Fact-checked by Grok 2 weeks ago

Rayleigh quotient

The Rayleigh quotient, named after the British physicist John William Strutt, 3rd Baron Rayleigh, is a scalar-valued function defined for a A \in \mathbb{R}^{n \times n} and a nonzero vector x \in \mathbb{R}^n as R(A, x) = \frac{x^T A x}{x^T x}, providing an approximation to the eigenvalues of A. When x is an eigenvector of A corresponding to eigenvalue \lambda, the quotient exactly equals \lambda. This expression arises naturally in the study of quadratic forms and has roots in Rayleigh's 1870s work on the theory of sound, where it modeled vibrational modes through Sturm-Liouville eigenvalue problems. For Hermitian matrices (the complex analog of symmetric matrices), the Rayleigh quotient plays a central role in the , which characterizes the eigenvalues \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n as \lambda_k = \min_{\dim S = k} \max_{x \in S, \|x\|=1} R(A, x), where the minimum is over all k-dimensional subspaces S of \mathbb{C}^n. This implies that the Rayleigh quotient is stationary at eigenvectors, with its maximum value over the unit equaling the largest eigenvalue \lambda_n and its minimum equaling the smallest \lambda_1. These extremal properties make it a cornerstone for bounding eigenvalues without full . Beyond theoretical foundations, the Rayleigh quotient is instrumental in numerical algorithms for eigenvalue computation, such as the , which refines approximate eigenvectors by solving shifted linear systems and achieves cubic convergence near isolated eigenvalues for non-defective matrices. It also extends to generalized eigenvalue problems of the form A v = \lambda B v with positive definite B, yielding R(A, B, x) = \frac{x^T A x}{x^T B x}, and finds applications in , , and optimization. Generalizations handle non-Hermitian cases, though convergence may degrade due to non-normality.

Definition and Formulation

General Definition

The Rayleigh quotient, for an n \times n A and a nonzero x \in \mathbb{C}^n, is defined as R(A, x) = \frac{x^* A x}{x^* x}, where x^* denotes the (Hermitian transpose) of x. Here, the numerator x^* A x is a , a scalar-valued function that arises from the induced by A via the standard inner product \langle u, v \rangle = u^* v on \mathbb{C}^n, while the denominator x^* x = \|x\|^2 is the squared of x. This ratio normalizes the quadratic form to make it scale-invariant, as R(A, cx) = R(A, x) for any nonzero scalar c. In the context of a basis for \mathbb{C}^n with the normalized vector x / \|x\| as the first basis vector (completed by an for the remainder), R(A, x) equals the (1,1) diagonal entry of the matrix representation of A in that basis. Introduced by John William Strutt, the third Baron Rayleigh, the concept originated in the late during his investigations into the theory of sound, particularly for analyzing normal modes of in continuous systems like strings and membranes. Rayleigh employed the quotient to approximate fundamental frequencies by assuming simple trial functions for displacement, deriving energy ratios that yield estimates for eigenvalues of associated differential operators. This variational approach laid foundational groundwork for later numerical methods in eigenvalue problems, extending beyond acoustics to broader linear algebra applications. The Rayleigh quotient provides a basic interpretation as the sole eigenvalue of the linear A restricted (or projected) to the one-dimensional spanned by x. Specifically, if P is the orthogonal onto \operatorname{[span](/page/Span)}\{x\}, then R(A, x) is the eigenvalue of the compressed P A P on that . For illustration, consider the $2 \times 2 A = \begin{pmatrix} 1 & 0 \\ 0 & 3 \end{pmatrix} and the eigenvector x = \begin{pmatrix} 1 \\ 0 \end{pmatrix}. Then x^* A x = 1 and x^* x = 1, so R(A, x) = 1, matching the corresponding eigenvalue \lambda_1 = 1. If instead x = \begin{pmatrix} 1 \\ 1 \end{pmatrix}, then x^* A x = 4 and x^* x = 2, yielding R(A, x) = 2, a value between the eigenvalues reflecting the alignment.

Hermitian Case

When the matrix A is Hermitian, meaning A = A^* where A^* denotes the , the Rayleigh quotient R(A, x) = \frac{x^* A x}{x^* x} for a nonzero x \in \mathbb{C}^n simplifies significantly. In this case, R(A, x) is always real-valued because both the numerator x^* A x and denominator x^* x are real scalars, as A being Hermitian ensures x^* A x = (x^* A x)^* and x^* x > 0. Furthermore, if x is an eigenvector of A corresponding to eigenvalue \lambda, then R(A, x) = \lambda, providing a direct link to the eigenvalues. The critical points of the Rayleigh quotient occur precisely at the eigenvectors of A. To see this, consider the stationarity condition by computing the \nabla R(A, x) with respect to x, treating R as a function on the unit sphere \|x\| = 1 for simplicity. The leads to \nabla R = 2(A x - R(A, x) x) = 0, which implies A x = R(A, x) x, showing that stationary points satisfy the eigenvalue equation with eigenvalue R(A, x). This property holds because Hermitian matrices have real eigenvalues and a complete set of orthonormal eigenvectors, making the Rayleigh quotient a natural variational characterization. For Hermitian matrices, eigenvectors corresponding to distinct eigenvalues are orthogonal with respect to the standard inner product \langle u, v \rangle = u^* v. This orthogonality follows directly from the eigenvalue : if A u = \lambda u and A v = \mu v with \lambda \neq \mu, then u^* A v = \mu u^* v and u^* A v = \lambda u^* v, yielding (\lambda - \mu) u^* v = 0. This property ensures the eigenvectors form an , facilitating . Consider the 2×2 A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}, which has eigenvalues \lambda_1 = 5 and \lambda_2 = 0 with corresponding normalized eigenvectors v_1 = \frac{1}{\sqrt{5}} \begin{pmatrix} 1 \\ 2 \end{pmatrix} and v_2 = \frac{1}{\sqrt{5}} \begin{pmatrix} -2 \\ 1 \end{pmatrix}. For x = v_1, R(A, x) = 5; for x = v_2, R(A, x) = 0. For an intermediate , say x = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}, R(A, x) = \frac{x^* A x}{x^* x} = \frac{9}{2}, which lies strictly between the minimum and maximum eigenvalues. The Rayleigh quotient originated in Lord Rayleigh's analysis of normal modes in vibrating systems, where it was used to approximate fundamental frequencies by assuming a trial displacement function, as detailed in his 1877 treatise on acoustics.

Properties

Rayleigh Bounds

For a Hermitian matrix A \in \mathbb{C}^{n \times n} with eigenvalues ordered as \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n, the Rayleigh quotient R(A, x) = \frac{x^* A x}{x^* x} for x \neq 0 satisfies \lambda_1 \leq R(A, x) \leq \lambda_n. These basic bounds are achieved when x is a corresponding eigenvector: the minimum \min_{x \neq 0} R(A, x) = \lambda_1 at the eigenvector for \lambda_1, and the maximum \max_{x \neq 0} R(A, x) = \lambda_n at the eigenvector for \lambda_n. This follows directly from the spectral theorem, which diagonalizes A in an orthonormal basis of eigenvectors, allowing expansion of x in that basis to show the quadratic form is a convex combination of the eigenvalues weighted by the squared coefficients. The extremal properties extend to intermediate eigenvalues via the Courant-Fischer min-max theorem, which characterizes each \lambda_k variationally over subspaces. Specifically, \lambda_k = \min_{\dim S = k} \max_{\substack{x \in S \\ \|x\| = 1}} x^* A x = \max_{\dim T = n-k+1} \min_{\substack{x \in T \\ \|x\| = 1}} x^* A x, where the minima and maxima are taken over all subspaces S, T \subseteq \mathbb{C}^n of the indicated dimensions. The first equality identifies \lambda_k as the smallest possible maximum Rayleigh quotient over all k-dimensional subspaces, achieved when S is the span of the eigenvectors corresponding to \lambda_1, \dots, \lambda_k. The second equality provides a dual max-min characterization, achieved when T is the span of the eigenvectors for \lambda_k, \dots, \lambda_n. These formulations, originally due to Fischer (1905) for the max-min version and Courant (1920) for extensions to boundary value problems, enable bounding eigenvalues without full computation. A proof sketch relies on the and s. Let \{v_1, \dots, v_n\} be an orthonormal eigenbasis with A v_i = \lambda_i v_i. For the min-max form, consider any k-dimensional S; by the properties of orthogonal projections, S must intersect nontrivially with the orthogonal complement of the span of \{v_1, \dots, v_{k-1}\}, which has n-k+1. Thus, there exists a x \in S with components only in \{v_k, \dots, v_n\}, so x^* A x = \sum_{i=k}^n |\alpha_i|^2 \lambda_i \geq \lambda_k (since \lambda_i \geq \lambda_k for i \geq k), implying \max_{x \in S, \|x\|=1} x^* A x \geq \lambda_k. Equality holds for S = \operatorname{span}\{v_1, \dots, v_k\}, where the maximum is \lambda_k. Taking the minimum over all such S yields \lambda_k. The max-min form follows symmetrically by reversing the ordering or using complements. This spectral decomposition proof highlights how the Rayleigh quotient's range in subspaces is constrained by the eigenvalue distribution. To illustrate, consider the 3×3 A = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{pmatrix}, with eigenvalues \lambda_1 \approx 0.586, \lambda_2 = 2, \lambda_3 \approx 3.414. The full-space bounds are \min R(A, x) = 0.586 and \max R(A, x) = 3.414. For the second eigenvalue, take the 2-dimensional S spanned by the vectors e_1, e_2; the restricted Rayleigh quotients solve the 2×2 subproblem with eigenvalues 1 and 3, so \max_{x \in S, \|x\|=1} R(A, x) = 3 > \lambda_2, but \min_{x \in S, \|x\|=1} R(A, x) = 1 < \lambda_2. The min-max characterization gives \lambda_2 = \min_{\dim S=2} \max_{x \in S} R(A, x) = 2, achieved in the optimal . Similarly, for T of dimension 2 orthogonal to the smallest eigenvector, the max in T approximates \lambda_2 from above. This shows how choices yield bounds bracketing intermediate eigenvalues.

Variational Principles

The Rayleigh quotient provides a variational framework for characterizing the eigenvalues of a Hermitian matrix A as extrema attained by restricting the quotient to finite-dimensional subspaces of \mathbb{R}^n. For a subspace S \subset \mathbb{R}^n of dimension k, the Rayleigh-Ritz procedure approximates the eigenvalues by solving the eigenvalue problem for the projected operator P_S A|_S, where P_S is the orthogonal projector onto S. To derive this, select an orthonormal basis Q for S (with columns forming the basis vectors), and form the Rayleigh matrix Q^T A Q. The eigenvalues of this k \times k symmetric matrix, termed Ritz values \theta_1 \leq \cdots \leq \theta_k, represent the stationary values of the R(x) = \frac{x^T A x}{x^T x} over unit vectors x \in S, with corresponding Ritz vectors Q v_i approximating the eigenvectors of A. This projection ensures that the Ritz values interlace the true eigenvalues when S is varied appropriately. In the infinite-dimensional setting of self-adjoint operators on Hilbert spaces, the Rayleigh quotient extends to a functional R = \frac{\langle Au, u \rangle}{\langle u, u \rangle} over suitable function spaces, connecting directly to the calculus of variations. The critical points of this functional, found by setting the variational derivative to zero, satisfy the Euler-Lagrange equation A u = \lambda u, yielding the eigenfunctions and eigenvalues as stationary values. This formulation underpins the in partial differential equations, where trial functions spanning a subspace minimize the functional to approximate solutions. A key extension involves trace minimization over subspaces, capturing sums of eigenvalues variationally. Ky Fan's principle states that the sum of the smallest k eigenvalues of A equals \min \operatorname{tr}(Q^T A Q), where the minimum is over all n \times k matrices Q with orthonormal columns, equivalent to optimizing the trace of the projected Rayleigh matrix over k-dimensional subspaces. This provides an upper bound approximation for the sum when restricting to trial subspaces, useful for estimating spectral gaps or low-lying spectra collectively. Convergence of Ritz values to the true eigenvalues follows from perturbation theory applied to the subspace projection. Specifically, as the dimension k increases or the principal angles between the trial subspace S and the true invariant subspace shrink, the Ritz values approach the corresponding eigenvalues; error bounds are derived using majorization inequalities and the generalized pinching lemma to quantify residuals and perturbations. In quantum mechanics, the Ritz method approximates the ground state energy by minimizing the expectation value of the Hamiltonian over a low-dimensional trial subspace. For instance, consider a 2D subspace spanned by Gaussian trial functions \psi_1(x) = e^{-\alpha x^2} and \psi_2(x) = x e^{-\beta x^2} for the 1D anharmonic oscillator H = -\frac{d^2}{dx^2} + x^2 + \gamma x^4; the optimal linear combination \psi = c_1 \psi_1 + c_2 \psi_2 (normalized) yields the Ritz energy as the smallest eigenvalue of the 2x2 matrix of Hamiltonian matrix elements \langle \psi_i | H | \psi_j \rangle, providing a variational upper bound that improves upon single-trial estimates for small \gamma.

Applications in Linear Algebra

Covariance Matrices and PCA

In statistics, the Rayleigh quotient provides the foundational framework for principal component analysis (PCA), a method for identifying patterns of variation in multivariate data. Consider a data matrix X \in \mathbb{R}^{n \times p} whose rows represent n observations of p variables, assumed to be centered (mean zero). The sample covariance matrix is defined as \Sigma = \frac{1}{n} X^T X, which captures the pairwise variances and covariances among the variables. The principal components of the data are the eigenvectors of \Sigma, and the variance of the data projected onto a unit vector x \in \mathbb{R}^p is given by \mathrm{var}(X x) = x^T \Sigma x = R(\Sigma, x), where R(\Sigma, x) denotes the Rayleigh quotient. This equivalence positions the Rayleigh quotient as the objective function for maximizing projected variance, the core goal of PCA. To derive the principal directions, the first principal component solves the constrained optimization problem of maximizing x^T \Sigma x subject to the unit norm constraint x^T x = 1. This is addressed using the method of Lagrange multipliers, forming the Lagrangian L(x, \lambda) = x^T \Sigma x + \lambda (1 - x^T x). Taking the derivative with respect to x and setting it to zero yields the eigenvalue equation \Sigma x = \lambda x. The solution x is thus the eigenvector of \Sigma corresponding to its largest eigenvalue \lambda_1, with the maximum value of the Rayleigh quotient being \lambda_1. Subsequent principal components are obtained by solving similar problems on the deflated covariance matrix or enforcing orthogonality to prior components, yielding eigenvectors associated with decreasing eigenvalues \lambda_2 \geq \cdots \geq \lambda_p. In the context of PCA, the largest Rayleigh quotient identifies the first principal component, which captures the direction of greatest data variability; the associated eigenvalue \lambda_1 quantifies this variance. The proportion of total variance explained by this component is \lambda_1 / \sum_{i=1}^p \lambda_i, allowing practitioners to assess its informativeness and decide on the number of components to retain for dimensionality reduction. Lower-order components similarly maximize remaining variance under orthogonality. In machine learning, this Rayleigh quotient-based formulation underpins PCA's role in preprocessing high-dimensional data for tasks such as classification, regression, and visualization, where it reduces noise and computational burden while preserving key structure. As an illustrative example, consider a simple 2D dataset with observations (1,2), (3,4), (5,6). The sample mean is (3, 4), so the centered data matrix has rows (-2, -2), (0, 0), (2, 2). The covariance matrix is \Sigma = \frac{1}{3} \begin{pmatrix} (-2)^2 + 0^2 + 2^2 & (-2)(-2) + 0 \cdot 0 + 2 \cdot 2 \\ (-2)(-2) + 0 \cdot 0 + 2 \cdot 2 & (-2)^2 + 0^2 + 2^2 \end{pmatrix} = \begin{pmatrix} 8/3 & 8/3 \\ 8/3 & 8/3 \end{pmatrix}. The eigenvalues of \Sigma are $16/3 and $0, with corresponding unit eigenvectors (1/\sqrt{2}, 1/\sqrt{2})^T and (1/\sqrt{2}, -1/\sqrt{2})^T. The Rayleigh quotient along the first eigenvector is (1/\sqrt{2}, 1/\sqrt{2}) ^T \Sigma (1/\sqrt{2}, 1/\sqrt{2})^T = 16/3, confirming it as the direction of maximum variance (the data lie along the line y = x), while the second yields $0 (orthogonal to the variation). This example demonstrates how the Rayleigh quotient identifies the principal direction aligned with the data spread.

Eigenvalue Computation Methods

The Rayleigh quotient serves as a key tool in iterative algorithms for approximating eigenvalues and eigenvectors of matrices, particularly for symmetric matrices where its stationarity properties enhance convergence. In the power method, designed to find the dominant eigenvalue, the iteration generates a sequence x_{k+1} = A x_k / \| A x_k \| starting from a normalized initial vector x_0, and the Rayleigh quotient R(A, x_k) = \frac{x_k^T A x_k}{x_k^T x_k} provides a progressively accurate estimate of the dominant eigenvalue. The method converges linearly with asymptotic error constant |\lambda_2 / \lambda_1|, where \lambda_1 and \lambda_2 are the eigenvalues of largest and second-largest magnitude, respectively, and the Rayleigh quotient effectively monitors this convergence by stabilizing as the iterates approach the dominant eigenvector. Inverse iteration targets an eigenvalue near a user-specified shift \sigma, typically chosen close to an a priori estimate. The algorithm begins with a normalized x_0 and iterates x_{k+1} = (A - \sigma I)^{-1} x_k / \| (A - \sigma I)^{-1} x_k \|, with the R(A, x_k) yielding an approximation to the targeted eigenvalue, and R(A, x_k) - \sigma serving as the monitor for the shifted eigenvalue's convergence. This achieves linear convergence at a rate governed by the ratio of distances from \sigma to the closest and next-closest eigenvalues. Rayleigh quotient iteration improves upon inverse iteration by dynamically setting the shift \sigma_k = R(A, x_k) at each step, producing the update x_{k+1} = (A - \sigma_k I)^{-1} x_k / \| (A - \sigma_k I)^{-1} x_k \| followed by \sigma_{k+1} = R(A, x_{k+1}). For symmetric matrices, this yields local cubic convergence to a simple eigenvalue, where the error in the eigenvector approximation satisfies \| e_{k+1} \| \approx C \| e_k \|^3 for some constant C depending on eigenvalue gaps and matrix condition, often tripling the number of accurate digits per iteration once near the solution. To illustrate, consider the symmetric matrix A = \begin{pmatrix} 6 & 2 & 1 \\ 2 & 3 & 1 \\ 1 & 1 & 4 \end{pmatrix}. Starting with a normalized initial vector such as x_0 = \frac{1}{\sqrt{3}} (1, 1, 1)^T, the rapidly refines the approximate eigenvalue and eigenvector toward an isolated eigenvalue, demonstrating the cubic convergence. These techniques are integrated into established numerical libraries; for instance, 's MRRR algorithm (in routines like xSTEGR) employs to refine selected eigenvectors of symmetric tridiagonal matrices, achieving high relative accuracy even for nearly degenerate cases.

Applications in Analysis

Sturm-Liouville Theory

In the context of , the Rayleigh quotient arises naturally from the self-adjoint operator associated with the differential equation -(p u')' + q u = \lambda r u, where p(x) > 0, q(x), and r(x) > 0 are given functions on an interval [a, b], subject to appropriate boundary conditions ensuring self-adjointness. The functional form of the Rayleigh quotient for a twice-differentiable function u satisfying the boundary conditions is R(u) = \frac{\int_a^b \left[ p (u')^2 + q u^2 \right] \, dx}{\int_a^b r u^2 \, dx}, which represents the Rayleigh quotient \langle u, L u \rangle / \langle u, u \rangle_r, where L is the and \langle \cdot, \cdot \rangle_r denotes the inner product weighted by r. This formulation derives from integrating by parts the expression \int u (L u) r \, dx, yielding the quadratic form in the numerator that is positive definite under standard assumptions on the coefficients. The Rayleigh quotient provides a variational characterization of the eigenvalues: the smallest eigenvalue \lambda_1 is the minimum of R(u) over all admissible nonzero functions u, attained precisely when u is the corresponding , with higher eigenvalues given by the min-max \lambda_{n+1} = \min \{ R(u) \mid \langle u_i, u \rangle_r = 0, \, i=1,\dots,n \}. To approximate these eigenvalues numerically, the employs a finite-dimensional of trial functions (e.g., polynomials or satisfying the boundary conditions), reducing the problem to minimizing R(u) over that , which yields upper bounds on the eigenvalues via the . The eigenfunctions corresponding to distinct eigenvalues are orthogonal with respect to the weight r, satisfying \int_a^b \phi_m \phi_n r \, dx = 0 for m \neq n. A classic example is the vibrating string problem, modeled by the Sturm-Liouville equation u'' + \lambda u = 0 on [0, L] with Dirichlet boundary conditions u(0) = u(L) = 0, corresponding to p=1, q=0, r=1. Here, R(u) = \int_0^L (u')^2 \, dx / \int_0^L u^2 \, dx, and the exact eigenvalues are \lambda_n = (n \pi / L)^2 with eigenfunctions \sin(n \pi x / L); trial functions such as \sin(\pi x / L) yield the exact lowest , while higher-order approximations like linear combinations of provide bounds on subsequent modes. This continuous formulation ties historically to Lord Rayleigh's original derivation of the quotient for estimating natural frequencies in vibrating systems, as developed in his analysis of and string vibrations.

Rayleigh-Ritz Method

The Rayleigh-Ritz method is a variational technique for obtaining approximate solutions to eigenvalue problems governed by partial differential equations (PDEs), such as those arising in or . It projects the original operator onto a finite-dimensional spanned by basis functions \{\phi_1, \phi_2, \dots, \phi_n\} that satisfy the essential boundary conditions, reducing the problem to a finite generalized eigenvalue problem in the coefficients of the expansion. This approach, originally developed by Walter Ritz in 1909 as an extension of Lord Rayleigh's , is particularly effective for elliptic PDEs and forms the foundation of the in . In this method, an approximate is sought as a u \approx \sum_{i=1}^n c_i \phi_i, where the coefficients c_i are determined by minimizing the Rayleigh quotient restricted to the . The resulting values \lambda_k are the eigenvalues of the reduced problem and serve as upper bounds to the true eigenvalues \lambda_k^* of the original , thanks to the min-max characterization provided by the Courant-Fischer theorem. Specifically, for the k-th smallest eigenvalue, \lambda_k \geq \lambda_k^*, with equality achieved in the limit as the dimension n \to \infty and the basis becomes complete. Error bounds for the values follow from the theorem: |\lambda_k - \lambda_k^*| \leq C \cdot \dist(\phi, S_{k-1})^{-2}, where S_{k-1} is the spanned by the first k-1 and \dist measures the distance to the . To implement the method, the basis functions are substituted into the weak form of the PDE, leading to the K_{ij} = \int_\Omega \phi_i \mathcal{L} \phi_j \, d\Omega (where \mathcal{L} is the ) and the M_{ij} = \int_\Omega \phi_i \phi_j \, d\Omega. The pairs (\lambda, \mathbf{v}) are then obtained by solving the generalized eigenvalue problem K \mathbf{v} = \lambda M \mathbf{v}, typically using standard linear algebra routines for symmetric positive-definite matrices. For conforming finite element approximations, where the basis consists of piecewise polynomials of degree p on a of size h, theory guarantees that the error in the k-th value satisfies |\lambda_k - \lambda_k^*| = O(h^{2p}) for smooth eigenfunctions, with faster convergence for lower modes due to their smoother behavior. A representative example is the transverse vibration of a cantilever beam governed by the Euler-Bernoulli equation \frac{\partial^2}{\partial x^2} \left( EI \frac{\partial^2 w}{\partial x^2} \right) = -\lambda \rho A w, with \lambda = \omega^2, clamped at x=0 (w(0) = w'(0) = 0) and free at x=L. Using a two-term polynomial basis \phi_1(x) = x^2/L^2, \phi_2(x) = x^3/L^3 (normalized for convenience), the stiffness matrix entries are K_{ij} = \frac{EI}{L^3} \int_0^1 \phi_i'' \phi_j'' \, d\xi and mass matrix M_{ij} = \rho A L \int_0^1 \phi_i \phi_j \, d\xi (with \xi = x/L). Computing the integrals yields K = \frac{EI}{L^3} \begin{pmatrix} 4 & 6 \\ 6 & 12 \end{pmatrix} and M = \rho A L \begin{pmatrix} \frac{1}{5} & \frac{1}{6} \\ \frac{1}{6} & \frac{1}{7} \end{pmatrix}; solving K \mathbf{v} = \lambda M \mathbf{v} gives the first approximate frequency \omega_1 \approx 3.535 \sqrt{EI / \rho A L^4} (error ~0.5% vs. exact 3.516) and the second \omega_2 \approx 34.8 \sqrt{EI / \rho A L^4} (poor approximation vs. exact second mode ~22.0, as low-order polynomials better capture lower modes). Adding terms like \phi_3(x) = x^4/L^4 refines the approximations further, converging monotonically from above as per the min-max principle.

Generalizations

Non-Hermitian Extensions

For non-Hermitian matrices A, the standard Rayleigh quotient R(A, x) = \frac{x^* A x}{x^* x} produces complex values rather than real ones, precluding the direct application of the min-max theorems that characterize eigenvalues for Hermitian cases. The set of all such quotients over unit vectors \|x\| = 1 defines the numerical range (or field of values) W(A), a closed and bounded convex subset of the complex plane. By the Toeplitz-Hausdorff theorem, W(A) contains the entire spectrum of A, providing an enclosure for the (possibly complex) eigenvalues without the ordered extremal properties of the Hermitian numerical range. The numerical range relates closely to pseudospectra, which quantify eigenvalue sensitivity under perturbations and reveal transient amplification in non-normal systems; specifically, W(A) is contained within the \varepsilon-pseudospectrum for sufficiently large \varepsilon, highlighting non-normality effects. Unlike Hermitian matrices, no real-valued min-max exists for eigenvalues, but the of W(A) offers practical bounds, such as the being at most the maximum modulus over W(A). To approximate interior eigenvalues more effectively, the harmonic Rayleigh quotient H(A, x) = \frac{x^* A x}{x^* A^{-1} x} is used, particularly in projection methods like harmonic Ritz extraction, where it targets shifts near the spectrum's interior. In fluid dynamics, non-normal operators from discretized Navier-Stokes equations exhibit large transient growth despite stable eigenvalues; here, the Rayleigh quotient and numerical range elucidate such non-modal instabilities, as in shear flow analyses where pseudospectra predict amplification factors exceeding eigenvalue magnitudes by orders of magnitude. A illustrative example is the 2×2 Jordan block J = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, a prototypical non-normal matrix with eigenvalue 0; its numerical range is the closed disk \{ z \in \mathbb{C} : |z| \leq 1/2 \}, demonstrating how W(J) extends beyond the spectrum to capture sensitivity.

Functional Forms

In infinite-dimensional Hilbert spaces, the Rayleigh quotient generalizes from finite-dimensional matrices to , providing a variational of eigenvalues. For a A on a H, the Rayleigh quotient is defined as R(A, f) = \frac{\langle A f, f \rangle}{\langle f, f \rangle} for any f \in H with f \neq 0, where \langle \cdot, \cdot \rangle denotes the inner product on H. This form extends the finite-dimensional case and satisfies R(A, f) \in \sigma(A), the of A, when f is an eigenvector. The quotient remains real-valued due to the self-adjointness of A, ensuring \langle A f, f \rangle = \langle f, A f \rangle. In , the Rayleigh quotient underpins the for the eigenvalues of unbounded operators, which are common in applications like differential operators. For a densely defined, unbounded A with D(A) \subset H, the eigenvalues \lambda_k below the essential spectrum are characterized by \lambda_k = \min_{\dim V = k} \max_{f \in V, \|f\|=1, f \in D(A)} R(A, f) = \max_{\dim W = k-1} \min_{f \perp W, \|f\|=1, f \in D(A)} R(A, f), where the minima and maxima are taken over finite-dimensional subspaces V and codimension-k subspaces W of H, with restrictions ensuring A f is well-defined. This extension requires careful handling of the D(A) to maintain self-adjointness and avoid essential spectrum intrusion, distinguishing it from cases. Such formulations enable eigenvalue approximations via trial functions in D(A), linking to broader variational principles for operator spectra. A key application arises in , where the Rayleigh quotient computes the value of the H for a trial state \psi \in H, given by R(H, \psi) = \langle \psi | H | \psi \rangle / \langle \psi | \psi \rangle. This yields an upper bound to the energy via the , facilitating approximations when exact eigenstates are unavailable. For instance, in the with H = -\frac{[\hbar](/page/H-bar)^2}{2m} \frac{d^2}{dx^2} + \frac{1}{2} m [\omega](/page/Omega)^2 x^2 on L^2(\mathbb{R}), the Gaussian trial function \psi(x) = \left( \frac{\alpha}{\pi} \right)^{1/4} e^{-\alpha x^2 / 2} (with \alpha > 0) minimizes R(H, \psi). Optimizing over \alpha gives R(H, \psi) = \frac{1}{2} [\hbar](/page/H-bar) [\omega](/page/Omega), exactly matching the energy, as the Gaussian coincides with the true wavefunction. In the context of Hilbert-Schmidt operators, which are compact operators on H, the Rayleigh quotient aids in by approximating eigenvalues through finite-rank projections, leveraging the operator's trace-class properties for convergence guarantees. Furthermore, Kato's employs the Rayleigh quotient to analyze eigenvalue shifts under small perturbations of operators, deriving corrections via \delta \lambda \approx R(A + \epsilon B, f) - R(A, f) for unperturbed eigenvector f, with higher-order terms from resolvent expansions. This framework is foundational for stability analyses in and .

References

  1. [1]
    [PDF] Lecture 4: Rayleigh Quotients - San Jose State University
    We also know that Λ consists of the eigenvalues of A along the diagonal, and Q has the corresponding orthonormal eigenvectors in columns. Dr. Guangliang Chen | ...
  2. [2]
    [PDF] The Rayleigh Quotient Iteration and Some Generalizations for ...
    May 8, 2007 · Definition. The Rayleigh Quotient p is the function which assigns to any. nonzero complex vector u the scalar quantity. P(U) =U*CEI/U*U= C C C ...<|control11|><|separator|>
  3. [3]
    [PDF] RAYLEIGH QUOTIENT AND THE MIN-MAX THEOREM
    RAYLEIGH QUOTIENT AND THE MIN-MAX THEOREM. Definition 1. A Matrix is called Hermitian if A? = A. Notice that if A = A? then A?A = A?A so by the diago ...
  4. [4]
    [PDF] 18.303: The Min–Max/Variational Theorem and the Rayleigh Quotient
    Oct 5, 2011 · Note that if u is an eigenfunction un, then R{un} = λn. The Min–Max Theorem. Now, the min–max theorem (also called the variational principle) ...
  5. [5]
    [PDF] Lecture 27. Rayleigh Quotient, Inverse
    Thus Rayleigh quotients are just diagonal entries of matrices, once you transform orthogonally to the right coordinate system.
  6. [6]
    [PDF] 1 Symmetric eigenvalue basics - CS@Cornell
    Oct 26, 2016 · |wi|2 = 1. That is, the Rayleigh quotient is a weighted average of the eigenvalues. Max- imizing or minimizing the Rayleigh quotient ...
  7. [7]
    [PDF] 2·Hermitian Matrices
    Apr 1, 2017 · As the numerator and denominator are both real, notice that the Rayleigh quotients for a Hermitian matrix is always real. We can say more: since ...Missing: stationarity orthogonality
  8. [8]
    [PDF] Rayleigh quotients revisited - Cornell: Computer Science
    i.e. ρA(v) is a weighted average of the eigenvalues of A. If V is a k-dimensional subspace, then we can find a unit vector v ∈ V that satisfies the k − 1.Missing: source | Show results with:source<|control11|><|separator|>
  9. [9]
    [PDF] Applied Mathematics 205 Unit V: Eigenvalue Problems
    The columns of the matrix Q k. = Q1Q2 ···Qk approximates the eigenvectors of A. The diagonal entries of the Rayleigh quotient matrix Ak = QT k. AQ k.
  10. [10]
    [PDF] 1 Courant-Fischer Theorem - CS-People by full name
    Proof sketch. By the spectral theorem for real symmetric matrices, we know that we can take the eigenvec- tors f1,...,fn to be orthogonal. Thus, the ...
  11. [11]
    [PDF] Eigenvalues and Optimization: The Courant-Fischer Theorem
    The Courant-Fischer Theorem tells us that the vectors x that maximize the Rayleigh quotient are exactly the eigenvectors of the largest eigenvalue of M . In ...Missing: paper | Show results with:paper
  12. [12]
    [PDF] arXiv:math/0701784v6 [math.NA] 6 Oct 2009
    Oct 6, 2009 · In approximation theory, the Rayleigh-Ritz method is the most common technique of approximating eigenvalues and eigenvectors of operators, e.g.,.
  13. [13]
    [PDF] On Generalizing Trace Minimization - arXiv
    Apr 1, 2021 · This paper extends Ky Fan's trace minimization principle to investigate infX tr(DXHAX) subject to XHBX = Ik or Jk = diag(±1), where D is ...
  14. [14]
    [PDF] Math2121420 The Coulomb potential Rayleigh-Ritz, Approximation ...
    Nov 25, 2014 · Rayleigh-Ritz and its applications. Valence. Two dimensional examples. Suppose that S11 = S22 = 1, i.e. that ψ1 and ψ2 are separately.Missing: 2D | Show results with:2D
  15. [15]
    [PDF] Theory related to PCA and SVD
    Apr 8, 2019 · For every 𝒙 with ‖𝒙‖ = 1 maximizing the Rayleigh quotient, the flipped vector −𝒙 is a solution with ‖𝒙‖ = 1 as well. Note that since 𝑨 is real ...
  16. [16]
    Inverse, Shifted Inverse, and Rayleigh Quotient Iteration as Newton's ...
    The 𝑙 2 normalized inverse, shifted inverse, and Rayleigh quotient iterations are classic algorithms for approximating an eigenvector of a symmetric matrix.
  17. [17]
    [PDF] lapack working note 162: the design and implementation of the mrrr ...
    Furthermore, it is used for the computation of singletons in the case that Rayleigh Quotient Correction does not converge to the wanted eigenvalue. It has ...
  18. [18]
    [PDF] Sturm-Liouville Eigenvalue Problems Motivation - Penn Math
    A Sturm-Liouville eigenvalue problem consists of the Sturm-Liouville ... The Rayleigh quotient provides the eigenvalues in terms of the eigenfunctions, as.
  19. [19]
    [PDF] STURM-LIOUVILLE THEORY - OSU Math
    Mar 23, 2015 · Recall that, using the Rayleigh quotient. R[u] = hu, Lui hu, ui. , u ∈ D, u 6= 0 we have λ1 = minR[u] then λ2 = min{R[u]|hu1,ui = 0}, λ3 ...<|control11|><|separator|>
  20. [20]
    Rayleigh-Ritz Method - an overview | ScienceDirect Topics
    The Rayleigh-Ritz method is a direct numerical method of approximating eigenvalues [62], originated in the context of solving physical boundary value problems.
  21. [21]
  22. [22]
    [PDF] Appendix A Rayleigh Ratios and the Courant-Fischer Theorem
    Rayleigh–Ritz theorem. As an application of Propositions A.1 and A.2, we give a proof of a proposition which is the key to the proof of. Theorem 2.2. Given ...
  23. [23]
    The convergence of the Rayleigh-Ritz Method in quantum chemistry
    Cite this article. Klahn, B., Bingel, W.A. The convergence of the Rayleigh-Ritz Method in quantum chemistry. Theoret. Chim. Acta 44, 9–26 (1977). https://doi ...
  24. [24]
    Rayleigh–Ritz–Galerkin Methods for Multidimensional Problems
    This paper is concerned with the application of the Ritz–Galerkin method to the numerical solution of singular boundary value problems of the type arising when ...
  25. [25]
    5.10 Rayleigh-Ritz method for vibrating elastic solids
    We wish to estimate the lowest natural frequency of vibration. The deformation of a beam can be characterized by the deflection of its neutral section. The ...
  26. [26]
  27. [27]
    Bounds for the Rayleigh Quotient and the Spectrum of Self-Adjoint ...
    If 𝑥 is an eigenvector of a self-adjoint bounded operator 𝐴 in a Hilbert space, then the RQ of the vector 𝑥 , denoted by 𝜌 ⁡ ( 𝑥 ) , is an exact eigenvalue of ...
  28. [28]
    Higher-order Rayleigh-quotient gradient effect on electron correlations
    Apr 4, 2023 · In quantum mechanics, the Rayleigh quotient E is commonly recognized as the expectation value of the energy for a given state ψ, E = ψ H ψ ψ S ...
  29. [29]
    The Rayleigh–Ritz Variation Method: An Illustrative Application to ...
    Mar 20, 2025 · The method is illustrated by the example of its application to the problem of anharmonicity of the HCl molecule vibrations.
  30. [30]
    [PDF] First-Order Perturbation Theory for Eigenvalues and Eigenvectors
    As Kato notes, these results are exactly what were anticipated by Rayleigh, Schr\"odinger, and others, but to prove them is by no means trivial, even in the ...