Fact-checked by Grok 2 weeks ago

Eigenvalues and eigenvectors

In linear algebra, eigenvalues and eigenvectors are core concepts that describe the intrinsic scaling and directional properties of linear transformations represented by square matrices. Specifically, for an n \times n matrix A, an eigenvalue \lambda is a scalar such that there exists a nonzero v \in \mathbb{R}^n, called an eigenvector corresponding to \lambda, satisfying the equation A v = \lambda v. This equation implies that the linear transformation defined by A scales the eigenvector v by the factor \lambda without altering its direction. The eigenvalues of A are the roots of its \det(A - \lambda I_n) = 0, where I_n is the , and there are at most n such values, counting multiplicities. The notion of "eigen" derives from the German word meaning "own" or "characteristic," reflecting how these vectors and scalars capture the self-similar behaviors inherent to the matrix. Historically, the concepts emerged in the early , with popularizing their use in 1822 for analyzing conduction through , and establishing key properties for symmetric matrices in 1829, including eigenvalue interlacing. By the mid-20th century, computational methods for finding eigenvalues advanced significantly, driven by needs in and , such as those surveyed in historical reviews of . Eigenvalues and eigenvectors underpin the spectral theory of matrices, enabling diagonalization when a full set of linearly independent eigenvectors exists, which simplifies matrix powers and exponentials as A^k = P D^k P^{-1}, where D is diagonal with eigenvalues on the diagonal and P has eigenvectors as columns. This structure is crucial for understanding matrix stability, as the eigenvalues determine whether powers of A converge (if all |\lambda| < 1) or diverge. Beyond pure mathematics, they have broad applications: in differential equations, eigenvalues govern the stability and oscillatory behavior of solutions to systems like \dot{x} = A x; in principal component analysis (PCA), the largest eigenvalues of the covariance matrix identify directions of maximum variance for dimensionality reduction in data science; and in Google's PageRank algorithm, the dominant eigenvector (with eigenvalue 1) ranks web pages based on link structures modeled as Markov chains. Other uses include spectral clustering for graph partitioning in biology and social networks, image compression via singular value decomposition (closely related to eigenvalues), and stability analysis in control systems for mechanical and electrical engineering.

Basic Concepts

Definition

In linear algebra, eigenvalues and eigenvectors arise in the context of linear transformations on vector spaces. A linear transformation represented by a square matrix A maps vectors from a vector space to itself, preserving addition and scalar multiplication. For an n \times n square matrix A over the real or complex numbers, a scalar \lambda is called an eigenvalue of A if there exists a non-zero vector v in \mathbb{R}^n or \mathbb{C}^n such that A v = \lambda v; in this case, v is an eigenvector corresponding to \lambda. This equation can be expressed in component form as a system: for v = (v_1, \dots, v_n)^T, the i-th row gives \sum_{j=1}^n a_{ij} v_j = \lambda v_i for each i = 1, \dots, n, where a_{ij} are the entries of A. Intuitively, an eigenvector v represents a direction in the vector space that remains unchanged under the linear transformation defined by A, except for scaling by the factor \lambda; thus, eigenvalues quantify how the transformation stretches or compresses along those invariant directions. In two dimensions, this manifests geometrically as the matrix A inducing a combination of rotation and scaling: eigenvectors align with the principal axes of this transformation, where the eigenvalue \lambda > 1 indicates expansion, $0 < \lambda < 1 indicates contraction, \lambda = 1 indicates no scaling, and \lambda < 0 indicates reflection along that axis. For example, a rotation matrix by an angle \theta has eigenvalues e^{\pm i\theta} (complex unless \theta = 0 or \pi), with no real scaling directions except in degenerate cases.

Historical background

The origins of eigenvalues and eigenvectors trace back to the 18th century, rooted in efforts to solve systems of linear differential equations and analyze mechanical systems. Leonhard Euler laid early groundwork in 1755 through his investigations into linear differential equations with constant coefficients, where he employed methods that implicitly involved characteristic roots to find solutions, particularly in the context of vibrations and celestial mechanics. These ideas prefigured the systematic treatment of roots that determine the behavior of linear systems. In the 19th century, advancements accelerated with contributions from and , who focused on secular equations arising in celestial mechanics to describe long-term perturbations in planetary orbits. , in his work on the stability of the solar system during the 1770s and 1780s, formulated the secular equation as a determinant whose roots captured the principal modes of variation, influencing later geometric and analytic developments. extended this in his 1829 paper "Sur l'équation à l'aide de laquelle on détermine les inégalités séculaires des mouvements des planètes," introducing the term "valeur propre" for eigenvalues, proving that the roots of symmetric matrices associated with these equations are real, and further elaborated in his 1851 memoir on linear differential equations, where he explored their integration and properties. Concurrently, and developed the theory of second-order linear differential equations and boundary value problems now known as . The formalization of these concepts occurred around 1900, with David Hilbert playing a pivotal role in spectral theory through his 1904 work "Grundzüge einer allgemeinen Theorie der linearen Integralgleichungen," where he coined the German term "Eigenwert" (proper value) for eigenvalues and "Eigenfunktion" for eigenfunctions in the context of integral operators. This terminology, emphasizing intrinsic or "proper" values of operators, was later translated into English as "eigenvalue." In the late 1920s, John von Neumann advanced the framework through his early papers and collaborations, such as the 1928 work with Hilbert, on the mathematical foundations of quantum mechanics, applying matrix mechanics to demonstrate how eigenvalues represent observable quantities like energy levels, solidifying their role in physics. These developments marked the transition from ad hoc solutions in mechanics to a unified abstract theory.

Matrix Eigenvalues and Eigenvectors

Characteristic polynomial and eigenvalues

The characteristic polynomial of an n \times n matrix A is defined as p_A(\lambda) = \det(\lambda I_n - A), where I_n denotes the n \times n identity matrix. This polynomial equation p_A(\lambda) = 0 is known as the characteristic equation. The eigenvalues of A are the roots of this polynomial. By the Fundamental Theorem of Algebra, the characteristic polynomial has exactly n roots in the complex numbers, counted with algebraic multiplicity. The characteristic polynomial is monic of degree n, with leading coefficient 1 for the \lambda^n term. It can be expressed in terms of the eigenvalues \lambda_1, \dots, \lambda_n of A as p_A(\lambda) = \prod_{i=1}^n (\lambda - \lambda_i). The coefficients of this polynomial are related to the elementary symmetric functions of the eigenvalues, up to sign, via . Specifically, the coefficient of \lambda^{n-1} is -\sum_{i=1}^n \lambda_i = -\operatorname{trace}(A), and the constant term is (-1)^n \prod_{i=1}^n \lambda_i = (-1)^n \det(A). To compute p_A(\lambda), the determinant \det(\lambda I_n - A) is expanded using the Leibniz formula, which sums over all permutations of the indices with signed products of matrix entries. The set of all eigenvalues of A is called the spectrum of A, denoted \sigma(A). For real matrices, eigenvalues may be complex, occurring in conjugate pairs if the matrix is real. As an example, consider the $2 \times 2 matrix A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}. The characteristic polynomial is p_A(\lambda) = \det\begin{pmatrix} \lambda - a & -b \\ -c & \lambda - d \end{pmatrix} = \lambda^2 - (a + d)\lambda + (ad - bc) = \lambda^2 - \operatorname{trace}(A) \lambda + \det(A). The eigenvalues are the roots of this quadratic equation, given explicitly by the quadratic formula: \lambda = \frac{\operatorname{trace}(A) \pm \sqrt{ [\operatorname{trace}(A)]^2 - 4 \det(A) }}{2}. This formula provides the eigenvalues directly from the and without full determinant expansion for larger matrices, though numerical stability considerations apply in computations.

Eigenspaces and multiplicities

The eigenspace corresponding to an eigenvalue \lambda of a square matrix A is the null space of A - \lambda I, denoted \ker(A - \lambda I), consisting of all eigenvectors associated with \lambda and the zero vector. This subspace captures the directions in which A acts as scalar multiplication by \lambda. The dimension of this eigenspace, known as the geometric multiplicity m_g(\lambda), measures the number of linearly independent eigenvectors for \lambda. The algebraic multiplicity m_a(\lambda) of \lambda is defined as the multiplicity of \lambda as a root of the characteristic polynomial \det(A - \lambda I). A fundamental property is that m_a(\lambda) \geq m_g(\lambda) for every eigenvalue \lambda, with equality holding if and only if the eigenspace spans the full generalized eigenspace dimension for \lambda. In the Jordan canonical form of A, the algebraic multiplicity m_a(\lambda) equals the sum of the sizes of all Jordan blocks associated with \lambda, while the geometric multiplicity m_g(\lambda) equals the number of such blocks. Additionally, the trace of A, \operatorname{tr}(A), equals the sum of all eigenvalues counted with their algebraic multiplicities: \operatorname{tr}(A) = \sum m_a(\lambda) \lambda. For matrices that admit a basis of eigenvectors—those that are diagonalizable—an eigenbasis can be formed by selecting bases from the eigenspaces of distinct eigenvalues, ensuring the entire space is spanned by these eigenvectors. This requires m_g(\lambda) = m_a(\lambda) for all \lambda. Consider the $2 \times 2 matrix A = \begin{pmatrix} 2 & 1 \\ 0 & 2 \end{pmatrix}. The characteristic polynomial is (\lambda - 2)^2 = 0, so \lambda = 2 has algebraic multiplicity m_a(2) = 2. The eigenspace is \ker(A - 2I) = \ker\begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, which has dimension m_g(2) = 1, spanned by \begin{pmatrix} 1 \\ 0 \end{pmatrix}. Here, m_a(2) > m_g(2), illustrating a non-diagonalizable case with a single block of size 2.

Diagonalization and spectral decomposition

A square matrix A \in \mathbb{C}^{n \times n} is diagonalizable if it possesses a complete set of n linearly independent eigenvectors, allowing representation as A = P D P^{-1}, where D is a containing the eigenvalues of A on its diagonal, and the columns of the P are the corresponding eigenvectors. This condition holds if and only if the geometric multiplicity equals the algebraic multiplicity for each eigenvalue of A. A sufficient but not necessary condition for diagonalizability is that all eigenvalues of A are distinct. The eigendecomposition A = P D P^{-1} simplifies powers of the matrix to A^k = P D^k P^{-1}, where D^k is obtained by raising each diagonal entry to the power k, facilitating analysis in applications such as dynamical systems and Markov chains. For matrices over the real numbers that are not diagonalizable, the Jordan canonical form provides a generalized with Jordan blocks for eigenvalues having deficient eigenspaces, though details of its construction are addressed in computational methods. The spectral theorem extends diagonalizability to real symmetric matrices, stating that every real symmetric matrix A is orthogonally diagonalizable: A = Q D Q^T, where Q is an orthogonal matrix (Q^T Q = I) whose columns are orthonormal eigenvectors, and D is diagonal with real eigenvalues. This decomposition ensures the eigenvalues are real and the eigenvectors form an orthonormal basis. For positive semi-definite symmetric matrices, all eigenvalues in D are non-negative. As an illustrative example, consider the real A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}. The eigenvalues are \lambda_1 = 3 and \lambda_2 = 1, with corresponding orthonormal eigenvectors \mathbf{u}_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix} and \mathbf{u}_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}. The is then A = Q D Q^T, \quad D = \begin{pmatrix} 3 & 0 \\ 0 & 1 \end{pmatrix}, \quad Q = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{pmatrix}, verifying Q^T Q = I and the property.

Properties and Characterizations

Variational characterization

The variational characterization provides a way to describe the eigenvalues of a Hermitian matrix through optimization principles involving quadratic forms. For a Hermitian matrix A \in \mathbb{C}^{n \times n}, the Rayleigh quotient is defined as R_A(x) = \frac{x^* A x}{x^* x} for any nonzero vector x \in \mathbb{C}^n, where x^* denotes the . When restricted to unit vectors with \|x\| = 1, the maximum value of R_A(x) equals the largest eigenvalue \lambda_{\max}(A) of A, achieved at a corresponding eigenvector, while the minimum value equals the smallest eigenvalue \lambda_{\min}(A). For real symmetric matrices, the eigenvalues can be viewed as the stationary points of the Rayleigh quotient under the constraint \|x\| = 1. Specifically, if x is an eigenvector corresponding to eigenvalue \lambda, then R_A(x) = \lambda, and the critical points of R_A on the unit sphere yield exactly the eigenvalues. The full set of eigenvalues admits a more refined characterization via the Courant-Fischer min-max theorem. Assuming the eigenvalues of the Hermitian matrix A are ordered as \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n, the k-th smallest eigenvalue is given by \lambda_k = \min_{\dim S = k} \max_{\substack{x \in S \\ \|x\| = 1}} x^* A x = \max_{\dim T = n-k+1} \min_{\substack{x \in T \\ \|x\| = 1}} x^* A x, where the minimum is over all subspaces S \subseteq \mathbb{C}^n of dimension k, and the maximum is over all subspaces T \subseteq \mathbb{C}^n of dimension n-k+1. A sketch of the proof relies on the spectral decomposition of A. Since A is Hermitian, it admits an orthonormal basis of eigenvectors \{v_1, \dots, v_n\} with A v_i = \lambda_i v_i. For any unit vector x = \sum_{i=1}^n c_i v_i with \sum |c_i|^2 = 1, the quadratic form expands as x^* A x = \sum_{i=1}^n \lambda_i |c_i|^2. To establish the min-max formula, consider the subspace S_k = \operatorname{span}\{v_1, \dots, v_k\}; the maximum of x^* A x over unit vectors in S_k is \lambda_k, as larger eigenvalues are orthogonal to this span. Extending to arbitrary subspaces of dimension k via the orthogonal projector onto the span of the first k eigenvectors bounds the expression from below, yielding equality. The max-min form follows dually by considering the complementary subspace. This variational approach enables bounding eigenvalues without computing the full spectrum, for instance by evaluating the Rayleigh quotient on trial subspaces to obtain upper or lower estimates for specific \lambda_k.

Left and right eigenvectors

In linear algebra, right eigenvectors of a square matrix A are defined by the equation A \mathbf{v} = \lambda \mathbf{v}, where \lambda is an eigenvalue and \mathbf{v} is a nonzero column vector. This standard formulation captures the action of A on column vectors from the right. For a more complete description, particularly with non-symmetric matrices, left eigenvectors are introduced, satisfying \mathbf{w}^* A = \lambda \mathbf{w}^*, where \mathbf{w}^* denotes the row vector that is the conjugate transpose of a column vector \mathbf{w}. Equivalently, in column vector form, this is A^* \mathbf{w} = \bar{\lambda} \mathbf{w}, where A^* is the (conjugate transpose) of A. Left eigenvectors (as columns) thus correspond to the right eigenvectors of the matrix A^* with eigenvalue \bar{\lambda}. The eigenvalues of A^* are the complex conjugates of the eigenvalues of A. However, for non-symmetric matrices, the left and right eigenvectors generally differ, unlike the case for symmetric or matrices where they coincide up to scaling. A key property is biorthogonality: if \{\mathbf{v}_i\} and \{\mathbf{w}_i\} are the right and left eigenvectors corresponding to distinct eigenvalues \{\lambda_i\}, then \mathbf{w}_i^* \mathbf{v}_j = \delta_{ij} after appropriate , forming a biorthogonal basis. Consider the non-symmetric $2 \times 2 matrix A = \begin{pmatrix} 3 & 1 \\ 0 & 2 \end{pmatrix}. The eigenvalues are \lambda_1 = 3 and \lambda_2 = 2. The right eigenvectors are \mathbf{v}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix} for \lambda_1 and \mathbf{v}_2 = \begin{pmatrix} 1 \\ -1 \end{pmatrix} for \lambda_2. The left eigenvectors (as rows) are \mathbf{w}_1^* = \begin{pmatrix} 1 & 1 \end{pmatrix} for \lambda_1 and \mathbf{w}_2^* = \begin{pmatrix} 0 & 1 \end{pmatrix} for \lambda_2, demonstrating the difference between left and right vectors and satisfying biorthogonality relations such as \mathbf{w}_1^* \mathbf{v}_2 = 0 and \mathbf{w}_2^* \mathbf{v}_1 = 0.

Additional matrix properties

The trace of a square matrix A, defined as the sum of its diagonal entries, equals the sum of its eigenvalues, counted with algebraic multiplicity. Similarly, the of A is the product of its eigenvalues, also counted with multiplicity. These relations follow from the coefficients of the and hold for any complex . Eigenvalues are invariant under similarity transformations; that is, if B = P^{-1} A P for an invertible matrix P, then A and B share the same eigenvalues with identical multiplicities. This property underscores that eigenvalues capture intrinsic spectral information independent of the basis chosen to represent the linear transformation. The provides bounds on eigenvalue locations: every eigenvalue of an n \times n complex matrix A = [a_{ij}] lies in at least one of the disks in the centered at a_{ii} with radius \sum_{j \neq i} |a_{ij}|, for i = 1, \dots, n. Named after Sergei Gershgorin, this result offers a simple way to estimate eigenvalue regions without computing the , particularly useful for matrices with dominant diagonal entries. For symmetric matrices, positive definiteness is equivalent to all eigenvalues being positive. A real A is positive definite if x^T A x > 0 for all nonzero x \in \mathbb{R}^n, and this condition aligns directly with its spectral properties, ensuring the quadratic form remains positive. As an illustrative example, consider a $2 \times 2 A arising in a linear \dot{x} = A x. The eigenvalues determine : the origin is asymptotically if both have negative real parts, which occurs precisely when \operatorname{tr}(A) < 0 and \det(A) > 0. For instance, with A = \begin{pmatrix} -1 & 1 \\ 0 & -2 \end{pmatrix}, \operatorname{tr}(A) = -3 < 0 and \det(A) = 2 > 0, confirming without explicit eigenvalue computation.

Generalizations

Linear operators and eigenfunctions

In infinite-dimensional settings, the notions of eigenvalues and eigenvectors from finite-dimensional linear algebra extend to linear operators on Banach or Hilbert spaces, where non-zero elements satisfying Tf = \lambda f are termed eigenfunctions corresponding to the eigenvalue \lambda. The point spectrum of an operator T, consisting precisely of its eigenvalues, comprises those \lambda \in \mathbb{C} for which T - \lambda I is not injective, meaning there exists a non-trivial . This framework applies to spaces of functions, such as L^2 intervals, where eigenfunctions play a central role in solving differential equations. A simple example is the differentiation D = \frac{d}{dx} acting on the of polynomials, where constant functions serve as eigenfunctions for the eigenvalue \lambda = 0, since their derivatives vanish identically. More generally, the of a bounded linear T on a decomposes into the of the point spectrum (eigenvalues), the continuous (where T - \lambda I is injective with dense but non-surjective range), and the (where T - \lambda I is injective but the range is not dense). For compact operators on infinite-dimensional Banach spaces, the non-zero elements of the are isolated eigenvalues of finite multiplicity, forming a set that can only accumulate at 0; the point 0 itself belongs to the . This discreteness contrasts with the potentially continuous of non-compact operators and facilitates spectral decompositions in applications like integral equations. In the context of boundary value problems, Sturm-Liouville theory addresses differential operators of the form L = \frac{d}{dx} \left( p(x) \frac{d}{dx} \right) + q(x) on finite intervals with appropriate boundary conditions, yielding real eigenvalues \lambda_n that are simple and countable with \lambda_1 < \lambda_2 < \cdots \to \infty. The corresponding eigenfunctions \phi_n are orthogonal with respect to the weight function \sigma(x) in the inner product \int_a^b \phi_n(x) \phi_m(x) \sigma(x) \, dx = 0 for n \neq m, forming a complete orthogonal basis for the underlying Hilbert space. This orthogonality, established via Green's identity, underpins expansions of arbitrary functions in series of eigenfunctions.

Spectral theory overview

Spectral theory provides a framework for understanding the structure of linear operators on Hilbert spaces through their eigenvalues and eigenvectors, generalizing finite-dimensional concepts to infinite dimensions. The spectral theorem for normal operators states that every bounded normal operator T on a separable Hilbert space \mathcal{H} admits a spectral decomposition of the form T = \int_{\sigma(T)} \lambda \, dE(\lambda), where \sigma(T) denotes the spectrum of T, and E is a unique projection-valued measure (spectral measure) supported on the Borel subsets of \mathbb{C} such that E(\mathbb{C}) = I, the identity operator on \mathcal{H}. This integral representation expresses T as a "continuous superposition" of scalar multiples of projections, with E(\Delta) projecting onto the subspace spanned by generalized eigenvectors associated with eigenvalues in the Borel set \Delta \subseteq \sigma(T). For self-adjoint operators, a subclass of normal operators, the spectral theorem yields particularly concrete consequences. The spectrum \sigma(T) is real and contained in \mathbb{R}, and the Hilbert space \mathcal{H} decomposes orthogonally as a direct sum of closed subspaces \mathcal{H} = \bigoplus_{\lambda \in \sigma_p(T)} \ker(T - \lambda I) \oplus \overline{\operatorname{ran} E(\sigma(T) \setminus \sigma_p(T))}, where \sigma_p(T) is the point spectrum, and the continuous part corresponds to the range of the spectral projections over the continuous spectrum. This orthogonal decomposition ensures that eigenvectors (or generalized eigenvectors) for distinct eigenvalues are orthogonal, facilitating the analysis of operator behavior through its spectral components. The spectral theorem underpins the development of functional calculus for normal operators, allowing the definition of f(T) for Borel measurable functions f: \sigma(T) \to \mathbb{C} via f(T) = \int_{\sigma(T)} f(\lambda) \, dE(\lambda). This construction preserves the algebraic structure, such that f(T) is normal if T is, and it enables the computation of functions like exponentials e^{i t T} or powers T^n, which are essential in applications like . The resolution of the identity provided by E is a projection-valued measure satisfying E(\emptyset) = 0, E(\mathbb{C}) = I, and E(\Delta_1) E(\Delta_2) = E(\Delta_1 \cap \Delta_2) for Borel sets \Delta_1, \Delta_2, with the projections E(\{\lambda\}) isolating point masses corresponding to eigenvalues and their associated generalized eigenspaces. A illustrative example is the multiplication operator M_m on L^2(X, \mu), where m \in L^\infty(X, \mu) is a bounded measurable function and (M_m \phi)(x) = m(x) \phi(x) for \phi \in L^2(X, \mu). This operator is normal (self-adjoint if m is real-valued), and its spectrum \sigma(M_m) coincides with the essential range of m, defined as the set of \lambda \in \mathbb{C} such that \mu(\{ x : |m(x) - \lambda| < \epsilon \}) > 0 for every \epsilon > 0. The spectral measure E is given by E(\Delta) \phi = \chi_{\{ m \in \Delta \}} \phi, where \chi is the characteristic function, projecting onto the subspace of functions supported where m falls in \Delta.

Associative algebras and representations

In unital associative over the complex numbers, the of an a is defined as the set \sigma(a) = \{\lambda \in \mathbb{C} \mid a - \lambda \cdot 1 is not invertible in the \}\). This generalizes the notion of eigenvalues from matrix algebras, where the [spectrum](/page/Spectrum) coincides with the set of eigenvalues of the matrix. For finite-dimensional algebras, the [spectrum](/page/Spectrum) of a$ matches the eigenvalues arising from its action via left multiplication in the regular representation. In commutative unital associative over \mathbb{C}, the characters—algebra homomorphisms \chi: A \to \mathbb{C}—provide a concrete realization of the , with \sigma(a) = \{\chi(a) \mid \chi a character of A\}. Each character has a kernel that is a maximal ideal of A, establishing a bijection between the set of characters and the maximal ideals when A is finitely generated. These characters correspond to one-dimensional representations, where \chi(a) is the eigenvalue of a in that representation. In the of associative algebras or finite groups, a \rho: A \to \mathrm{End}(V) or \rho: G \to \mathrm{GL}(V) maps elements a \in A or g \in G to endomorphisms whose eigenvalues capture the "spectral content" of a or g in the module V. Irreducible representations often allow triangularization over \mathbb{C}, with eigenvalues appearing on the diagonal in a suitable basis, though full requires additional structure like abelianity. The \chi_\rho(g) = \mathrm{tr}(\rho(g)) equals the sum of these eigenvalues, providing a conjugation-invariant summary of the across the representation. For finite groups, the eigenvalues of \rho(g) are roots of unity, reflecting the finite order of g. Schur's lemma plays a key role in controlling eigenvalue multiplicities within irreducible representations: it states that any commuting with all \rho(g) (or \rho(a)) must be a scalar multiple of the . Consequently, central elements of the or group act as scalars in irreducible representations, yielding a single eigenvalue with multiplicity equal to the representation . For non-central elements, the lemma implies that eigenspaces for distinct eigenvalues cannot be interchanged by representation operators without contradicting irreducibility, leading to balanced multiplicities determined by the representation's structure; in unitary irreducible representations of compact groups, conjugate eigenvalues occur with equal multiplicity. A concrete illustration arises in the representation theory of the compact Lie group \mathrm{SO}(3). Its irreducible representations, labeled by non-negative integers l = 0, 1, 2, \dots, have dimension $2l + 1. The angular momentum operators, which generate the Lie algebra \mathfrak{so}(3), act on these representations; for instance, the operator corresponding to rotation around the z-axis has eigenvalues m = -l, -l+1, \dots, l-1, l, each with multiplicity one. The Casimir operator J^2, associated with the quadratic form on the Lie algebra, acts as the scalar l(l+1) times the identity, with multiplicity $2l + 1. These eigenvalues encode the decomposition of representations and underpin the theory's classification.

Computation

Analytical methods

The classical analytical approach to finding eigenvalues and eigenvectors of an n \times n A begins by forming the \det(A - \lambda I) = 0, where I is the ; the roots \lambda of this equation are the eigenvalues. For each eigenvalue \lambda_k, the corresponding eigenvectors v_k are obtained by solving the homogeneous (A - \lambda_k I) v = 0, which yields a basis for the eigenspace associated with \lambda_k./04:_Eigenvalues_and_eigenvectors/4.02:_Finding_eigenvalues_and_eigenvectors) This method provides exact solutions when the characteristic can be solved symbolically. For 2×2 matrices, the characteristic polynomial is quadratic, \lambda^2 - \operatorname{tr}(A) \lambda + \det(A) = 0, and eigenvalues are given explicitly by the \lambda = \frac{\operatorname{tr}(A) \pm \sqrt{[\operatorname{tr}(A)]^2 - 4 \det(A)}}{2}. Eigenvectors follow by into the . For 3×3 matrices, the is cubic, solvable in closed form using Cardano's formula, though the expressions involve cube roots and may yield complex values even for real matrices. The Cayley-Hamilton theorem asserts that A satisfies its own , p(A) = 0, enabling symbolic reduction of matrix powers to lower-degree polynomials via the eigenvalues, which aids in verifying solutions or computing functions of A analytically without explicit eigenvalue extraction in some cases. Special cases simplify the process: for diagonal matrices, the eigenvalues are precisely the diagonal entries, and the eigenvectors are the vectors. For upper or lower triangular matrices, the eigenvalues are also the diagonal entries, with eigenvectors found by solving the triangular system (A - \lambda_i I) v = 0 via forward or back substitution. Consider a 3×3 representing a by angle \theta around the z-: A = \begin{pmatrix} \cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \end{pmatrix}. The is \det(A - \lambda I) = (\lambda - 1)(\lambda^2 - 2\lambda \cos \theta + 1) = 0, yielding eigenvalues \lambda_1 = 1, \lambda_{2,3} = \cos \theta \pm i \sin \theta = e^{\pm i \theta}. The eigenvector for \lambda_1 = 1 is v_1 = (0, 0, 1)^T, along the . For the eigenvalues, the eigenvectors are v_2 = (1, -i, 0)^T and v_3 = (1, i, 0)^T, spanning the in the sense; real approximations can be formed from their linear combinations for practical use.

Iterative techniques

Iterative techniques provide efficient ways to approximate the dominant eigenvalues and eigenvectors of large matrices where exact eigendecomposition is computationally prohibitive. These methods rely on repeated matrix-vector multiplications to refine initial guesses, assuming the matrix has a dominant eigenvalue whose absolute value exceeds that of all others. Under this condition, the iterates converge linearly to the corresponding eigenvector. The power iteration, also known as the power method, is the simplest such algorithm. Starting from an initial \mathbf{v}_0, it generates the sequence \mathbf{v}_{k+1} = \frac{A \mathbf{v}_k}{\| A \mathbf{v}_k \|}, where \|\cdot\| denotes a , typically the norm. If the eigenvalues are ordered by |\lambda_1| > |\lambda_2| \geq \cdots \geq |\lambda_n|, and \mathbf{v}_0 has a nonzero component in the direction of the dominant eigenvector \mathbf{u}_1, then \mathbf{v}_k converges to \mathbf{u}_1 as k \to \infty. The corresponding eigenvalue \lambda_1 can be approximated by the \mathbf{v}_k^T A \mathbf{v}_k. This method originates from early 20th-century work on iterative approximations for linear operators and is widely used for its minimal storage requirements, needing only O(n) space beyond the matrix itself. The convergence of power iteration is linear, with the error typically reducing by a factor of \left| \frac{\lambda_2}{\lambda_1} \right| per in the asymptotic . More precisely, the angle \theta_k between \mathbf{v}_k and \mathbf{u}_1 satisfies \tan \theta_{k+1} \approx \left| \frac{\lambda_2}{\lambda_1} \right| \tan \theta_k, implying that the number of iterations needed for a desired accuracy grows logarithmically with the and inversely with \log \left| \frac{\lambda_1}{\lambda_2} \right|. This rate makes the method effective when the is large but slow otherwise. To illustrate, consider the A = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}, which has eigenvalues \lambda_1 = 4 and \lambda_2 = 2, with eigenvectors \mathbf{u}_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix} and \mathbf{u}_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}. Starting with \mathbf{v}_0 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, the first few iterations yield:
  • \mathbf{v}_1 = \frac{A \mathbf{v}_0}{\|A \mathbf{v}_0\|} = \frac{1}{\sqrt{10}} \begin{pmatrix} 3 \\ 1 \end{pmatrix} \approx \begin{pmatrix} 0.949 \\ 0.316 \end{pmatrix}
  • \mathbf{v}_2 \approx \begin{pmatrix} 0.858 \\ 0.515 \end{pmatrix}
  • \mathbf{v}_3 \approx \begin{pmatrix} 0.789 \\ 0.614 \end{pmatrix}
The quotients are approximately 3.60, 3.88, and 3.97, approaching \lambda_1 = 4, demonstrating dominated by the ratio |\lambda_2 / \lambda_1| = 0.5. iteration extends to target eigenvalues near a known shift \mu, by applying the method to the shifted matrix (A - \mu I)^{-1}. The update becomes \mathbf{v}_{k+1} = \frac{(A - \mu I)^{-1} \mathbf{v}_k}{\| (A - \mu I)^{-1} \mathbf{v}_k \|}, where the dominant eigenvalue of the inverse corresponds to the original eigenvalue closest to \mu. Each step requires solving a (A - \mu I) \mathbf{w} = \mathbf{v}_k, which can be done efficiently via factorization if \mu is fixed. This technique accelerates for interior eigenvalues when a good shift is available, with the rate determined by the ratio of distances to the nearest eigenvalues. Orthogonal iteration generalizes to compute multiple dominant eigenvectors simultaneously by operating on an initial orthonormal . Starting with an n \times p Q_0 whose columns form an for a p-dimensional (with p equal to the number of desired eigenvectors), the method computes Y_{k+1} = A Q_k and then orthogonalizes via to obtain Q_{k+1} R_{k+1} = Y_{k+1}, yielding Q_{k+1} as the updated . The columns of Q_k converge to the dominant p eigenvectors, assuming a sufficient |\lambda_p| > |\lambda_{p+1}|, with linear governed by |\lambda_{p+1} / \lambda_p|. This approach is particularly useful for symmetric matrices and maintains through orthogonalization.

Numerical algorithms

Numerical algorithms for computing eigenvalues and eigenvectors address the full spectrum of matrices, particularly through direct methods for dense problems and subspace projections for large-scale sparse cases. These approaches aim to deliver the complete set of eigenvalues (and optionally eigenvectors) with high accuracy and efficiency, scaling from moderate-sized dense matrices to massive sparse systems arising in scientific . Modern implementations balance , speed, and computational cost, often building on orthogonal transformations to preserve matrix norms and avoid ill-conditioning. The QR algorithm stands as a cornerstone for solving the general nonsymmetric eigenvalue problem, transforming the matrix via iterative orthogonal similarities to reveal its Schur form, where eigenvalues appear on the diagonal of an upper triangular matrix. Prior to iterations, the matrix undergoes Hessenberg reduction using Householder reflections, which converts a dense n \times n matrix to unreduced Hessenberg form in approximately \frac{10}{3} n^3 floating-point operations, preserving eigenvalues while simplifying subsequent steps. Each QR iteration then decomposes the Hessenberg matrix H_k = Q_k R_k and forms H_{k+1} = R_k Q_k, converging quadratically near the diagonal for well-separated eigenvalues; shifts, such as the Wilkinson shift, accelerate this process to ensure deflation of converged eigenvalues. For a symmetric matrix, the algorithm yields a real tridiagonal form after reduction (in \frac{4}{3} n^3 operations), followed by iterations that produce a diagonal matrix of eigenvalues. For symmetric tridiagonal matrices, the divide-and-conquer algorithm offers an efficient alternative to QR iterations, particularly on parallel architectures, by recursively splitting the problem into smaller subproblems while exploiting the banded structure. The method, as refined by Gu and Eisenstat, divides the at a rank-one update point, computes eigensystems of the resulting blocks independently, and merges them using secular equations solved via or , achieving high relative accuracy for eigenvectors. This approach reduces complexity to O(n^2) per level in the recursion tree, outperforming standard QR for large n by factors of 2–4 on vector processors, though it requires careful handling of nearly singular merges. Subspace methods like Lanczos and Arnoldi excel for large sparse matrices, projecting the problem onto low-dimensional Krylov subspaces to approximate extremal eigenvalues without forming the full eigendecomposition. The Lanczos algorithm, tailored for symmetric sparse matrices, generates an orthonormal basis Q_k for the Krylov subspace \mathcal{K}_k(A, q_1) = \operatorname{span}\{q_1, A q_1, \dots, A^{k-1} q_1\} via three-term recurrence, yielding a tridiagonal projection T_k = Q_k^T A Q_k whose Ritz values (eigenvalues of T_k) converge rapidly to the largest or smallest eigenvalues of A. For nonsymmetric cases, the Arnoldi iteration extends this by constructing a Hessenberg projection H_k = Q_k^T A Q_k through Gram-Schmidt orthogonalization, with Ritz values approximating a cluster of eigenvalues near the spectral edge, often requiring restarts for targeted spectra. These methods leverage sparse matrix-vector products, making them suitable for matrices with O(n) nonzeros. Practical implementations rely on libraries like , which provide robust routines for dense problems. The DGEEV routine computes all eigenvalues and optionally left/right eigenvectors of a general real via the with Hessenberg reduction, handling complex conjugate pairs appropriately. For symmetric matrices, DSYEV employs a divide-and-conquer or QR-based approach on tridiagonal form to deliver eigenvalues and orthonormal eigenvectors, with workspace optimized for stability. Computational complexity for dense matrices is O(n^3) overall, dominated by the initial reduction and iterations in QR-based methods, while sparse Krylov methods scale as O(k \cdot \mathrm{nnz}(A)) for k iterations and \mathrm{nnz}(A) nonzeros, enabling solutions for n > 10^6. As an illustrative example, consider the for the polynomial t^3 - 6t^2 + 11t - 6 = 0, A = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 6 & -11 & 6 \end{pmatrix}, whose eigenvalues are the roots 1, 2, 3. The QR algorithm applied to this matrix converges after several iterations to an upper triangular form with diagonal entries 1, 2, 3.

Applications

Geometric and dynamic systems

In the context of linear transformations, eigenvectors represent directions that remain invariant under the transformation, meaning they are mapped to scalar multiples of themselves, while eigenvalues quantify the scaling factor along those directions. For a linear transformation represented by a matrix A, if Av = \lambda v where v \neq 0, then v is an eigenvector and \lambda is the corresponding eigenvalue, indicating how the vector is stretched or compressed geometrically. In dynamical systems governed by the linear x' = A x, the of the at the origin is determined by the eigenvalues of A: the system is asymptotically if all eigenvalues have negative real parts, as solutions exponentially toward zero over time. If any eigenvalue has a positive real part, the is unstable, with trajectories diverging; zero real parts lead to marginal without guaranteed . Eigenvalues play a central role in analyzing vibration modes of mechanical systems, such as coupled mass-spring arrangements, where they correspond to the squares of the natural frequencies of oscillation. For a system with masses m_i and springs with stiffness k_j, the equations of motion form a matrix eigenvalue problem K \phi = \omega^2 M \phi, where the eigenvalues \omega^2 yield the squared frequencies, and the eigenvectors \phi describe the mode shapes of synchronized displacements. In continuous dynamical systems, such as those modeled by partial equations, eigenvalues of operators like the Laplacian dictate the decay rates of solutions; for the \partial_t u = \Delta u on a , leads to solutions of the form u(x,t) = \sum c_n \phi_n(x) e^{-\lambda_n t}, where \lambda_n > 0 are the eigenvalues of -\Delta, and larger \lambda_n produce faster , smoothing the temperature distribution over time. A illustrative example is the two-dimensional of a , arising when the matrix A has one positive eigenvalue \lambda_2 > 0 and one negative eigenvalue \lambda_1 < 0: trajectories along the eigenvector for \lambda_1 approach the origin, while those along the eigenvector for \lambda_2 diverge, forming hyperbolic curves that separate stable and unstable manifolds. This configuration highlights how opposing eigenvalue signs lead to unstable dynamics, often resolvable via diagonalization to express solutions as x(t) = e^{A t} x(0).

Statistics and data analysis

In statistics and data analysis, eigenvalues and eigenvectors play a central role in dimensionality reduction techniques, enabling the identification of underlying patterns in high-dimensional datasets. (PCA), introduced by , decomposes the covariance matrix of the data into its eigenvalues and eigenvectors, where the eigenvectors represent the principal components—orthogonal directions of maximum variance—and the corresponding eigenvalues quantify the amount of variance explained by each component. By selecting the principal components with the largest eigenvalues, PCA reduces data dimensionality while preserving the most significant variability, facilitating visualization and preprocessing for machine learning models. Singular value decomposition (SVD) extends this framework to rectangular matrices, commonly encountered in data analysis, by factoring a matrix A as A = U \Sigma V^T, where the singular values in \Sigma are the square roots of the eigenvalues of A^T A (or A^* A for complex cases), and the columns of V are the corresponding eigenvectors. In statistical applications, such as latent semantic analysis or recommender systems, SVD identifies low-rank approximations that capture essential data structure, with larger singular values indicating dominant features. Eigenvalues also underpin probabilistic models like Markov chains, where the transition matrix is row-stochastic, ensuring a dominant eigenvalue of 1 with an associated left eigenvector representing the stationary distribution. The subdominant eigenvalues, with magnitudes less than 1 for irreducible and aperiodic chains, determine the convergence rate to the stationary distribution, as the powers of the matrix asymptotically approach the projector onto the eigenspace of eigenvalue 1. In epidemiology, the basic reproduction number R_0 is defined as the spectral radius—the largest eigenvalue—of the next-generation matrix, which models the expected number of secondary infections produced by a single infected individual in a susceptible population. If R_0 > 1, the disease-free is unstable, indicating potential growth; conversely, R_0 < 1 implies and disease . For , consider a 2D with covariance matrix \Sigma = \begin{pmatrix} 4 & 2 \\ 2 & 3 \end{pmatrix}. The eigenvalues are approximately 5.56 and 1.44, with eigenvectors such as \begin{pmatrix} 0.8 \\ 0.6 \end{pmatrix} for the larger eigenvalue and \begin{pmatrix} -0.6 \\ 0.8 \end{pmatrix} for the smaller; data points projected onto the first eigenvector align along the axis of greatest spread, reducing the to a 1D that explains approximately 79% of the variance.

Physics and engineering

In quantum mechanics, the time-independent Schrödinger equation, H \psi = E \psi, where H is the , identifies eigenvalues E as the levels of a quantum system. These eigenvalues represent the possible stationary states, with the corresponding eigenfunctions \psi describing the wavefunctions of the system. The ensures that for Hamiltonians, the eigenvalues are real, aligning with the physical requirement that energy measurements yield , real-valued outcomes. In , the eigenvalues of the determine the energies of molecular orbitals, which form the basis for understanding electronic structure and chemical bonding. Approximations such as the Hartree-Fock method solve the eigenvalue problem for the , yielding orbital energies that predict molecular stability and reactivity. These eigenvalues help quantify the distribution of electrons in bonding and antibonding orbitals, essential for computational simulations of molecular properties. Vibration in and physics relies on solving the generalized eigenvalue problem M^{-1} K \mathbf{u} = \lambda \mathbf{u}, where M is the , K is the , \mathbf{u} are the mode shapes (eigenvectors), and \lambda are the eigenvalues corresponding to squared natural frequencies. The natural frequencies are then given by \sqrt{\lambda}, enabling the identification of normal modes in structures like bridges or mechanical systems to predict and . This approach is fundamental in finite element for designing vibration-resistant components. The stress tensor in is diagonalized to find principal stresses, which are the eigenvalues, with principal axes as the corresponding eigenvectors, simplifying the analysis of material failure under load. Similarly, the inertia tensor for rigid bodies yields principal moments of as eigenvalues and principal axes as eigenvectors, optimizing rotational calculations in and . These decompositions reduce complex tensor operations to scalar problems along orthogonal directions. A representative example is the one-dimensional , whose H = \frac{p^2}{2m} + \frac{1}{2} m \omega^2 x^2 has eigenvalues E_n = \left(n + \frac{1}{2}\right) \hbar \omega, where n = 0, 1, 2, \dots, illustrating quantized levels in systems like molecular vibrations. This model underpins approximations for anharmonic potentials in real physical systems.

Other domains

In , the eigenvalues of the and provide key insights into and structural properties. The second-smallest eigenvalue of the Laplacian, known as the or Fiedler value, quantifies how well-connected the graph is, with larger values indicating stronger overall and resistance to disconnection. graph partitioning leverages the Fiedler vector—the eigenvector corresponding to this eigenvalue—to identify balanced cuts that minimize the number of edges between partitions, enabling efficient division of large networks for or community detection. In , eigenvalues derived from the stress tensor are used to analyze fault plane solutions and predict tectonic faulting patterns. By treating the stress tensor as a , its eigenvalues represent the principal stresses, which determine the orientation and type of potential faults through criteria like Mohr-Coulomb failure envelopes, aiding in the reconstruction of paleostress fields from observed fault data. In , eigenvalues of crystal orientation tensors model ice fabric anisotropy, influencing flow dynamics in polar ice sheets; for instance, the spread of eigenvalues reflects how crystal alignments affect and in ice cores, improving simulations of movement under varying conditions. The eigenfaces method applies to face recognition, where eigenvalues of the rank the importance of facial features by capturing variance in image datasets. Larger eigenvalues correspond to principal components that explain the most variation across faces, allowing efficient for identifying individuals from images. This builds on statistical techniques like to project high-dimensional face data onto a low-dimensional eigenspace. In wave transport, eigenvalues in scattering theory characterize propagation modes, particularly in waveguides or disordered media, where they determine resonant frequencies and transmission efficiencies for acoustic or electromagnetic . For example, the eigenvalues of the help model how scatter off obstacles, revealing quasi-normal modes that predict energy flow and decay rates in open systems. In emerging applications, the eigenvalues of the of a neural network's illuminate the optimization landscape, with the indicating the presence of flat minima (small eigenvalues) that promote better generalization or sharp regions (large eigenvalues) that hinder . Analysis of the top Hessian eigenvalues during training reveals how network depth affects , guiding adaptive optimizers to navigate saddle points more effectively.