Jordan matrix
In linear algebra, a Jordan matrix, also known as a matrix in Jordan canonical form and named after the French mathematician Camille Jordan who introduced the form in 1870,[1] is a block diagonal matrix composed of Jordan blocks, where each Jordan block is a square matrix featuring a single eigenvalue along its main diagonal and ones on the superdiagonal, with all other entries being zero.[2][3][4] This structure provides a canonical representation for any square matrix over an algebraically closed field, such as the complex numbers, revealing the matrix's eigenvalue multiplicities and the dimensions of its generalized eigenspaces.[2][3] The Jordan canonical form theorem states that every square matrix A is similar to a unique Jordan matrix J (up to permutation of the blocks), meaning there exists an invertible matrix P such that A = P J P^{-1}, where the columns of P consist of eigenvectors and generalized eigenvectors forming Jordan chains.[3][4] For an eigenvalue \lambda, the number of Jordan blocks corresponds to the geometric multiplicity (dimension of the eigenspace), while the sizes of the blocks determine the algebraic multiplicity and the structure of the nilpotent part.[2] A matrix is diagonalizable if and only if all its Jordan blocks are of size 1, reducing the form to a diagonal matrix of eigenvalues.[2][3] Jordan matrices are fundamental for analyzing linear transformations, computing matrix powers and functions (e.g., via the binomial theorem on blocks), and solving systems of linear ordinary differential equations, as the form simplifies exponentiation and reveals stability properties.[2][4] The minimal polynomial of a matrix is the least common multiple of the minimal polynomials of its Jordan blocks, which are (\lambda - \mu)^k where k is the size of the largest block for eigenvalue \mu.[3] This form extends to fields that are not algebraically closed by considering rational canonical forms as an alternative, but over \mathbb{C}, the Jordan form is always achievable.[2]Background Concepts
Eigenvalues and Diagonalizability
Eigenvalues of a square matrix A are the scalars \lambda that satisfy the characteristic equation \det(A - \lambda I) = 0, where I is the identity matrix of the same dimension; these values are the roots of the characteristic polynomial p_A(\lambda) = \det(A - \lambda I).[5][6] For each eigenvalue \lambda, a corresponding eigenvector is a non-zero vector v such that Av = \lambda v.[7] The set of all such eigenvectors for a fixed \lambda, together with the zero vector, forms the eigenspace, which is the null space of A - \lambda I; the dimension of this eigenspace defines the geometric multiplicity of \lambda.[8] The algebraic multiplicity of an eigenvalue \lambda is its multiplicity as a root of the characteristic polynomial, counting repetitions in the factorization.[8] This multiplicity always satisfies the inequality that the geometric multiplicity is less than or equal to the algebraic multiplicity for each eigenvalue.[8] The concept of eigenvalues traces back to Augustin-Louis Cauchy, who introduced the characteristic equation in the context of solving systems of linear equations and quadratic forms around 1829.[9] A square matrix A is diagonalizable if there exists an invertible matrix P such that P^{-1} A P = D, where D is a diagonal matrix with the eigenvalues of A on its main diagonal.[10] By the diagonalizability theorem, A is diagonalizable if and only if, for every eigenvalue \lambda, its algebraic multiplicity equals its geometric multiplicity; equivalently, the minimal polynomial of A factors into distinct linear terms over the base field.[10][11] This condition ensures the existence of a basis of eigenvectors that spans the entire space, allowing the matrix to be represented in a diagonal form.[10] Diagonalization was formalized by Camille Jordan in 1870, building on earlier work by mathematicians such as Karl Weierstrass.[1] For example, consider the 2×2 rotation matrix by an angle \theta that is not a multiple of \pi: A = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}. The characteristic polynomial is \lambda^2 - 2\cos\theta \, \lambda + 1 = 0, with roots \lambda = e^{i\theta} and \lambda = e^{-i\theta}, which are distinct complex conjugates.[12] Each has algebraic and geometric multiplicity one, so A is diagonalizable over \mathbb{C}: an invertible P exists such that P^{-1} A P = D = \operatorname{diag}(e^{i\theta}, e^{-i\theta}).[12]Limitations of Diagonalization
A matrix is defective if, for at least one eigenvalue, its geometric multiplicity is strictly less than its algebraic multiplicity.[13] This discrepancy implies that the eigenspace for that eigenvalue does not span the full generalized eigenspace, preventing the matrix from having a complete basis of eigenvectors.[13] A classic example is the 2×2 nilpotent matrix A = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, which has eigenvalue 0 with algebraic multiplicity 2 but geometric multiplicity 1.[14] Here, the characteristic polynomial is \det(A - \lambda I) = \lambda^2, confirming the algebraic multiplicity, while the eigenspace consists of vectors of the form \begin{pmatrix} a \\ 0 \end{pmatrix}, yielding dimension 1.[14] Consequently, no basis of eigenvectors exists, rendering A non-diagonalizable.[14] Diagonalization also fails if the minimal polynomial of the matrix has repeated factors.[15] Specifically, a matrix over a field F is diagonalizable if and only if its minimal polynomial factors into distinct linear factors over F.[15] Repeated factors indicate higher-order generalized eigenvectors are needed beyond standard eigenvectors. Over algebraically closed fields such as the complex numbers, every square matrix has at least one eigenvalue, as the characteristic polynomial splits completely.[16] However, over the real numbers, a matrix may lack real eigenvalues, necessitating an extension to the complex field for full analysis.[16] A matrix is non-diagonalizable precisely when it is not similar to a diagonal matrix; in such cases, the Jordan canonical form serves as the next best canonical form, capturing the structure via Jordan blocks.[2]Definition and Structure
Jordan Blocks
A Jordan block J_k(\lambda) of size k associated with an eigenvalue \lambda is a k \times k upper triangular matrix featuring \lambda along the main diagonal and 1's strictly on the superdiagonal, with all other entries equal to zero.[17] This structure makes it the fundamental building block for representing matrices that are not diagonalizable.[18] The matrix admits the explicit decomposition J_k(\lambda) = \lambda I_k + N_k, where I_k denotes the k \times k identity matrix and N_k is the strictly upper triangular nilpotent matrix with 1's on the superdiagonal and zeros elsewhere.[17] Here, N_k satisfies N_k^k = 0 but N_k^{k-1} \neq 0, establishing the nilpotency index of N_k as exactly k, the smallest positive integer m such that N_k^m = 0.[18] The eigenvalue of J_k(\lambda) is solely \lambda, possessing algebraic multiplicity k (the dimension of the matrix) and geometric multiplicity 1 (the dimension of the eigenspace).[17] Powers of a Jordan block leverage the nilpotency of N_k through the binomial theorem, yielding J_k(\lambda)^m = \sum_{i=0}^{k-1} \binom{m}{i} \lambda^{m-i} N_k^i for any nonnegative integer m, since higher terms vanish as N_k^k = 0.[19] This formula simplifies computations for functions of the matrix and highlights the "shift" behavior induced by the superdiagonal 1's. For illustration, consider the 3×3 Jordan block with \lambda = 2: J_3(2) = \begin{pmatrix} 2 & 1 & 0 \\ 0 & 2 & 1 \\ 0 & 0 & 2 \end{pmatrix}. This matrix embodies a chain of generalized eigenvectors: if v_1 is an eigenvector satisfying (J_3(2) - 2I)v_1 = 0, then v_2 = (J_3(2) - 2I)v_1 and v_3 = (J_3(2) - 2I)v_2 form the basis where the action of J_3(2) shifts the chain, underscoring the single-dimensional eigenspace and the defective nature of the matrix.[17]Full Jordan Canonical Form
The Jordan canonical form theorem asserts that every square matrix A \in M_n(\mathbb{C}) is similar to a block-diagonal matrix J, known as the Jordan canonical form of A, where J consists of Jordan blocks along the diagonal, and this form is unique up to the ordering of the blocks.[20] This representation provides a canonical structure for matrices that are not diagonalizable, capturing the deviation from diagonality through the sizes and number of off-diagonal 1's in the blocks.[21] To construct the Jordan canonical form, the vector space is first decomposed into a direct sum of generalized eigenspaces corresponding to each eigenvalue \lambda, where the generalized eigenspace for \lambda is the kernel of (A - \lambda I)^n.[22] Within each generalized eigenspace, a basis is formed by identifying Jordan chains of generalized eigenvectors, obtained through successive iterations of the kernel of (A - \lambda I)^k for increasing k, starting from the eigenspace (where k=1).[23] These chains determine the structure of the Jordan blocks for that eigenvalue, with the length of each chain corresponding to the size of a block. The uniqueness of the Jordan canonical form stems from the fact that, for each eigenvalue [\lambda](/page/Lambda), the sizes of the associated Jordan blocks are uniquely specified by the dimensions of the kernels \dim \ker((A - [\lambda](/page/Lambda) I)^m) for m = 1, 2, \dots, which reveal the number and lengths of the chains via differences in these dimensions.[24] For instance, consider a $4 \times 4 matrix with eigenvalues [\lambda](/page/Lambda) = 1 (algebraic multiplicity 3, consisting of one block of size 2 and one of size 1) and [\lambda](/page/Lambda) = 2 (multiplicity 1); its Jordan canonical form is the block-diagonal matrix J = \begin{pmatrix} 1 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 2 \end{pmatrix}, where the first $2 \times 2 block corresponds to the chain for \lambda = 1, the $1 \times 1 block is the remaining eigenvector for \lambda = 1, and the final block is for \lambda = 2.[25] For matrices over the real numbers \mathbb{R}, the Jordan canonical form is adapted to the real Jordan form, where non-real eigenvalues occur in complex conjugate pairs \lambda, \overline{\lambda}, and the corresponding blocks are replaced by real $2 \times 2 blocks of the form \begin{pmatrix} a & -b \\ b & a \end{pmatrix} (for \lambda = a + bi) paired with larger structures for generalized eigenvectors, ensuring the entire form remains real.[26] A standard algorithmic approach to obtain the Jordan structure begins by computing the characteristic polynomial \det(\lambda I - A) to identify the eigenvalues and their algebraic multiplicities.[27] For each eigenvalue \lambda, the ascending chain of kernel dimensions \dim \ker((A - \lambda I)^k) for k = 1, 2, \dots until stabilization is calculated, with the differences \dim \ker((A - \lambda I)^{k}) - \dim \ker((A - \lambda I)^{k-1}) giving the number of Jordan blocks of size at least k, thereby determining the complete block partition.[28]Algebraic Properties
Similarity Invariants
Two matrices A and B are similar if there exists an invertible matrix P such that B = P^{-1} A P.[29] Similarity preserves several fundamental properties, including eigenvalues, trace, determinant, and rank.[30][31] The Jordan canonical form provides a complete set of similarity invariants through the structure of its Jordan blocks for each eigenvalue. Specifically, the number and sizes of these blocks are encoded by the Segre characteristic, which lists the sizes of the blocks in non-increasing order for each eigenvalue, or equivalently by the Weyr characteristic, a non-increasing sequence w_k(\lambda) where w_k(\lambda) counts the number of blocks of size at least k for eigenvalue \lambda.[32][33] The Segre and Weyr characteristics are conjugate partitions of each other, and both fully determine the Jordan form up to permutation of blocks.[32] Over non-algebraically closed fields, where the characteristic polynomial may not split into linear factors, the Jordan form may not exist, and the rational canonical form serves as an alternative invariant, based on invariant factors rather than elementary divisors.[34] However, when the minimal polynomial splits completely (as over algebraically closed fields like the complex numbers), the Jordan form is preferred for its direct revelation of eigenvalue multiplicities and generalized eigenspace structures.[35] For example, consider two 4×4 nilpotent matrices: one with Jordan blocks of sizes 3 and 1, and another that appears unrelated but has the same block sizes after transformation; their Segre characteristic (3,1) ensures they are similar, despite differing initial forms.[36] A fundamental theorem states that two matrices are similar if and only if they share the same eigenvalues (with matching algebraic multiplicities) and identical Segre (or Weyr) characteristics for each eigenvalue, thereby classifying the similarity class uniquely by the Jordan block sizes.[32][33]Polynomials Associated with Jordan Form
The characteristic polynomial of a matrix A coincides with that of its Jordan canonical form J, as similar matrices share the same characteristic polynomial. Since J is a block diagonal matrix consisting of Jordan blocks, the characteristic polynomial \chi_A(t) = \det(tI - A) factors as \prod_{\lambda} (t - \lambda)^{m_{\lambda}}, where the product is over the distinct eigenvalues \lambda of A, and m_{\lambda} denotes the algebraic multiplicity of \lambda, equal to the total dimension of the generalized eigenspace for \lambda or the sum of the sizes of all Jordan blocks corresponding to \lambda. For an individual Jordan block of size k associated with eigenvalue \lambda, the characteristic polynomial is (t - \lambda)^k.[37] The minimal polynomial m_A(t) of A is the monic polynomial of least degree such that m_A(A) = 0, and it too is invariant under similarity, matching that of J. For the Jordan form, m_A(t) = \prod_{\lambda} (t - \lambda)^{k_{\lambda}}, where k_{\lambda} is the size of the largest Jordan block for eigenvalue \lambda, known as the index of \lambda. Thus, the exponent k_{\lambda} in the minimal polynomial directly reflects the longest chain of generalized eigenvectors for \lambda. For a single Jordan block of size k with eigenvalue \lambda, both the minimal and characteristic polynomials are (t - \lambda)^k.[37] By the Cayley-Hamilton theorem, the characteristic polynomial annihilates A, so the minimal polynomial divides the characteristic polynomial in the ring of polynomials; specifically, k_{\lambda} \leq m_{\lambda} for each \lambda, with equality if and only if there is exactly one Jordan block per eigenvalue. Equality of the minimal and characteristic polynomials holds precisely when each k_{\lambda} = m_{\lambda}, meaning the Jordan form has a single block of full multiplicity for each eigenvalue.[37] The structure encoded by these polynomials relates to the primary decomposition theorem, which decomposes the underlying vector space as a direct sum of the generalized eigenspaces \bigoplus_{\lambda} \Ker((A - \lambda I)^{k_{\lambda}}), where the exponents k_{\lambda} are exactly those from the minimal polynomial. On each generalized eigenspace, the restriction of A has minimal polynomial (t - \lambda)^{k_{\lambda}}, and the overall minimal polynomial is the least common multiple of these factors, yielding the product form since the linear factors (t - \lambda) are distinct. This decomposition highlights how the Jordan block structure determines the polynomial annihilators and the invariant subspace lattice.[37]Applications in Computation
Matrix Powers and Functions
One of the primary advantages of the Jordan canonical form is its utility in computing powers and analytic functions of a matrix. For a matrix A \in \mathbb{C}^{n \times n} with Jordan decomposition A = P J P^{-1}, where J is the Jordan form, any analytic function f applied to A satisfies f(A) = P f(J) P^{-1}. Since J is block-diagonal with Jordan blocks J_k(\lambda) along the diagonal, f(J) is also block-diagonal, with each block given by f(J_k(\lambda)). For a single Jordan block J_k(\lambda) of size k \times k, the entries of f(J_k(\lambda)) are determined by the Taylor series expansion of f around \lambda. Specifically, f(J_k(\lambda)) is an upper triangular Toeplitz matrix where the main diagonal consists of f(\lambda), the first superdiagonal has entries f'(\lambda), the second superdiagonal has f''(\lambda)/2!, and in general, the i-th superdiagonal (for i = 0, \dots, k-1) has entries f^{(i)}(\lambda)/i!, with zeros beyond the (k-1)-th superdiagonal. This structure arises because J_k(\lambda) = \lambda I_k + N_k, where N_k is the nilpotent shift matrix with N_k^k = 0, allowing f(J_k(\lambda)) to be expressed via the finite binomial expansion of the Taylor series truncated at degree k-1. A key application is the computation of the matrix exponential e^A, defined by the power series e^A = \sum_{m=0}^\infty A^m / m!. Using the Jordan form yields e^A = P e^J P^{-1}, where for each block, e^{J_k(\lambda)} = e^\lambda e^{N_k} = e^\lambda \sum_{i=0}^{k-1} N_k^i / i!, since higher powers of N_k vanish. This results in a closed-form expression that is particularly efficient for small block sizes. Matrix powers A^m follow similarly: A^m = P J^m P^{-1}, with J^m computed block-wise. For a Jordan block J_k(\lambda), J_k(\lambda)^m = \sum_{i=0}^{k-1} \binom{m}{i} \lambda^{m-i} N_k^i, again leveraging the nilpotency of N_k. This binomial expansion simplifies iterative power computations, especially for large m, by reducing to polynomial evaluations on the eigenvalues and nilpotent parts. As an illustrative example, consider a nilpotent Jordan block of size 3, J_3(0) = N_3 = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}. The exponential is e^{N_3} = I_3 + N_3 + N_3^2 / 2! = \begin{pmatrix} 1 & 1 & 1/2 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{pmatrix}, a finite sum terminating at the second power due to N_3^3 = 0. This demonstrates how the Jordan structure yields exact, compact expressions without infinite series. In numerical linear algebra, the Jordan form facilitates these computations theoretically but is often impractical due to its sensitivity to perturbations in A, as small changes can alter the block structure dramatically. Consequently, the Schur canonical form, which is more stable and computable via unitary transformations, is frequently preferred for approximating matrix functions and powers in practice.Numerical Computation of Jordan Form
Computing the Jordan canonical form numerically presents significant challenges due to its inherent ill-posedness. The form is not unique, as Jordan blocks corresponding to the same eigenvalue can be permuted arbitrarily, and even small perturbations in the matrix entries can drastically alter the block structure or merge/split blocks, leading to discontinuity in the entries of the Jordan form with respect to the original matrix.[38] This sensitivity arises because the Jordan form relies on the precise identification of algebraic and geometric multiplicities, which are highly unstable under rounding errors in floating-point arithmetic.[38] The standard approach to numerical computation begins with a Schur decomposition, which triangularizes the matrix via a unitary similarity transformation into an upper triangular form with eigenvalues on the diagonal, providing a stable starting point. From this form, the Jordan structure is determined by analyzing the ranks of certain submatrices or using deflation techniques to identify invariant subspaces corresponding to each eigenvalue cluster. For generalized eigenvalue problems, the QZ algorithm computes a generalized Schur decomposition, yielding block upper triangular forms for the pair of matrices.[39] The process typically involves the following steps: first, compute the eigenvalues using the QR algorithm, which iteratively applies orthogonal transformations to produce the Hessenberg form and then the Schur form. Next, group the diagonal entries by eigenvalue proximity and compute the dimensions of the generalized eigenspaces via rank determinations of deflating subspaces, often using rank-revealing QR decompositions on the Schur blocks; the sizes of the Jordan blocks are then inferred from the Weyr characteristic, which counts the number of blocks of each dimension. Finally, a change-of-basis matrix is constructed by solving for the Jordan chains within these subspaces. Early numerical methods were pioneered by J. H. Wilkinson in his 1965 monograph, which laid the groundwork for analyzing the stability of eigenvalue computations and highlighted the difficulties in obtaining the Jordan form. Modern implementations often employ staircase algorithms, originally developed in the 1970s by Ruhe and Kågström, which progressively deflate the matrix by computing staircase forms to reveal the Kronecker or Jordan structure through orthogonal transformations and rank computations. In software libraries, direct computation of the Jordan form is rarely provided due to instability; instead, LAPACK routines such as DGES (for Schur decomposition) and GGES (for the QZ algorithm) are used as proxies, allowing users to extract approximate Jordan information post hoc.[39] MATLAB'sjordan function is primarily intended for symbolic matrices and issues warnings for numerical inputs, recommending the schur function for practical computations.[40]
When the exact Jordan form proves too unstable, alternatives include the real Schur form, which uses orthogonal transformations to produce a block upper triangular matrix with 1×1 or 2×2 blocks on the diagonal for real matrices, or the Hessenberg form as an intermediate step in the QR process, both of which preserve eigenvalues while avoiding the full Jordan structure.[38]