Sylvester's determinant identity
Sylvester's determinant identity is a fundamental result in linear algebra stating that if A is an n \times m matrix and B is an m \times n matrix over a commutative ring, then \det(I_n + AB) = \det(I_m + BA), where I_k denotes the k \times k identity matrix. This equality holds even when n \neq m, allowing the determinant of a larger matrix to be reduced to that of a smaller one in cases involving low-rank updates.[1] The identity was first explicitly stated by the English mathematician James Joseph Sylvester in his 1883 paper "On the equation to the secular inequalities in the planetary theory," published in the Philosophical Magazine. Sylvester presented it as part of a broader discussion on the characteristic equations of matrix products in the context of planetary motion perturbations, noting that AB and BA share the same non-zero eigenvalues, which implies the determinant equality upon specialization.[2] Although Sylvester did not provide a full proof in the paper, subsequent works, such as those by Turnbull and Aitken in 1932, offered rigorous demonstrations using properties of determinants and block matrices. This identity, sometimes referred to as a special case of the Weinstein–Aronszajn identity, has wide applications in matrix analysis, numerical linear algebra, and related fields.[1] It facilitates computations in low-rank perturbations, such as updating determinants after rank-one modifications, which is crucial in algorithms for solving linear systems. In random matrix theory, it aids in evaluating expectations of characteristic polynomials for products of random matrices. Additionally, generalizations of the identity appear in ring theory for studying matrix invertibility and stable range conditions.[1]Statement and Notation
The Core Identity
Sylvester's determinant identity states that for an n \times m matrix A and an m \times n matrix B with entries over the real or complex numbers, \det(I_n + AB) = \det(I_m + BA), where I_k denotes the k \times k identity matrix.[3] This equality relates the determinants of two matrices of potentially different sizes, with AB being n \times n and BA being m \times m. A simple illustrative example occurs when n = m = 1, so A = and B = are scalars treated as $1 \times 1 matrices. In this case, the identity reduces to \det(1 + ab) = \det(1 + ba), or simply $1 + ab = 1 + ba, which holds since scalar multiplication is commutative. The identity is particularly useful when n \neq m, as it allows computation of the larger determinant by reducing it to the smaller one, leveraging the fact that the nonzero eigenvalues of AB and BA coincide (with the same algebraic multiplicities).[3]Assumptions and Matrix Dimensions
Sylvester's determinant identity holds for matrices A \in M_{n \times m}(R) and B \in M_{m \times n}(R), where R is a commutative ring with identity, such as the real numbers \mathbb{R} or complex numbers \mathbb{C}.[1] No invertibility is required for A or B, as the identity relies solely on the multiplicative properties of determinants over such rings.[1] The identity equates \det(I_n + AB) and \det(I_m + BA), where I_k denotes the k \times k identity matrix for dimension k. This equality persists regardless of whether n = m, n > m, or n < m. When n > m, the matrix AB has n - m additional zero eigenvalues compared to BA, but since the determinant involves the product of (1 + \lambda_i) over eigenvalues \lambda_i, these extra factors of $1 + 0 = 1 ensure the determinants remain equal without adjustment.[4] The symmetric case m > n follows analogously.[4] The identity is trivially satisfied when \min(n, m) = 0, reducing to \det(I_n) = \det(I_m) or $1 = 1, but it becomes non-trivial for \min(n, m) \geq 1, where the matrices AB and BA share the same non-zero eigenvalues with matching algebraic multiplicities.Historical Context
Discovery by Sylvester
James Joseph Sylvester introduced what is now known as his determinant identity in 1851, during his early investigations into the theory of invariants for binary quadratic forms.[5] This work emerged as part of his broader efforts to understand canonical forms and the properties preserved under linear transformations, where determinants played a central role in quantifying algebraic invariants associated with quadratic expressions.[5] In the paper "On the relation between the minor determinants of linearly equivalent quadratic functions," published in the Philosophical Magazine, Sylvester explored how minor determinants of quadratic forms remain related when the forms are subjected to equivalent linear substitutions.[5] The identity itself arose naturally in this setting, serving as a tool to connect the structure of these forms to their invariant properties, thereby facilitating the evaluation of expressions invariant under group actions in algebraic geometry.[5] Sylvester presented the identity explicitly as a lemma within this 1851 publication, applying it specifically to the computation of bordered determinants—matrices augmented with additional rows and columns derived from the original form's coefficients.[5] This formulation highlighted the identity's utility in simplifying determinant calculations for systems related by linear equivalence, laying foundational insights into the interplay between matrix structure and algebraic invariance.[5]Early Publications and Recognition
Following its initial discovery in 1851, Sylvester republished the determinant identity in his 1852 paper titled "On a theorem concerning the combination of determinants," appearing in volume 8 of the Cambridge and Dublin Mathematical Journal (pp. 60–62). This publication elaborated the identity in the context of compound determinants and quantics, providing a more detailed exposition without a proof.[6] The identity received early attention from Sylvester's contemporary Arthur Cayley, who referenced related determinant techniques in his seminal 1858 memoir "A memoir on the theory of matrices," published in the Philosophical Transactions of the Royal Society (vol. 148, pp. 17–37), where such properties underpinned the formal development of matrix operations.[7][8] By the late 19th and early 20th centuries, the identity had achieved prominence in the mathematical community, as documented in Thomas Muir's influential 1906 textbook The Theory of Determinants in the Historical Order of Development (vol. 2, Macmillan), which explicitly credits Sylvester for the theorem and traces its role in advancing determinant theory up to 1860.[9] The formal designation "Sylvester's determinant identity" solidified in 20th-century literature, such as in discussions of matrix theory, to clearly distinguish it from Sylvester's other major contributions, including the law of nullity.[1][10]Proofs
Schur Complement Approach
One elegant proof of Sylvester's determinant identity employs block matrix decomposition and the Schur complement. Consider the (n+m) \times (n+m) block matrix K = \begin{pmatrix} I_n & A \\ -B & I_m \end{pmatrix}, where A is an n \times m matrix and B is an m \times n matrix.[11] Performing elementary row operations on K—adding each of the first n rows multiplied by the corresponding rows of B to the last m rows—yields the upper block triangular matrix \begin{pmatrix} I_n & A \\ 0 & I_m + BA \end{pmatrix}, without altering the determinant. The determinant of this triangular form is thus \det(I_n) \det(I_m + BA) = \det(I_m + BA), so \det(K) = \det(I_m + BA). The Schur complement provides an alternative verification of this result and connects it to the other side of the identity. For a block matrix \begin{pmatrix} P & Q \\ R & S \end{pmatrix} with P invertible, the determinant formula is \det\begin{pmatrix} P & Q \\ R & S \end{pmatrix} = \det(P) \det(S - R P^{-1} Q). Applying this to K with P = I_n, Q = A, R = -B, and S = I_m gives \det(K) = \det(I_n) \det(I_m - (-B) I_n^{-1} A) = \det(I_m + BA), confirming the row operation outcome. To establish the full identity, apply the Schur complement formula to K using the lower-right block S = I_m as the pivot (which is invertible). This yields \det(K) = \det(I_m) \det(I_n - A I_m^{-1} (-B)) = \det(I_n + AB). Equating the two expressions for \det(K) therefore shows \det(I_n + AB) = \det(I_m + BA).Inductive Proof
The inductive proof of Sylvester's determinant identity, which states that \det(I_n + AB) = \det(I_m + BA) for an n \times m matrix A and an m \times n matrix B, is conducted by induction on the smaller dimension k = \min(m, n). Without loss of generality, assume m \leq n, so induction is on m. (Note: Dimensions adjusted to match article notation; the proof is symmetric.) For the base case, when m = 1, B is an m \times n = 1 \times n row vector? Wait, no: B m x n =1 x n row? Wait, to match: actually, since swapped, but to fix: for m=1, A n x 1 column, B 1 x n row, then AB n x n = A (n x1) B (1 x n) outer, I_n + AB; BA 1 x1 scalar, I_1 + BA. But det(I_1 + BA) =1 + BA, scalar. det(I_n + AB) =1 + B A by matrix det lemma, with u=A, v=B^T, v^T u = B A. Yes. For the base case, when m = 1, A is an n \times 1 column vector and B is a $1 \times n row vector. Then I_1 + BA = 1 + BA, where BA is the scalar product of B and A, so \det(I_1 + BA) = 1 + BA. For the other side, I_n + AB is the identity matrix plus the rank-1 outer product A B, and by the matrix determinant lemma, \det(I_n + uv^T) = 1 + v^T u with u = A and v = B^T, yielding \det(I_n + AB) = 1 + B A. Thus, the determinants are equal as scalars. For the inductive step, assume the identity holds for dimensions (m-1) \times (n-1). To verify for m \times n, now expand \det(I_m + BA) using cofactor expansion along the last row, since BA is m x m, and m <=n. The last row of I_m + BA is the m-th standard basis vector plus the m-th row of BA, which is the m-th row of B (denoted b_m, 1 x n) multiplied by A (n x m), but wait, (b_m A)_j for j=1 to m. Actually, to make it consistent, the expansion would be similar but roles swapped. The proof proceeds analogously by cofactor expansion, with the diagonal minor being I_{m-1} + B' A', where B' is B without last row ((m-1) x n), A' without last column (n x (m-1)), so B' A' (m-1) x (m-1). By IH, det(I_{m-1} + B' A') = det(I_n + A' B'') , but adjust. The other minors lead to terms that pair with the expansion of det(I_n + AB) along the last column. The expansions match term by term, with off-diagonal contributions pairing to the same value, reducing to the inductive hypothesis and completing the proof. This elementary derivation relies solely on properties of determinants and recursion, avoiding block inverses or spectral methods.[12]Eigenvalue-Based Proof
One approach to proving Sylvester's determinant identity, \det(I_n + AB) = \det(I_m + BA), over fields like \mathbb{R} or \mathbb{C}, utilizes the connection between determinants and eigenvalues. The determinant of a square matrix equals the product of its eigenvalues, counted with algebraic multiplicity. Thus, \det(I_m + BA) = \prod_{i=1}^m (1 + \lambda_i), where \{\lambda_i\}_{i=1}^m are the eigenvalues of BA \in \mathbb{R}^{m \times m}, and similarly \det(I_n + AB) = \prod_{j=1}^n (1 + \mu_j), where \{\mu_j\}_{j=1}^n are the eigenvalues of AB \in \mathbb{R}^{n \times n}. A fundamental result in matrix theory states that AB and BA share the same non-zero eigenvalues, including algebraic multiplicities. To see this, suppose \lambda \neq 0 is an eigenvalue of BA with eigenvector v \neq 0, so BA v = \lambda v. Then A (BA v) = \lambda A v, or AB (A v) = \lambda (A v). Since \lambda \neq 0 and v \neq 0, A v \neq 0, establishing \lambda as an eigenvalue of AB. The argument extends to algebraic multiplicities via the characteristic polynomials or Jordan forms, confirming the spectra match except possibly for the eigenvalue zero. The number of zero eigenvalues also aligns in a way that preserves the determinant equality. The rank of AB equals the rank of BA, so if r = \rank(AB) = \rank(BA), then BA has exactly m - r zero eigenvalues and AB has n - r zero eigenvalues. Each zero eigenvalue contributes a factor of 1 to the respective products, while the shared non-zero eigenvalues \{\lambda_k \neq 0\}_{k=1}^r contribute identical factors \prod_{k=1}^r (1 + \lambda_k). The differing numbers of 1's from the zero eigenvalues do not affect the overall products, yielding \det(I_n + AB) = \det(I_m + BA).[13]Applications
Random Matrix Theory
Sylvester's determinant identity finds significant application in random matrix theory, particularly within Gaussian ensembles where matrices have independent Gaussian entries. For an n \times m Gaussian matrix X, the identity establishes that \det(I_n + XX^T) = \det(I_m + X^T X), enabling the reduction of the determinant computation to the smaller m \times m matrix when m \ll n. This simplification is invaluable for handling high-dimensional data, as it avoids direct evaluation of large matrices while preserving exactness, and it underpins analyses of Wishart-like distributions central to multivariate statistics and signal processing. A key utilization of the identity arises in deriving exact formulas for partition functions in random matrix models, where determinants of the form \det(I + AB) represent normalization constants or generating functions. In the works of Tracy and Widom during the 1990s, the identity facilitates the computation of correlation functions and spacing distributions for eigenvalues in Gaussian unitary ensembles by interchanging matrix dimensions, thereby linking finite-dimensional calculations to asymptotic behaviors. This approach has been instrumental in establishing universality classes for eigenvalue statistics.[14] For instance, in the derivation of the Marchenko-Pastur law—which describes the limiting spectral density of sample covariance matrices from Gaussian data—the identity equates the resolvents or Stieltjes transforms across dual matrix products, allowing efficient characterization of the bulk spectrum and edge fluctuations. By applying the identity to low-rank perturbations of Wishart matrices, researchers can align the empirical spectral distribution with the quarter-circle law in the large-n limit, providing a foundational tool for free probability and deformed ensemble studies.[15]Numerical Linear Algebra
Sylvester's determinant identity is crucial in numerical linear algebra for handling low-rank updates to determinants. For example, in algorithms for Gaussian elimination over integers or finite fields, the identity allows efficient computation of determinants after rank-one modifications without full matrix inversion, reducing computational complexity from O(n^3) to O(n^2) in certain cases. This is particularly useful in symbolic computation and exact arithmetic systems, where preserving integrality is key.[12] The identity also supports the matrix determinant lemma, a special case for rank-one updates: \det(A + uv^T) = \det(A) (1 + v^T A^{-1} u), which extends to higher ranks via iterative application, aiding in Cholesky factorization updates and QR decomposition in iterative solvers. These techniques are foundational in software libraries like LAPACK for numerical stability in large-scale simulations.Linear Algebra and Control Theory
In linear algebra, Sylvester's determinant identity facilitates the analysis of matrix ranks and nullities by establishing equivalence between the invertibility of operators of the form I + AB and I + BA, where A is an m \times n matrix and B is an n \times m matrix. Specifically, the identity \det(I + AB) = \det(I + BA) implies that I + AB is invertible if and only if I + BA is invertible, allowing computations on the smaller matrix when m \neq n. This equivalence directly relates to the injectivity of the associated linear operators: for finite-dimensional spaces, the operator I + gf (where g: W \to V and f: V \to W) is injective precisely when \det(I + gf) \neq 0, which by the identity equals \det(I + fg) \neq 0, enabling efficient checks without full matrix assembly.[16] Such applications simplify rank computations for low-rank perturbations of the identity, as the nullity of I + AB equals the dimension of the kernel, which is zero under invertibility conditions derived from the determinant. In control theory, the identity plays a key role in analyzing state-space models, particularly for stability and observability assessments. For instance, in the design of observers and pole-placement problems, it provides algebraic conditions for the existence of feedback gains that achieve desired system poles, by relating the rank of augmented system matrices to determinant evaluations.[17] This approach, developed in the 1970s, avoids direct solution of large Sylvester equations and instead uses the identity to verify full rank or invertibility in multivariable systems.[17] Additionally, in Kalman filtering contexts, the identity equates determinants of observability Gramians across different realizations of linear systems, aiding in the confirmation of observability without exhaustive trajectory simulations; for example, it shows that the empirical observability Gramian remains positive definite under certain input conditions, supporting filter convergence and stability analysis.[18] A practical example arises in discrete-time systems, where verifying the positive definiteness of I + AB—crucial for Lyapunov stability in perturbed models—can be performed efficiently using the identity. If B represents a low-dimensional control input matrix, computing \det(I + BA) on the reduced n \times n form (assuming m < n) determines whether all eigenvalues of I + AB exceed zero, confirming positive definiteness without evaluating the full spectrum of the larger matrix. This technique is particularly valuable in Rosenbrock-style system matrices for multivariable control, where it streamlines stability checks in real-time implementations.[17]Generalizations and Extensions
Bordered Determinant Form
The bordered determinant form of Sylvester's identity establishes a relationship between the determinant of a square matrix and the determinants of matrices formed by bordering one of its leading principal submatrices with selected rows and columns from the original matrix. This generalization is particularly useful in combinatorial matrix analysis and invariant theory, as it expresses higher-order determinants in terms of bordered minors. Consider an n \times n matrix M and an integer t with $0 \leq t \leq n-1. Define a_{i,j}^{(t)} as the determinant of the (t+1) \times (t+1) submatrix obtained from the leading principal submatrix of M of order t by adjoining the i-th row and j-th column of M, where i,j > t. With the convention a_{0,0}^{(-1)} = 1, the identity states: \det(M) \cdot \left[ a_{t,t}^{(t-1)} \right]^{n-t-1} = \det \begin{pmatrix} a_{t+1,t+1}^{(t)} & \cdots & a_{t+1,n}^{(t)} \\ \vdots & \ddots & \vdots \\ a_{n,t+1}^{(t)} & \cdots & a_{n,n}^{(t)} \end{pmatrix}. This formula links the full determinant to a bordered principal minor of order n-t.[19] A special case arises when t = n-1, corresponding to bordering the matrix with single row and column vectors v^T and u. In this rank-1 bordering, the determinant of the (n+1) \times (n+1) matrix \begin{pmatrix} M & u \\ v^T & 0 \end{pmatrix} equals -v^T \adj(M) u, where \adj(M) denotes the adjugate matrix of M. Assuming \det(M) \neq 0, this simplifies to -\det(M) \cdot v^T M^{-1} u, or equivalently, the bordered determinant divided by \det(M) yields -v^T M^{-1} u. This vector-bordered variant directly connects to bilinear forms in invariant theory and serves as a foundational tool for computing adjugate entries via cofactor expansion. Sylvester first introduced this bordered form in 1851, focusing on relations among minor determinants of linearly equivalent systems of binary quadratic forms within the framework of classical invariant theory.[20] This predates later formulations of the identity, such as the block matrix version involving non-commuting operators A and B, where the bordered structure equivalently represents the blocks A and B to derive \det(I + AB) = \det(I + BA). The bordered approach thus provides an early structural generalization rooted in enumerative combinatorics of forms.Versions for Infinite-Dimensional Operators
Sylvester's determinant identity admits a natural extension to infinite-dimensional settings via Fredholm determinants on Hilbert spaces. Consider bounded linear operators A: H \to K and B: K \to H, where H and K are (possibly distinct) separable Hilbert spaces. If AB (acting on H) and BA (acting on K) belong to the trace class ideal \mathcal{S}_1, then the Fredholm determinants satisfy \det(I_H + AB) = \det(I_K + BA), provided the infinite products defining these determinants converge, which is ensured by the trace-class condition on the singular values. This equality follows from the fact that AB and BA share the same nonzero eigenvalues (with matching algebraic multiplicities), and the Fredholm determinant is the exponential of the trace of the logarithm, or equivalently, the product \prod (1 + \lambda_i) over eigenvalues \lambda_i. The trace-class requirement stems from the foundational work on nuclear operators in the 1950s, building on Grothendieck's thesis (1953) and Lidskii's trace formula (1956), which established that the trace of a trace-class operator equals the sum of its eigenvalues. Extensions of Lidskii's formula to Fredholm determinants underpin the identity, as the logarithm of the determinant relates directly to the trace via \log \det(I + T) = \mathrm{tr} \log(I + T) for trace-class T. These developments were crucial for handling perturbations of the identity operator in functional analysis. In applications, this generalized identity appears in quantum field theory for regularizing functional determinants of differential operators arising in path integrals, and in scattering theory for expressing the scattering matrix via determinants of resolvent operators. For instance, in one-dimensional quantum scattering, the transmission coefficient can be represented as a ratio of Fredholm determinants involving the Jost solutions. However, the identity applies only to nuclear (trace-class) perturbations; for unbounded operators, such as Dirac or Laplacian operators in QFT, regularization techniques like zeta-function or heat-kernel methods are necessary to define the determinant, as the trace may diverge without cutoff.| Aspect | Finite-Dimensional Case | Infinite-Dimensional Extension |
|---|---|---|
| Operators | Matrices A \in \mathbb{C}^{n \times m}, B \in \mathbb{C}^{m \times n} | Bounded A: H \to K, B: K \to H with AB, BA \in \mathcal{S}_1(H), \mathcal{S}_1(K) |
| Determinant | Standard \det(I + AB) = \det(I + BA) | Fredholm \det(I + AB) = \det(I + BA), product over eigenvalues |
| Convergence | Always (finite) | Requires trace-class for eigenvalue summability |
| Key Reference | Sylvester (1851) | Gohberg et al. (1990); Simon (2005) for applications |