Symmetric matrix

In linear algebra, a symmetric matrix is defined as a square matrix A that equals its own transpose, satisfying A = A^T.^[1] This property implies that the element in the (i,j)-th position equals the element in the (j,i)-th position, making the matrix symmetric across its main diagonal.^[2] Symmetric matrices exhibit several fundamental properties that distinguish them from general matrices. All eigenvalues of a real symmetric matrix are real numbers, and the matrix is always diagonalizable.^[2] Moreover, the eigenvectors corresponding to distinct eigenvalues are orthogonal, allowing the matrix to be decomposed via the spectral theorem as A = Q \Lambda Q^T, where Q is an orthogonal matrix (satisfying Q^T Q = I) and \Lambda is a diagonal matrix containing the real eigenvalues.^[1] This orthogonal diagonalization highlights the role of symmetric matrices in geometry and inner product spaces, where they represent self-adjoint operators with respect to orthonormal bases.^[1] A subclass of symmetric matrices, known as positive definite matrices, has all positive eigenvalues and satisfies x^T A x > 0 for all nonzero vectors x; these are essential in optimization problems, covariance matrices in statistics, and stability analysis in physics.^[2] Symmetric matrices also arise naturally in quadratic forms, as any such form x^T A x (for real A and x) equals x^T S x, where S = (A + A^T)/2 is symmetric.^[1] Their role extends to applications in principal component analysis, quantum mechanics (via Hermitian matrices, the complex analog), and solving systems of equations in numerical methods.

Definition and Examples

Definition

In linear algebra, the transpose of a matrix A, denoted A^T, is the matrix obtained by reflecting A over its main diagonal, which interchanges the rows and columns of A. Formally, if A = (a_{ij}) is an m \times n matrix, then the entry in row i, column j of A^T is (A^T)_{ij} = a_{ji}. For example, consider the $2 \times 2 matrix

A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix};

its transpose is

A^T = \begin{pmatrix} 1 & 3 \\ 2 & 4 \end{pmatrix}.

A symmetric matrix is defined as a square matrix A over a field (typically the real or complex numbers) that is equal to its own transpose, i.e., A = A^T. Equivalently, the entries of A satisfy a_{ij} = a_{ji} for all indices i and j, meaning the matrix is unchanged under reflection over its main diagonal. This property requires A to be square, as the transpose of a non-square matrix has different dimensions and thus cannot equal the original.^[3] Symmetric matrices arise in various contexts, such as quadratic forms and covariance matrices, with concrete examples provided in the dedicated section.^[4]^[5]

Examples

A concrete example of a real symmetric 2×2 matrix is

A = \begin{pmatrix} a & b \\ b & c \end{pmatrix},

where a, b, and c are real numbers, ensuring the off-diagonal entries are equal.^[3] To verify symmetry, the transpose is computed as

A^T = \begin{pmatrix} a & b \\ b & c \end{pmatrix} = A,

confirming A = A^T.^[3] A specific numerical instance is

A = \begin{pmatrix} 4 & 1 \\ 1 & -2 \end{pmatrix},

which satisfies the symmetry condition since the (1,2) and (2,1) entries both equal 1.^[3] A trivial yet illustrative symmetric matrix is any diagonal matrix, such as the 3×3 form

D = \begin{pmatrix} d_1 & 0 & 0 \\ 0 & d_2 & 0 \\ 0 & 0 & d_3 \end{pmatrix},

where d_1, d_2, d_3 are real numbers; here, all off-diagonal elements are zero, so D^T = D.^[3] In contrast, a skew-symmetric matrix provides a non-example of symmetry. For instance, the 2×2 matrix

K = \begin{pmatrix} 0 & b \\ -b & 0 \end{pmatrix},

with real b \neq 0, yields K^T = -K \neq K.^[6]

Basic Properties

Equality to Transpose

A symmetric matrix A is defined as a square matrix that equals its own transpose, denoted A = A^T. This equality implies that each entry satisfies a_{ij} = a_{ji} for all indices i and j.^[3] For off-diagonal elements where i \neq j, the condition requires a_{ij} = a_{ji}, meaning these entries appear in symmetric pairs across the main diagonal. The diagonal elements a_{ii}, however, are unconstrained by this relation since they compare to themselves.^[3]^[7] This property extends to the matrix's representation of bilinear forms. Specifically, a symmetric matrix A corresponds to a symmetric bilinear form B(\mathbf{x}, \mathbf{y}) = \mathbf{x}^T A \mathbf{y}, satisfying B(\mathbf{x}, \mathbf{y}) = B(\mathbf{y}, \mathbf{x}). For real symmetric matrices and vectors in the context of an inner product space with the standard dot product, this yields \langle \mathbf{x}, A \mathbf{y} \rangle = \langle A \mathbf{x}, \mathbf{y} \rangle for all vectors \mathbf{x} and \mathbf{y}.^[8] The notion of symmetry holds over arbitrary fields, including the real and complex numbers, as long as the field's characteristic is not 2, which ensures compatibility with decompositions like the sum of symmetric and skew-symmetric forms. In characteristic 2, the transpose equality still defines the matrix, but certain bilinear form properties may differ.^[9]

Trace and Determinant

The trace of a symmetric matrix A, denoted \operatorname{tr}(A), is defined as the sum of its diagonal entries, \operatorname{tr}(A) = \sum_{i=1}^n a_{ii}. For any matrix, the trace equals the trace of its transpose, \operatorname{tr}(A) = \operatorname{tr}(A^T), because the diagonal entries remain unchanged under transposition. This invariance is particularly straightforward for symmetric matrices, where A = A^T by definition, emphasizing the scalar's preservation under the matrix's inherent symmetry. Computationally, evaluating the trace requires only O(n) operations for an n \times n matrix, making it an efficient invariant for applications like monitoring convergence in iterative methods for symmetric systems.^[10] The determinant of a symmetric matrix A, denoted \det(A), satisfies \det(A) = \det(A^T), a property that holds for all square matrices since the determinant is computed via permutations of rows or columns, which transposition merely swaps without altering the value. For a $2 \times 2 symmetric matrix

A = \begin{pmatrix} a & b \\ b & c \end{pmatrix},

the determinant is given by

\det(A) = ac - b^2.

This formula illustrates how symmetry constrains the off-diagonal entries to be equal, simplifying computation while yielding the same result as for the transpose. The determinant serves as a key scalar measure of the matrix's singularity and scaling effects in linear transformations.^[11] The rank of a symmetric matrix A, defined as the dimension of its column space (or equivalently, row space), equals the rank of its transpose, \operatorname{rank}(A) = \operatorname{rank}(A^T). This equality is a general fact for any matrix, arising from the isomorphism between row and column spaces under transposition, but it reinforces the structural consistency inherent to symmetric matrices where A = A^T. By the rank-nullity theorem, this preservation implies that the nullity, or dimension of the kernel \dim \ker(A), is identical for A and A^T.^[12]

Advanced Properties

Normality and Unitary Diagonalization

A symmetric matrix over the real numbers is a special case of a normal matrix. A matrix A is defined to be normal if it commutes with its adjoint, that is, A A^* = A^* A, where A^* denotes the conjugate transpose. For a real symmetric matrix, the transpose A^T coincides with the adjoint A^*, and since A = A^T, it follows that A A^* = A^2 = (A^*)^T A = A^* A, confirming that real symmetric matrices are normal.^[13]^[14] The spectral theorem for normal matrices asserts that every normal matrix is unitarily diagonalizable over the complex numbers. Specifically, there exists a unitary matrix U and a diagonal matrix D such that

A = U D U^*,

where the entries of D are the eigenvalues of A. This decomposition highlights the algebraic structure preserved under unitary similarity transformations.^[14]^[15] This normality property underscores the commutativity of symmetric matrices with their adjoints, enabling the unitary diagonalization and facilitating applications in quantum mechanics and signal processing where preservation of inner products is essential.^[16]

Real Symmetric Eigenvalues

A fundamental property of real symmetric matrices is that all their eigenvalues are real numbers. Consider a real symmetric matrix A \in \mathbb{R}^{n \times n} and a complex eigenvalue \lambda = \alpha + i\beta with corresponding eigenvector v = x + iy, where x, y \in \mathbb{R}^n and \beta \neq 0. The quadratic form v^* A v = \lambda \|v\|^2 must be real because A is symmetric, implying v^* A v = (v^* A v)^* = \overline{\lambda} \|v\|^2, which forces \beta = 0 and thus \lambda real.^[16] Real symmetric matrices are orthogonally diagonalizable, meaning there exists an orthogonal matrix Q (satisfying Q^T Q = I) and a diagonal matrix \Lambda with real entries on the diagonal such that A = Q \Lambda Q^T. This decomposition follows from the spectral theorem, which guarantees a complete set of orthonormal eigenvectors spanning \mathbb{R}^n.^[17] The eigenvectors of a real symmetric matrix corresponding to distinct eigenvalues are orthogonal. For eigenvalues \lambda_1 \neq \lambda_2 with eigenvectors u_1, u_2, the relation u_1^T A u_2 = \lambda_2 u_1^T u_2 and symmetry yield u_1^T A u_2 = \lambda_1 u_1^T u_2, implying (\lambda_1 - \lambda_2) u_1^T u_2 = 0 and thus u_1^T u_2 = 0. In cases of repeated eigenvalues, an orthonormal basis for the eigenspace can be constructed using the Gram-Schmidt process.^[15] A real symmetric matrix A is positive definite if and only if all its eigenvalues are positive. This equivalence holds because the quadratic form x^T A x = x^T Q \Lambda Q^T x = (Q^T x)^T \Lambda (Q^T x) is a sum of terms \lambda_i y_i^2 with y = Q^T x, which is positive for all nonzero x precisely when each \lambda_i > 0. Sylvester's criterion provides an equivalent test: A is positive definite if and only if all leading principal minors are positive.^[18]

Complex Symmetric Cases

In the complex numbers, a matrix A is called symmetric if it equals its transpose, A = A^T, where the transpose does not involve conjugation. This condition contrasts with the Hermitian case, A = A^* (conjugate transpose), which imposes additional structure leading to real eigenvalues and unitary diagonalizability. Complex symmetric matrices arise in areas such as quantum mechanics with PT-symmetry and certain optimization problems over complexes, but they lack the self-adjointness that ensures many spectral niceties of Hermitian matrices.^[19] Unlike their real counterparts, complex symmetric matrices can possess non-real eigenvalues, and their eigenvectors are not necessarily orthogonal with respect to the standard Hermitian inner product. Complex symmetric matrices are normal only in special cases, such as when they are also diagonal or permutations thereof, but in general, they deviate from the normality that enables simultaneous unitary diagonalization with the adjoint. A representative example is the $2 \times 2 matrix

A = \begin{pmatrix} 0 & i \\ i & 0 \end{pmatrix},

which is complex symmetric and has characteristic polynomial \lambda^2 + 1 = 0, yielding eigenvalues \lambda = \pm i, both non-real. The corresponding eigenvectors [1, 1]^T and [1, -1]^T happen to be orthogonal in this instance, but this is not guaranteed; for instance, the matrix \begin{pmatrix} 1 & i \\ i & 2 \end{pmatrix} has non-real eigenvalues (3 \pm i \sqrt{3})/2 with eigenvectors whose Hermitian inner product is nonzero.^[19]^[20] The spectral theory of complex symmetric matrices is captured by the Takagi factorization (also known as Autonne–Takagi), which decomposes any n \times n complex symmetric matrix A as A = U D U^T, where U is unitary and D is diagonal with non-negative real entries (the Takagi values, analogous to singular values). This factorization highlights that while eigenvalues may be complex and eigenvectors non-orthogonal, the matrix admits a symmetric-like decomposition emphasizing its singular value structure rather than an eigenvalue basis. The Takagi factorization underpins applications in numerical linear algebra and the analysis of non-Hermitian operators, with algorithmic computations relying on iterative methods like the simultaneous iteration or Newton-based approaches for stability.^[21]

Decompositions

Symmetric-Skew Decomposition

Every square matrix A over a field of characteristic not equal to 2 admits a unique decomposition into the direct sum of a symmetric matrix and a skew-symmetric matrix, expressed as A = S + K, where S is symmetric (S^T = S) and K is skew-symmetric (K^T = -K). This decomposition arises because the vector space of all n \times n matrices over such a field is the direct sum of the subspace of symmetric matrices (dimension n(n+1)/2) and the subspace of skew-symmetric matrices (dimension n(n-1)/2), with their intersection being only the zero matrix.^[9] The explicit formulas for the components are S = \frac{A + A^T}{2} and K = \frac{A - A^T}{2}. To verify, note that S^T = \frac{A^T + (A^T)^T}{2} = \frac{A^T + A}{2} = S, confirming symmetry, and similarly K^T = \frac{A^T - A}{2} = -K, confirming skew-symmetry; their sum recovers A since S + K = \frac{A + A^T + A - A^T}{2} = A. Uniqueness follows from the direct sum property: if A = S_1 + K_1 = S_2 + K_2, then S_1 - S_2 = K_2 - K_1, which is both symmetric and skew-symmetric, hence zero, implying S_1 = S_2 and K_1 = K_2. For a concrete example, consider the $2 \times 2 matrix

A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}.

The transpose is A^T = \begin{pmatrix} 1 & 3 \\ 2 & 4 \end{pmatrix}, so

S = \frac{1}{2} \begin{pmatrix} 2 & 5 \\ 5 & 8 \end{pmatrix} = \begin{pmatrix} 1 & 2.5 \\ 2.5 & 4 \end{pmatrix},

which is symmetric, and

K = \frac{1}{2} \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} = \begin{pmatrix} 0 & -0.5 \\ 0.5 & 0 \end{pmatrix},

which is skew-symmetric; indeed, A = S + K. This decomposition preserves the trace of the matrix, as \operatorname{trace}(S) = \operatorname{trace}(A) and \operatorname{trace}(K) = 0. The zero trace of the skew-symmetric part holds because \operatorname{trace}(K) = \operatorname{trace}\left( \frac{A - A^T}{2} \right) = \frac{1}{2} \left( \operatorname{trace}(A) - \operatorname{trace}(A^T) \right) = \frac{1}{2} \left( \operatorname{trace}(A) - \operatorname{trace}(A) \right) = 0, using the cyclic property that the trace is invariant under transposition.

Eigenvalue Decomposition

Every real symmetric matrix admits an eigenvalue decomposition, also known as the spectral decomposition, given by A = Q \Lambda Q^T, where Q is an orthogonal matrix whose columns are the orthonormal eigenvectors of A, and \Lambda is a diagonal matrix containing the real eigenvalues of A on its main diagonal.^[16] This decomposition follows from the spectral theorem, which guarantees that real symmetric matrices are orthogonally diagonalizable, with all eigenvalues real and eigenvectors forming an orthonormal basis for the underlying vector space.^[22] Since Q is orthogonal, Q^{-1} = Q^T, ensuring the decomposition is both unique up to permutation of eigenvalues and computationally stable for many applications.^[23] A key property enabling this decomposition is that for real symmetric matrices, the algebraic multiplicity of each eigenvalue equals its geometric multiplicity, implying the matrix is always diagonalizable without defective eigenvalues.^[24] This contrasts with general matrices, where Jordan canonical forms may be required for non-diagonalizable cases. The orthonormal nature of the eigenvectors arises because eigenvectors corresponding to distinct eigenvalues are orthogonal, and those for repeated eigenvalues can be chosen to form an orthonormal set within the eigenspace.^[16] For complex symmetric matrices, the situation differs: such matrices are not necessarily diagonalizable by a unitary matrix, though they may admit an orthogonal diagonalization if the eigenvectors can be selected accordingly; in defective cases, which occur but are less common, a Jordan canonical form is needed instead.^[25] Algorithms exist to handle diagonalization when possible, often involving Takagi factorization as an intermediate step.^[26] Eigenvalues of symmetric matrices can be computed using the Rayleigh quotient, defined as R(x) = \frac{x^T A x}{x^T x} for a nonzero vector x, which provides an estimate of an eigenvalue, with the maximum value over unit vectors equaling the largest eigenvalue and the minimum equaling the smallest.^[27] This quotient is central to iterative methods like the power method or Lanczos algorithm for approximating the spectral decomposition efficiently.^[28]

Singular Value Decomposition

The singular value decomposition (SVD) of a real matrix A \in \mathbb{R}^{m \times n} is a factorization A = U \Sigma V^T, where U \in \mathbb{R}^{m \times m} and V \in \mathbb{R}^{n \times n} are orthogonal matrices, and \Sigma \in \mathbb{R}^{m \times n} is a rectangular diagonal matrix with non-negative real numbers \sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0 (the singular values) on the diagonal, and zeros elsewhere, with r = \min(m, n).^[29] For a real symmetric matrix A \in \mathbb{R}^{n \times n}, the SVD is A = U \Sigma V^T, where U, V are orthogonal matrices whose columns are the eigenvectors of A (up to signs), and the singular values are the absolute values of the eigenvalues of A, i.e., \sigma_i = |\lambda_i(A)|.^[29] This follows because the eigenvalues of A^T A = A^2 are the squares of the eigenvalues of A, making the singular values \sigma_i = \sqrt{\lambda_i(A^2)} = |\lambda_i(A)|, with the left and right singular vectors coinciding with the eigenvectors of A up to signs for negative eigenvalues.^[29] When A is positive semidefinite, all eigenvalues are non-negative, so the singular values equal the eigenvalues, V = U, and the SVD coincides with the eigenvalue decomposition A = U \Lambda U^T where \Lambda has the eigenvalues on the diagonal.^[29] In general, the SVD offers greater numerical stability for computing decompositions of symmetric matrices compared to direct eigenvalue methods, particularly in ill-conditioned cases or when the matrix is nearly singular, as the algorithms (such as the Golub-Reinsch bidiagonalization followed by QR iterations) ensure backward stability with small perturbations relative to the machine epsilon.^[30] Consider the 2×2 symmetric matrix

A = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}.

The eigenvalues are \lambda_1 = 1 and \lambda_2 = -1, with corresponding orthonormal eigenvectors \mathbf{q}_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix} and \mathbf{q}_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}. The singular values are $1 and $1. A valid SVD is

A = U \Sigma V^T, \quad \Sigma = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix},

where U = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} and V = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}. Here, the second column of V is -\mathbf{q}_2 to account for the negative eigenvalue.^[29]

Applications

Hessian in Optimization

In multivariable calculus, the Hessian matrix of a twice continuously differentiable function f: \mathbb{R}^n \to \mathbb{R} at a point is the symmetric n \times n matrix whose (i,j)-th entry is the second partial derivative H_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j}.^[31] This symmetry arises from Clairaut's theorem (also known as Schwarz's theorem), which states that if the second partial derivatives are continuous in a neighborhood of the point, then the mixed partials are equal: \frac{\partial^2 f}{\partial x_i \partial x_j} = \frac{\partial^2 f}{\partial x_j \partial x_i}.^[31] The continuity assumption ensures the Hessian is well-defined and symmetric, facilitating its use in analyzing the local behavior of the function near critical points where the gradient vanishes. The Hessian plays a central role in the second derivative test for classifying critical points of f. At such a point \mathbf{a}, if the Hessian H(\mathbf{a}) is positive definite—all eigenvalues positive—then f has a strict local minimum at \mathbf{a}; if negative definite—all eigenvalues negative—then a strict local maximum; and if indefinite—eigenvalues of mixed signs—then a saddle point.^[32] If the Hessian is positive semidefinite or negative semidefinite but not definite, or singular, the test is inconclusive. This classification leverages the spectral properties of real symmetric matrices, where positive definiteness corresponds to the quadratic form \mathbf{h}^T H(\mathbf{a}) \mathbf{h} > 0 for all nonzero \mathbf{h} \in \mathbb{R}^n.^[32] For example, consider f(x,y) = x^2 + y^2 + xy. The critical point is at (0,0), and the Hessian is the constant matrix

H = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}.

The eigenvalues are found by solving \det(H - \lambda I) = (2-\lambda)^2 - 1 = 0, yielding \lambda = 1 and \lambda = 3, both positive, so H is positive definite and (0,0) is a local minimum.^[31] In optimization, the Hessian is integral to Newton's method for finding minima of f. Starting from an initial guess \mathbf{x}^{(0)}, the update is \mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} - H(\mathbf{x}^{(k)})^{-1} \nabla f(\mathbf{x}^{(k)}), where the inverse Hessian approximates the curvature to achieve quadratic convergence near the optimum under suitable conditions like positive definiteness of H.^[33] This second-order approach outperforms first-order methods like gradient descent in regions of strong convexity but requires efficient computation or approximation of the inverse Hessian for large-scale problems.^[33]

Quadratic Forms

A quadratic form on a real vector space is a homogeneous polynomial of degree two, expressed as q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, where A is an n \times n symmetric matrix and \mathbf{x} \in \mathbb{R}^n.^[34] This representation is unique for the symmetric matrix A, as any quadratic form can be rewritten in this manner by symmetrizing the coefficient matrix if necessary, ensuring the off-diagonal elements contribute correctly without altering the form.^[35] Symmetric matrices enable the classification of quadratic forms based on their sign properties, which correspond to the eigenvalues of A. A quadratic form q is positive definite if q(\mathbf{x}) > 0 for all \mathbf{x} \neq \mathbf{0}, positive semidefinite if q(\mathbf{x}) \geq 0 for all \mathbf{x}, negative definite if q(\mathbf{x}) < 0 for all \mathbf{x} \neq \mathbf{0}, negative semidefinite if q(\mathbf{x}) \leq 0 for all \mathbf{x}, and indefinite otherwise.^[36] These classifications are determined by the eigenvalues of the symmetric matrix A: A is positive definite if and only if all eigenvalues are positive, positive semidefinite if all eigenvalues are non-negative, and similarly for the negative cases; indefiniteness occurs when eigenvalues have mixed signs.^[37] Sylvester's law of inertia further guarantees that the number of positive, negative, and zero eigenvalues (the inertia) is invariant under congruence transformations, providing a complete signature for the form.^[38] Since real symmetric matrices are orthogonally diagonalizable, there exists an orthogonal matrix Q such that A = Q D Q^T, where D = \operatorname{diag}(\lambda_1, \dots, \lambda_n) contains the eigenvalues. Substituting the change of variables \mathbf{y} = Q^T \mathbf{x} (which preserves the Euclidean norm), the quadratic form simplifies to

q(\mathbf{x}) = \mathbf{y}^T D \mathbf{y} = \sum_{i=1}^n \lambda_i y_i^2.

This diagonal representation highlights the form's structure in principal coordinates aligned with the eigenvectors.^[34] Geometrically, for a positive definite quadratic form, the level set \{\mathbf{x} \mid q(\mathbf{x}) = 1\} defines an ellipsoid centered at the origin, with principal axes along the eigenvectors of A and semi-axes lengths scaled by the reciprocals of the square roots of the eigenvalues. The sublevel set \{\mathbf{x} \mid q(\mathbf{x}) \leq 1\} then forms the solid ellipsoid, illustrating how the form encodes ellipsoidal geometry in \mathbb{R}^n. For indefinite forms, the level sets are hyperboloids.^[39]

Symmetrizable Matrices

A symmetrizable matrix is defined as a square matrix A over the real numbers for which there exists an invertible matrix D such that D^{-1} A D is symmetric.^[40] Over the reals, a matrix is symmetrizable if and only if it has a symmetric Jordan form, meaning its Jordan canonical form is a real diagonal matrix.^[41] This condition is equivalent to the matrix being diagonalizable over the reals with all eigenvalues real, as the spectral theorem guarantees that real symmetric matrices are orthogonally diagonalizable with real eigenvalues, and a real diagonal matrix is symmetric.^[42] All symmetric matrices are symmetrizable, with D taken as the identity matrix.^[40] Examples of symmetrizable matrices include all real diagonal matrices, which are already symmetric. Non-derogatory matrices with distinct real eigenvalues also qualify, as they are diagonalizable over the reals.^[42]

Congruence to Symmetric Matrices

In linear algebra, two square matrices A and B of the same size are said to be congruent if there exists an invertible matrix P such that B = P^T A P. This transformation preserves the essential properties of the quadratic form associated with a symmetric matrix, such as its signature. Congruence is particularly relevant for real symmetric matrices, where it allows for the classification of equivalence classes based on invariant features.^[43] Sylvester's law of inertia, named after James Joseph Sylvester who proved it in 1852, states that two real symmetric matrices are congruent if and only if they have the same inertia. The inertia of a real symmetric matrix is the triple (n_+, n_-, n_0), where n_+ is the number of positive eigenvalues, n_- the number of negative eigenvalues, and n_0 the number of zero eigenvalues (with n_+ + n_- + n_0 = n for an n \times n matrix). This law implies that the rank (which is n_+ + n_-) and the signature (typically n_+ - n_-) are also invariants under congruence. The law provides a complete classification: any real symmetric matrix is congruent to a unique (up to permutation of entries) diagonal matrix with n_+ entries of +1, n_- entries of -1, and n_0 entries of $0 on the diagonal.^[38]^[43]^[44] This congruence property holds for any real symmetric matrix, achievable through a suitable choice of basis that diagonalizes the associated quadratic form while preserving the inertia. In practice, completing the square or using the Cholesky decomposition (for positive definite cases) can facilitate this reduction, though the general case relies on spectral analysis to determine the inertia first. The invariance ensures that geometric interpretations, such as the number of positive and negative directions in the quadratic form, remain unchanged across congruent representations.^[38]^[43] For example, consider the quadratic form q(\mathbf{x}) = x_1^2 + 2x_1 x_2 - 3x_2^2, represented by the symmetric matrix

A = \begin{pmatrix} 1 & 1 \\ 1 & -3 \end{pmatrix}.

The eigenvalues of A are \lambda_1 = \frac{1 + \sqrt{13}}{2} > 0 and \lambda_2 = \frac{1 - \sqrt{13}}{2} < 0, so the inertia is (1, 1, 0). By Sylvester's law, A is congruent to the diagonal matrix \operatorname{diag}(1, -1). An explicit congruence can be found using a change of variables that completes the square: let y_1 = x_1 + x_2 and y_2 = x_2, yielding P = \begin{pmatrix} 1 & -1 \\ 0 & 1 \end{pmatrix} (invertible), and P^T A P = \begin{pmatrix} 1 & 0 \\ 0 & -4 \end{pmatrix}, which is congruent to \operatorname{diag}(1, -1) via scaling (though scaling by nonsingular diagonal matrices preserves congruence classes up to the law). This reduction highlights how congruence simplifies the quadratic form to a sum of squares and negative squares, revealing its hyperbolic nature.^[38]^[43]