A quadratic form is a homogeneous polynomial of degree two in a finite number of variables, generalizing the notion of a quadratic equation to multiple variables and expressible as q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, where A is a symmetric matrix and \mathbf{x} is a column vector.[1][2] These forms arise naturally in diverse mathematical contexts, including linear algebra, where they are associated with symmetric matrices, and number theory, where binary quadratic forms like ax^2 + bxy + cy^2 are used to study Diophantine equations and class numbers.[1]Over the real numbers, quadratic forms can be diagonalized by orthogonal transformations, and Sylvester's law of inertia classifies them by their signature—the numbers of positive, negative, and zero eigenvalues—determining if they are positive definite, indefinite, or semidefinite.[1][2]Historically, quadratic forms have roots in ancient mathematics, with developments by Babylonian mathematicians around 1800 BCE, Brahmagupta in 628 CE, and Carl Friedrich Gauss in 1801.[3][1]Applications include defining conic sections and quadrics in geometry, principal component analysis in statistics via covariance matrices,[2] and modeling spacetime in physics, such as the (3,1) signature of Minkowski space.[4]
Fundamentals
Definition and basic properties
A quadratic form over a field F (of characteristic not equal to 2) is a homogeneous polynomial of degree two in n variables x_1, \dots, x_n with coefficients in F. It takes the general formQ(\mathbf{x}) = \sum_{1 \leq i \leq j \leq n} a_{ij} x_i x_j,where the coefficients satisfy a_{ij} = a_{ji} to ensure symmetry in the cross terms, though this is implicit in the summation convention.[5][6] This expression defines a map Q: F^n \to F that captures the quadratic nature through its terms.The defining property of quadratic homogeneity distinguishes quadratic forms from general homogeneous polynomials, which may have terms of arbitrary equal degree greater than two. Specifically, for any scalar \lambda \in F and vector \mathbf{x} \in F^n,Q(\lambda \mathbf{x}) = \lambda^2 Q(\mathbf{x}).This scaling behavior ensures all monomials contribute equally to the degree-two structure, enabling unique algebraic manipulations not available to higher-degree forms.[5] A key immediate property arising from this homogeneity is the additivity relationQ(\mathbf{x} + \mathbf{y}) + Q(\mathbf{x} - \mathbf{y}) = 2 Q(\mathbf{x}) + 2 Q(\mathbf{y}),which holds for all \mathbf{x}, \mathbf{y} \in F^n and allows the quadratic form to be reconstructed from its associated symmetric bilinear form via polarization.[5]Basic examples illustrate these properties. Consider the binary quadratic form Q(x, y) = x^2 + 2xy + y^2 over the real numbers, which simplifies to (x + y)^2 and satisfies the homogeneity Q(\lambda x, \lambda y) = \lambda^2 (x + y)^2 as well as the additivity relation. This form demonstrates how cross terms like $2xy arise naturally in the expansion while preserving the overall quadratic structure.[6]
Every quadratic form Q on a vector space V over a field of characteristic not 2, such as the real or complex numbers, induces an associated symmetric bilinear form B: V \times V \to \mathbb{F}. This association is given by the polarization identity:B(x, y) = \frac{Q(x + y) - Q(x - y)}{4},which expresses the bilinear form directly in terms of the quadratic form and establishes a bridge between quadratic and bilinear structures.[7][8]The form B is bilinear by construction, as the right-hand side is a linear combination of quadratic terms that linearize under the operations. Moreover, B is symmetric, satisfying B(x, y) = B(y, x) for all x, y \in V, since Q(x + y) - Q(x - y) = Q(y + x) - Q(y - x). Conversely, the quadratic form recovers from the bilinear form via Q(x) = B(x, x), confirming the direct correspondence between the two.[7][8]This association is unique: for any quadratic form Q, there exists exactly one symmetric bilinear form B such that Q(x) = B(x, x) for all x \in V, and every symmetric bilinear form arises this way from its associated quadratic form. This uniqueness follows from the polarization identity, which fully determines B from Q, and the symmetry ensures no other bilinear form can produce the same quadratic values on the diagonal.[7][8][9]For a concrete illustration, consider the quadratic form Q(x, y) = x^2 + 2xy + y^2 on \mathbb{R}^2. Applying the polarization identity yields the associated symmetric bilinear formB((x_1, y_1), (x_2, y_2)) = x_1 x_2 + x_1 y_2 + y_1 x_2 + y_1 y_2,which is symmetric and satisfies Q(x, y) = B((x, y), (x, y)). This example demonstrates how cross terms in Q distribute evenly in B, reflecting the underlying symmetry.[8]This coordinate-free construction of B from Q facilitates later representations in terms of matrices, though the bilinear form itself remains independent of any basis choice.[7]
Matrix representation
In finite-dimensional vector spaces over the reals or complex numbers, quadratic forms are concretely represented using symmetric matrices. Consider a quadratic form Q: \mathbb{R}^n \to \mathbb{R} defined on column vectors x = (x_1, \dots, x_n)^T. It can be expressed as Q(x) = x^T A x, where A = (a_{ij}) is an n \times n symmetric matrix with real entries satisfying A = A^T, meaning a_{ij} = a_{ji} for all i, j. This representation arises from the associated symmetric bilinear form, where the off-diagonal entries capture the cross terms via $2a_{ij} for i \neq j.For the general case of n variables, A is an n \times n symmetric matrix, and every quadratic form on \mathbb{R}^n admits a unique such representation. To see this, suppose a quadratic form is initially given by Q(x) = x^T B x for some (possibly nonsymmetric) matrix B. Then Q(x) = x^T \left( \frac{B + B^T}{2} \right) x, since the skew-symmetric part \frac{B - B^T}{2} contributes x^T \left( \frac{B - B^T}{2} \right) x = 0 for all x, as \left( x^T C x \right)^T = x^T C^T x = -x^T C x implies it vanishes. Thus, replacing B with its symmetric part yields the desired form, proving the existence and uniqueness up to the choice of basis.Under a change of basis, the matrix representation transforms accordingly. If P is an invertible n \times n matrix whose columns are the basis vectors, and x = P y expresses coordinates in the new basis, then Q(x) = y^T (P^T A P) y, so the new matrix is A' = P^T A P. This congruence transformation preserves the quadratic nature of the form while reflecting the coordinate change.For a concrete example, consider the quadratic form Q(x, y) = x^2 + 2xy + y^2 in two variables. Its symmetric matrix representation isA = \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix},since Q\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x & y \end{pmatrix} \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = x^2 + 2xy + y^2.
Quadratic forms over the reals
Diagonalization and spectral theorem
The spectral theorem provides a fundamental tool for analyzing quadratic forms over the real numbers, as these forms are represented by symmetric matrices. It states that every real symmetric matrix A \in \mathbb{R}^{n \times n} is orthogonally diagonalizable, meaning there exists an orthogonal matrix O (satisfying O^T O = I) and a diagonal matrix D = \operatorname{diag}(\lambda_1, \dots, \lambda_n) with real entries \lambda_i such that A = O D O^T.[10] This decomposition arises because symmetric matrices are self-adjoint with respect to the standard Euclidean inner product, ensuring all eigenvalues are real and eigenvectors can be chosen to form an orthonormal basis.[11]In the context of a quadratic form Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, the spectral theorem enables a change of variables \mathbf{x} = O \mathbf{y} that transforms Q into a sum of squares:Q(\mathbf{x}) = \mathbf{y}^T D \mathbf{y} = \sum_{i=1}^n \lambda_i y_i^2.This orthogonal transformation preserves the Euclidean norm and simplifies the form to a diagonal expression without cross terms, facilitating geometric and analytic interpretations.[12] The eigenvalues \lambda_i on the diagonal directly reflect the scaling factors along the principal axes defined by the columns of O.A proof sketch proceeds by induction on the dimension n. For symmetric A, the characteristic polynomial has real roots, so A has at least one real eigenvalue \lambda with eigenvector \mathbf{v}; normalizing gives an orthonormal \mathbf{v}. The eigenspace and its orthogonal complement are invariant under A, reducing the problem to smaller symmetric matrices on these subspaces.[10] Distinct eigenvalues yield orthogonal eigenvectors, as \mathbf{u}^T A \mathbf{v} = \lambda \mathbf{u}^T \mathbf{v} and symmetry imply \mathbf{u}^T A \mathbf{v} = \mu \mathbf{u}^T \mathbf{v}, so (\lambda - \mu) \mathbf{u}^T \mathbf{v} = 0. For multiplicities, Gram-Schmidt orthogonalization extends to a full orthonormal eigenbasis, forming O.[11]For example, consider the symmetric matrixA = \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}.The characteristic equation \det(A - \lambda I) = \lambda(\lambda - 2) = 0 yields eigenvalues \lambda_1 = 2 and \lambda_2 = 0. The corresponding orthonormal eigenvectors are \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix} and \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}, forming the columns ofO = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}.Thus, A = O D O^T with D = \operatorname{diag}(2, 0), and the quadratic form becomes Q(\mathbf{x}) = 2u^2 + 0 \cdot v^2 under \mathbf{x} = O \begin{pmatrix} u \\ v \end{pmatrix}.[12]
Positive semidefinite forms
A quadratic form Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, where A is a symmetric real matrix, is said to be positive semidefinite if Q(\mathbf{x}) \geq 0 for all \mathbf{x} \in \mathbb{R}^n. It is positive definite if Q(\mathbf{x}) > 0 for all \mathbf{x} \neq \mathbf{0}.[13][14]The matrix A associated with a positive semidefinite quadratic form has all eigenvalues \lambda_i \geq 0, while for positive definite forms, all \lambda_i > 0. Additionally, A is positive semidefinite if and only if all its principal minors are nonnegative, and positive definite if all leading principal minors are positive.[15][16]For positive definite matrices, the Cholesky decomposition provides a factorization A = L L^T, where L is a lower triangular matrix with positive diagonal entries, enabling efficient numerical computations and stability in algorithms.[17]Positive semidefinite quadratic forms underpin convex optimization problems, where they induce norms \|\mathbf{x}\|_A = \sqrt{Q(\mathbf{x})} for positive definite cases, facilitating distance metrics in least squares and semidefinite programming formulations.[18][19]
Classification by signature
Over the real numbers, quadratic forms are classified up to congruence by their signature, which captures the number of positive, negative, and zero eigenvalues in the diagonalized form. The signature of an n-dimensional real quadratic form is denoted (p, q, r), where p is the number of positive eigenvalues, q the number of negative eigenvalues, and r the nullity (dimension of the kernel), satisfying p + q + r = n.[20][21]This classification implies that two real quadratic forms are congruent if and only if they share the same signature (p, q, r). For non-degenerate forms, where r = 0 and the associated symmetric matrix has full rank n = p + q, equivalence holds for any pair (p, q) with the same p and q.[22][20]Indefinite quadratic forms, characterized by signatures where both p > 0 and q > 0, represent both positive and negative values and include hyperbolic types such as the form x^2 - y^2 with signature (1, 1). For instance, a quadratic form with signature (2, 1) in three variables, such as x^2 + y^2 - z^2, corresponds geometrically to hyperboloids.[22][20]
Equivalence and invariants
Sylvester's law of inertia
Sylvester's law of inertia asserts that two real symmetric matrices are congruent over the real numbers if and only if they have the same inertia, defined as the triple (p, q, r), where p is the number of positive eigenvalues (counting multiplicities), q is the number of negative eigenvalues, and r = p + q is the rank of the matrix.[23] This invariance holds because congruence transformations preserve the signature of the quadratic form, ensuring that the counts of positive, negative, and zero eigenvalues remain unchanged under change of basis.[24]The law was first enunciated by James Joseph Sylvester in 1852, in his paper demonstrating the reduction of homogeneous quadratic polynomials to sums of positive and negative squares via real orthogonal substitutions.[23] Sylvester presented it as a remark easily proved, building on earlier ideas from Gauss and unpublished work by Jacobi, though he provided a more detailed proof in subsequent publications around 1853.[24]A standard proof of the law uses induction on the dimension n of the underlying real vector space, often combined with the method of completing the square to reduce the form step by step.[24] For the base case n=1, the quadratic form ax^2 has inertia (1,0,1) if a>0, (0,1,1) if a<0, or (0,0,0) if a=0, which is trivially invariant. Assuming the result holds for dimension n-1, consider an n-dimensional nondegenerate quadratic form Q(\mathbf{x}) represented by an invertible symmetric matrix A. Select a vector \mathbf{v} \neq \mathbf{0} such that Q(\mathbf{v}) \neq 0; without loss of generality, assume Q(\mathbf{v}) > 0 by scaling if necessary. Extend \{\mathbf{v}\} to a basis \{\mathbf{v}, \mathbf{e}_2, \dots, \mathbf{e}_n\} and change coordinates so that the form becomes Q(y_1, \dots, y_n) = y_1^2 + 2 \sum_{i=2}^n b_i y_1 y_i + Q'(y_2, \dots, y_n), where Q' is the induced quadratic form on the hyperplane spanned by \{\mathbf{e}_2, \dots, \mathbf{e}_n\}. Completing the square yields Q = (y_1 + \sum b_i y_i)^2 + Q''(y_2, \dots, y_n), where Q'' is obtained by congruence from Q' and has one fewer positive contribution if the leading term is positive. By induction, Q'' diagonalizes to a form with inertia (p-1, q, r-1) or (p, q-1, r-1) depending on the sign adjustment, ensuring the total inertia (p, q, r) matches that of the original. For degenerate cases, restrict to the nondegenerate subspace of dimension r and apply the nondegenerate result, with the kernel contributing zeros. This process preserves the signs and non-degeneracy, confirming the invariance.[25]As a corollary, real quadratic forms are classified up to congruence by their rank r and signature s = p - q, providing a complete set of invariants for the equivalence classes over the reals.[24]
Congruence and reduction
Two symmetric matrices A and B representing quadratic forms over the reals are said to be congruent if there exists an invertible matrix P such that B = P^T A P. This transformation corresponds to a change of basis in the underlying vector space, preserving the quadratic form up to equivalence. Congruence defines an equivalence relation on the space of symmetric matrices, partitioning them into classes that share the same intrinsic properties, such as rank and signature.[26]Over the reals, congruence allows reduction of any quadratic form to a canonical diagonal form, as guaranteed by the spectral theorem for symmetric matrices, which ensures diagonalization via an orthogonal congruence (where P is orthogonal, so P^T = P^{-1}). Unlike the Jordan canonical form, which applies to similarity transformations (B = P^{-1} A P) for general linear operators and may involve non-diagonal Jordan blocks for defective eigenvalues, congruence transformations do not yield Jordan forms because they preserve the bilinear structure rather than the linear action alone; symmetric matrices over the reals are always diagonalizable under congruence, avoiding nondiagonal blocks.[27][28]Reduction algorithms exploit congruence to systematically eliminate cross terms and achieve the diagonal canonical form. Lagrange's method, introduced in 1759, proceeds iteratively by completing the square: for a quadratic form q(\mathbf{x}) = \sum_{i,j} a_{ij} x_i x_j with a_{11} \neq 0, rewrite the terms involving x_1 as a_{11} (x_1 + \sum_{j>1} b_j x_j)^2 - (quadratic in remaining variables), substituting a new variable for the completed square and recursing on the reduced form in n-1 variables until diagonal. This process terminates in finitely many steps, yielding a sum of squares (or differences) without cross terms.[29]For binary quadratic forms q(x,y) = a x^2 + b x y + c y^2 over the reals, Gauss's reduction algorithm provides an efficient path to the canonical form by applying integer shifts and swaps via SL(2,\mathbb{Z}) transformations, which are special congruences. Start with the form and repeatedly apply nearest-integer shifts to minimize |b| (using matrices like \begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix} for integer k) until |b| \leq a \leq c, or swap variables if a > c, ensuring convergence to a unique reduced representative due to the decreasing discriminant bound; over the reals, this aligns with the diagonal form by further orthogonal adjustment if needed.[30]A concrete example is reducing q(x,y) = x^2 + 2xy + y^2, whose matrix is \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}. The eigenvalues are 2 and 0, with orthonormal eigenvectors \frac{1}{\sqrt{2}}(1,1)^T and \frac{1}{\sqrt{2}}(1,-1)^T. The orthogonal matrix P with these columns yields P^T A P = \begin{pmatrix} 2 & 0 \\ 0 & 0 \end{pmatrix}, so q(x,y) = 2u^2 in new variables u = \frac{x+y}{\sqrt{2}}, v = \frac{x-y}{\sqrt{2}} (with the zero term omitted), demonstrating rank-1 degeneracy via rotation.[29]
Invariants over general fields
Over fields of characteristic not equal to 2, the theory of quadratic forms is governed by Witt's decomposition theorem, which decomposes any quadratic space (V, q) uniquely (up to isometry) as an orthogonal direct sum V \cong I \cdot H \oplus V_0, where H is the hyperbolic plane, I is the Witt index (the maximal number of hyperbolic planes), and V_0 is an anisotropic quadratic space (the anisotropic kernel).[6] This decomposition highlights key invariants: the dimension n = \dim V, the Witt index I, and the isometry class of the anisotropic kernel V_0. Unlike over the reals, where the signature provides a complete classification, these invariants do not always suffice globally, necessitating local invariants for broader fields.Witt's theorem provides foundational results on equivalence and structure in this setting. Specifically, the cancellation theorem states that if two quadratic spaces U_1 and U_2 satisfy U_1 \oplus W \cong U_2 \oplus W for some quadratic space W, then U_1 \cong U_2.[6] Additionally, Witt's extension theorem (often called Witt's lemma) asserts that any isometry between subspaces of two quadratic spaces can be extended to an isometry of the full spaces under certain nondegeneracy conditions.[6] Regarding isotropicity, a quadratic space is isotropic if and only if it contains a hyperbolic plane as an orthogonal summand, allowing the Witt index to measure the "size" of the isotropic part.[6]For classification over local fields (such as the p-adic numbers \mathbb{Q}_p or the reals), nondegenerate quadratic forms are determined up to isometry by three invariants: the dimension n, the discriminant d(q) \in K^\times / (K^\times)^2 (the determinant of the associated symmetric matrix modulo squares), and the Hasse invariant c(q) \in \{ \pm 1 \}, which encodes local Brauer class information via Hilbert symbols.[31] Over the rationals \mathbb{Q}, the Hasse-Minkowski theorem establishes a local-global principle: two quadratic forms are equivalent if and only if they are equivalent over \mathbb{R} (where equivalence is by signature, as discussed earlier) and over every \mathbb{Q}_p for primes p.[31] For example, the ternary form x^2 + y^2 - 3z^2 is equivalent over \mathbb{Q} to one that represents zero nontrivially, as it does so locally everywhere.[31]Over finite fields \mathbb{F}_q with q odd, nondegenerate quadratic forms are classified solely by dimension and discriminant: two such forms are isometric if and only if they have the same dimension and the same discriminant in \mathbb{F}_q^\times / (\mathbb{F}_q^\times)^2.[32] For instance, in three variables over \mathbb{F}_5, the forms x^2 + y^2 + z^2 and $2x^2 + 2y^2 + 2z^2 are equivalent since both have discriminant 2 (a nonsquare) modulo squares. Over the complex numbers \mathbb{C}, an algebraically closed field of characteristic zero, all nondegenerate quadratic forms of the same dimension are equivalent to the standard sum of squares \sum_{i=1}^n x_i^2, as every element is a square and forms diagonalize orthogonally.[33]
Geometric and algebraic interpretations
Quadrics and conic sections
In Euclidean space \mathbb{R}^n, a quadratic form Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, where A is a symmetric matrix, defines quadric hypersurfaces as its level sets \{\mathbf{x} \in \mathbb{R}^n \mid Q(\mathbf{x}) = c\} for constants c \neq 0. These surfaces are preserved under affine transformations, which maintain the geometric type of the quadric while allowing changes in position, orientation, and scaling. Pure quadratic forms yield central quadrics (ellipsoids, hyperboloids, cones, cylinders), while general quadric hypersurfaces, including linear terms, encompass paraboloids as well, with the signature classifying the quadratic part in both cases.[34][35]The classification of these quadrics depends on the signature of A, which counts the number of positive, negative, and zero eigenvalues and determines the surface type via Sylvester's law of inertia. For positive definite forms (all eigenvalues positive, signature (n,0)), the level sets for c > 0 are ellipsoids, bounded and compact surfaces like \frac{x^2}{a^2} + \frac{y^2}{b^2} + \frac{z^2}{c^2} = 1 in \mathbb{R}^3. Indefinite forms (mixed positive and negative eigenvalues, e.g., signature (2,1)) yield hyperboloids: one-sheet for certain c values, such as \frac{x^2}{a^2} + \frac{y^2}{b^2} - \frac{z^2}{c^2} = 1, featuring hyperbolic cross-sections and unbounded extent; or two-sheet for others like \frac{x^2}{a^2} + \frac{y^2}{b^2} - \frac{z^2}{c^2} = -1. Degenerate cases (zero eigenvalues) produce cylindrical hypersurfaces, such as elliptic cylinders \frac{x^2}{a^2} + \frac{y^2}{b^2} = c (z arbitrary; signature (2,0,1)) or hyperbolic cylinders \frac{x^2}{a^2} - \frac{y^2}{b^2} = c (z arbitrary; signature (1,1,1)). Paraboloids, such as elliptic paraboloids \frac{x^2}{a^2} + \frac{y^2}{b^2} = z or hyperbolic paraboloids \frac{x^2}{a^2} - \frac{y^2}{b^2} = z, are ruled surfaces arising in general quadric equations that include linear terms, with the quadratic part having the corresponding signature.[34][35]In projective geometry, quadrics generalize to homogeneous equations in real projective space \mathbb{RP}^n, where conic sections arise as one-dimensional quadrics in \mathbb{RP}^2. A conic is the zero set of a homogeneous quadratic form Q(x,y,z) = 0, such as x^2 + y^2 - z^2 = 0, which in the affine plane (setting z=1) describes a circle but projectively unifies ellipses, parabolas, and hyperbolas under transformations. For instance, the quadratic form Q(x,y,z) = x^2 + y^2 - z^2 defines a hyperboloid of one sheet in \mathbb{R}^3 as the level set Q(\mathbf{x}) = 1, but in \mathbb{RP}^3, its projective completion includes points at infinity, revealing a smooth quadric surface ruled by two families of lines. Nondegenerate conics over \mathbb{R} are projectively equivalent if they contain real points, emphasizing the role of quadratic forms in capturing geometric invariants beyond Euclidean metrics.[36]
Inner product spaces
In finite-dimensional real vector spaces, a positive definite quadratic form Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, where A is a symmetric positive definite matrix, induces an inner product via the associated symmetric bilinear form B(\mathbf{x}, \mathbf{y}) = \mathbf{x}^T A \mathbf{y}, such that Q(\mathbf{x}) = B(\mathbf{x}, \mathbf{x}). The inner product is then given by \langle \mathbf{x}, \mathbf{y} \rangle = B(\mathbf{x}, \mathbf{y}), which can be recovered from Q using the polarization identity:\langle \mathbf{x}, \mathbf{y} \rangle = \frac{1}{2} \left[ Q(\mathbf{x} + \mathbf{y}) - Q(\mathbf{x}) - Q(\mathbf{y}) \right].This construction ensures that \langle \mathbf{x}, \mathbf{x} \rangle = Q(\mathbf{x}) > 0 for \mathbf{x} \neq 0, satisfying the positive definiteness axiom of an inner product.[37] Conversely, every inner product on \mathbb{R}^n arises from such a quadratic form, with the matrix A representing the inner product relative to the standard basis.[38]With respect to an orthonormal basis adapted to this inner product, the quadratic form diagonalizes to Q(\mathbf{x}) = \sum_{i=1}^n x_i^2, where the x_i are the coordinates in that basis.[39] This diagonal representation highlights the geometric role of the quadratic form as the squared Euclidean norm in the induced metric, enabling the study of angles, orthogonality, and lengths directly through [Q](/page/Q). The existence of such a basis follows from the spectral theorem for symmetric matrices, which guarantees an orthogonal diagonalization for A.[40]This framework extends naturally to infinite-dimensional settings through Hilbert spaces, where positive definite quadratic forms define sesquilinear forms on dense subspaces, often associated with self-adjoint operators via the representation theorem. In a Hilbert space \mathcal{H}, a closed positive quadratic form q(\phi) = \langle A \phi, \phi \rangle for a positive self-adjoint operator A generates the inner product structure, with the domain of q typically a dense subspace where the form is defined and continuous.[41] Such forms are closable and lower semi-bounded, ensuring the associated operator is well-defined, and they play a central role in functional analysis by linking quadratic functionals to spectral properties.[42]In physics, these quadratic forms often represent energy functionals in variational principles, such as the kinetic energy operator in quantum mechanics on a Hilbert space of wave functions, where q(\psi) = \int |\nabla \psi|^2 \, d\mathbf{x} corresponds to the Dirichlet form for the Laplacian.[43] In statistics, the covariance matrix of a random vector induces a quadratic form that defines an inner product on the space of centered random variables, with Q(X) = \operatorname{Cov}(X, X) measuring variance and facilitating principal component analysis through diagonalization./04%3A_Expected_Value/4.11%3A_Vector_Spaces_of_Random_Variables)
Polarization identity
The polarization identity provides a means to recover the associated symmetric bilinear form B from a quadratic form Q on a real vector space V. For vectors x, y \in V, the identity states thatB(x, y) = \frac{1}{4} \left[ Q(x + y) - Q(x - y) \right].This formula holds in any dimension n, as it applies componentwise or directly in the vector space structure, and is equivalent to the summation form B(x, y) = \frac{1}{4} \sum_{\epsilon = \pm 1} \epsilon \, Q(\epsilon x + y), since Q is even (i.e., Q(-z) = Q(z) for all z \in V).[7][6]Geometrically, the polarization identity connects the quadratic form to the geometry of angles in inner product spaces, where Q(x) = \langle x, x \rangle induces the norm \|x\| = \sqrt{Q(x)}. It arises from applying the law of cosines to the diagonals of the parallelogram spanned by x and y: the terms Q(x + y) and Q(x - y) represent the squared lengths of these diagonals, allowing recovery of the inner product \langle x, y \rangle = \|x\| \|y\| \cos \theta, where \theta is the angle between x and y. This links the quadratic form's "energy" or "length" interpretation to directional correlations via bilinear pairings.[7]The identity extends to higher even-degree homogeneous polynomials via higher-order polarization formulas, which recover multilinear forms from the polynomial by averaging over sign changes or roots of unity in multiple variables. For instance, the complete polarization of a degree $2k form expresses a k-linear form in terms of differences of the original polynomial evaluated at scaled sums.[44]As an example, consider the multivariable quadratic form Q(x_1, x_2) = a x_1^2 + 2 h x_1 x_2 + b x_2^2 on \mathbb{R}^2. Applying the polarization identity yields the cross term B(e_1, e_2) = h, where e_1, e_2 are the standard basis vectors, thus isolating the bilinear coefficient h that captures the interaction between variables without direct matrix manipulation.[7]
Integral and arithmetic quadratic forms
Definitions over integers
An integral quadratic form is a quadratic form Q: \mathbb{Z}^n \to \mathbb{Z} whose associated symmetric matrix A = (a_{ij}) has integer entries, so that Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x} = \sum_{i,j=1}^n a_{ij} x_i x_j for \mathbf{x} \in \mathbb{Z}^n, with a_{ij} = a_{ji} \in \mathbb{Z}.[45] Such forms arise naturally in number theory, where they encode arithmetic properties of integer lattices.[46]A primitiveintegral quadratic form is one for which the greatest common divisor of all its coefficients a_{ij} is 1.[47] This condition ensures that the form is not a scalar multiple of another integral form by an integer greater than 1, preserving certain representation properties in the theory of quadratic fields.[46]For binary integral quadratic forms, the discriminant is defined as \det(2A), where A is the associated $2 \times 2 symmetric matrix.[48] For the standard binary form Q(x,y) = a x^2 + b x y + c y^2, this yields \det(2A) = 4ac - b^2, an integerinvariant that classifies forms up to equivalence and relates to the discriminant of quadratic number fields.[46]Two integral quadratic forms Q and Q' with matrices A and A' are said to be \mathrm{SL}(n,\mathbb{Z})-equivalent if there exists a matrix U \in \mathrm{SL}(n,\mathbb{Z}) such that A' = U^T A U, corresponding to an integer change of variables with determinant 1 that preserves the lattice \mathbb{Z}^n.[49] This equivalence relation groups forms into classes that share the same arithmetic invariants, such as the discriminant.[46]
Representation of numbers
In number theory, the representation problem for integral quadratic forms concerns determining which integers can be expressed by a given form. Specifically, for an integral quadratic form Q: \mathbb{Z}^n \to \mathbb{Z}, an integer m is said to be represented by Q if there exists a vector x \in \mathbb{Z}^n such that Q(x) = m. This problem is central to understanding the arithmetic properties of quadratic forms and has applications in Diophantine equations.[50]A prominent example is the representation of integers as sums of squares, corresponding to the quadratic form Q(x_1, \dots, x_k) = x_1^2 + \dots + x_k^2. Lagrange's four-square theorem asserts that every natural number can be expressed as the sum of at most four integer squares, meaning g(2) = 4, where g(k) denotes the smallest number such that every natural number is a sum of at most g(k) k-th powers. This result implies that the quaternary form Q in four variables represents all positive integers, though some require exactly four squares, such as numbers of the form $4^n(8k + 7). The theorem provides a complete characterization for k = 2, contrasting with higher powers where more terms may be needed for all sufficiently large integers.[51]For binary quadratic forms, which are forms Q(x, y) = ax^2 + bxy + cy^2 with integer coefficients and discriminant d = b^2 - 4ac, the representation problem is closely tied to equivalence classes under the action of \mathrm{SL}(2, \mathbb{Z}). Two such forms are equivalent if one can be obtained from the other via an integerchange of variables with determinant 1, and equivalent forms represent precisely the same set of integers. The distinct equivalence classes, known as the representation classes, partition the possible sets of represented integers for forms of a fixed discriminant. The number of these classes is given by the class number h(d), which measures the complexity of the representation problem for that discriminant; for negative d, h(d) is finite, and forms in different classes may represent disjoint sets of primes or integers.[52][50]The Hasse-Minkowski theorem provides a local-global principle that resolves many representation questions over the rationals. It states that a quadratic form over \mathbb{Q} represents a given rational if and only if the form Q - m t^2 is isotropic over \mathbb{R} and every \mathbb{Q}_p for primes p. For integral representations over \mathbb{Z}, the situation is more complex, as rational representations do not guarantee integer solutions; additional tools from the arithmetic theory of quadratic forms, such as genus and spinor genus, are required to determine exact integral representability.[53][54]
Universal and positive definite forms
A universal integral quadratic form is a positive definite quadratic form with integer coefficients that represents every positive integer, meaning for every positive integer m, there exist integers x_1, \dots, x_n such that Q(x_1, \dots, x_n) = m.[55]The sum of four squares, Q(x,y,z,w) = x^2 + y^2 + z^2 + w^2, is a classic example of a universal quaternary form, as established by Lagrange's four-square theorem.[55]No positive definite ternary integral quadratic form is universal, due to local obstructions at certain primes. For instance, the form Q(x,y,z) = x^2 + y^2 + 7z^2 is positive definite but fails to represent 3, since sums of two squares cannot be congruent to 3 modulo 4, and adding $7z^2 (with z \neq 0) exceeds 3 without achieving it.[56]The Conway-Schneeberger fifteen theorem, proved by Bhargava, states that a positive definite integral quadratic form is universal if and only if it represents all positive integers from 1 to 15.[57]Bhargava and Hanke extended this with the 290-theorem, showing that universality is equivalent to representing a specific set of 29 positive integers up to 290, providing a finite computational criterion.[55]Bhargava enumerated all 204 primitively universal positive definite quaternary integral quadratic forms up to \mathrm{SL}_4(\mathbb{Z})-equivalence, where primitive universality requires representations with \gcd(x_1,x_2,x_3,x_4)=1.[56]For five, six, and seven variables, Bhargava and Hanke classified all universal forms using "escalator" constructions that build higher-dimensional forms from lower ones, yielding finite lists: 509 for five variables, 11,804 for six, and 257,632 for seven.[55]In eight or more variables, universal forms can be systematically enumerated via escalators from lower-dimensional universal forms, though the counts grow rapidly and all such forms over \mathbb{Z} remain finite for fixed dimension.[55]A positive definite integral quadratic form satisfies Q(\mathbf{x}) > 0 for all nonzero \mathbf{x} \in \mathbb{Z}^n . These forms are classified up to integral equivalence using genus theory, where the genus consists of all forms locally equivalent over \mathbb{Z}_p for every prime p (and over \mathbb{R}), and the class number is the number of global equivalence classes within a genus, which is finite for positive definite forms by the Minkowski-Siegel mass formula.[58]Genus theory distinguishes equivalence classes via local invariants, such as the Jordan decomposition or Hasse invariants, enabling the study of representation properties within genera.[59]
Generalizations and extensions
Over arbitrary fields and rings
Quadratic forms generalize naturally to modules over commutative rings with identity. For a commutative ring R, a quadratic module is a pair (M, q), where M is an R-module (typically free or projective of finite rank) and q: M \to R is a quadratic form satisfying q(\lambda x) = \lambda^2 q(x) for \lambda \in R and x \in M, with the associated polar bilinear form b(x, y) = q(x + y) - q(x) - q(y) being R-bilinear and symmetric. This setup extends the field case by allowing non-invertible scalars, which introduces challenges like non-cancellation of forms; for instance, over polynomial rings R = k for a field k, quadratic forms may not cancel even if stably equivalent. Comprehensive treatments of such modules appear in foundational works on the subject.[60]Equivalence of quadratic modules over rings is defined via isometry: (M, q) \cong (M', q') if there exists an R-module isomorphism \phi: M \to M' such that q'(\phi(x)) = q(x) for all x \in M. For classification, stable isometry is used, where two modules are stably isometric if they become isometric after adding copies of the hyperbolic plane H = (R \oplus R, q((x,y)) = xy). The Witt group W(R) is the abelian group formed by isometry classes of non-degenerate quadratic modules modulo the subgroup generated by hyperbolic modules, with direct sum as the operation; this captures stable equivalence and has been computed for various rings, including local and regular ones. Over semi-local rings, forms are classified by rank, discriminant, and Hasse-Witt invariants, generalizing field invariants.[61][60]Examples illustrate these concepts over specific rings. Over the Gaussian integers R = \mathbb{Z}, quadratic forms on free modules of rank 4, such as q(x, y, z, w) = x^2 + i y^2 + z^2 + i w^2, are universal, representing every element of \mathbb{Z} non-trivially, which aids in studying representation problems in quadratic number fields. For polynomial rings like \mathbb{Z}_p where p is prime, quadratic modules often admit cancellation under certain stability conditions, as shown in early results on projective modules.[62][63]p-adic quadratic forms arise over the ring of p-adic integers \mathbb{Z}_p, the completion of \mathbb{Z} at prime p, where modules are free \mathbb{Z}_p-modules equipped with integral-valued quadratic forms. Classification relies on local invariants: for odd p, non-degenerate forms are determined by dimension, determinant in (\mathbb{Z}_p^\times / (\mathbb{Z}_p^\times)^2), and Hasse invariant; at p=2, additional Jordan invariants account for the dyadic structure. The Witt group W(\mathbb{Z}_p) is finite and supports the local-global principle for forms over number rings via completions. These local structures address gaps in global theory by providing invariants for arithmetic equivalence.[60][64]
Hermitian and sesquilinear forms
In the context of complex vector spaces, a sesquilinear form on a vector space V over \mathbb{C} is a map B: V \times V \to \mathbb{C} that is linear in the first argument and conjugate-linear (antilinear) in the second argument, meaning B(au + bv, w) = a B(u, w) + b B(v, w) and B(u, av + bw) = \bar{a} B(u, v) + \bar{b} B(u, w) for all scalars a, b \in \mathbb{C} and vectors u, v, w \in V.[65] Such forms generalize bilinear forms to account for complex conjugation, enabling the study of structures like inner products in Hilbert spaces.[65]A sesquilinear form B is Hermitian if it satisfies B(v, u) = \overline{B(u, v)} for all u, v \in V, which ensures that the associated quadratic form Q(x) = B(x, x) takes real values.[65] This condition mirrors the symmetry of real bilinear forms but incorporates conjugation to preserve reality on the diagonal.[66] The quadratic form Q derived from a Hermitian sesquilinear form B thus satisfies Q(ax) = |a|^2 Q(x) for a \in \mathbb{C}, reflecting the sesquilinear nature.[65]Hermitian quadratic forms play a central role in defining inner products on \mathbb{C}^n, where a form is positive definite if Q(x) > 0 for all nonzero x \in \mathbb{C}^n, analogous to the real case but with unitary transformations preserving the structure.[65] For instance, the standard positive definite Hermitian form on \mathbb{C}^2 is given by Q(z) = |z_1|^2 + |z_2|^2, where z = (z_1, z_2), corresponding to the sesquilinear form B(z, w) = z_1 \overline{w_1} + z_2 \overline{w_2}.[65]Classification of Hermitian forms proceeds via unitary diagonalization: any Hermitian matrix representing such a form is unitarily similar to a diagonal matrix with real eigenvalues, determining the form up to unitary equivalence by the signature (number of positive, negative, and zero eigenvalues).[67] This parallels the spectral theorem for real symmetric forms but uses unitary matrices instead of orthogonal ones, ensuring eigenvalues lie on the real line rather than the unit circle.[68]
Multivariable and higher-degree analogs
Quadratic forms in n variables, often termed n-ary quadratic forms, generalize the binary and ternary cases by associating a symmetric bilinear form to an n \times n matrix over a ring such as \mathbb{Z} or \mathbb{Q}, where the form is q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x} for \mathbf{x} \in R^n.[69] Classification of these forms up to equivalence under the action of \mathrm{GL}_n(R) relies on invariants like the discriminant, which is the determinant of A up to scaling by units in R, but becomes increasingly intricate as n grows due to the exponential rise in the dimension of the space of forms and the need to account for local-global principles across primes.[69] For n \geq 5, every positive definite n-ary quadratic form over \mathbb{Z} represents all but finitely many positive integers (i.e., is almost regular), with the specific exceptions depending on the form.[69] Scaling factors in the coefficients affect local densities of representations, such as \delta_p(q) at prime p, and can render forms anisotropic in low dimensions, but for large n, near-universality mitigates these issues.[69]Higher-degree analogs extend quadratic forms to homogeneous polynomials of degree d \geq 4, such as quartic forms q(\mathbf{x}) = \sum_{i_1 + \cdots + i_n = 4} a_{i_1 \cdots i_n} x_1^{i_1} \cdots x_n^{i_n}, which capture more complex algebraic structures like hypersurfaces of higher codimension.[70] These forms lack the direct link to inner products that quadratics enjoy but can be associated with symmetric multilinear forms via polarization, a process that recovers the original polynomial by evaluating the multilinear form on equal arguments, generalizing the quadratic polarization identity to higher degrees.[70] For instance, the polarization of a degree-d homogeneous polynomial yields a symmetric d-linear form, enabling the study of tensor decompositions and apolarity in commutative algebra.[70]Such higher-degree forms connect to additive problems like Waring's problem, which asks for the minimal s(k) such that every natural number is a sum of at most s(k) k-th powers, extending the quadratic case of sums of squares to representations by degree-k forms in multiple variables.[71] In the polynomial setting, analogs decompose homogeneous forms as sums of d-th powers of linear forms, with bounds on the number of terms depending on the number of variables and degree, as explored in extensions over rings like \mathbb{Z}[x_1, \dots, x_n].[71]Computationally, analyzing ideals generated by higher-degree homogeneous forms often employs Gröbner bases, which provide a canonical set of generators facilitating membership tests and variety computations despite the combinatorial explosion in degrees and variables.[72] For bihomogeneous ideals arising from such forms, specialized algorithms exploit multigrading to reduce complexity, outperforming standard methods on systems where degrees exceed those of quadratic cases, though runtime remains doubly exponential in the worst case.[72]
Historical development
Origins in geometry and number theory
The study of quadratic forms traces its roots to ancient mathematical inquiries into geometric figures and arithmetic representations, particularly through Diophantine equations involving sums of squares. In the 3rd century AD, Diophantus of Alexandria explored problems in his Arithmetica that equated to expressing numbers as sums of squares, such as finding three squares summing to a given square, laying early groundwork for quadratic Diophantine analysis.[73][74] These efforts focused on integer solutions to quadratic equations, bridging geometry and number theory without modern algebraic notation.[74]Independently, in 7th-century India, Brahmagupta advanced the arithmetic side through his Brahmasphutasiddhanta (628 AD), where he solved general quadratic equations and provided methods for the Pell equation x^2 - ny^2 = 1, a binary quadratic form central to representing integers via quadratic expressions.[75][76] His compositional identity for sums of squares further connected quadratic forms to multiplicative structures in number theory, influencing later developments in Diophantine approximations.[76]By the 17th century, European mathematicians built on these foundations with a focus on prime representations. Pierre de Fermat, in a 1640 letter to Marin Mersenne, asserted that every prime of the form $4k + 1 can be uniquely expressed as the sum of two squares, a claim rooted in arithmetic investigations and later proven by Euler.[77][78] This theorem highlighted quadratic forms' role in classifying primes, motivating systematic studies of integer solutions to equations like x^2 + y^2 = p.[79]Parallel geometric motivations emerged from René Descartes' La Géométrie (1637), which introduced analytic geometry by associating curves with algebraic equations. Conic sections, such as ellipses and hyperbolas, were shown to satisfy quadratic equations in coordinates, transforming classical Euclidean geometry into an algebraic framework where quadratic forms defined loci and intersections.[80][81] This synthesis elevated quadratic expressions from isolated problems to tools for broader geometric analysis.[80]The transition to more algebraic treatments occurred in the 18th century with Leonhard Euler's investigations into binary quadratic forms, particularly their reduction and composition for representing numbers. Euler's work, including analyses of forms like ax^2 + bxy + cy^2, extended Fermat's ideas by exploring equivalence classes and applications to Diophantine equations, paving the way for 19th-century systematization.[82][83] These efforts underscored quadratic forms' dual utility in geometry and arithmetic, influencing fields from continued fractions to modular arithmetic.[82]
Key contributions and theorems
In the early 19th century, Carl Friedrich Gauss made foundational contributions to the theory of quadratic forms through his seminal work Disquisitiones Arithmeticae (1801), where he introduced the concept of composition of binary quadratic forms, enabling the systematic study of equivalence classes under SL(2,ℤ) transformations.[84] This composition law provided a group structure to the set of forms of a given discriminant, laying the groundwork for later developments in the arithmetic of quadratic fields. Gauss's approach unified disparate results on representation of integers by forms and influenced subsequent classifications.[85]Mid-century advancements included the inertia theorem, independently developed by Augustin-Louis Cauchy in the 1820s and formalized by James Joseph Sylvester in 1852, which states that for real quadratic forms, the number of positive, negative, and zero eigenvalues (the signature) is invariant under congruence transformations.[23]Sylvester's proof, published in the Philosophical Magazine, emphasized the "law of inertia" for quadratic forms, connecting linear algebra to geometric interpretations and proving essential for classifying indefinite forms over the reals.[86]In the 20th century, Hermann Minkowski's development of the geometry of numbers, detailed in his 1910 monograph Geometrie der Zahlen, applied convex body theorems to bound representations by positive definite quadratic forms, yielding results like the finiteness of class numbers for imaginary quadratic fields.[87] Building on this, Helmut Hasse established the local-global principle for quadratic forms over the rationals in his 1923 papers, asserting that a form represents zero over ℚ if and only if it does so over ℝ and all ℚ_p, resolving equivalence questions via local invariants. Ernst Witt's 1937 theorem on cancellation further advanced the algebraic theory, showing that if two quadratic spaces over a field become isometric after hyperbolic plane adjunction, their anisotropic parts are isometric, enabling decomposition into hyperbolic and anisotropic components.[61]Post-1970s computational advances, such as algorithms by H.C. Williams and H. Zassenhaus for enumerating class groups of real quadratic fields up to large discriminants, facilitated empirical studies of class number problems and verified conjectures like the rarity of class number one fields. These methods, leveraging continued fractions and regulator computations, extended Gauss's manual calculations to millions of discriminants by the 1980s. The theory's impact extends to algebraic geometry through the study of quadratic hypersurfaces and Witt rings in motivic cohomology, and to physics via quadratic forms modeling kinetic energy and spacetime metrics in special relativity.[88][89]