Fact-checked by Grok 2 weeks ago

Definite quadratic form

In , a definite quadratic form is a over a real that takes either strictly positive values for all non-zero vectors (positive definite) or strictly negative values for all non-zero vectors (negative definite). A itself is a homogeneous polynomial of degree two, expressible as q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, where \mathbf{x} is a vector in \mathbb{R}^n and A is an n \times n symmetric matrix. The associated bilinear form is B(\mathbf{x}, \mathbf{y}) = \mathbf{x}^T A \mathbf{y}, and definiteness requires the quadratic form to vanish only at the zero vector. The classification of quadratic forms as definite, indefinite, or semidefinite relies on the eigenvalues of A: a form is positive definite if all eigenvalues are positive, negative definite if all are negative, and indefinite if there are both positive and negative eigenvalues. This spectral characterization follows from the for symmetric matrices, which diagonalizes A over the reals, transforming the quadratic form into a with signs determined by the eigenvalues. Positive definite forms correspond geometrically to ellipsoids in \mathbb{R}^n, while indefinite forms yield hyperboloids, influencing applications in conic sections and optimization landscapes. Definite quadratic forms have profound applications across mathematics and related fields. In optimization, they underpin convex functions and the analysis of critical points via the Hessian matrix, ensuring local minima for positive definite cases. In statistics, they model covariance structures and appear in multivariate normal distributions, where positive definiteness guarantees well-defined probability densities. Number theory employs them to study representations of integers, such as sums of squares, with positive definite binary forms central to class number problems. They also arise in physics for energy functionals and in machine learning for kernel methods like support vector machines. Historically, quadratic forms trace back to ancient problems like those solved by the Babylonians around 1800 BCE, but the modern theory began with Carl Friedrich Gauss's (1801), which provided a complete treatment of binary quadratic forms over the integers. Earlier contributions include Brahmagupta's work on x^2 - n y^2 = c in 628 CE, linking to Pell equations. Ernst Witt's 1930s advancements extended the theory to arbitrary fields, influencing and . Today, definite forms remain foundational in linear algebra, with ongoing research in their computational aspects and generalizations to indefinite metrics in .

Fundamentals

Definition

A quadratic form on the real \mathbb{R}^n is a of degree two in the variables x_1, \dots, x_n, expressed as Q(\mathbf{x}) = \sum_{i,j=1}^n a_{ij} x_i x_j, where \mathbf{x} = (x_1, \dots, x_n)^T \in \mathbb{R}^n and the coefficients a_{ij} satisfy a_{ij} = a_{ji} for all i, j, making the associated matrix A = (a_{ij}) symmetric. This representation assumes familiarity with basic notions from linear algebra, such as s over \mathbb{R}, but the itself is defined directly through this polynomial expression without requiring inner products a priori. A quadratic form Q is said to be positive definite if Q(\mathbf{x}) > 0 for all nonzero \mathbf{x} \in \mathbb{R}^n, and negative definite if Q(\mathbf{x}) < 0 for all nonzero \mathbf{x} \in \mathbb{R}^n. A quadratic form that is neither positive definite nor negative definite is either semidefinite (positive semidefinite if Q(\mathbf{x}) \geq 0 for all \mathbf{x} with equality for some nonzero \mathbf{x}, or negative semidefinite if Q(\mathbf{x}) \leq 0 for all \mathbf{x} with equality for some nonzero \mathbf{x}) or indefinite (taking both positive and negative values on some nonzero vectors).

Associated symmetric bilinear form

Every quadratic form Q on a real vector space arises from a unique symmetric bilinear form B. Specifically, the associated bilinear form is given by the polarization identity: B(\mathbf{x}, \mathbf{y}) = \frac{1}{4} \left[ Q(\mathbf{x} + \mathbf{y}) - Q(\mathbf{x} - \mathbf{y}) \right], which ensures B is bilinear and symmetric, satisfying B(\mathbf{x}, \mathbf{y}) = B(\mathbf{y}, \mathbf{x}). The polarization identity derives from the binomial expansion of the quadratic form. Assuming Q(\mathbf{z}) = B(\mathbf{z}, \mathbf{z}) for a symmetric bilinear B, expand Q(\mathbf{x} + \mathbf{y}) = B(\mathbf{x}, \mathbf{x}) + 2B(\mathbf{x}, \mathbf{y}) + B(\mathbf{y}, \mathbf{y}) and Q(\mathbf{x} - \mathbf{y}) = B(\mathbf{x}, \mathbf{x}) - 2B(\mathbf{x}, \mathbf{y}) + B(\mathbf{y}, \mathbf{y}). Subtracting these equations yields Q(\mathbf{x} + \mathbf{y}) - Q(\mathbf{x} - \mathbf{y}) = 4B(\mathbf{x}, \mathbf{y}), and dividing by 4 recovers B. This holds over fields of characteristic not equal to 2, such as the reals. Over the real numbers, the quadratic form Q uniquely determines the symmetric bilinear form B via the polarization identity, and conversely, any symmetric bilinear form B defines a quadratic form by Q(\mathbf{x}) = B(\mathbf{x}, \mathbf{x}). This bijection establishes that Q is a quadratic form if and only if the associated B is symmetric. For a definite quadratic form Q, the associated B is non-degenerate, meaning that if B(\mathbf{x}, \mathbf{y}) = 0 for all \mathbf{y}, then \mathbf{x} = \mathbf{0}. This follows because Q(\mathbf{x}) = B(\mathbf{x}, \mathbf{x}) > 0 (or < 0) for \mathbf{x} \neq \mathbf{0}, implying no non-trivial kernel.

Properties

Positive and negative definiteness

A quadratic form Q: V \to \mathbb{R} on a finite-dimensional real vector space V is positive definite if Q(\mathbf{x}) > 0 for all nonzero \mathbf{x} \in V, or equivalently, Q(\mathbf{x}) \geq 0 for all \mathbf{x} \in V with equality if and only if \mathbf{x} = \mathbf{0}. This strict positivity ensures that Q induces a norm on V defined by \|\mathbf{x}\| = \sqrt{Q(\mathbf{x})}, which satisfies the axioms of positive definiteness, homogeneity, and the triangle inequality. The associated symmetric bilinear form B(\mathbf{x}, \mathbf{y}) = \frac{1}{2} [Q(\mathbf{x} + \mathbf{y}) - Q(\mathbf{x}) - Q(\mathbf{y})] then defines an inner product on V, endowing the space with a Euclidean structure. A quadratic form Q is negative definite if Q(\mathbf{x}) < 0 for all nonzero \mathbf{x} \in V, which is equivalent to -Q being positive definite. For positive definiteness, a brief characterization is that all leading principal minors of the matrix representation are positive (Sylvester's criterion). Geometrically, the level sets \{\mathbf{x} \in V \mid Q(\mathbf{x}) = c \} for c > 0 under a positive definite form are compact ellipsoids centered at the origin, reflecting the bounded, convex nature of sublevel sets. In contrast, indefinite quadratic forms—those taking both positive and negative values—yield non-compact level sets such as hyperboloids, which extend to infinity. Definite quadratic forms, whether positive or negative, are non-degenerate, meaning the radical of the associated bilinear form is trivial (only the zero vector is orthogonal to the entire space). Consequently, in matrix terms, the symmetric matrix representing the form is invertible, ensuring the linear operator it defines is bijective on V. This invertibility underpins the form's role in defining unique solutions to associated equations and in applications requiring strict convexity or coercivity.

Criteria for definiteness

Sylvester's law of states that for a real symmetric matrix associated with a , the number of positive, negative, and zero eigenvalues remains invariant under transformations, meaning that any two congruent matrices have the same , defined as the triple (number of positive eigenvalues, number of negative eigenvalues, number of zero eigenvalues). This law provides a foundational algebraic criterion for classifying s as positive definite (all eigenvalues positive, (n,0,0)), negative definite (all eigenvalues negative, (0,n,0)), or indefinite (mixed signs). A practical determinant-based test for definiteness is given by , which applies to the symmetric A of the . For , all leading principal minors of A must be positive; that is, the of the top-left k \times k submatrix is positive for each k = 1, \dots, n. For negative definiteness, the leading principal minors alternate in sign, starting with negative for the 1x1 minor: \det(A_1) < 0, \det(A_2) > 0, \det(A_3) < 0, and so on up to \det(A_n) > 0 if n is even or \det(A_n) < 0 if n is odd. These conditions are necessary and sufficient, allowing verification without computing eigenvalues directly. Another analytic criterion for positive definiteness is the existence of a Cholesky decomposition, where the symmetric positive definite matrix A factors as A = LL^T with L a lower triangular matrix having positive diagonal entries. This decomposition confirms definiteness computationally, as the algorithm succeeds if and only if A is positive definite, and it applies directly to the coefficient matrix A. Such tests, including Sylvester's criterion and Cholesky factorization, are essential for determining whether a quadratic form induces a norm on the vector space, linking to properties like completeness in optimization.

Matrix Representation

Symmetric matrix form

A quadratic form Q: \mathbb{R}^n \to \mathbb{R} on an n-dimensional real vector space can be expressed in matrix notation as Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, where A is an n \times n real symmetric matrix, meaning A = A^T. This representation arises from the associated symmetric bilinear form B(\mathbf{x}, \mathbf{y}) = \mathbf{x}^T A \mathbf{y}, with Q(\mathbf{x}) = B(\mathbf{x}, \mathbf{x}). The entries of the symmetric matrix A = (a_{ij}) directly relate to the coefficients of the quadratic form when expanded as a homogeneous polynomial Q(\mathbf{x}) = \sum_{i=1}^n a_{ii} x_i^2 + \sum_{1 \leq i < j \leq n} 2 a_{ij} x_i x_j. Specifically, the diagonal entries a_{ii} are the coefficients of the squared terms x_i^2, while the off-diagonal entries satisfy a_{ij} = a_{ji} = \frac{1}{2} times the coefficient of the cross term x_i x_j for i \neq j. This convention ensures the matrix form captures the full polynomial without redundancy, given the symmetry a_{ij} = a_{ji}. Under a change of coordinates \mathbf{x} = P \mathbf{y}, where P is an invertible n \times n matrix, the quadratic form transforms to Q(\mathbf{y}) = \mathbf{y}^T (P^T A P) \mathbf{y}, preserving the quadratic nature but yielding a new symmetric matrix A' = P^T A P. This operation, known as , leaves the intrinsic properties of the quadratic form invariant under linear substitutions. For a given quadratic form over the reals, the associated symmetric matrix A is unique. If another symmetric matrix B also satisfies Q(\mathbf{x}) = \mathbf{x}^T B \mathbf{x} for all \mathbf{x}, then A = B. This uniqueness follows from the one-to-one correspondence between quadratic forms and symmetric matrices in a fixed basis.

Eigenvalue characterization

The spectral theorem for real symmetric matrices states that any such matrix A \in \mathbb{R}^{n \times n} is orthogonally diagonalizable, meaning there exists an orthogonal matrix Q (with Q^T Q = I) and a diagonal matrix D = \operatorname{diag}(\lambda_1, \lambda_2, \dots, \lambda_n) containing the real eigenvalues \lambda_i of A, such that A = Q D Q^T. This decomposition arises from the fact that symmetric matrices have real eigenvalues and can be diagonalized by an orthonormal basis of eigenvectors. For a quadratic form Q(\mathbf{x}) = \mathbf{x}^T A \mathbf{x}, the definiteness is fully characterized by the eigenvalues of A: the form is positive definite if and only if all \lambda_i > 0, negative definite if all \lambda_i < 0, and indefinite otherwise (with both positive and negative eigenvalues). Substituting the spectral decomposition into the quadratic form yields Q(\mathbf{x}) = \sum_{i=1}^n \lambda_i ( \mathbf{q}_i^T \mathbf{x} )^2, where \mathbf{q}_i are the columns of Q, directly showing that the sign of Q(\mathbf{x}) for \mathbf{x} \neq \mathbf{0} depends on the signs of the \lambda_i. The Rayleigh quotient provides bounds on the eigenvalues via the quadratic form: for \mathbf{x} \neq \mathbf{0}, R(\mathbf{x}) = \frac{Q(\mathbf{x})}{\mathbf{x}^T \mathbf{x}} = \frac{\mathbf{x}^T A \mathbf{x}}{\|\mathbf{x}\|^2}, and the minimum value of R(\mathbf{x}) over all such \mathbf{x} is the smallest eigenvalue \lambda_{\min}, while the maximum is the largest eigenvalue \lambda_{\max}. Thus, Q(\mathbf{x}) > 0 for all \mathbf{x} \neq \mathbf{0} \lambda_{\min} > 0, confirming through this variational characterization. In practice, definiteness is often checked numerically by computing the eigenvalues of A using the , which iteratively applies QR decompositions to converge to the Schur form (diagonal for symmetric matrices) and is highly efficient for dense symmetric matrices of moderate size.

Examples and Applications

Low-dimensional examples

In one dimension, a takes the simple form Q(x) = a x^2, where a \in \mathbb{R}. This is positive definite a > 0, in which case Q(x) > 0 for all x \neq 0. In two dimensions, the quadratic form Q(x,y) = x^2 + y^2 provides a basic positive definite example, associated with the \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}, whose eigenvalues are both $1 > 0. A less trivial case is Q(x,y) = 2x^2 + 2xy + 2y^2, corresponding to the \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}; the eigenvalues of this matrix are $3 and &#36;1, both positive, confirming . For contrast, the two-dimensional quadratic form Q(x,y) = x^2 - y^2 is indefinite, as it evaluates to positive values along the line y=0 (except at the origin) and negative values along x=0 (except at the origin), with eigenvalues $1and-1$. The level sets \{(x,y) \mid Q(x,y) = 1\} of a positive definite quadratic form in two dimensions are ellipses centered at the origin, reflecting the form's bounded, positive nature away from zero. In the indefinite case, such as Q(x,y) = x^2 - y^2, the level sets are hyperbolas.

Optimization contexts

Definite quadratic forms play a central role in unconstrained optimization problems, where the objective is a quadratic f(\mathbf{x}) = \frac{1}{2} \mathbf{x}^T A \mathbf{x}, with A symmetric positive definite. In this case, the unique global minimizer occurs at \mathbf{x} = \mathbf{0}, as the vanishes there and the \nabla^2 f(\mathbf{x}) = 2A is positive definite everywhere, guaranteeing strict convexity and a well-defined minimum. This structure simplifies analysis, as the second-order expansion exactly matches the itself, allowing exact solutions in one step for gradient-based methods. In constrained optimization, definite quadratic forms underpin quadratic programming, formulated as minimizing f(\mathbf{x}) = \frac{1}{2} \mathbf{x}^T A \mathbf{x} + \mathbf{b}^T \mathbf{x} subject to linear inequalities or equalities, where A positive definite ensures the objective is strictly convex. This convexity implies that any local minimum is global and unique, facilitating efficient solving via interior-point or active-set methods, with strong duality holding under Slater's condition. The positive definiteness of A also bounds the condition number, aiding numerical stability in solvers. The eigenvalues of A determine this condition number, influencing convergence rates in iterative algorithms. Newton's method leverages the of the in optimization, approximating the objective locally as a and solving for the minimizer of that model at each . For twice-differentiable functions with a near the optimum, the method exhibits convergence, as the update \Delta \mathbf{x} = -H^{-1} \nabla f provides a direction aligned with the local . This reliance on extends to modified versions with or trust regions to handle cases where the may temporarily lose . In economic applications, definite quadratic forms model agent preferences through functions, such as in portfolio selection where the objective maximizes minus a penalty: U(\mathbf{x}) = \boldsymbol{\mu}^T \mathbf{x} - \frac{\lambda}{2} \mathbf{x}^T \Sigma \mathbf{x}, with \Sigma positive definite ensuring concavity and . This structure, introduced in mean-variance analysis, yields tractable solutions via optimization, capturing trade-offs between return and variance without assuming full distributional knowledge.

Physical and statistical uses

In physics, the kinetic energy of a mechanical system expressed in orthogonal coordinates takes the form of a positive definite quadratic form Q(\mathbf{v}) = \mathbf{v}^T M \mathbf{v}, where \mathbf{v} is the vector and M is the symmetric positive definite . This representation ensures that the is strictly positive for any non-zero , reflecting the physical requirement that vanishes only at rest. In the theory of elasticity, the strain energy density for linear materials is modeled as a positive definite form in the components of the strain tensor. This expression, often denoted as W(\epsilon) = \frac{1}{2} \epsilon^T C \epsilon where C is the positive definite stiffness tensor, captures the restorative due to deformations and guarantees by ensuring non-negative energy with positivity for non-trivial strains. In statistics, the covariance matrix \Sigma of a is required to be to ensure the distribution is non-degenerate and well-defined over the full-dimensional space. The , defined as d(\mathbf{x}, \mu) = \sqrt{(\mathbf{x} - \mu)^T \Sigma^{-1} (\mathbf{x} - \mu)}, arises as the of a form that generalizes the by accounting for correlations in the data. Similarly, in ordinary , the sum of squared residuals forms a in the parameter vector \beta, specifically (\mathbf{y} - X\beta)^T (\mathbf{y} - X\beta), which becomes when the X has full column rank, ensuring a unique minimizer. In these models, guarantees well-posedness by preventing singularities or ill-conditioned behaviors. In number theory, positive definite quadratic forms are used to study the representation of integers as sums of squares. For example, the form Q(x,y) = x^2 + y^2 represents primes congruent to 1 modulo 4, as per Fermat's theorem on sums of two squares, and more generally, binary positive definite forms classify ideals in quadratic fields, central to computing class numbers via reduction theory. In machine learning, positive definite quadratic forms appear in kernel methods, such as support vector machines (SVMs), where the kernel matrix K must be positive semi-definite to define a valid inner product in the feature space, ensuring the optimization problem of maximizing the margin is convex and solvable. For instance, the radial basis function kernel K(\mathbf{x}_i, \mathbf{x}_j) = \exp(-\gamma \|\mathbf{x}_i - \mathbf{x}_j\|^2) yields a positive definite matrix, facilitating non-linear classification.