Fact-checked by Grok 2 weeks ago

Characteristic equation

In mathematics, the characteristic equation is a polynomial equation that plays a central role in determining eigenvalues of square matrices and solving linear homogeneous ordinary differential equations with constant coefficients. In linear algebra, for an n \times n square matrix A, the characteristic equation is given by \det(\lambda I_n - A) = 0, where I_n is the n \times n identity matrix and \lambda is a scalar variable; the roots of this equation are precisely the eigenvalues of A, which describe how the matrix scales vectors in certain directions. The associated polynomial \det(\lambda I_n - A) is known as the characteristic polynomial, a monic polynomial of degree n whose coefficients are functions of the entries of A, such as the trace and determinant for low dimensions. This equation underpins key theorems like the spectral theorem for symmetric matrices and the Cayley-Hamilton theorem, which states that every square matrix satisfies its own characteristic equation. In the context of differential equations, the characteristic equation arises when solving an nth-order linear homogeneous equation a_n y^{(n)} + a_{n-1} y^{(n-1)} + \dots + a_1 y' + a_0 y = 0 by assuming a solution of the form y = e^{rt}, leading to the auxiliary equation a_n r^n + a_{n-1} r^{n-1} + \dots + a_1 r + a_0 = 0. The roots r of this equation dictate the form of the general solution: distinct real roots yield exponential terms c_i e^{r_i t}, repeated roots introduce polynomial factors like t^k e^{rt}, and complex conjugate roots \alpha \pm \beta i produce oscillatory solutions involving sines and cosines via Euler's formula. This method is essential for analyzing systems in physics, engineering, and other fields where such equations model damped oscillations or growth processes.

Introduction

Definition

The characteristic equation is a fundamental polynomial equation in mathematics that arises in the analysis of linear systems, particularly in linear algebra and differential equations. In the context of linear algebra, it is defined as the equation \det(A - \lambda I) = 0, where A is an n \times n square matrix, \lambda is a scalar variable, and I is the n \times n identity matrix; the roots of this equation, known as eigenvalues, reveal essential properties of the matrix such as its spectral decomposition and stability characteristics. In the realm of ordinary differential equations (ODEs), for a linear homogeneous ODE with constant coefficients of order n, such as y^{(n)} + a_{n-1} y^{(n-1)} + \cdots + a_1 y' + a_0 y = 0, the characteristic equation takes the form r^n + a_{n-1} r^{n-1} + \cdots + a_1 r + a_0 = 0, where the roots r determine the form of the general solution and system behavior like exponential growth or decay. This equation emerges naturally from the requirement to find non-trivial solutions to linear homogeneous systems, such as A \mathbf{v} = \lambda \mathbf{v} for matrices or assuming solutions y = e^{rt} for ODEs, leading to a whose degree equals the of the system or the order of the equation. The nature ensures that the roots can be analyzed using algebraic tools, providing insights into the system's without solving the full original problem. A simple example illustrates this for a $2 \times 2 matrix A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}, where the characteristic equation simplifies to \det(A - \lambda I) = \lambda^2 - (a+d)\lambda + (ad - bc) = 0, or equivalently \lambda^2 - \operatorname{tr}(A) \lambda + \det(A) = 0, with the roots indicating the eigenvalues.

Historical Development

The origins of the characteristic equation can be traced to the early . first employed exponential trial solutions for second-order linear homogeneous ordinary s with constant coefficients in 1730. Leonhard Euler extended this approach to higher-order equations in 1743, reducing the to an algebraic equation in r—the precursor to the modern characteristic equation—whose roots determined the general solution form. This innovation was essential for analyzing oscillatory phenomena and marked an early systematic approach to such equations. During the , and advanced the formalization of related concepts through their studies of forms and determinants, bridging the to algebraic structures. Lagrange's investigations into forms, particularly in his 1788 Mécanique analytique and earlier works on planetary perturbations around 1782, involved diagonalizing symmetric matrices and using what he called the "secular equation"—a determinant-based akin to the characteristic equation—to identify principal axes and invariants. Gauss built on this in his 1801 , where he systematized the theory of binary forms, employing determinant computations to classify forms and their equivalence classes, thereby providing tools that facilitated the later recognition of eigenvalues in finite-dimensional linear algebra. Augustin-Louis Cauchy played a pivotal role in standardizing the terminology and its application to matrices and differential equations. In his 1829 memoir on secular perturbations in astronomy, Cauchy utilized the equation \det(A - \lambda I) = 0 to prove that symmetric matrices possess real eigenvalues, though without yet naming it. He formally introduced the term "équation caractéristique" in 1840 in his Mémoire sur l'intégration des équations différentielles, applying it to the auxiliary polynomial for solving linear systems of differential equations and linking it explicitly to eigenvalues. The early 20th century saw extensions to infinite-dimensional settings through David Hilbert's foundational contributions to . In his 1904 treatise Grundzüge einer allgemeinen Theorie der linearen Integralgleichungen, Hilbert generalized the characteristic equation to linear integral equations, introducing the concepts of "eigenwerte" (eigenvalues) and "eigenfunctions" for compact operators on infinite-dimensional spaces, which formed the basis for in Hilbert spaces. This work, building on finite-dimensional precedents, enabled the analysis of continuous systems and profoundly influenced .

Linear Algebra Context

Characteristic Polynomial of a Matrix

The characteristic polynomial of an n \times n A over the complex numbers is defined as the p_A(\lambda) = \det(\lambda I_n - A), where I_n is the n \times n and \lambda is a scalar . This yields a of degree exactly n, with leading coefficient 1. Some texts employ the alternative convention \det(A - \lambda I_n), which differs by a factor of (-1)^n but shares the same roots. The polynomial is derived from the matrix pencil \lambda I_n - A, a parameterized family of matrices whose determinant provides a scalar-valued function of \lambda. For small n, it can be computed explicitly via cofactor expansion. For instance, expanding along the first row of \lambda I_n - A involves summing signed minors, each of which is itself a of a smaller matrix. Key properties include the form p_A(\lambda) = \lambda^n - (\operatorname{tr} A) \lambda^{n-1} + \cdots + (-1)^n \det A, where the \operatorname{tr} A is the sum of the diagonal entries and appears as the negative coefficient of \lambda^{n-1}. More generally, the coefficients are, up to alternating signs, the elementary symmetric functions of the eigenvalues \lambda_1, \dots, \lambda_n of A: specifically, p_A(\lambda) = \prod_{i=1}^n (\lambda - \lambda_i) = \lambda^n - s_1 \lambda^{n-1} + s_2 \lambda^{n-2} - \cdots + (-1)^n s_n, where s_k is the k-th elementary symmetric sum over the eigenvalues. The constant term is thus (-1)^n \det A, reflecting the product of the eigenvalues up to sign. For a $2 \times 2 A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}, the is p_A(\lambda) = \lambda^2 - (a + d) \lambda + (ad - bc) = \lambda^2 - (\operatorname{tr} A) \lambda + \det A. For a $3 \times 3 A = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix}, cofactor expansion yields p_A(\lambda) = \lambda^3 - (\operatorname{tr} A) \lambda^2 + \left( \sum_{i=1}^3 \det A_{ii} \right) \lambda - \det A, where A_{ii} is the $2 \times 2 principal minor excluding row i and column i, and the coefficient of \lambda is the of those principal minors. By the fundamental theorem on eigenvalues, the roots of p_A(\lambda) = 0 (counted with algebraic multiplicity) are precisely the eigenvalues of A. These roots enable the identification of eigenvalues, which are subsequently used to compute corresponding eigenvectors.

Eigenvalues and Eigenvectors

In linear algebra, the roots of the characteristic equation of a square matrix A are precisely the eigenvalues \lambda of A, which satisfy the equation A\mathbf{v} = \lambda \mathbf{v} for some nonzero \mathbf{v}, known as an eigenvector corresponding to \lambda. This relationship arises directly from the definition of the characteristic equation \det(A - \lambda I) = 0, where the solutions \lambda identify the scalars by which A scales its eigenvectors during linear transformations. Geometrically, eigenvectors represent directions unchanged by the transformation except for scaling by \lambda, providing insight into the matrix's action on spaces. Each eigenvalue \lambda has an algebraic multiplicity, defined as the multiplicity of \lambda as a of the , and a geometric multiplicity, which is the dimension of the corresponding eigenspace (the null space of A - \lambda I). The geometric multiplicity is always less than or equal to the algebraic multiplicity, and equality holds for all eigenvalues if and only if the matrix is diagonalizable. A matrix is defective if, for some eigenvalue, the geometric multiplicity is strictly less than the algebraic multiplicity, as in the example of the matrix \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}, where \lambda = 1 has algebraic multiplicity 2 but geometric multiplicity 1, yielding only one independent eigenvector. A matrix A is diagonalizable if it possesses a full set of n linearly eigenvectors for an n \times n , allowing a P^{-1}AP = D where P has the eigenvectors as columns and D is a with the eigenvalues on the diagonal. This simplifies computations like matrix powers, as A^k = PDP^{-1} implies A^k = PD^kP^{-1}. Distinct eigenvalues guarantee of eigenvectors, but repeated eigenvalues may or may not, depending on the eigenspace dimension. For instance, consider the 2×2 rotation matrix R = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix} for \theta \neq 0, \pi, which has characteristic equation roots \lambda = e^{\pm i\theta}, complex eigenvalues with no real eigenvectors since the real eigenspaces are trivial. Geometrically, this reflects the matrix's pure rotation without scaling or fixed directions in the real plane. When a is not diagonalizable, its canonical form provides a block-diagonal with Jordan blocks featuring the eigenvalue on the diagonal and 1's on the superdiagonal, capturing the structure of generalized eigenvectors for defective cases. This form generalizes while preserving the eigenvalues.

Differential Equations Context

For Linear Homogeneous ODEs with Constant Coefficients

In the context of ordinary differential equations, the characteristic is employed to solve linear homogeneous equations with constant coefficients. The general form of an nth-order such is a_n y^{(n)}(x) + a_{n-1} y^{(n-1)}(x) + \cdots + a_1 y'(x) + a_0 y(x) = 0, where the coefficients a_k (for k = 0, 1, \dots, n) are real constants with a_n \neq 0. To find solutions, an trial y(x) = e^{rx} is assumed, where r is a constant . The derivatives follow as y^{(k)}(x) = r^k e^{rx} for each order k. Substituting these into yields e^{rx} \left( a_n r^n + a_{n-1} r^{n-1} + \cdots + a_1 r + a_0 \right) = 0. Since e^{rx} \neq 0 for all x, the equation simplifies to the characteristic equation a_n r^n + a_{n-1} r^{n-1} + \cdots + a_1 r + a_0 = 0, a of degree n in r. The roots of this characteristic equation determine the structure of the solutions to the original . Given real coefficients, the roots are either real numbers or numbers appearing in conjugate pairs. Real roots may be distinct or have multiplicities (repeated roots), while roots take the form \alpha \pm i\beta with \beta \neq 0. For higher-order equations, multiplicities can exceed two, affecting the solution basis accordingly. As an example, consider the second-order equation y''(x) + 3y'(x) + 2y(x) = 0. The corresponding characteristic equation is r^2 + 3r + 2 = 0, which factors as (r + 1)(r + 2) = 0, giving roots r = -1 and r = -2, both real and distinct.

Solution via Characteristic Roots

Once the roots of the characteristic equation are determined for an nth-order linear homogeneous (ODE) with constant coefficients, the general is constructed as a of basis functions derived from those roots. This approach leverages the fact that exponential functions e^{rt} form solutions corresponding to each root r, with modifications for multiplicities and values to ensure . The arbitrary constants in the linear combination are then fixed by initial or boundary conditions to yield a unique . For distinct real roots r_1, r_2, \dots, r_n, the general solution takes the form y(t) = c_1 e^{r_1 t} + c_2 e^{r_2 t} + \dots + c_n e^{r_n t}, where c_1, c_2, \dots, c_n are arbitrary constants determined by initial conditions. This structure arises because each distinct root produces a linearly exponential solution. When a real root r has multiplicity m > 1, the corresponding basis functions include factors to maintain , yielding terms t^k e^{rt} for k = 0, 1, \dots, m-1. The contribution to the general from this repeated is thus e^{rt} (c_1 + c_2 t + \dots + c_m t^{m-1}), combined linearly with solutions from other roots. For instance, consider the second-order y'' - 2y' + y = 0, whose characteristic equation r^2 - 2r + 1 = 0 has a repeated r = 1. The general is y(t) = (c_1 + c_2 t) e^t. For complex conjugate roots \alpha \pm \beta i (with \beta \neq 0), the solutions are expressed in real form using , producing oscillatory components modulated by or growth. The pair contributes e^{\alpha t} (c_1 \cos(\beta t) + c_2 \sin(\beta t)) to the general solution, ensuring real-valued functions. If such a pair has multiplicity greater than one, additional factors of t^k (for k = 0 to the multiplicity minus one) are included for each trigonometric term. Applying initial conditions to the full determines the specific solution satisfying the problem constraints.

Other Mathematical Contexts

In Control Theory

In control theory, the characteristic equation is fundamental to analyzing the stability and dynamic performance of linear time-invariant (LTI) systems, as its roots determine the locations of the system's poles in the , which govern the and asymptotic behavior. For systems modeled in state-space form as \dot{x} = Ax + Bu and y = Cx + Du, the characteristic equation arises from the condition \det(sI - A) = 0, where A is the system matrix and s is the complex variable; the roots of this equation are the eigenvalues of A, representing the natural frequencies and damping characteristics of the system's modes. This formulation directly links the internal dynamics to external inputs and outputs, enabling the design of controllers that place these roots to achieve desired and response specifications. The roots of the state-space characteristic equation correspond precisely to the poles of the system's G(s) = C(sI - A)^{-1}B + D, which describe the input-output relationship in the Laplace domain; these poles dictate the system's , as all must lie in the open left-half s-plane for asymptotic . analysis often employs the Routh-Hurwitz , which provides necessary and sufficient conditions on the coefficients of the to ensure all roots have negative real parts without explicitly solving for them; for a s^2 + as + b = 0, the system is if and only if a > 0 and b > 0, corresponding to positive and terms in second-order . This extends to higher-order polynomials via the Routh array, facilitating rapid assessment in design. A practical application appears in feedback control systems, where the characteristic equation is derived from the and analyzed using the root locus technique to visualize pole migration as a varies. For instance, consider a unity-feedback with open-loop G(s) = K \frac{s+1}{s^2 + 2s + 1}, yielding the characteristic equation $1 + K \frac{s+1}{s^2 + 2s + 1} = 0; the root locus begins at the open-loop poles—a double pole at s = -1—and as K increases from 0, one branch remains fixed at s = -1, while the other moves leftward along the real axis toward negative infinity, revealing regions of for K > 0 and aiding in selecting K for optimal and . This method highlights how adjustments influence performance metrics like overshoot and without repeated eigenvalue computations.

In Difference Equations

In linear homogeneous difference equations with constant coefficients, the characteristic equation plays a central role in determining the general solution, analogous to its role in ordinary differential equations but adapted to discrete-time systems. These equations take the form a_n y_{k+n} + a_{n-1} y_{k+n-1} + \cdots + a_1 y_{k+1} + a_0 y_k = 0, where a_i are constants and n is the order of the recurrence. To solve such an equation, one assumes a solution of the form y_k = r^k, where r is a constant to be determined. Substituting this assumed form into the equation yields the characteristic equation: a_n r^n + a_{n-1} r^{n-1} + \cdots + a_1 r + a_0 = 0. The roots of this characteristic equation determine the form of the general . For an n-th equation with n distinct r_1, r_2, \dots, r_n, the general solution is y_k = c_1 r_1^k + c_2 r_2^k + \cdots + c_n r_n^k, where the c_i are arbitrary constants determined by initial conditions. If there are repeated , say a root r with multiplicity m, the corresponding terms in the solution include factors: (c_1 + c_2 k + \cdots + c_m k^{m-1}) r^k. For , which occur in conjugate pairs for equations with real coefficients, the solution can be expressed using polar form: if r = \rho e^{\pm i \theta}, then the terms are \rho^k (A \cos(k \theta) + B \sin(k \theta)). A simple example illustrates this approach. Consider the second-order equation y_{k+2} - 3 y_{k+1} + 2 y_k = 0. The characteristic equation is r^2 - 3r + 2 = 0, which factors as (r-1)(r-2) = 0, yielding distinct roots r=1 and r=2. Thus, the general solution is y_k = c_1 \cdot 1^k + c_2 \cdot 2^k = c_1 + c_2 2^k. In the context of the , which is the discrete analog of the , the roots of the characteristic equation correspond to the poles of the for the system described by the difference equation. This connection is fundamental in and for analyzing stability and response in discrete-time systems.

References

  1. [1]
    The Characteristic Polynomial
    The characteristic polynomial of A is the function f ( λ ) given by f ( λ )= det ( A − λ I n ) . We will see below that the characteristic polynomial is in ...
  2. [2]
    Homogeneous Differential Equations - Pauls Online Math Notes
    Nov 16, 2022 · This is called the characteristic polynomial/equation and its roots/solutions will give us the solutions to the differential equation. We ...
  3. [3]
    Differential Equations - Pauls Online Math Notes - Lamar University
    Jun 26, 2023 · We derive the characteristic polynomial and discuss how the Principle of Superposition is used to get the general solution.Homogeneous Differential... · Partial Differential Equations
  4. [4]
    [PDF] Second-Order Homogeneous Linear Equations with Constant ... - UAH
    IMPORTANT: What we will derive and define here (e.g., “the characteristic equation”) is based on the assumption that the coefficients in our differential ...
  5. [5]
    [PDF] Math 54: Linear Algebra and Differential Equations
    Jul 17, 2019 · is an important polynomial associated to the square matrix A, called the characteristic polynomial. The set of eigenvalues as we have discussed ...
  6. [6]
    [PDF] Linear Algebra and Differential Equations Math 21b
    The characteristic polynomial of a 2 × 2 matrix A = a b. c d is. fA(λ) = λ2 ... DETERMINANTS of matrices appear in the definition of the characteristic polyomial ...<|control11|><|separator|>
  7. [7]
    D'Alembert and the Wave Equation: Its Disputes and Controversies
    Euler (1707-1783) also examined free vibrations of an elastic string without mass (Euler, 1749). In connection with the development of a solution of vibration ...
  8. [8]
    GAUSS' CLASS NUMBER PROBLEM FOR IMAGINARY ...
    4ac. In his book of 1798 Legendre simplified Lagrange's work, proved the law of quadratic reciprocity assuming ...Missing: formalization | Show results with:formalization
  9. [9]
    How ordinary elimination became Gaussian elimination
    Since neither Lagrange [1759] nor Gauss [1801] dealt with quadratic forms of more than three variables, Gauss had yet to systematically extend the ...
  10. [10]
    Math Origins: Eigenvectors and Eigenvalues
    Cauchy named the equation, in which this general polynomial is set equal to 0, the "characteristic equation." Figure 5. The conclusion of Cauchy's method ...
  11. [11]
    [PDF] Chapter 7 Vector Norms and Matrix Norms - UPenn CIS
    det(λI A) = λ n tr(A)λn1. + ··· + (1) n det(A) is called the characteristic polynomial of A. The n (not necessarily distinct) roots λ1,...,λn of the ...
  12. [12]
    Characteristic Polynomial -- from Wolfram MathWorld
    The characteristic polynomial is the polynomial left-hand side of the characteristic equation det(A-lambdaI)=0, where A is a square matrix and I is the ...
  13. [13]
    [PDF] On the elementary symmetric functions of a sum of matrices - arXiv
    Sep 17, 2009 · In fact, given a square matrix A, the coefficients of its characteristic polynomial. χA(t) := det(tI − A) are, up to a sign, the elementary ...<|control11|><|separator|>
  14. [14]
    Eigenvalue -- from Wolfram MathWorld
    Eigenvalues are a special set of scalars associated with a linear system of equations (ie, a matrix equation) that are sometimes also known as characteristic ...
  15. [15]
    Eigenvector -- from Wolfram MathWorld
    Eigenvectors are special vectors associated with linear systems, also known as characteristic, proper, or latent vectors, and are paired with eigenvalues.
  16. [16]
    Complex Eigenvalues
    Geometrically, the rotation-scaling theorem says that a 2 × 2 matrix with a complex eigenvalue behaves similarly to a rotation-scaling matrix. See this ...
  17. [17]
    Algebraic and geometric multiplicity of eigenvalues - StatLect
    Discover how the geometric and algebraic multiplicity of an eigenvalue are defined and how they are related. With examples, proofs and solved exercises.Algebraic multiplicity · Geometric multiplicity · Relationship between...Missing: source | Show results with:source
  18. [18]
    [PDF] Multiplicities of Eigenvalues. Professor Karen Smith Theorem I: Let V
    Geometric multiplicity is the dimension of the λ-eigenspace. Algebraic multiplicity is the largest k such that (x-λ)^k is a factor of the characteristic ...
  19. [19]
    Diagonalizable Matrix -- from Wolfram MathWorld
    The diagonalization theorem states that an n×n matrix A is diagonalizable if and only if A has n linearly independent eigenvectors.
  20. [20]
    [PDF] Chapter 6 Eigenvalues and Eigenvectors
    We can't write every v as a combination of eigenvectors. In the language of the next section, we can't diagonalize a matrix without n independent eigenvectors.
  21. [21]
    [PDF] Rotations and complex eigenvalues Math 130 Linear Algebra
    Example 2. Eigenvalues of a general rotation in R2. cosθ − λ −sinθ sinθ cosθ − λ = (cosθ − λ)2 + sin2 θ.
  22. [22]
    [PDF] Jordan Normal form of 2 × 2 matrices - UC Berkeley math
    Corollary: Let A be a 2 × 2 matrix which is not diagonalizable. Then there exist matrices D and N, where D is diagonal and N is nilpotent, with A = D + N ...
  23. [23]
    [PDF] 3 Canonical Forms - 3.1 Jordan Forms & Generalized Eigenvectors
    A Jordan canonical form is a block-diagonal matrix of Jordan blocks. A Jordan canonical basis is a basis where the matrix is a Jordan canonical form.
  24. [24]
    [PDF] Arbitrary Homogeneous Linear Equations with Constant Coefficients
    when N is some positive integer and the ak's are all real constants. Assuming y = erx still leads to the corresponding “characteristic equation”.
  25. [25]
    [PDF] 4.3 Linear, Homogeneous Equations with Constant Coefficients
    Sep 25, 2012 · The general solution is y(t) = eat(C1 cos(bt) + C2 sin(bt)) = C1 cos(2t) + C2 sin(2t). Jiwen He, University of Houston. Math 3331 Differential ...
  26. [26]
  27. [27]
    [PDF] ANALYSIS OF LINEAR SYSTEMS IN STATE SPACE FORM
    This is also called the realization problem in control theory. We shall only solve ... characteristic equation det(sI − A) = 0 is called its algebraic ...
  28. [28]
    [PDF] Understanding Poles and Zeros 1 System Poles and Zeros - MIT
    The transfer function poles are the roots of the characteristic equation, and also the eigenvalues of the system A matrix. The homogeneous response may ...
  29. [29]
    [PDF] Lecture 10: Routh-Hurwitz Stability Criterion - Matthew M. Peet
    Thus a quadratic is stable if and only if both coefficients are positive. M. Peet. Lecture 10: Control Systems. 20 / 28 ...
  30. [30]
    [PDF] Lecture 11 Routh-Hurwitz criterion: Control examples
    Example 1: K(s)=KP+KI/s (cont'd). ▫ Select KP=3 (<9). ▫ Routh ... ▫ Control examples for Routh-Hurwitz criterion. ▫ P controller gain range for stability.