Fact-checked by Grok 2 weeks ago

Gershgorin circle theorem

The Gershgorin circle theorem provides a to localize the eigenvalues of an n \times n complex square matrix A = (a_{ij}) within the . It states that every eigenvalue of A lies in at least one of the closed disks (called Gershgorin disks) centered at the diagonal entry a_{ii} for i = 1, \dots, n, with radius r_i(A) = \sum_{j \neq i} |a_{ij}|, the sum of the absolute values of the off-diagonal entries in the i-th row. Named after the Soviet mathematician Semyon Gershgorin, the theorem appeared in his 1931 paper "Über die Abgrenzung der Eigenwerte einer Matrix," published in the proceedings of the Academy of Sciences of the USSR. Although the result built on earlier ideas related to diagonal dominance—such as those by Lévy in 1881, Desplanques in 1887, and Hadamard in 1903—it was the first to explicitly bound all eigenvalues using row-sum disks. The proof relies on selecting an eigenvector and identifying the component with maximum to show that the corresponding eigenvalue cannot lie outside the relevant disk. A key strengthening asserts that if k of the Gershgorin disks are disjoint from the rest, then those k disks together contain exactly k eigenvalues (counting algebraic multiplicities). The theorem applies to any and is particularly useful for rough estimates of eigenvalue locations, proving nonsingularity for strictly diagonally dominant matrices (where |a_{ii}| > r_i(A) for all i), and analyzing the of iterative methods like the Jacobi or Gauss-Seidel algorithms. Extensions include versions for generalized eigenvalue problems, operator matrices, and refined bounds using column sums or scaled radii.

Core Theorem

Statement

The Gershgorin circle theorem provides a method to localize the eigenvalues of a complex matrix within specific regions of the complex plane. For an n \times n complex matrix A = (a_{ij}), every eigenvalue \lambda of A satisfies |\lambda - a_{ii}| \leq \sum_{\substack{j=1 \\ j \neq i}}^n |a_{ij}| for at least one i = 1, \dots, n. This inequality defines the Gershgorin disks D_i = \{ z \in \mathbb{C} \mid |z - a_{ii}| \leq r_i \}, where the center of D_i is the diagonal entry a_{ii} and the radius r_i = \sum_{\substack{j=1 \\ j \neq i}}^n |a_{ij}| is the sum of the absolute values of the off-diagonal entries in the i-th row. The Gershgorin set is the union \bigcup_{i=1}^n D_i, and all eigenvalues of A lie within this set. An equivalent formulation arises by applying the theorem to the A^T, which has the same eigenvalues as A. This yields disks centered at a_{ii} with radii \sum_{\substack{j=1 \\ j \neq i}}^n |a_{ji}|, corresponding to the sums of absolute values in the i-th column. The theorem was discovered by the Soviet Semyon Aronovich Gershgorin and first published in 1931 in his " die Abgrenzung der Eigenwerte einer Matrix" in Izvestiya Akademii Nauk SSSR, Otdelenie Matematicheskikh i Estestvennykh Nauk.

Proof

The proof of the Gershgorin circle theorem relies on basic properties of eigenvalues, eigenvectors, and the triangle inequality in the complex plane. Consider an n \times n complex matrix A = (a_{ij}) with eigenvalue \lambda and corresponding eigenvector x = (x_1, \dots, x_n)^T \neq 0. Without loss of generality, normalize x such that \|x\|_\infty = \max_k |x_k| = 1. Let i be an index where |x_i| = 1. The equation (A - \lambda I)x = 0 implies that the i-th component satisfies \sum_{j=1}^n (a_{ij} - \lambda \delta_{ij}) x_j = 0, where \delta_{ij} is the Kronecker delta. Rearranging gives (a_{ii} - \lambda) x_i + \sum_{j \neq i} a_{ij} x_j = 0, so \lambda - a_{ii} = \sum_{j \neq i} a_{ij} \frac{x_j}{x_i}. Taking absolute values yields |\lambda - a_{ii}| = \left| \sum_{j \neq i} a_{ij} \frac{x_j}{x_i} \right| \leq \sum_{j \neq i} |a_{ij}| \left| \frac{x_j}{x_i} \right|. Since |x_j| \leq 1 = |x_i| for all j, it follows that \left| \frac{x_j}{x_i} \right| \leq 1, and thus |\lambda - a_{ii}| \leq \sum_{j \neq i} |a_{ij}| = r_i, where r_i is the radius of the i-th Gershgorin disk. Therefore, \lambda lies in the disk centered at a_{ii} with radius r_i. The infinity norm is used here for its simplicity in bounding the ratios |x_j / x_i| directly by 1 without additional estimation. For the column version of the theorem, apply the row version to the A^T. The eigenvalues of A^T coincide with those of A, and the row sums of absolute off-diagonal entries for A^T are precisely the column sums for A. Thus, every eigenvalue of A lies in at least one disk centered at a_{jj} with radius \sum_{i \neq j} |a_{ij}|.

Interpretation

Disk Properties

Each Gershgorin disk D_i associated with an n \times n complex matrix A = (a_{ij}) is a closed disk in the , centered at the diagonal entry a_{ii} with radius r_i = \sum_{j \neq i} |a_{ij}|, the \ell_1-norm of the off-diagonal elements in the i-th row. The G(A) = \bigcup_{i=1}^n D_i, known as the Gershgorin region, contains all eigenvalues of A. The disks exhibit specific invariance and scaling properties under diagonal similarity transformations. For a nonsingular diagonal matrix D = \operatorname{diag}(d_1, \dots, d_n), the transformed matrix B = D^{-1} A D has unchanged diagonal entries b_{ii} = a_{ii}, preserving the disk centers, while the new radii become r_i' = \sum_{j \neq i} |d_i^{-1} a_{ij} d_j|. Choosing the d_k > 0 appropriately allows row and column scaling to minimize the radii, often tightening the overall bound provided by G(B), which still contains the eigenvalues of A since similarity preserves the spectrum. Disks frequently overlap, yielding conservative localization since multiple eigenvalues may lie within shared regions of G(A), potentially enlarging the effective bound beyond the precise locations. For an irreducible A, whose associated is strongly connected, the Gershgorin region G(A) is path-connected in the . If A is strictly diagonally dominant, meaning |a_{ii}| > r_i for each i, then A is nonsingular, and the Gershgorin disks bound the eigenvalues near the diagonal entries.

Eigenvalue Localization

The Gershgorin circle theorem localizes all eigenvalues of a square matrix A = (a_{ij}) to the union of disks in the , each centered at a diagonal entry a_{ii} with radius r_i = \sum_{j \neq i} |a_{ij}|, thereby demonstrating that eigenvalues cluster near these diagonal entries, which often capture the primary behavior of the matrix. This clustering insight arises because the theorem exploits the row sums of absolute off-diagonal entries to bound deviations from the diagonal, providing a geometric enclosure that reflects the matrix's diagonal dominance or lack thereof. For Hermitian matrices, where all eigenvalues are real and the diagonal entries a_{ii} are real, the theorem confines the eigenvalues to the union of real line segments [a_{ii} - r_i, a_{ii} + r_i], effectively projecting the disks onto the real axis and yielding interval bounds that align directly with the spectrum's location. In general, for any matrix, the real parts of all eigenvalues lie between \min_i (\operatorname{Re}(a_{ii}) - r_i) and \max_i (\operatorname{Re}(a_{ii}) + r_i), as the ensures that the horizontal extent of each disk spans this range around \operatorname{Re}(a_{ii}). This has significant implications for stability : if every disk lies entirely in the open left half-plane (i.e., \operatorname{Re}(a_{ii}) + r_i < 0 for all i), then all eigenvalues have negative real parts, implying that the matrix is Hurwitz stable and associated linear systems are asymptotically stable. Despite its utility, the theorem's bounds can be loose when off-diagonal entries are large relative to the diagonal, resulting in expansive radii that enclose a wide portion of the and reduce the precision of localization. Furthermore, while the disks exactly contain the eigenvalues, the pseudospectra—which characterize the eigenvalues of perturbed matrices and measure nonnormality—often extend beyond these boundaries, highlighting the theorem's focus on the unperturbed spectrum alone. Compared to the spectral radius bound \rho(A) \leq \max_i (|a_{ii}| + r_i), which follows directly from the disk enclosures, the full theorem offers sharper localization by considering the union of disks rather than a single encompassing , particularly when diagonal entries vary substantially across rows.

Refinements

Disjoint Disk Strengthening

The disjoint disk strengthening of the Gershgorin circle theorem, originally established by Sergei Gershgorin in his 1931 paper, provides a more precise localization of eigenvalues when the disks separate into disjoint groups. Specifically, suppose the n Gershgorin disks of a n \times n A into k disjoint unions, where the j-th union consists of the disks corresponding to a S_j \subseteq \{1, \dots, n\} of row indices with |S_j| = m_j and \bigcup_{i=1}^k S_j = \{1, \dots, n\}. Then, the union of the disks for each S_j contains precisely m_j eigenvalues (counting multiplicities) of A, which coincide with the eigenvalues of the m_j \times m_j principal submatrix of A formed by the rows and columns in S_j. This refinement enables exact multiplicity determination within each disjoint component, which is impossible in general without separation. A proof sketch relies on the of eigenvalues with respect to entries. Consider a that scales the off-block entries (those connecting different S_j) by a t \in [0,1], yielding a family of matrices A(t) with A(0) block diagonal (containing the principal submatrices along the diagonal blocks) and A(1) = A. The eigenvalues of A(t) vary continuously in t, and since the Gershgorin disks of A(1) are disjoint across blocks, no eigenvalue can cross between the block-localized regions during the deformation. Thus, each block's eigenvalues remain confined to its corresponding disk union. When disks overlap, this exact partitioning fails, preventing precise multiplicity assignment to subsets. For instance, the of the (z - \lambda)^n has all n eigenvalues at \lambda (with multiplicity n), but its Gershgorin disks generally overlap substantially, so the theorem cannot isolate individual eigenvalues or subgroups without additional structure.

Taussky's Theorem

Taussky's refinement of the Gershgorin circle theorem, introduced in 1949, addresses the precise location of eigenvalues on the boundaries of the disks for irreducible matrices. For an irreducible complex matrix A, if an eigenvalue \lambda lies on the boundary of one Gershgorin disk, then \lambda lies on the boundary of every Gershgorin disk. This result applies more broadly, including to real symmetric irreducible matrices, where the Gershgorin disks provide intervals on the real line [a_{ii} - r_i, a_{ii} + r_i] with r_i = \sum_{j \neq i} |a_{ij}|, and eigenvalues at the endpoints of these intervals—corresponding to the extreme values of the spectrum—must be simple or satisfy multiplicity conditions dictated by the structure of overlapping disks sharing that endpoint. For instance, in the case of a real symmetric irreducible matrix with positive off-diagonal entries, the largest eigenvalue (Perron root) is simple, ensuring no higher multiplicity at the rightmost endpoint unless multiple disks coincide exactly there. The proof for the general case employs the eigenvector with maximum modulus component, showing that boundary equality in one row propagates via irreducibility to all rows. For the symmetric case with positive off-diagonals, arguments analogous to those in the Perron-Frobenius theorem leverage the positive eigenvector: Consider \lambda as an eigenvalue with normalized eigenvector x where \max_i |x_i| = 1. The Gershgorin theorem implies |\lambda - a_{ii}| \leq r_i for all i, with equality for some row k if \lambda is on the boundary of the k-th interval. This equality requires |x_j| = |x_k| for all j with a_{kj} \neq 0. Irreducibility ensures the associated graph is strongly connected, and positivity guarantees a positive eigenvector, propagating the equality condition across all rows via contradiction: if \lambda were interior to another disk, the strict inequality would violate the eigenvector's uniformity in magnitude. This theorem strengthens the disjoint disks result by yielding more precise boundary placements, enabling tighter localization when eigenvalues touch the disk peripheries.

Examples

Basic Application

Consider the following 4×4 diagonally dominant matrix as a basic example to illustrate the application of the Gershgorin circle theorem: A = \begin{pmatrix} 3 & 1 & 0 & 0 \\ 1 & 4 & 1 & 0 \\ 0 & 1 & 4 & 1 \\ 0 & 0 & 1 & 3 \end{pmatrix} The Gershgorin disks for this matrix are determined by the diagonal entries as centers and the sums of the absolute values of the off-diagonal entries in each row as radii. For the first row, the center is 3 with radius |1| = 1, yielding the disk D_1 = \{ z \in \mathbb{C} : |z - 3| \leq 1 \}. For the second row, the center is 4 with radius |1| + |1| = 2, so D_2 = \{ z \in \mathbb{C} : |z - 4| \leq 2 \}. Similarly, the third row gives D_3 = \{ z \in \mathbb{C} : |z - 4| \leq 2 \}, and the fourth row gives D_4 = \{ z \in \mathbb{C} : |z - 3| \leq 1 \}. The union of these disks is the region in the covered by at least one D_i, which for this real lies along the real axis from 2 to 6, as the leftmost boundary is \min(3-1, 4-2) = 2 and the rightmost is \max(3+1, 4+2) = 6. By the Gershgorin circle theorem, all eigenvalues of A must lie within this union. The actual eigenvalues of A can be found by solving the \det(A - \lambda I) = 0, which yields the \lambda^4 - 14\lambda^3 + 70\lambda^2 - 148\lambda + 112 = 0. Factoring gives (\lambda - 2)(\lambda - 4)(\lambda^2 - 8\lambda + 14) = 0, so the eigenvalues are exactly $2, &#36;4, and $4 \pm \sqrt{2} (approximately $2.586 and &#36;5.414). All these values lie within the interval [2, 6], confirming the theorem's bound, with the lower bound achieved exactly at \lambda = 2. This example demonstrates the theorem's utility in providing a quick estimate for eigenvalue locations without full computation, particularly for diagonally dominant matrices where the disks are relatively tight. However, the significant overlap among D_2, D_3, D_1, and D_4 results in a coarser upper bound (6 versus the actual maximum of about $5.414$), highlighting how disk overlaps can reduce the precision of the localization.

Taussky Illustration

Consider the symmetric A = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{pmatrix}, which is irreducible because its associated directed graph is strongly connected. The Gershgorin disks for A are centered at the diagonal entries with radii equal to the sums of the absolute values of the off-diagonal entries in each row: the first and third disks are both |z - 2| \leq 1 (corresponding to the real interval [1, 3]), and the second disk is |z - 3| \leq 2 (corresponding to [1, 5]). The union of these disks is [1, 5], so all eigenvalues of A lie in this interval. The of A is (\lambda - 2)(\lambda - 1)(\lambda - 4) = 0, yielding eigenvalues \lambda = 1, 2, 4. The eigenvalue \lambda = 1 lies on the boundary of the union [1, 5]. In accordance with Taussky's theorem for irreducible matrices, this boundary eigenvalue also lies on the boundary of every Gershgorin disk containing it: |1 - 2| = 1 for the first and third disks, and |1 - 3| = 2 for the second disk. The eigenvalue \lambda = 1 is simple (multiplicity one), consistent with Taussky's observation that boundary eigenvalues of irreducible matrices cannot have full multiplicity unless the matrix is a multiple of the . To illustrate the necessity of irreducibility in Taussky's theorem, consider the reducible symmetric block-diagonal matrix B = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 2 & 2 \\ 0 & 2 & 2 \end{pmatrix}. The Gershgorin disks are the degenerate disk \{1\} for the first row, and |z - 2| \leq 2 (interval [0, 4]) for the second and third rows. The union is [0, 4]. The eigenvalues of B are $1 (from the first block) and $0, 4 (from the second block, as its characteristic polynomial is \lambda^2 - 4\lambda = \lambda(\lambda - 4) = 0). Here, \lambda = 1 lies on the boundary of its own degenerate disk but in the strict interior of the other two disks containing it (|1 - 2| = 1 < 2), violating the boundary condition required by Taussky's theorem.

Applications

Numerical Preconditioning

In , the Gershgorin circle theorem plays a key role in preconditioning strategies for iterative solvers of linear systems Ax = b, particularly through row techniques that enhance diagonal dominance. By applying a diagonal matrix D with positive entries, the matrix A can be equilibrated to D^{-1}AD, where the goal is to minimize the maximum row sums of the off-diagonal absolute values, thereby tightening the Gershgorin disks and improving the matrix's . Such leverages the theorem's localization of eigenvalues near the diagonal, ensuring the scaled matrix is strictly diagonally dominant—a that confines all eigenvalues to disks with radii less than the diagonal magnitudes. The preconditioning procedure typically involves selecting the diagonal entries of D to equalize row norms, often using the infinity norm, which directly applies to methods like Jacobi and Gauss-Seidel iterations for solving Ax = b. In the Jacobi method, the preconditioner is the diagonal D of A, transforming the iteration matrix to I - D^{-1}A, while Gauss-Seidel uses the lower triangular part D - E. Convergence is guaranteed if the spectral radius \rho(I - D^{-1}A) < 1, a condition bounded by the Gershgorin theorem for diagonally dominant matrices, as the eigenvalues of D^{-1}A lie within disks centered at 1 with radii less than 1, implying those of the iteration matrix lie within the open unit disk. This approach is especially effective for sparse, non-symmetric systems where direct methods are prohibitive, and the theorem provides a posteriori verification of the preconditioner's quality without computing eigenvalues explicitly. Historically, this use of Gershgorin-based row for preconditioning became a staple in 20th-century , appearing in foundational texts that analyzed iterative convergence for diagonally dominant matrices. Early works established that such not only accelerates Jacobi and Gauss-Seidel but also extends to preconditioned methods, with the theorem serving as a theoretical for bounding error reduction rates.

Eigenvalue Bounds

The Gershgorin circle theorem provides a computationally efficient method for obtaining rough bounds on the eigenvalues of large sparse matrices, where full eigendecomposition is prohibitive due to high dimensionality. In control theory, these bounds are particularly valuable for assessing system stability without solving the characteristic equation; for instance, if all Gershgorin disks lie in the open left half of the complex plane (Re(z) < 0), the matrix is Hurwitz stable, implying asymptotic stability of the linear system. This application is demonstrated in the analysis of coupled nonlinear oscillators, where the theorem confirms that eigenvalues of the Jacobian matrix have negative real parts, ensuring bounded trajectories and synchronization stability as validated by numerical simulations. In , the theorem is applied to bound eigenvalues of the , which encodes and properties relevant to processes and network analysis. For Laplacians in signed or unit graphs, Gershgorin disks yield upper bounds on the largest eigenvalue λ_n in terms of and magnitudes, such as λ_n(Φ) ≤ max_i {d_i + ∑_{j ∼ i} |c_j|/|c_i|}, where d_i is the and c_k are parameters; holds under switching to unsigned graphs. These bounds facilitate quick assessments of gaps, which influence convergence rates in consensus algorithms on graphs. For non-normal matrices, where eigenvalues alone do not capture sensitivity to perturbations, the theorem extends to bounding pseudospectra—regions where eigenvalues of perturbed matrices can lie. Recent Gershgorin-type inclusion sets, derived via block decompositions of the matrix into perturbations of block-tridiagonal forms, enclose the ε-pseudospectrum within unions of disks or more refined regions like Γ_n^ε(A), offering sharper enclosures than classical disks for large Toeplitz matrices common in non-normal discretizations. Such bounds are crucial for understanding transient in non-normal , as in fluid flows or . Despite its simplicity, the theorem's bounds can be loose for matrices with overlapping disks or weak diagonal dominance, limiting precision in eigenvalue estimation. In the context of Markov chains, Gershgorin disks bound eigenvalues of the , constraining the and thus the mixing time or relaxation rates; for continuous-time chains, the disks center on negative diagonals (outflow rates) with radii given by total rates, ensuring the second-largest eigenvalue magnitude informs to stationarity without detailed computation. These bounds complement localization results by providing algebraic enclosures that align with physical constraints like in reversible chains.

References

  1. [1]
    [PDF] 1. Basic Theory - Andrés E. Caicedo
    It can be said that the original result of Geršgorin (1931), on obtaining an eigenvalue inclusion result for any n × n complex matrix in terms of n easily.
  2. [2]
    S. Geršgorin (S. Gerschgorin), “Über die Abgrenzung der ...
    Full-text PDF (762 kB) Citations (8). Bibliographic databases: Document ... Gerschgorin), “Über die Abgrenzung der Eigenwerte einer Matrix”, Izv. Math ...
  3. [3]
    [PDF] Gershgorin Circles - Alen Alexanderian
    Gershgorin circles provide a basic means of localizing eigenvalues of a matrix. This is made precise by the Gershgorin Theorem. Below, we provide a concise.
  4. [4]
    [PDF] gerschgorin's theorem for generalized eigenvalue problems in the ...
    Mar 30, 2011 · Our sets are defined by circles in the complex plane in the standard Euclidean metric, and are easier to compute than known similar results.
  5. [5]
  6. [6]
    Where are Eigenvalues?
    The corresponding statement is known as the Gershgorin circle theorem may be used to bound the spectrum of a square matrix. ... Proof: Let λ be an eigenvalue of a ...<|control11|><|separator|>
  7. [7]
    Geršgorin and His Circles - SpringerLink
    TheGer? sgorin CircleTheorem, averywell-known resultin linear algebra today, stems from the paper of S. Ger? sgorin in 1931.Missing: Semyon | Show results with:Semyon
  8. [8]
    [PDF] Optimizing Gershgorin for Symmetric Matrices - arXiv
    Apr 30, 2019 · The Gershgorin Circle Theorem is a well-known and efficient method for bounding the eigenvalues of a matrix in terms of its entries. If A is a ...
  9. [9]
    Optimizing Gershgorin for symmetric matrices - ScienceDirect.com
    Sep 15, 2019 · The Gershgorin Circle Theorem is a well-known and efficient method for bounding the eigenvalues of a matrix in terms of its entries.
  10. [10]
    [PDF] Stability of System matrix via Gerschgorin circles
    In this paper the stability of the system can be analyzed graphically using Gerschgorin circle theorem. Analytically it has been proved that if the left ...Missing: Hurwitz | Show results with:Hurwitz<|control11|><|separator|>
  11. [11]
  12. [12]
    [PDF] Iterative Methods for Sparse Linear Systems Second Edition
    In the six years that passed since the publication of the first edition of this book, iterative methods for linear systems have made good progress in ...
  13. [13]
    None
    ### Summary of Gershgorin Circle Theorem Use in Control Theory/Stability Analysis
  14. [14]
    [PDF] Bounds for the extremal eigenvalues of gain Laplacian matrices - arXiv
    May 9, 2021 · Using Gershgorin's theorem, we propose novel upper bounds for the largest eigenvalue λn(Φ) of the gain Laplacian in terms of arbitrary ...
  15. [15]
    [PDF] Gershgorin-Type Spectral Inclusions for Matrices - arXiv
    Aug 6, 2025 · Gershgorin: Über die Abgrenzung der Eigenwerte einer Matrix. Izv. Akad. Nauk. USSR Otd. Fiz.-Mat. Nauk 7 (1931), 749–754. [12] M. J. C. ...
  16. [16]
    [PDF] Numerical methods for eigenvalues and eigenvectors, MAT 3110 UiO
    Sep 21, 2023 · In these notes we present applications of Gershgorin's circle theorem, which is a useful for rough estimates of eigenvalues of a matrix, ...
  17. [17]
    [PDF] arXiv:2205.10968v1 [math.CO] 23 May 2022
    May 23, 2022 · Both Theorem 1 and Gershgorin's circle theorem would allow all eigenvalues to be 0, as zero is in each of the closed balls Bdk (dk) and 0 ...
  18. [18]
    None
    ### Summary of Use of Gershgorin for Bounding Eigenvalues in Markov Chains