Fact-checked by Grok 2 weeks ago

Schur decomposition

The Schur decomposition, also known as Schur triangulation, is a matrix decomposition in linear algebra that asserts every square complex matrix is unitarily similar to an upper triangular matrix. Specifically, for any n \times n complex matrix A, there exist a unitary matrix Q (satisfying Q^* Q = I, where Q^* denotes the conjugate transpose) and an upper triangular matrix T such that A = Q T Q^*, with the diagonal entries of T being the eigenvalues of A. This decomposition, named after the German mathematician Issai Schur (1875–1941), guarantees the existence of such a triangular form for any square matrix over the complex numbers, providing a canonical way to reveal the eigenvalues without requiring diagonalizability. Key properties of the Schur decomposition include its preservation of eigenvalues and algebraic multiplicities through unitary similarity, which maintains the of the original . The decomposition is not unique: the eigenvalues on the diagonal of T can be arranged in any order, and for eigenvalues with multiplicity greater than one, the corresponding upper-triangular block can be modified by right-multiplication with a while preserving triangularity. For real matrices, a real Schur decomposition exists, where A = Q T Q^T with Q orthogonal and T upper quasi-triangular, featuring 1×1 blocks for real eigenvalues and 2×2 blocks for eigenvalue pairs. When A is (satisfying A^* A = A A^*), the Schur form reduces to the , with T diagonal. The Schur decomposition plays a central role in , enabling efficient eigenvalue computations via algorithms like the QR iteration, which typically requires O(n^3) arithmetic operations. It facilitates the evaluation of matrix functions, solving equations through methods like the Bartels–Stewart algorithm, and assessing matrix normality by measuring the norm of the strictly upper triangular part of T. As a generalization of that applies to all matrices, it underpins in dynamical systems and serves as a building block for more advanced decompositions in and .

Background and Statement

Historical Context

The Schur decomposition theorem originated in the work of , a German mathematician whose contributions spanned , , and linear algebra. In 1909, Schur published the seminal result in his paper "Über die charakteristischen Wurzeln einer linearen Substitution mit einer Anwendung auf die Theorie der Integralgleichungen," appearing in Mathematische Annalen. There, he established that any over the complex numbers admits a to an upper triangular form, with the diagonal elements corresponding to the matrix's eigenvalues. This decomposition provided a fundamental tool for analyzing linear transformations while preserving the Euclidean norm through the unitary factor. Schur's theorem built directly on foundational advances by his doctoral advisor, , who had explored matrix canonical forms in the late 19th century. Frobenius's 1878 paper "Über lineare Substitutionen und bilineare Formen" demonstrated that commuting families of matrices could be simultaneously triangularized via similarity transformations, laying groundwork for understanding joint spectral properties. Earlier influences included Arthur Cayley's 1858 Cayley-Hamilton theorem, which linked matrix powers to their and informed Schur's approach to eigenvalue placement on the diagonal. These developments occurred within the broader context of , where figures like and Frobenius investigated algebraic invariants under linear group actions during the 1890s. The theorem's emergence aligned with rapid progress in around 1900–1920, driven by efforts to classify finite group actions on vector spaces. Schur, motivated by problems in symmetric functions and theory—areas central to his earlier work on group characters—sought a that facilitated computations of traces and other invariants. His 1909 result not only resolved key questions in linear algebra but also supported applications to integral equations, reflecting the era's growing interplay between finite-dimensional matrix theory and infinite-dimensional operator methods. This period marked a shift toward unitary transformations, enhancing numerical and theoretical stability over general similarities.

Theorem Statement

The Schur decomposition theorem concerns square matrices over the complex numbers. Consider an n \times n complex matrix A. A unitary matrix U satisfies U^* U = I_n, where U^* denotes the Hermitian transpose (conjugate transpose) of U, and I_n is the n \times n identity matrix. An upper triangular matrix T has all entries below the main diagonal equal to zero, and the diagonal entries of T are the eigenvalues of A. The theorem asserts that for every such matrix A, there exists a unitary matrix U and an upper triangular matrix T such that A = U T U^*, where the diagonal entries of T are the eigenvalues of A (counted with algebraic multiplicity). To illustrate the form, consider a general $2 \times 2 complex matrix A. Its Schur decomposition is A = U \begin{pmatrix} \lambda_1 & t_{12} \\ 0 & \lambda_2 \end{pmatrix} U^*, where \lambda_1 and \lambda_2 are the eigenvalues of A, and t_{12} is some complex scalar.

Proof and Properties

Proof Outline

The proof of the Schur decomposition theorem proceeds by on the dimension n of the square complex matrix A \in \mathbb{C}^{n \times n}. For the base case n=1, any $1 \times 1 matrix is already upper triangular, as it consists of a single scalar entry, which is trivially the eigenvalue of A. Assume the statement holds for all matrices of dimension n-1, where n \geq 2. By the fundamental theorem of algebra, the characteristic polynomial \det(A - \lambda I) of degree n has at least one root \lambda \in \mathbb{C}, so A possesses an eigenvalue \lambda with a corresponding nonzero eigenvector v. Normalizing v yields a unit eigenvector with \|v\|_2 = 1. To construct the unitary matrix, extend the set \{v\} to an orthonormal basis \{v, u_2, \dots, u_n\} for \mathbb{C}^n using the Gram-Schmidt process applied to v and any basis for its orthogonal complement. Let Q \in \mathbb{C}^{n \times n} be the unitary matrix whose columns are these basis vectors, so the first column is v. Consider the unitary similarity transformation T = Q^* A Q. The first column of T is Q^* A v = Q^* (\lambda v) = \lambda e_1, where e_1 is the first standard basis vector. The submatrix in rows 2 through n and column 1 is Q_1^* A v = \lambda Q_1^* v = 0, where Q_1 denotes the last n-1 columns of Q. Thus, T takes the block upper triangular form T = \begin{pmatrix} \lambda & w^* \\ 0 & A' \end{pmatrix}, where w^* \in \mathbb{C}^{1 \times (n-1)} and A' = Q_1^* A Q_1 \in \mathbb{C}^{(n-1) \times (n-1)}. By the induction hypothesis, there exists a unitary matrix U' \in \mathbb{C}^{(n-1) \times (n-1)} such that U'^* A' U' = T', where T' is upper triangular. Define the block diagonal unitary matrix U'' = \operatorname{diag}(1, U'). Then the full transformation matrix U = Q U'' is unitary, and U^* A U = \begin{pmatrix} \lambda & (w^* U') \\ 0 & T' \end{pmatrix}, which is upper triangular with \lambda and the diagonal entries of T' on its main diagonal. A key lemma ensures that the eigenvalues of A' are precisely the eigenvalues of A excluding \lambda, as unitary similarity transformations preserve eigenvalues (including multiplicities). This completes the induction, establishing the existence of a unitary U and upper triangular T such that A = U T U^*, with the diagonal of T comprising the eigenvalues of A.

Uniqueness and Properties

The Schur decomposition of a square matrix A \in \mathbb{C}^{n \times n} is not unique, as there are multiple choices for the unitary matrix U and the upper triangular matrix T that satisfy A = U T U^H. The columns of U, known as Schur vectors, are not uniquely determined except for the first column corresponding to a specific eigenvalue; subsequent vectors depend on the selection process, leading to different possible T matrices even for the same eigenvalue ordering. However, the decomposition is unique up to of the eigenvalues on the diagonal of T. The diagonal entries of T are precisely the eigenvalues of A, which can appear in any order, but their is invariant. The superdiagonal elements of T, which capture interactions between eigenvalues, are not unique and can vary depending on the choice of Schur vectors. Key properties of the Schur decomposition include the preservation of the spectrum of A, as the eigenvalues lie on the diagonal of T. If A is (satisfying A^H A = A A^H), then T is diagonal, reducing the decomposition to a unitary where the Schur vectors are eigenvectors. In contrast to the canonical form, which provides a block-diagonal structure revealing the full algebraic and geometric multiplicities via Jordan blocks, the Schur form achieves a weaker upper triangularization that does not necessarily expose the minimal or subspaces as explicitly, though it maintains numerical advantages due to the unitary transformation. For practical purposes, eigenvalues are often ordered on the diagonal of T by decreasing magnitude or by argument to enhance numerical stability in subsequent computations, such as deflation or reordering.

Numerical Computation

Algorithms for Computation

The computation of the Schur decomposition typically begins with a reduction of the general matrix to upper Hessenberg form using Householder reflections, which requires approximately \frac{10}{3}n^3 floating-point operations for an n \times n matrix and preserves the eigenvalues. This step is followed by iterative QR iterations to triangularize the Hessenberg matrix, where each iteration involves a QR factorization of the current matrix followed by a reverse multiplication to form a similarity transformation, gradually driving subdiagonal elements toward zero. The basic unshifted QR algorithm converges slowly, with an overall complexity of O(n^4) for dense matrices, but shifts accelerate convergence to O(n^3). The QR step, an implicit variant of the QR iteration, enhances efficiency by avoiding explicit computation of the orthogonal factor Q through bulge chasing techniques. In this method, a double-shift strategy is employed for real matrices to handle eigenvalue pairs without introducing complex arithmetic: shifts are based on the eigenvalues of the trailing 2x2 submatrix, and a bulge is introduced and chased down the subdiagonal using reflectors or Givens rotations. The QR iteration phase using double-shift Francis steps requires in total about \frac{20}{3}n^3 operations when computing only the Schur form (for the real case), enabling the full decomposition in O(n^3) time. Practical implementations are provided in numerical libraries such as , where the driver routine xGEES computes the Schur decomposition of a general A, returning an orthogonal/ Z and a quasi-upper T such that A = Z T Z^T (real) or A = Z T Z^H (complex), optionally ordering selected eigenvalues on the diagonal of T. The routine internally performs Hessenberg reduction via xGEHRD followed by QR iterations using xHSEQR, with parameters including the input A, an optional eigenvalue selection function SELCTG, and outputs for the Schur form T, Schur vectors in Z, and workspace for computations. Related expert routines like xGEESX include error bounds and balancing for improved conditioning. For illustration, the high-level steps of a basic QR iteration on a small are as follows:
Input: Hessenberg matrix H ∈ ℂ^{n×n}, tolerance ε
Output: Approximate Schur form T ≈ upper triangular, unitary U such that H ≈ U T U*

1. Initialize U ← I_n, A ← H
2. While not converged (e.g., max |A_{i+1,i}| > ε for i=1 to n-1):
   a. Compute QR factorization: A = Q R (using [Householder](/page/The_Householder) or Givens)
   b. Update: A ← R Q
   c. U ← U Q
3. Return T ← A, U
This pseudocode represents the unshifted process; in practice, shifts and the implicit Francis formulation are used for efficiency on larger matrices.

Numerical Stability

The QR algorithm for computing the Schur decomposition is backward stable, meaning that the computed Schur form T and unitary matrix Q satisfy (A + E) Q = Q T, where the perturbation E satisfies \|E\|_2 \leq c n u \|A\|_2 for some modest constant c, dimension n, and unit roundoff u. This bound ensures that the algorithm introduces only small relative perturbations on the order of machine epsilon times the matrix norm, preserving the essential numerical properties of the original matrix throughout the iterations. The sensitivity of the computed Schur vectors and eigenvalues to perturbations in A depends critically on the separation of the eigenvalues; clustered eigenvalues lead to a larger , amplifying errors in the output. For instance, the condition number for simple eigenvalues is given by \kappa(\lambda) = 1 / |y^H x|, where x and y are normalized right and left eigenvectors, but poor separation can cause subspace angles to satisfy \|\sin \Theta\|_F \leq 4 \|E_{21}\|_F / \sep(T_{11}, T_{22}), highlighting increased vulnerability when diagonal blocks are close. Rounding errors in floating-point arithmetic primarily affect the superdiagonal elements of the computed Schur form, with perturbations bounded by O(u \|A\|_2), though these can propagate if not mitigated. Balancing the matrix prior to applying the —via a diagonal —scales the rows and columns to reduce the \kappa_2(A) and equalize norms, thereby minimizing the impact of these errors and improving overall accuracy for ill-conditioned problems. Post-2000 advancements in have incorporated block algorithms into the QR iteration, processing submatrices in parallel to enhance on multicore and distributed systems while maintaining backward with error bounds analogous to the classical case, O(\varepsilon \|A\|). The triangular structure of the Schur form contributes to its superior compared to direct computation of the form, which requires identifying sensitive Jordan chains and is prone to severe ill-conditioning even for modest multiple eigenvalues.

Applications

Eigenvalue and Eigenvector Computation

The Schur decomposition of a square matrix A \in \mathbb{C}^{n \times n} expresses it as A = [Q](/page/Q) T Q^H, where Q is unitary and T is upper triangular, enabling direct extraction of eigenvalues from the diagonal entries of T. These diagonal elements \lambda_1, \lambda_2, \dots, \lambda_n are precisely the eigenvalues of A, listed in the order determined by the decomposition algorithm, with multiplicities preserved. This approach ensures all eigenvalues are computed simultaneously and reliably, without requiring prior knowledge of their locations. For eigenvector computation, the columns of Q, known as Schur vectors, provide an for the invariant s associated with the eigenvalues. When A is diagonalizable (e.g., all eigenvalues distinct or sufficient geometric multiplicity), the Schur vectors corresponding to each diagonal entry of T are the eigenvectors of A. In defective cases, where the algebraic multiplicity exceeds the geometric multiplicity, the Schur vectors span the generalized eigenspace; the true eigenvectors form a basis within this , while additional Schur vectors serve as generalized eigenvectors satisfying chains like (A - \lambda I) v_{k} = v_{k-1}. This structure allows reconstruction of the full eigensystem by solving within the deflated blocks of T. Deflation exploits zeros on the superdiagonal of T to identify , the problem into smaller subproblems. A zero superdiagonal entry at position (i, i+1) indicates that the leading i \times i principal submatrix of T (and corresponding columns of Q) corresponds to an subspace, allowing the eigenvalues and vectors for that block to be isolated and the remaining matrix reduced accordingly. This process, often performed incrementally during computation, enhances efficiency by avoiding redundant iterations on converged portions. Compared to iterative methods like the power method, which converge to dominant eigenvalues via repeated matrix-vector multiplications and may require separate runs for each eigenvalue, the Schur decomposition offers global convergence for dense matrices, computing the entire spectrum in a single process with guaranteed under unitary transformations. The power method excels for extreme eigenvalues in sparse settings but struggles with interior or clustered eigenvalues, whereas Schur-based algorithms (e.g., via QR iterations) handle dense cases robustly, often converging in O(n^3) time with high reliability. Consider a non-diagonalizable 2×2 matrix A = \begin{pmatrix} 3 & 1 \\ 0 & 3 \end{pmatrix}, which has a repeated eigenvalue \lambda = 3 with algebraic multiplicity 2 but geometric multiplicity 1. Its Schur decomposition is A = Q T Q^H with T = A (already upper triangular) and Q = I, so the diagonal yields \lambda = 3 twice. The first column of Q is the eigenvector v_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, while the second Schur vector v_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix} is a generalized eigenvector satisfying (A - 3I) v_2 = v_1. Deflation on the full block confirms the single invariant subspace \mathbb{C}^2.

Matrix Exponentiation and Functions

The Schur decomposition A = Q T Q^*, where Q is unitary and T is upper triangular, enables the of a matrix function f(A) via the relation f(A) = Q f(T) Q^*. The triangular structure of T simplifies the evaluation of f(T), as its entries can be determined recursively using forward substitution or by solving small equations for off-diagonal blocks in the Schur-Parlett algorithm. For the matrix exponential \exp(A), the Schur form allows approximation of \exp(T) through truncation or s applied directly to the , with the scaling-and-squaring technique reducing computational cost while maintaining accuracy. This approach requires approximately $2(q + j + 1/3) n^3 for a of type [q/j], where n is the matrix dimension. A key advantage of this method is that it circumvents the need for complete diagonalization, which fails for non-diagonalizable matrices; instead, the quasi-triangular form of T (with 2×2 blocks for complex conjugate eigenvalue pairs) accommodates defective eigenspaces while preserving numerical stability. For the matrix logarithm \log(A), particularly when A is positive definite and Hermitian (yielding a diagonal T), the computation simplifies to \log(A) = Q \log(D) Q^*, where D is the diagonal of T and logarithms are taken element-wise; in general cases, inverse scaling and squaring or the Denman-Beavers iteration is applied to T. In , the Schur decomposition supports stability analysis by facilitating solutions to Lyapunov equations A X + X A^* + W = 0, where the Bartels-Stewart algorithm transforms the equation into a block-triangular solvable via back-substitution in O(n^3) operations, enabling efficient assessment of controllability and quadratic performance metrics. As an illustrative example, consider the 2×2 Jordan block J = \begin{pmatrix} \lambda & 1 \\ 0 & \lambda \end{pmatrix}, which coincides with its Schur form T = J. The exponential is given by \exp(J) = e^{\lambda} \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}, obtained by exponentiating the equal diagonal entries and determining the superdiagonal via the nilpotent part's \exp(N) = I + N for the unit upper shift N = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}.

Generalizations

Real Schur Form

For real matrices, the Schur decomposition is adapted to maintain real arithmetic throughout the process, resulting in the real Schur form. Specifically, any real A \in \mathbb{R}^{n \times n} can be decomposed as A = Q T Q^T, where Q is a real (Q^T Q = I) and T is a real block upper , known as quasi-triangular. The diagonal blocks of T are either 1×1 (corresponding to real eigenvalues) or 2×2 (corresponding to pairs of eigenvalues). The construction of the real Schur form handles complex conjugate eigenvalue pairs by incorporating 2×2 rotation blocks on the diagonal of T. These blocks are of the form \begin{pmatrix} \alpha & \beta \\ -\beta & \alpha \end{pmatrix}, where \alpha, \beta \in \mathbb{R} and \beta \neq 0, ensuring the eigenvalues \alpha \pm i \beta are the roots of the of the block. The off-diagonal elements above these blocks maintain the upper triangular structure, while the entire factorization preserves the eigenvalues of A in the diagonal blocks. Key properties of the real Schur form include its preservation of the real nature of the input matrix, avoiding any complex numbers in the decomposition. The 1×1 blocks directly give real eigenvalues, and the 2×2 blocks encapsulate complex conjugate pairs without explicit complex entries. This form embeds naturally into the complex Schur decomposition by viewing the real orthogonal Q as unitary over the complexes and the quasi-triangular T as upper triangular in the complex sense. In numerical software, the real Schur form is preferred for real input matrices to avoid complex arithmetic, which simplifies implementation and reduces floating-point operations. For instance, the QR algorithm is used to compute this form, producing real outputs even when complex eigenvalues are present.

Generalized Schur Decomposition

The generalized Schur decomposition extends the classical Schur decomposition to the generalized eigenvalue problem involving a pair of square matrices A and B \in \mathbb{C}^{n \times n}, where B is invertible. It seeks unitary matrices Q and Z such that Q^* A Z = S, \quad Q^* B Z = T, where S and T are both upper triangular. The diagonal entries of this decomposition yield the generalized eigenvalues \lambda_i = S_{ii} / T_{ii} for i = 1, \dots, n, which are the roots of the equation \det(A - \lambda B) = 0. This decomposition is computed using the QZ algorithm, a double-sided orthogonal transformation method analogous to the for the standard eigenvalue problem. Introduced by Moler and Stewart in 1973, the QZ algorithm proceeds iteratively by applying generalized Givens rotations from both the left and right to progressively triangularize the pencil (A, B), handling cases where B may be singular through preliminary reductions. The algorithm's convergence relies on shifts based on the Rayleigh quotients, ensuring quadratic convergence for non-defective pencils under suitable ordering of eigenvalues. Regarding uniqueness, the generalized Schur form is unique up to permutations of the diagonal blocks and unitary similarities within blocks corresponding to clusters of equal generalized eigenvalues, mirroring the properties of the standard Schur decomposition. This non-uniqueness arises from the freedom in ordering the eigenvalues and selecting bases for degenerate eigenspaces, but the triangular structure and diagonal ratios remain invariant under such transformations. Applications of the generalized Schur decomposition include solving the generalized eigenvalue problem in control systems, where matrix pencils model descriptor systems of the form E \dot{x} = A x + B u with E possibly singular, facilitating placement and analysis. It also underpins computations of transmission zeros and residue derivatives in linear , providing a robust framework for and design.

References

  1. [1]
    Schur Decomposition -- from Wolfram MathWorld
    The first step in a Schur decomposition is a Hessenberg decomposition. Schur decomposition on an n×n matrix requires O(n^3) arithmetic operations.
  2. [2]
    Schur decomposition - StatLect
    The Schur decomposition (or factorization) allows us to find another matrix that is similar to the given one and is upper triangular.Preliminaries · Householder matrices used in... · The decomposition · Uniqueness
  3. [3]
    [PDF] 10. Schur decomposition
    • the sum of the algebraic multiplicities of the eigenvalues of an 𝑛 × 𝑛 matrix is 𝑛. Geometric multiplicity. • the geometric multiplicity is the dimension ...
  4. [4]
    Matrix Analysis - Cambridge University Press & Assessment
    Roger A. Horn, The Johns Hopkins University, Charles R. Johnson, College of William and Mary, Virginia. Publisher: Cambridge University Press.
  5. [5]
  6. [6]
    [PDF] The Unsymmetric Eigenvalue Problem
    Consider the Schur decomposition A = QT QH, where T is upper-triangular. Then, if A is Hermitian, it follows that T = TH. But because T is upper-triangular ...
  7. [7]
    [PDF] Basics - Ethz
    ... Schur vector is an eigenvector of A. The other columns of U, however, are in general not eigenvectors of A. Notice, that the Schur decomposition is not unique.
  8. [8]
    [PDF] The QR Algorithm - Ethz
    The QR algorithm computes a Schur decomposition of a matrix. It is certainly one of the most important algorithm in eigenvalue computations [9].
  9. [9]
    Eigenvalues, Eigenvectors and Generalized Schur Decomposition
    ### Summary of LAPACK Routines for Schur Decomposition (xGEES)
  10. [10]
    None
    Below is a merged summary of the numerical stability of the QR algorithm for Schur decomposition based on *Matrix Computations* (4th ed.) by Golub and Van Loan. Given the extensive and overlapping information across the segments, I will consolidate the details into a comprehensive narrative with a table in CSV format to capture specific details (e.g., page references, theorems, and key equations) efficiently. The narrative will provide an overview, while the table will serve as a dense, structured representation of the data.
  11. [11]
    [PDF] Fast Linear Algebra is Stable - The Netlib
    May 18, 2007 · their backward stable counterparts. We will show that the amount of ... (Indeed, not even the conventional Hessenberg QR algorithm has a ...
  12. [12]
    [PDF] NUMERICAL METHODS FOR LARGE EIGENVALUE PROBLEMS ...
    Similarly to the power method, there are advantages in working with the shifted matrix A − σI instead of A, where σ is a carefully chosen shift. In fact since ...
  13. [13]
    [PDF] Nineteen Dubious Ways to Compute the Exponential of a Matrix ...
    There will be differences between programs based on the Schur decomposition and those which work with the block diagonal form, but the timings should be ...
  14. [14]
    [PDF] Computational Techniques for Real Logarithms of Matrices
    All things considered, our recommendation for a general purpose method goes to the Schur decomposition approach with eigenvalue grouping, followed by square ...
  15. [15]
    [PDF] Numerical methods for Lyapunov equations
    (More difficult optional: Also use that T in the Schur decomposition of A is a diagonal matrix.) ... On the eigenvalue decay of solutions to operator Lyapunov ...
  16. [16]
    Direct Methods - NetLib.org
    The QR algorithm computes the real Schur decomposition to avoid complex numbers for saving in floating point operations and memory.
  17. [17]
    Schur decomposition - MATLAB - MathWorks
    Calculate the real Schur form of A . The 2-by-2 block matrix on the diagonal represents a pair of conjugate complex eigenvalues. T1 = schur(A).Missing: software | Show results with:software
  18. [18]
    An Algorithm for Generalized Matrix Eigenvalue Problems
    A new method, called the $QZ$ algorithm, is presented for the solution of the matrix eigenvalue problem $Ax = \lambda Bx$ with general square matrices A and ...
  19. [19]
    QZ-Based Algorithm for System Pole, Transmission Zero, and ...
    ... QZ algorithm gives a robust way of computing solutions to the generalized eigenvalue problem. The generalized eigenvalue problem is used in linear control ...