Triangular matrix

In linear algebra, a triangular matrix is a square matrix where all entries either above or below the main diagonal are zero.^[1] There are two primary types: an upper triangular matrix, which has zeros in all positions below the main diagonal (i.e., a_{ij} = 0 for i > j), and a lower triangular matrix, which has zeros above the main diagonal (i.e., a_{ij} = 0 for i < j).^[2]^[3] Triangular matrices possess several key properties that simplify computations in linear algebra. The determinant of a triangular matrix is the product of its diagonal entries.^[4] A triangular matrix is invertible if and only if all its diagonal entries are nonzero, and its inverse is also triangular.^[5] The transpose of an upper triangular matrix is lower triangular, and vice versa. Products of triangular matrices of the same type (upper or lower) are also triangular.^[2] These matrices are fundamental in numerical linear algebra due to their role in efficient algorithms for solving systems of linear equations. For instance, triangular systems can be solved using forward or backward substitution in linear time relative to the matrix size, making them a building block for more complex methods.^[6] They appear prominently in matrix decompositions such as LU factorization, where a general matrix is expressed as the product of a lower triangular matrix and an upper triangular matrix, facilitating the solution of large-scale problems in engineering and scientific computing.^[7] Additionally, every square matrix over the complex numbers is unitarily similar to an upper triangular matrix via the Schur decomposition, which aids in eigenvalue computations.^[8]

Definition and Fundamentals

Upper and Lower Triangular Matrices

A triangular matrix is a square matrix in which either all entries above the main diagonal are zero or all entries below the main diagonal are zero. An upper triangular matrix has zeros strictly below the main diagonal, while a lower triangular matrix has zeros strictly above it. Diagonal matrices, with all off-diagonal entries zero, qualify as both upper and lower triangular.^[9] Formally, an n \times n matrix U is upper triangular if U_{ij} = 0 for all i > j, meaning entries below the diagonal vanish. Similarly, an n \times n matrix L is lower triangular if L_{ij} = 0 for all i < j, with entries above the diagonal being zero.^[1] The concept traces back to Gaussian elimination, a method that Carl Friedrich Gauss applied in 1809 to solve linear systems in celestial mechanics, though the technique has earlier origins; it transforms the coefficient matrix into upper triangular form through row operations. This process reveals a pattern of non-zero entries forming a triangle above the diagonal, from which the modern terminology derives.^[10] Consider a basic 2×2 upper triangular matrix:

\begin{bmatrix} a & b \\ 0 & c \end{bmatrix},

where a, b, c are elements from the underlying field, such as real numbers. A 2×2 lower triangular example is:

\begin{bmatrix} a & 0 \\ d & e \end{bmatrix}.

For a 3×3 upper triangular matrix, the form is:

\begin{bmatrix} a & b & c \\ 0 & d & e \\ 0 & 0 & f \end{bmatrix},

with zeros below the diagonal.^[9] Diagonal matrices represent a special case of triangular matrices, where both upper and lower off-diagonal entries are zero, simplifying many algebraic operations while retaining the triangular structure.^[1]

Notation and Conventions

In mathematical literature, upper triangular matrices are commonly denoted by the letter U, often rendered in boldface as \mathbf{U} to distinguish them as matrices, while lower triangular matrices are denoted by L or \mathbf{L}. These notations facilitate clear representation in contexts such as factorizations and eigenvalue computations. The entries of an n \times n triangular matrix are indexed by row i and column j, where $1 \leq i, j \leq n. The main diagonal consists of the entries a_{ij} where i = j. The superdiagonal comprises the entries where j = i + 1 (directly above the main diagonal), and the subdiagonal comprises those where i = j + 1 (directly below the main diagonal).^[11]^[12] This indexing convention aligns with standard matrix notation and aids in identifying the zero patterns characteristic of triangular forms. The zero pattern of a triangular matrix is typically visualized using explicit matrix diagrams that highlight the positions of zeros. For an upper triangular matrix, all entries below the main diagonal are zero (a_{ij} = 0 for i > j), with potentially nonzero real or complex entries on the diagonal and above it:

\mathbf{U} = \begin{pmatrix} u_{11} & u_{12} & \cdots & u_{1n} \\ 0 & u_{22} & \cdots & u_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & u_{nn} \end{pmatrix}

Analogously, a lower triangular matrix has zeros above the main diagonal (a_{ij} = 0 for i < j), with nonzero entries on and below the diagonal.^[13] These visualizations emphasize the structured sparsity compared to full matrices, where entries can be nonzero throughout. Triangular matrices are not symmetric in general, as symmetry requires a_{ij} = a_{ji} for all i, j, which would force all off-diagonal entries to zero if the matrix is triangular, reducing it to a diagonal matrix.^[9] This contrasts with full matrices, which may have nonzero entries in symmetric positions without such restrictions.^[13] Standard conventions assume triangular matrices are square of size n \times n over the field of real numbers \mathbb{R} or complex numbers \mathbb{C}, ensuring compatibility with common algebraic operations and spectral analysis.^[13]

Properties

Algebraic Properties

A triangular matrix possesses several key algebraic properties that distinguish it from general matrices. The transpose of an upper triangular matrix is lower triangular, and conversely, the transpose of a lower triangular matrix is upper triangular. This follows from the definition, as the entries (U^T)_{ij} = u_{ji}, which swaps the zero patterns above and below the diagonal.^[14] The determinant of a triangular matrix T, whether upper or lower, is the product of its diagonal entries: \det(T) = \prod_{i=1}^n t_{ii}. This result holds because cofactor expansion along a row or column of zeros (the off-diagonal parts) reduces the determinant recursively to the product of the diagonals, as proven by induction on the matrix size.^[15]^[14] The trace of a triangular matrix, defined as the sum of its diagonal entries \operatorname{tr}(T) = \sum_{i=1}^n t_{ii}, remains invariant under similarity transformations. For any invertible matrix S, \operatorname{tr}(S^{-1} T S) = \operatorname{tr}(T), a property that holds for all square matrices but is particularly relevant for triangular forms where the trace equals the sum of the eigenvalues.^[16]^[14] A triangular matrix is invertible if and only if all its diagonal entries are nonzero. In this case, the inverse is also triangular of the same type (upper or lower), preserving the zero pattern off the diagonal.^[17]^[14] The rank of a triangular matrix equals the number of its nonzero diagonal entries. This follows from the fact that the diagonal entries determine the linear independence of the rows or columns, with zero diagonals corresponding to linearly dependent parts.^[18]^[14]

Spectral Properties

A triangular matrix possesses particularly straightforward spectral properties due to its structure. For an n \times n triangular matrix T, whether upper or lower, the eigenvalues are precisely the diagonal entries t_{ii} for i = 1, \dots, n, counted with their algebraic multiplicities matching the multiplicities of these entries on the diagonal.^[19] This follows from the fact that the characteristic equation simplifies dramatically for triangular matrices, as the determinant of \lambda I - T is the product of the diagonal terms (\lambda - t_{ii}). The characteristic polynomial of T is thus explicitly given by

p_T(\lambda) = \det(\lambda I - T) = \prod_{i=1}^n (\lambda - t_{ii}),

which directly encodes the eigenvalues and their multiplicities without requiring further computation.^[20] A direct consequence is that the trace of T, defined as the sum of its diagonal entries \operatorname{tr}(T) = \sum_{i=1}^n t_{ii}, equals the sum of the eigenvalues (with multiplicity), a property that holds generally but is immediately verifiable for triangular matrices.^[21] Regarding the minimal polynomial, it always divides the characteristic polynomial, as per the Cayley-Hamilton theorem, and thus takes the form of a product of factors (\lambda - t_{ii}) raised to powers at most the algebraic multiplicities. If all diagonal entries t_{ii} are distinct, the minimal polynomial equals the characteristic polynomial, since the distinct eigenvalues imply that the matrix is diagonalizable over the complex numbers.^[22] In relation to the Jordan canonical form, a triangular matrix with distinct diagonal entries is similar to a diagonal matrix (hence its Jordan form is diagonal, a special case of upper triangular), but in general, the Jordan form of a triangular matrix is upper triangular with the same diagonal eigenvalues, though the original matrix need not exhibit the specific block structure of Jordan blocks unless the superdiagonal entries align accordingly.^[23]

Operations and Computations

Multiplication and Powers

The product of two upper triangular matrices is upper triangular, and the product of two lower triangular matrices is lower triangular. However, the product of an upper triangular matrix and a lower triangular matrix is not necessarily triangular.^[24] The (i,j)-entry of the product C=AB of two n \times n upper triangular matrices A=(a_{ik}) and B=(b_{kj}) is zero if i > j, since the relevant summands vanish due to the zero entries below the diagonal in A and B. If i \leq j, then

c_{ij} = \sum_{k=i}^{j} a_{ik} b_{kj}.

^[25] Any positive integer power T^k of a triangular matrix T is triangular of the same type (upper or lower), as it can be obtained by repeated multiplication, which preserves the structure. The diagonal entries of T^k are the k-th powers of the diagonal entries of T, since off-diagonal contributions to the diagonal vanish in the expansion.^[25] For a strictly triangular matrix S (upper or lower, with zeros on the main diagonal), S is nilpotent, meaning S^n = 0 for an n \times n matrix S. This follows from induction: each power shifts the nonzero entries further from the diagonal until all entries are zero after n multiplications.^[26] Triangular matrices do not commute under multiplication in general, though diagonal matrices (a special case) do.^[22]

Solving Linear Systems

One of the primary advantages of triangular matrices in numerical linear algebra is their role in efficiently solving linear systems of the form Ax = b, where A is triangular and b is a known vector. Unlike general dense matrices, which require O(n^3) operations via methods like Gaussian elimination, triangular systems can be solved in O(n^2) time using specialized substitution algorithms that exploit the zero structure above or below the diagonal.^[27] These methods, known as forward and backward substitution, iteratively compute the components of the solution vector x without needing matrix inversion or factorization of the original system.^[28] For a lower triangular system Lx = b, where L is an n \times n lower triangular matrix with nonzero diagonal entries, forward substitution proceeds from the first equation to the last. The first component is x_1 = b_1 / l_{11}. For each subsequent i = 2, \dots, n,

x_i = \frac{1}{l_{ii}} \left( b_i - \sum_{j=1}^{i-1} l_{ij} x_j \right).

This formula isolates x_i after substituting previously computed values, ensuring each equation involves only known terms.^[27] A step-by-step pseudocode implementation is as follows:

function forward_substitution(L, b):
    n = length(b)
    x = zeros(n)
    for i = 1 to n:
        sum = 0
        for j = 1 to i-1:
            sum += L[i,j] * x[j]
        x[i] = (b[i] - sum) / L[i,i]
    return x
function forward_substitution(L, b):
    n = length(b)
    x = zeros(n)
    for i = 1 to n:
        sum = 0
        for j = 1 to i-1:
            sum += L[i,j] * x[j]
        x[i] = (b[i] - sum) / L[i,i]
    return x

To illustrate, consider the 3×3 lower triangular system

L = \begin{pmatrix} 2 & 0 & 0 \\ 3 & 4 & 0 \\ 1 & 5 & 6 \end{pmatrix}, \quad b = \begin{pmatrix} 8 \\ 20 \\ 33 \end{pmatrix}.

First, x_1 = 8 / 2 = 4. Then, x_2 = (20 - 3 \cdot 4) / 4 = (20 - 12) / 4 = 2. Finally, x_3 = (33 - 1 \cdot 4 - 5 \cdot 2) / 6 = (33 - 4 - 10) / 6 = 19 / 6 \approx 3.1667. Thus, x \approx (4, 2, 3.1667)^T.^[29] For an upper triangular system Ux = b, where U is upper triangular with nonzero diagonal, backward substitution starts from the last equation and works upward. The last component is x_n = b_n / u_{nn}. For each i = n-1, \dots, 1,

x_i = \frac{1}{u_{ii}} \left( b_i - \sum_{j=i+1}^n u_{ij} x_j \right).

The pseudocode mirrors forward substitution but iterates in reverse:

function backward_substitution(U, b):
    n = [length](/page/Length)(b)
    x = zeros(n)
    for i = n downto 1:
        sum = 0
        for j = i+1 to n:
            sum += U[i,j] * x[j]
        x[i] = (b[i] - sum) / U[i,i]
    return x
function backward_substitution(U, b):
    n = [length](/page/Length)(b)
    x = zeros(n)
    for i = n downto 1:
        sum = 0
        for j = i+1 to n:
            sum += U[i,j] * x[j]
        x[i] = (b[i] - sum) / U[i,i]
    return x

Both algorithms require approximately n^2 / 2 floating-point operations, achieving quadratic complexity compared to the cubic cost of inverting a general matrix. In the context of LU decomposition, these substitution steps enable solving multiple right-hand sides efficiently after a single factorization. The substitution algorithms are backward stable, meaning the computed solution is the exact solution to a slightly perturbed system, with the perturbation bounded by machine epsilon times a modest function of n and the matrix condition number. For well-conditioned triangular matrices, the forward or backward error remains small. Pivoting is generally unnecessary during substitution itself, particularly if the matrix is diagonally dominant, as this ensures the diagonal entries are sufficiently large relative to off-diagonal elements, avoiding growth in rounding errors.

LU Decomposition and Numerical Efficiency

In numerical linear algebra, the LU decomposition expresses an invertible square matrix A \in \mathbb{R}^{n \times n} as the product A = LU, where L is a lower triangular matrix with unit diagonal entries (unitriangular) and U is an upper triangular matrix; this factorization is obtained through Gaussian elimination without pivoting when possible.^[30] For improved numerical robustness, partial pivoting is employed, yielding a permutation matrix P such that PA = LU, where the elimination process records multipliers in L and the resulting upper triangular form in U.^[30] The triangular factors L and U facilitate efficient forward and backward substitution to solve linear systems Ax = b, transforming the original O(n^3) direct solve into a two-stage process.^[30] The computational efficiency of LU decomposition stems from its asymptotic costs: the factorization requires approximately \frac{2}{3}n^3 floating-point operations (flops), while each subsequent triangular solve costs O(n^2) flops, making it particularly advantageous for systems with multiple right-hand sides, as the expensive factorization is performed only once.^[30] This structure extends naturally to matrices with banded form, where triangular matrices represent a special case with full bandwidth n, allowing tailored algorithms that exploit sparsity or band structure to reduce costs below the dense O(n^3) bound while preserving the triangular solve efficiency.^[30] Regarding numerical stability, Gaussian elimination with partial pivoting is backward stable, meaning the computed factors satisfy (PA + \Delta A) = \tilde{L}\tilde{U} with a small relative perturbation \|\Delta A\| / \|A\| = O(\epsilon) (machine epsilon times a modest growth factor), provided the growth factor— the ratio of the largest to smallest element magnitude during elimination—is controlled, as analyzed by Wilkinson. In practice, this approach exhibits robust performance for most matrices, with the triangular factors enabling reliable solutions via substitution. LU decomposition finds key applications in iterative refinement, where residual-based corrections refine approximate solutions to achieve higher accuracy beyond machine precision, leveraging the fast triangular solves for repeated evaluations. It also serves as a preconditioner in Krylov subspace methods for large sparse systems, where the triangular factors approximate the inverse to accelerate convergence.^[30] For symmetric positive definite matrices, the related Cholesky decomposition A = LL^T (with L lower triangular) offers roughly twice the efficiency of LU, requiring about \frac{1}{3}n^3 flops for factorization due to symmetry exploitation, though LU remains more general for nonsymmetric cases.^[30]

Special Forms

Unitriangular Matrices

A unitriangular matrix is a square matrix that is triangular (either upper or lower) and has all diagonal entries equal to 1. The set of all n \times n upper unitriangular matrices over a field K, often denoted U_n(K) or UN(n,K), forms a subgroup of the general linear group GL(n,K) under matrix multiplication. The analogous set for lower unitriangular matrices is denoted L_n(K) or LN(n,K).^[31] The determinant of any unitriangular matrix is 1, since the determinant of a triangular matrix equals the product of its diagonal entries, all of which are 1 in this case.^[32] This property implies that unitriangular matrices are unimodular, preserving volume under linear transformations they induce.^[31] Every unitriangular matrix is invertible, and its inverse is also unitriangular. The inverse can be computed via back substitution for upper unitriangular matrices or forward substitution for lower unitriangular ones, leveraging the triangular structure to solve the corresponding linear system efficiently in O(n^2) time.^[33] As a Lie group, the set of unitriangular matrices over \mathbb{R} or \mathbb{C} forms a connected nilpotent Lie group, with nilpotency class n-1. Its Lie algebra consists of the strictly triangular matrices (those with zeros on the diagonal), and the exponential map establishes a diffeomorphism between this nilpotent Lie algebra and the unitriangular group.^[31] For illustration, consider the $2 \times 2 upper unitriangular matrices over K, which take the form

\begin{pmatrix} 1 & a \\ 0 & 1 \end{pmatrix},

where a \in K. The product of two such matrices,

\begin{pmatrix} 1 & a \\ 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & b \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 1 & a+b \\ 0 & 1 \end{pmatrix},

remains upper unitriangular, demonstrating closure and the isomorphism of this group to the additive group (K,+).^[31]

Strictly Triangular Matrices

A strictly triangular matrix is defined as an upper or lower triangular matrix where all entries on the main diagonal are zero.^[34] This structure implies that all eigenvalues of such a matrix are zero, as the eigenvalues of a triangular matrix coincide with its diagonal entries.^[35] Strictly triangular matrices exhibit nilpotency: for an n \times n strictly triangular matrix N, there exists a positive integer k \leq n such that N^k = 0, with the index of nilpotency being at most n.^[36] This property arises because repeated multiplication shifts non-zero entries away from the diagonal, eventually filling the matrix with zeros after at most n powers. Nilpotency ties into broader computations of matrix powers, where higher powers vanish entirely.^[37] The trace of a strictly triangular matrix is zero, being the sum of its zero diagonal entries.^[38] Similarly, its determinant is zero, as the determinant of a triangular matrix is the product of its diagonal entries, all of which are zero.^[39] The matrix exponential of a strictly triangular matrix N truncates to a finite sum due to nilpotency:

\exp(N) = I + N + \frac{N^2}{2!} + \cdots + \frac{N^{n-1}}{(n-1)!},

yielding a unitriangular matrix (upper or lower triangular with ones on the diagonal).^[40] For example, consider the $3 \times 3 upper strictly triangular matrix

N = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}.

Here, N^2 = \begin{pmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} and N^3 = 0, confirming nilpotency of index 3. The exponential is

\exp(N) = I + N + \frac{N^2}{2} = \begin{pmatrix} 1 & 1 & 1/2 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{pmatrix},

a unitriangular matrix.^[36]

Atomic Triangular Matrices

An atomic triangular matrix is a lower triangular matrix with all diagonal entries equal to 1 and non-zero off-diagonal entries confined to a single column below the main diagonal. This structure distinguishes it as a basic form of a non-unit diagonal triangular matrix, where all other subdiagonal entries are zero, and the superdiagonal entries are zero. Such matrices are also known as elementary lower triangular matrices or, in the case of a single non-zero entry, Frobenius matrices or Gauss transformation matrices in contexts like LU decomposition and joint diagonalization algorithms.^[41] The key property of atomic triangular matrices lies in their minimal non-diagonal nature, making them fundamental building blocks for more complex triangular structures. They are indecomposable in the sense that they cannot be expressed as non-trivial block triangular matrices beyond the full matrix itself, as the off-diagonal entries in the single column create an irreducible connection between the affected rows and columns without allowing further partitioning into zero off-diagonal blocks. This indecomposability ensures they serve as the atomic units in decompositions, where general unit lower triangular matrices can be constructed as products of these basic forms—for instance, by successively introducing off-diagonal entries in columns via Gaussian elimination.^[41] Consider a 3×3 atomic triangular matrix with the non-zero off-diagonal entries at positions (2,1) and potentially (3,1), but for illustration with one:

\begin{pmatrix} 1 & 0 & 0 \\ b & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}

Here, b \neq 0 is the sole subdiagonal non-zero in this case, rendering the matrix lower triangular while introducing a single structural linkage between the first and second rows. Alternatively, with entries in column 1 including (3,1):

\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ c & 0 & 1 \end{pmatrix}

with c \neq 0, this configuration links the first and third rows directly, highlighting variations in connectivity. These examples illustrate how atomic matrices embed minimal perturbations to the diagonal form. In the classification of triangular matrices, atomic forms play a crucial role as the basis for analyzing reducibility, providing insight into how successive additions of off-diagonal connections lead to composite structures like block triangular matrices. By studying these indecomposable units, one can understand the overall reducibility of larger triangular systems through their decomposition into such elementary components.

Block Triangular Matrices

A block triangular matrix is a square matrix partitioned into submatrices (blocks) along the block diagonal such that the off-block-diagonal entries form a triangular pattern with zero blocks in the lower (or upper) triangular positions. For an upper block triangular matrix with k diagonal blocks A_1, A_2, \dots, A_k of sizes n_1 \times n_1, n_2 \times n_2, \dots, n_k \times n_k where \sum n_i = n, the matrix takes the form

\begin{pmatrix} A_1 & B_{12} & \cdots & B_{1k} \\ 0 & A_2 & \cdots & B_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & A_k \end{pmatrix},

where the B_{ij} (for i < j) are arbitrary n_i \times n_j blocks and the lower off-diagonal blocks are zero matrices of appropriate dimensions. The lower block triangular case is analogous, with zeros above the block diagonal. This structure extends the scalar triangular matrix by allowing diagonal blocks to be arbitrary square matrices rather than scalars. The determinant of a block triangular matrix is the product of the determinants of its diagonal blocks. For the upper block triangular form above, \det(M) = \prod_{i=1}^k \det(A_i). This follows from the multiplicative property of determinants and the fact that expanding along zero blocks simplifies the computation to the block diagonal case.^[42]^[43] The eigenvalues of a block triangular matrix are the union of the eigenvalues of its diagonal blocks, counting algebraic multiplicities. If \lambda is an eigenvalue of some A_i with multiplicity m, then \lambda is an eigenvalue of the full matrix with the same multiplicity, as the characteristic polynomial factors as \det(M - \lambda I) = \prod_{i=1}^k \det(A_i - \lambda I_{n_i}).^[44] A matrix is reducible if and only if it is permutation similar to a block triangular matrix with at least one nontrivial diagonal block, corresponding to the existence of a proper nontrivial invariant subspace. Specifically, if \mathcal{U} is an invariant subspace under the linear transformation represented by the matrix, then there exists a basis where the matrix takes block upper triangular form with the restriction to \mathcal{U} as the leading diagonal block. This equivalence links block triangular forms to the decomposition of the underlying vector space into invariant subspaces.^[45] For example, consider a $4 \times 4 upper block triangular matrix partitioned into $2 \times 2 blocks:

M = \begin{pmatrix} A & B \\ 0 & C \end{pmatrix},

where A = \begin{pmatrix} 1 & 2 \\ 0 & 3 \end{pmatrix}, B = \begin{pmatrix} 4 & 5 \\ 6 & 7 \end{pmatrix}, and C = \begin{pmatrix} 8 & 0 \\ 9 & 10 \end{pmatrix}. Here, \det(M) = \det(A) \det(C) = (1 \cdot 3) (8 \cdot 10) = 240, and the eigenvalues are \{1, 3\} \cup \{8, 10\} = \{1, 3, 8, 10\}. The lower block triangular version would have B below A and C, with zeros above. Block triangular matrices with atomic triangular diagonal blocks (indecomposable over the scalars) form composites that preserve triangular properties in larger systems.^[42]

Triangularization

Schur Triangularization

Schur's theorem states that every square matrix A \in \mathbb{C}^{n \times n} is unitarily similar to an upper triangular matrix T, meaning there exists a unitary matrix U such that U^* A U = T, where the diagonal entries of T are the eigenvalues of A (counted with algebraic multiplicity).^[46] This decomposition, known as the Schur triangularization, provides a canonical form that facilitates the study of matrix spectra and stability analysis, as the eigenvalues appear explicitly on the diagonal.^[46] A proof of Schur's theorem proceeds by induction on the matrix dimension n. For n = 1, the result is trivial. Assume it holds for dimension n-1; for an n \times n matrix A, let \lambda be an eigenvalue with corresponding unit eigenvector u_1. Extend \{u_1\} to an orthonormal basis \{u_1, \dots, u_n\}, and let U_1 = [u_1, \dots, u_n]. Then U_1^* A U_1 = \begin{pmatrix} \lambda & * \\ 0 \\ \vdots & A_1 \end{pmatrix}, where A_1 is the (n-1) \times (n-1) Schur complement. By the induction hypothesis, A_1 is unitarily similar to an upper triangular matrix, completing the proof.^[46] The Schur decomposition is computed numerically using the QR algorithm, which iteratively applies QR factorizations with shifts to deflate eigenvalues from the matrix until it reaches upper triangular form.^[47] The overall computational complexity of this process is O(n^3) floating-point operations, making it efficient for dense matrices of moderate size.^[47] Over the real numbers, for A \in \mathbb{R}^{n \times n}, the Schur decomposition yields a quasi-triangular form where the matrix T is block upper triangular, with 1×1 blocks for real eigenvalues and 2×2 blocks for complex conjugate eigenvalue pairs to preserve reality.^[46] This real Schur form is also obtainable via the QR algorithm adapted for real arithmetic.^[47] The Schur triangularization relates to the Jordan canonical form in that both represent the matrix in triangular form via similarity, but the Schur form uses a unitary similarity and is always achievable over \mathbb{C}, whereas the Jordan form may require a non-unitary basis and results in diagonal form only if A is diagonalizable; otherwise, it features larger Jordan blocks above the diagonal.^[46]

Simultaneous Triangularization

A set of matrices \{A_i\}_{i \in I} over a field k is said to be simultaneously triangularizable if there exists an invertible matrix P such that P^{-1} A_i P is upper triangular for every i \in I. Over an algebraically closed field, a finite family of commuting matrices is simultaneously triangularizable.^[48] More generally, such a family is simultaneously triangularizable if and only if the (associative) algebra they generate is solvable as a Lie algebra.^[49] The Lie-Kolchin theorem provides a foundational result in this context: if G is a connected solvable algebraic group over an algebraically closed field of characteristic zero and \rho: G \to \mathrm{GL}(V) is a rational representation on a finite-dimensional vector space V, then there exists a basis of V in which every matrix \rho(g) for g \in G is upper triangular.^[50] This implies simultaneous triangularization for the images of the representation. Examples include any set of simultaneously diagonalizable matrices, since diagonal matrices are triangular. Another case is the set of all polynomials in a single matrix A, as they commute and share the triangular form of A under the same similarity transformation. In representation theory, simultaneous triangularization simplifies the study of joint eigenspaces and invariant subspaces for families of operators, facilitating the decomposition of representations into irreducible components.

Advanced Structures and Applications

Triangular Matrix Algebras

The set of all upper triangular n \times n matrices over a field F forms a subalgebra of the full matrix algebra M_n(F), as it is closed under both addition and multiplication.^[51] This subalgebra, often denoted T_n(F), inherits the associative structure from M_n(F) and plays a key role in the study of solvable algebras and their representations.^[52] The dimension of T_n(F) as a vector space over F is \frac{n(n+1)}{2}, accounting for the n diagonal entries and the \frac{n(n-1)}{2} entries above the diagonal.^[53] The nilradical of this algebra, which is the largest nilpotent ideal, consists of the strictly upper triangular matrices; this ideal is nilpotent of index at most n, as powers of its elements eventually yield the zero matrix.^[53] As an associative algebra equipped with the Lie bracket defined by the commutator [A, B] = AB - BA, T_n(F) is solvable, meaning its derived series—generated by successive commutators—terminates at the zero algebra after finitely many steps (specifically, at most n-1 steps).^[54] The first term in this series is the subalgebra of strictly upper triangular matrices, and subsequent terms consist of matrices with increasingly restricted support above the diagonal, eventually reaching zero.^[54] The algebra T_n(F) admits faithful representations on spaces of flags in the underlying vector space F^n. In particular, it acts by matrix multiplication on F^n, preserving a complete flag \{0\} = V_0 \subset V_1 \subset \cdots \subset V_n = F^n where each V_i is the span of the first i standard basis vectors; this action is faithful since the representation is the defining embedding into \mathfrak{gl}(n, F).^[54]

Borel Subgroups and Lie Groups

In the context of Lie groups, the standard Borel subgroup B of the general linear group

\(GL(n, \mathbb{C})

consists of all invertible upper triangular n \times n matrices, forming a maximal solvable connected algebraic subgroup.^[55] This subgroup arises as the stabilizer of the standard complete flag in \mathbb{C}^n, where a complete flag is a chain of subspaces $0 = V_0 \subset V_1 \subset \cdots \subset V_n = \mathbb{C}^n with \dim V_i = i for each i.^[56] The unipotent radical of B, comprising the upper unitriangular matrices (those with 1's on the diagonal and zeros below), is a normal nilpotent subgroup whose derived series terminates, contributing to the solvability of B.^[55] The associated Lie algebra \mathfrak{b} of B is the subalgebra of \mathfrak{gl}(n, \mathbb{C}) consisting of all upper triangular matrices, which is also solvable.^[56] The commutator Lie subalgebra [\mathfrak{b}, \mathfrak{b}] is precisely the nilradical, formed by the strictly upper triangular matrices.^[56] For the special linear group SL(3, \mathbb{C}), the standard Borel subgroup is the subgroup of upper triangular matrices with determinant 1, preserving the trace-zero condition while stabilizing the standard flag in \mathbb{C}^3.^[57] A key application of Borel subgroups is the Bruhat decomposition, which expresses GL(n, \mathbb{C}) as a disjoint union \bigcup_{w \in [W](/page/W)} B w B, where W is the Weyl group (isomorphic to the symmetric group S_n) and each w is represented by a monomial matrix.^[58] This decomposition plays a fundamental role in representation theory, where irreducible representations of GL(n, \mathbb{C}) are classified by highest weight vectors fixed by a Borel subgroup, facilitating the construction of Verma modules and the study of weights.^[58] Geometrically, the quotient GL(n, \mathbb{C})/B is the flag variety, a projective manifold parametrizing complete flags, on which B acts by stabilizing points corresponding to chains of projective subspaces in \mathbb{P}(\mathbb{C}^n).^[59]