Fact-checked by Grok 2 weeks ago

Matrix decomposition

Matrix decomposition, also known as matrix factorization, refers to the process of factoring a into a product of matrices that are simpler or more structured, such as triangular, diagonal, or orthogonal matrices, to enable efficient computation, analysis, and solution of linear systems. This approach revolutionized in the mid-20th century; for example, the expresses a square A as A = LU where L is lower triangular and U is upper triangular. Key types of matrix decompositions include , used for solving general linear systems and related to ; Cholesky decomposition, which factors symmetric positive-definite matrices as A = LL^T with L lower triangular, offering computational efficiency in statistical applications; , representing A = QR with orthogonal Q and upper triangular R, essential for problems and orthogonalization; , decomposing A = U \Sigma V^T where U and V are orthogonal and \Sigma is diagonal with non-negative singular values, crucial for , low-rank approximations, and pseudo-inverses; and eigendecomposition, for diagonalizable matrices as A = B \Lambda B^{-1} with eigenvalues on the diagonal of \Lambda, aiding in stability analysis and spectral problems. These decompositions generally scale as O(n^3) for an n \times n matrix but allow subsequent operations, like solving multiple systems, to run in O(n^2), enhancing efficiency in large-scale computations such as those in backpropagation or the matrix factorization challenges. Beyond solving linear equations and eigenvalue problems, matrix decompositions underpin robust numerical software libraries like and LINPACK, support updating techniques for dynamic data (e.g., rank-one modifications in O(n^2)), and reveal structural properties like , , and angles, with broad applications in physics, , and .

Fundamentals

Definition and Motivation

In linear algebra, matrix decomposition, also known as , refers to the process of expressing a given matrix A \in \mathbb{R}^{m \times n} as a product of two or more simpler matrices, such as A = BC, where B and C possess special structures like triangular, orthogonal, or diagonal forms. This factorization transforms the original matrix into a that exploits inherent properties for easier manipulation. The primary motivation for matrix decomposition arises from the need to simplify complex matrix operations in numerical computations. For instance, solving linear systems Ax = b, computing matrix inverses, and addressing eigenvalue problems become more tractable when the matrix is factored into components that allow for efficient algorithms like forward or backward substitution. These decompositions reduce —for example, from O(n^3) operations in direct methods to more optimized forms—and enhance by avoiding ill-conditioned intermediate steps, particularly for large or sparse matrices. Beyond computation, matrix decompositions reveal intrinsic properties of the matrix, such as its , condition number, and overall stability, which are crucial for understanding the behavior of linear systems. In applications, they enable data compression through low-rank approximations, where only the dominant components are retained to represent high-dimensional data efficiently, as seen in techniques like . Additionally, they play a key role in optimization problems by facilitating gradient computations and constraint handling in fields like and . For example, decompositions such as or illustrate how structured factorizations support these tasks without requiring full matrix storage or inversion.

Historical Development

The foundations of matrix decomposition trace back to the , when mathematicians began exploring determinants and minors as tools for understanding linear systems and quadratic forms. introduced key concepts in 1826, including the use of minors and adjoints in the context of quadratic forms, laying groundwork for notions of matrix rank through the analysis of non-vanishing minors. advanced these ideas in the 1880s, particularly in 1888, by applying determinant-based criteria to assess the solvability of linear systems, effectively formalizing early rank concepts that would underpin later decompositions. A pivotal early milestone in spectral decompositions occurred in 1873, when Eugenio Beltrami developed a diagonalization technique for symmetric matrices via orthogonal transformations, motivated by simplifying bilinear forms; this was independently extended by Camille Jordan in 1874 to rectangular matrices, establishing the (SVD) in its basic form. contributed a similar result for real square matrices in 1889. formalized the Gram-Schmidt orthogonalization process in 1907, building on Jørgen Pedersen Gram's earlier work from 1883, which provided a basis for QR decompositions by constructing orthogonal bases from linearly independent vectors. The early 20th century saw further progress in triangular decompositions. André-Louis Cholesky devised a method in 1910 for factoring positive definite matrices into lower triangular factors, originally for geodetic computations, though it was published posthumously in 1924. In 1936, Carl Eckart and Gale Young demonstrated the utility of for low-rank matrix approximation, showing that the best approximation in the Frobenius or spectral norm is obtained by truncating the , which highlighted its practical value. World War II accelerated developments due to the need for solving large-scale linear systems in and , with early electronic computers like (1945) demanding efficient algorithms. formalized the general in 1948 as a stable factorization for , building on earlier work by Tadeusz Banachiewicz in 1938 and and William Goldstine in 1947. The rise of digital computers in the 1950s spurred numerical advancements: Alston S. Householder introduced unitary transformations in 1958 for QR decompositions, enabling stable orthogonalization without the instability of classical Gram-Schmidt. The 1960s marked a numerical , with Gene Golub and developing an efficient bidiagonalization algorithm for computing the in 1965, making it viable for practical applications on early computers. Concurrently, researchers addressed for ill-conditioned matrices; partial pivoting in decompositions, refined by and others around 1965, became a standard variant to mitigate growth in rounding errors during . These innovations, tied to the computational demands of scientific simulation and data analysis, solidified matrix decompositions as cornerstones of .

General Properties

Matrix decompositions exhibit varying conditions for uniqueness depending on the specific factorization and any imposed constraints. For instance, the of a nonsingular is unique if no row interchanges are required during , but partial pivoting generally introduces non-uniqueness due to permutation choices. Similarly, the orthogonal factor in the is unique up to sign changes in the columns when the upper triangular factor has positive diagonal entries and the matrix has full column . The (), however, is unique up to unitary transformations, phase factors, and ordering of the singular values, with strict uniqueness holding when singular values are distinct. Existence theorems underpin the reliability of these decompositions across matrix classes. The LU decomposition exists for any nonsingular when partial pivoting is employed, ensuring no zero pivots disrupt the process, though it may fail without pivoting for matrices with small pivots. In contrast, the exists for every real or complex , regardless of size, , or , providing a universal into orthogonal and diagonal components. These guarantees make decompositions applicable to a broad range of problems, including the solution of linear systems where factorizations simplify subsequent computations. Most matrix decompositions for dense n \times n matrices incur a of O(n^3) operations, typically derived from variants of . For example, the requires approximately $2n^3/3 floating-point operations, while the demands around $4n^3 to $22n^3 depending on the algorithm, such as the Golub-Reinsch method. This cubic scaling reflects the inherent cost of transforming matrices into structured forms like triangular or diagonal. Stability and conditioning play crucial roles in the practical utility of decompositions, as they determine error propagation in finite-precision arithmetic. Decompositions like LU with partial pivoting and QR via Householder reflections are backward stable, meaning computed factors solve a slightly perturbed version of the original problem with small relative errors bounded by machine epsilon times the matrix norm. The SVD is particularly robust, with perturbations in singular values controlled by the gap between them. Conditioning is quantified by the condition number \kappa(A) = \|A\| \cdot \|A^{-1}\|, which for the 2-norm equals the ratio of the largest to smallest singular value; well-conditioned matrices have \kappa(A) \approx 1, minimizing sensitivity to input perturbations, while ill-conditioned ones amplify errors. Matrix decompositions fundamentally equate to processes of triangularization or , enabling efficient manipulation of linear transformations. and QR decompositions achieve triangularization, converting a general matrix into lower/upper triangular factors through elimination, which simplifies solving systems or computing determinants. Spectral decompositions like the and eigendecomposition pursue diagonalization, factoring the matrix into orthogonal/unitary matrices and a of singular values or eigenvalues, revealing intrinsic geometric properties such as and norms.

Decompositions for Linear Systems

LU Decomposition

LU decomposition, also known as , expresses a square matrix A as the product of a L and an upper U, such that A = LU. Typically, L is a lower triangular matrix with ones on the diagonal, while U has arbitrary diagonal entries, ensuring all entries above the diagonal in L are zero and below the diagonal in U are zero. This factorization applies to nonsingular matrices A \in \mathbb{R}^{n \times n}. For , especially when pivots are small or zero, partial pivoting is used, yielding PA = LU where P is a that reorders rows to select the largest pivot in each column. The modern LU factorization emerged from Alan Turing's 1948 analysis of rounding errors in matrix processes, where he formalized as this decomposition. The algorithm to compute the relies on . Starting with the matrix A, elimination proceeds column by column: for each position k, the entries below the pivot in column k are zeroed by subtracting multiples of row k from subsequent rows, with the multipliers l_{ik} = a_{ik}^{(k)} / a_{kk}^{(k)} (for i > k) stored in the lower part of L. The resulting upper triangular form after all eliminations is U, and L is completed with ones on the diagonal and zeros above. With partial pivoting, rows are swapped before each elimination step to maximize the pivot magnitude. Solving Ax = b then involves forward to solve Ly = Pb for y, followed by back to solve Ux = y. The factorization requires O(n^3) operations, while each subsequent solve costs O(n^2). Applications of LU decomposition center on efficiently solving linear systems Ax = b, particularly when multiple right-hand sides b share the same A, as the costly is performed once. It also facilitates matrix inversion by solving Ax = e_i for each vector e_i. The method demands that A be nonsingular and that no zero pivots occur during elimination without pivoting; partial pivoting mitigates this by ensuring for ill-conditioned matrices. For symmetric positive definite matrices, LU decomposition connects to , where A = LL^T with L lower triangular. Consider the 2×2 matrix A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}. In the first step of Gaussian elimination, the multiplier for the (2,1) entry is l_{21} = 3/1 = 3. Subtract 3 times row 1 from row 2 to get the updated row 2: [3-3\cdot1, 4-3\cdot2] = [0, -2]. Thus, L = \begin{pmatrix} 1 & 0 \\ 3 & 1 \end{pmatrix}, \quad U = \begin{pmatrix} 1 & 2 \\ 0 & -2 \end{pmatrix}, and A = LU. To solve Ax = b for b = \begin{pmatrix} 1 \\ 2 \end{pmatrix}, first solve Ly = b by forward substitution: y_1 = 1, y_2 = (2 - 3\cdot1)/1 = -1. Then back-substitute Ux = y: x_2 = -1 / -2 = 0.5, x_1 = (1 - 2\cdot0.5)/1 = 0.

Cholesky Decomposition

The Cholesky decomposition, also known as , expresses a symmetric positive-definite matrix A as the product A = LL^T, where L is a with positive diagonal entries. This factorization exploits the symmetry and of A to produce a unique L. For complex Hermitian positive-definite matrices, the formula generalizes to A = LL^H, where ^H denotes the . The method was developed by French military engineer André-Louis Cholesky around 1907–1910 for geodetic computations during , and published posthumously in 1924. The decomposition exists if and only if A is symmetric (or Hermitian) and positive definite, meaning all eigenvalues of A are positive. During computation, if any diagonal entry of L becomes negative or imaginary, this indicates that A is not positive definite. The algorithm computes L row-by-row via a variant of that avoids pivoting, leveraging the symmetry to reduce operations. For the v-th row, the diagonal entry is l_{vv} = \sqrt{a_{vv} - \sum_{u=1}^{v-1} l_{vu}^2}, and for off-diagonal entries with t > v, l_{tv} = \frac{a_{tv} - \sum_{u=1}^{v-1} l_{tu} l_{vu}}{l_{vv}}. This process requires approximately n^3/3 floating-point operations for an n \times n matrix, about half the cost of for the same task. Cholesky decomposition offers key advantages over general methods like : it stores only the lower triangular part of L (halving memory usage) and performs fewer operations by exploiting symmetry, making it particularly efficient for large-scale computations on positive-definite matrices. It serves as a special case of for symmetric positive-definite matrices, where the upper triangular factor equals the of the lower one. Applications include solving linear systems Ax = b efficiently via forward substitution to solve Ly = b followed by back substitution for L^T x = y, which is common in optimization problems involving quadratic forms. In statistics, it decomposes covariance matrices of multivariate normal distributions to generate correlated random samples by transforming independent standard normals through L. For example, consider the 2×2 A = \begin{pmatrix} 4 & 2 \\ 2 & 5 \end{pmatrix}, which is symmetric positive definite. Its is L = \begin{pmatrix} 2 & 0 \\ 1 & 2 \end{pmatrix}, since LL^T = \begin{pmatrix} 4 & 2 \\ 2 & 5 \end{pmatrix}.

QR Decomposition

The , also known as QR factorization, expresses an m \times n matrix A (with m \geq n) as the product A = QR, where Q is an m \times m satisfying Q^T Q = I_m and R is an m \times n . In this full factorization, the bottom (m - n) \times n block of R consists of zeros. This decomposition exists for any real matrix A, regardless of . When A has full column rank (rank n), the reduced or thin QR decomposition takes the form A = \hat{Q} \hat{R}, where \hat{Q} is an m \times n matrix with orthonormal columns (\hat{Q}^T \hat{Q} = I_n) and \hat{R} is an n \times n upper triangular matrix; this reduced form is unique if the diagonal entries of \hat{R} are required to be positive. The economy-sized variant, often used for rectangular matrices to save computation and storage, corresponds to this reduced form by omitting the unnecessary columns of the full Q and rows of R. Several algorithms compute the QR decomposition, each with trade-offs in stability and efficiency. The classical Gram-Schmidt process orthogonalizes the columns of A sequentially but suffers from numerical instability due to subtractive cancellation in finite precision arithmetic. The modified Gram-Schmidt variant improves stability by performing projections in a different order. More robust methods include reflections, which apply a sequence of orthogonal transformations to triangularize A while preserving its column space, achieving backward stability at an O(m n^2) cost for the thin form; this approach was introduced by Householder in 1958. Givens rotations offer an alternative, zeroing elements pairwise via 2-by-2 orthogonal rotations, which is useful for sparse or structured matrices but generally slower for dense cases. A primary application of QR decomposition is solving overdetermined problems \min_x \| Ax - b \|_2 for m > n. Substituting A = QR yields \| QRx - b \|_2 = \| Rx - Q^T b \|_2 since Q preserves norms (\| Q y \|_2 = \| y \|_2); the upper triangular system R x = Q^T b can then be solved efficiently by back-substitution, avoiding the ill-conditioning of the normal equations A^T A x = A^T b. For symmetric matrices, facilitates eigenvalue computations via iterations of the form A_{k+1} = R_k Q_k, where A = Q_k R_k, preserving symmetry and leading to triangular convergence without explicit eigenvector calculation in the basic QR algorithm. It also supports updating LU decompositions for rank-one modifications in adaptive least squares contexts. As a representative example, consider orthogonalizing the columns of a 3×2 A = \begin{pmatrix} 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{pmatrix} using the modified Gram-Schmidt process. Normalize the first column to obtain the first column of \hat{Q}: q_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix}. Project the second column onto q_1 and normalize the orthogonal component: the projection coefficient is q_1^T a_2 = \frac{1}{\sqrt{2}}, so the residual is a_2 - (q_1^T a_2) q_1 = \begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix} - \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 0.5 \\ -0.5 \\ 1 \end{pmatrix}, with norm \sqrt{1.5} = \sqrt{3/2}. Thus, q_2 = \frac{1}{\sqrt{6}} \begin{pmatrix} 1 \\ -1 \\ 2 \end{pmatrix}, yielding \hat{Q} = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{6}} \\ 0 & \frac{2}{\sqrt{6}} \end{pmatrix}, \quad \hat{R} = \begin{pmatrix} \sqrt{2} & \frac{1}{\sqrt{2}} \\ 0 & \sqrt{\frac{3}{2}} \end{pmatrix}, verifying A = \hat{Q} \hat{R}.

Spectral Decompositions

Eigendecomposition

Eigendecomposition provides a way to factorize a square matrix into a form that reveals its spectral properties, assuming the matrix is diagonalizable. For an n \times n diagonalizable matrix A over the complex numbers, the eigendecomposition is given by A = V \Lambda V^{-1}, where V is an invertible matrix whose columns are the right eigenvectors of A, and \Lambda is a diagonal matrix containing the eigenvalues \lambda_i of A along its main diagonal. This representation diagonalizes A, transforming it into a simpler form where operations become scalar multiplications on the eigenvalues. A matrix A is diagonalizable if and only if it possesses n linearly independent eigenvectors, which occurs when the algebraic multiplicity of each eigenvalue equals its geometric multiplicity. For real symmetric matrices, the spectral theorem guarantees diagonalizability over the reals, with all eigenvalues real and the eigenvectors forming an , yielding A = Q \Lambda Q^T where Q is orthogonal. More generally, for Hermitian matrices over the complex numbers, the spectral theorem states that A = Q \Lambda Q^*, where Q is unitary and \Lambda has real eigenvalues. Computing the eigendecomposition typically involves iterative algorithms that approximate . The power iteration method, a simple iterative technique, converges to the dominant eigenvalue and its eigenvector by repeatedly applying the matrix to an initial vector and normalizing, effective when one eigenvalue dominates in magnitude. For more general cases, the , independently developed by John G. F. Francis in 1959–1961 and Vera Kublanovskaya in 1961, iteratively performs QR decompositions followed by reverse multiplications to drive the matrix toward a form revealing its eigenvalues, often after a Hessenberg reduction for efficiency. This algorithm forms the basis of modern implementations in numerical libraries for dense matrices. Eigendecomposition finds applications in analyzing dynamical systems and stochastic processes. For matrix powers, if A = V \Lambda V^{-1}, then A^k = V \Lambda^k V^{-1}, simplifying computations of iterates like transition probabilities in Markov chains, where the dominant eigenvalue (typically 1) determines the stationary distribution. In systems of linear differential equations of the form \mathbf{x}' = A \mathbf{x}, the solution is \mathbf{x}(t) = V e^{\Lambda t} V^{-1} \mathbf{x}(0), decoupling the system into independent scalar equations along eigenvector directions. Matrices that are not diagonalizable require alternative forms like the Jordan canonical form for similar analyses. As an illustrative example, consider the 2×2 by angle \theta, R = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}. The \det(R - \lambda I) = 0 yields eigenvalues \lambda = e^{\pm i \theta}, which are unless \theta = 0 or \pi (modulo $2\pi), reflecting the lack of real eigenvectors for non-trivial rotations in the plane. The corresponding eigenvectors involve entries, such as \begin{pmatrix} 1 \\ -i \end{pmatrix} for \lambda = e^{i \theta}.

Schur Decomposition

The , named after mathematician who established its existence in 1909, provides a fundamental for over the complex numbers. For any complex A \in \mathbb{C}^{n \times n}, there exists a Q \in \mathbb{C}^{n \times n} (satisfying Q^* Q = I, where Q^* denotes the ) and an upper triangular matrix T \in \mathbb{C}^{n \times n} such that A = Q T Q^*. The diagonal entries of T are the eigenvalues of A, appearing in some order, which follows from the preserving the of the matrix. This decomposition triangularizes A via a unitary similarity, ensuring in computations due to the of Q. For real matrices, a variant known as the real Schur form addresses the presence of complex conjugate eigenvalue pairs while maintaining real arithmetic. Any real A \in \mathbb{R}^{n \times n} admits an U \in \mathbb{R}^{n \times n} (satisfying U^T U = I) and a block upper S \in \mathbb{R}^{n \times n} such that A = U S U^T, where the diagonal blocks of S are either 1×1 (for real eigenvalues) or 2×2 (for pairs of eigenvalues). The 2×2 blocks have the form of real companion matrices corresponding to quadratic factors of the , ensuring all entries remain real. This form is particularly useful in practical implementations to avoid numbers. The Schur decomposition is typically computed using the with shifts, applied after an to upper Hessenberg form to enhance efficiency. The QR iteration iteratively decomposes the matrix into an orthogonal factor and an upper triangular factor, then recombines them via similarity; shifts accelerate convergence toward the triangular form by targeting individual eigenvalues. This process yields the desired Q and T (or S) factors, with the Hessenberg handled separately via transformations. The algorithm's backward stability stems from the unitary transformations, making it a for reliable eigenvalue solvers in numerical libraries. In applications, the underpins eigenvalue algorithms by reducing the problem to triangularization, from which eigenvalues are directly read off the diagonal and eigenvectors can be derived via back-substitution. Its ensures accurate computation even for ill-conditioned matrices, supporting tasks like in and model reduction in simulations. For matrices, the decomposition simplifies to the eigendecomposition, where T becomes diagonal. The Schur decomposition is not unique; the eigenvalues on the diagonal can be ordered arbitrarily, and for repeated eigenvalues, the corresponding invariant subspaces allow multiple choices for the columns of Q. In the real case, the block structure for complex pairs is unique up to the 2×2 block's internal unitary freedom, but the overall partitioning depends on eigenvalue ordering. As an illustrative example, consider the 2×2 real A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}. Its eigenvalues are \lambda_1 = \frac{5 + \sqrt{33}}{2} \approx 5.372 and \lambda_2 = \frac{5 - \sqrt{33}}{2} \approx -0.372. A corresponding is A = Q T Q^*, where T = \begin{pmatrix} \lambda_1 & * \\ 0 & \lambda_2 \end{pmatrix} (with the off-diagonal entry determined by the transformation), and Q is unitary with columns as Schur vectors, confirming the triangular form with eigenvalues on the diagonal.

Singular Value Decomposition

The singular value decomposition (SVD) provides a canonical factorization for any real matrix A \in \mathbb{R}^{m \times n} (with a similar form for complex matrices), given by A = U \Sigma V^T, where U \in \mathbb{R}^{m \times m} and V \in \mathbb{R}^{n \times n} are orthogonal matrices, and \Sigma \in \mathbb{R}^{m \times n} is a rectangular diagonal matrix containing the singular values \sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_{\min(m,n)} \geq 0 along its , with all off-diagonal entries zero. This decomposition generalizes eigendecomposition to rectangular matrices by capturing their intrinsic geometric structure through orthogonal transformations and scaling. The singular values \sigma_i satisfy \sigma_i = \sqrt{\lambda_i}, where \lambda_i are the eigenvalues of A^T A (or equivalently A A^T), ordered decreasingly; the associated right singular vectors (columns of V) are the eigenvectors of A^T A, and the left singular vectors (columns of U) are the eigenvectors of A A^T. The rank of A equals the number of strictly positive singular values, providing a direct measure of the matrix's . Furthermore, the SVD enables optimal low-rank approximations: for any k \leq \min(m,n), the matrix A_k = \sum_{i=1}^k \sigma_i u_i v_i^T minimizes \|A - B\|_F over all rank-k matrices B, where \| \cdot \|_F denotes the , as established by the Eckart–Young theorem. Computing the SVD typically involves the Golub–Kahan algorithm, which first applies reflections to bidiagonalize A (reducing it to a form with nonzeros only on the and adjacent sub- and super-diagonals), followed by iterative QR decompositions on the to deflate and compute the singular values and vectors accurately and efficiently. This method, with subsequent refinements like divide-and-conquer or Jacobi rotations, forms the basis for implementations in numerical libraries such as . Key applications of SVD include solving underdetermined or overdetermined linear systems via the Moore–Penrose pseudoinverse A^+ = V \Sigma^+ U^T, where \Sigma^+ reciprocates the nonzero \sigma_i on its diagonal and sets zeros elsewhere, enabling stable least-squares solutions even for rank-deficient matrices. In data analysis, SVD underpins (PCA): for a centered X \in \mathbb{R}^{m \times n} with m observations and n variables, the SVD X = U \Sigma V^T identifies the principal components as the columns of V, with the proportion of variance explained by the i-th component given by \sigma_i^2 / \sum_j \sigma_j^2, facilitating and noise filtering. SVD also supports by representing an image as a and retaining only the k largest singular values for a rank-k , which preserves essential visual features while significantly reducing file size, as demonstrated in techniques achieving high compression ratios with minimal perceptual loss. Variants of the SVD include the compact (or economy) form, which omits zero singular values for a rank-r matrix A (where r is the number of positive \sigma_i), yielding A = U_r \Sigma_r V_r^T with U_r \in \mathbb{R}^{m \times r}, \Sigma_r \in \mathbb{R}^{r \times r}, and V_r \in \mathbb{R}^{n \times r}, thus requiring less storage and computation. The truncated SVD extends this by selecting only the first k < r terms, explicitly for approximation purposes. As a concrete illustration, consider the 3×2 matrix A = \begin{pmatrix} 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{pmatrix}. The SVD computation begins with A^T A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}, whose eigenvalues are \lambda_1 = 3 and \lambda_2 = 1, yielding singular values \sigma_1 = \sqrt{3} \approx 1.732 and \sigma_2 = 1. The right singular vectors v_1, v_2 are the normalized eigenvectors of A^T A: for \lambda_1 = 3, an unnormalized eigenvector is \begin{pmatrix} 1 \\ 1 \end{pmatrix}, normalized to \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}; for \lambda_2 = 1, it is \begin{pmatrix} 1 \\ -1 \end{pmatrix}, normalized to \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}. The corresponding left singular vectors are then u_i = \frac{1}{\sigma_i} A v_i for i=1,2, with a third orthogonal vector completing the full U if needed. This process highlights how SVD extracts the matrix's rank (here 2) and directional scalings.

Advanced and Specialized Decompositions

Polar Decomposition

In linear algebra, the polar decomposition provides a factorization of a square matrix A \in \mathbb{C}^{n \times n} (or \mathbb{R}^{n \times n}) into the product of a U (orthogonal for real matrices) and a positive semidefinite P, such that A = UP. This decomposition is unique when A is invertible, with P being the unique positive semidefinite square root of A^*A, where A^* denotes the . The form generalizes the polar form of , separating the "phase" (unitary part) from the "magnitude" (positive part), and exists for any square matrix, though uniqueness requires invertibility. The right polar decomposition is A = UP, while the left form is A = PU, where now U is unitary and P = \sqrt{AA^*} is positive semidefinite. Explicitly, P = \sqrt{A^*A} and U = A (A^*A)^{-1/2} for the right form when A is invertible; the left form follows analogously with roles reversed. For noninvertible matrices, the decomposition still holds, but P may be singular, and uniqueness is guaranteed only for the positive semidefinite factor among decompositions with the same range. This structure highlights geometric properties, such as P representing a pure scaling and U a rigid rotation or isometry. A practical algorithm computes the polar decomposition via the singular value decomposition (SVD) of A = V \Sigma W^*, where V and W are unitary and \Sigma is diagonal with nonnegative entries. For the right polar form, set U = V W^* and P = W \Sigma W^*; the left form uses P = V \Sigma V^* and U = V W^*. This method is numerically stable and leverages efficient SVD routines, making it suitable for computational implementations despite the cubic cost of SVD. For rectangular matrices A \in \mathbb{C}^{m \times n} with m \geq n, a generalized right polar decomposition A = UH exists, where U \in \mathbb{C}^{m \times n} has orthonormal columns and H \in \mathbb{C}^{n \times n} is positive semidefinite, again derivable from SVD. Applications of polar decomposition span several fields. In continuum mechanics, it decomposes the deformation gradient tensor F as F = RU, where R is the rotation tensor (orthogonal) and U is the right stretch tensor (positive definite symmetric), enabling separation of rigid body motion from pure strain for analysis of material deformation. In optics, particularly polarimetry, the decomposition of Mueller matrices extracts intrinsic polarization properties: a diattenuator (linear dichroism), a retarder (phase shift), and a depolarizer, providing quantitative interpretation of light-scattering media such as biological tissues. Another use is finding the closest unitary matrix to a given A, which is the unitary factor U in the decomposition, useful in optimization and approximation problems. For example, consider the real $2 \times 2 matrix A = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}. First, compute A^T A = \begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix}, whose eigenvalues are \frac{3 \pm \sqrt{5}}{2} with eigenvectors yielding the positive semidefinite P = \sqrt{A^T A} = \begin{pmatrix} \frac{1 + \sqrt{5}}{4} & \frac{\sqrt{5} - 1}{4} \\ \frac{\sqrt{5} - 1}{4} & \frac{3 - \sqrt{5}}{4} \end{pmatrix}. Then, U = A P^{-1} = \begin{pmatrix} \frac{5 - \sqrt{5}}{4} & \frac{\sqrt{5} - 1}{4} \\ -\frac{\sqrt{5} - 1}{4} & \frac{3 + \sqrt{5}}{4} \end{pmatrix}, which is orthogonal since U^T U = I. This illustrates how A separates into rotation-scaling factors.

Jordan Canonical Form

The Jordan canonical form provides a canonical representation for square matrices over an algebraically closed field, such as the complex numbers, particularly useful for matrices that are not diagonalizable. Introduced by Camille Jordan in his 1874 treatise on substitutions and algebraic equations, it expresses any n \times n matrix A as similar to a unique (up to permutation of blocks) block-diagonal matrix J, via J = P^{-1} A P where P is invertible. Each block in J is a Jordan block J_k(\lambda), corresponding to an eigenvalue \lambda, defined as J_k(\lambda) = \lambda I_k + N_k, where I_k is the k \times k identity matrix and N_k is the nilpotent matrix with 1's on the superdiagonal and zeros elsewhere. J_k(\lambda) = \begin{pmatrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda & 1 \\ 0 & 0 & \cdots & 0 & \lambda \end{pmatrix} For an eigenvalue \lambda with algebraic multiplicity m (the multiplicity in the characteristic polynomial), the Jordan form decomposes the corresponding generalized eigenspace into a direct sum of cyclic subspaces, each associated with a Jordan block. The number of blocks equals the geometric multiplicity g = \dim \ker(A - \lambda I), which is at most m, and the sizes of the blocks (the lengths of the Jordan chains) are determined by the differences in dimensions of the kernels of powers of (A - \lambda I)^j for j = 1, \dots, m. The largest block size equals the index of \lambda in the minimal polynomial of A, which divides the characteristic polynomial and annihilates A. This structure extends the eigendecomposition to defective matrices, where g < m, by incorporating generalized eigenvectors satisfying (A - \lambda I)^k v = 0 but not for smaller k. The existence of the Jordan canonical form requires the base field to be algebraically closed, ensuring all eigenvalues lie in the field; over the reals, a real Jordan form with 2×2 blocks for complex conjugate pairs is used instead. The form is unique up to the ordering of blocks, and the minimal polynomial's degree for each \lambda governs the largest block size for that eigenvalue. Computationally, one first finds the eigenvalues via the characteristic polynomial, then for each \lambda, computes the generalized eigenspace \ker((A - \lambda I)^m) and the Weyr characteristic (dimensions of \ker((A - \lambda I)^j)) to determine block sizes. Jordan chains are constructed by solving (A - \lambda I) v_{j} = v_{j-1} starting from eigenvectors, but this process is numerically unstable for large or ill-conditioned matrices due to sensitivity to perturbations in defective cases. Practical algorithms, such as those using Schur decomposition followed by rank-revealing decompositions, mitigate this but rarely compute the exact form; instead, they approximate the structure. Applications of the Jordan canonical form include simplifying the solution of linear systems of differential equations, where e^{At} is block-diagonalized, and analyzing nilpotency in powers of (A - \lambda I)^k. In control theory, it determines the controllability indices and minimal realizations of linear systems by revealing the chain lengths for uncontrollable or unobservable modes. For instance, consider the 3×3 matrix A = \begin{pmatrix} 3 & 1 & 0 \\ 0 & 3 & 2 \\ 0 & 0 & 4 \end{pmatrix}, which has eigenvalues \lambda = 3 (algebraic multiplicity 2, geometric multiplicity 1) and \lambda = 4 (multiplicity 1). Its Jordan form is J = \begin{pmatrix} 3 & 1 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{pmatrix}, achieved via similarity with P whose columns are a generalized eigenvector chain for \lambda=3 (satisfying (A - 3I)^2 v_2 = 0, (A - 3I) v_2 = v_1, (A - 3I) v_1 = 0) and an eigenvector for \lambda=4. This reveals the single Jordan block of size 2 for the defective eigenvalue.

Hessenberg and Tridiagonal Forms

The Hessenberg form of a square is an intermediate structure that facilitates efficient computation of eigenvalues and eigenvectors by reducing the fill-in that occurs during iterative algorithms. An upper Hessenberg H is defined such that h_{ij} = 0 for all i > j+1, meaning all entries below the first subdiagonal are . Any n \times n A can be reduced to upper Hessenberg form via a unitary A = Q H Q^*, where Q is unitary and H is upper Hessenberg. This decomposition preserves the eigenvalues of A because similarity transformations do not alter the . The transformation is reversible, allowing recovery of the original from H and Q. The reduction to Hessenberg form is typically achieved using Householder reflections or Givens rotations, with methods being preferred for their efficiency in zeroing multiple elements simultaneously. The algorithm proceeds column by column: for each column k = 1 to n-2, a reflector is constructed from the subvector A(k+1:n, k) to zero the entries below the subdiagonal in that column, and this reflector is applied from both the left and right to maintain similarity. The overall computational cost is \frac{10}{3} n^3 floating-point operations for an n \times n matrix, though the work per column is O(n^2), making it practical for large n. The process is backward , meaning the computed \tilde{H} satisfies \tilde{Q} \tilde{H} \tilde{Q}^* = A + \delta A with \|\delta A\| = O(\epsilon_{\text{mach}} \|A\|), where \epsilon_{\text{mach}} is machine epsilon. For symmetric matrices, the Hessenberg reduction simplifies to a tridiagonal form, where A = Q T Q^T and T is tridiagonal (symmetric with zeros outside the and the adjacent sub- and superdiagonals). This is because the of A implies the resulting is also symmetric, forcing the superdiagonal structure to match the subdiagonal. The same Householder-based algorithm applies, but the right multiplications preserve symmetry, and the cost remains \frac{4}{3} n^3 for the symmetric case. These forms serve as preprocessing steps in eigenvalue algorithms. In the QR algorithm for computing the Schur decomposition, the initial reduction to Hessenberg or tridiagonal form ensures that subsequent iterations preserve the structure and avoid unnecessary fill-in, significantly improving efficiency. In methods, such as the , the process generates an upper Hessenberg projection of the matrix onto the , which approximates for large sparse systems. Example
Consider the $4 \times 4
A = \begin{pmatrix} 1 & 15 & -6 & 0 \\ 1 & 7 & 3 & 12 \\ 2 & -7 & -3 & 0 \\ 2 & -28 & 15 & 3 \end{pmatrix}. The reduction begins with the first column: the subvector from row 2 to 4 is [1, 2, 2]^T, with \sqrt{9} = 3. A Householder reflector is constructed to map this subvector to [3, 0, 0]^T (choosing the sign for ), zeroing a_{31} and a_{41}. This reflector is applied to rows and columns 2 through 4 of A. Subsequent steps target the second column (zeroing a_{42}) and the third column (already zero below the subdiagonal). The resulting H is upper Hessenberg, similar to A, with nonzeros only on and above the and the subdiagonal.

Variants and Generalizations

Block and Rank-Based Decompositions

Block and rank-based decompositions extend classical factorizations like and QR to handle structured matrices or reveal intrinsic low-rank properties, enabling efficient computation on modern hardware and for data tasks. These methods partition matrices into blocks for parallelism or incorporate pivoting to identify the numerical , which is crucial when matrices are ill-conditioned or approximately low-rank due to noise or approximations in applications such as . By focusing on block operations, algorithms achieve better cache utilization and scalability on parallel architectures, while rank-revealing variants provide bounds on singular values to approximate the matrix's effective dimension. The block LU decomposition applies the standard LU factorization recursively to block-partitioned matrices, facilitating parallel computation by allowing independent processing of diagonal blocks. For a square matrix A partitioned as A = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}, where A_{11} is k \times k, the block LU takes the form A = \begin{bmatrix} L_{11} & 0 \\ L_{21} & I \end{bmatrix} \begin{bmatrix} U_{11} & U_{12} \\ 0 & S \end{bmatrix}, with L_{11} and U_{11} from the LU of A_{11}, L_{21} = A_{21} U_{11}^{-1}, U_{12} = L_{11}^{-1} A_{12}, and S = A_{22} - L_{21} U_{12} as the for further recursion. This structure supports parallelism by computing the factorization of S independently after the initial blocks, reducing synchronization overhead in distributed systems. A simple example illustrates the block LU on a 4×4 matrix partitioned into 2×2 blocks: Let A = \begin{bmatrix} 1 & 2 & 3 & 4 \\ 5 & 6 & 7 & 8 \\ 9 & 10 & 11 & 12 \\ 13 & 14 & 15 & 16 \end{bmatrix} = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}, with A_{11} = \begin{bmatrix} 1 & 2 \\ 5 & 6 \end{bmatrix}. The of A_{11} is L_{11} = \begin{bmatrix} 1 & 0 \\ 5 & 1 \end{bmatrix}, U_{11} = \begin{bmatrix} 1 & 2 \\ 0 & -4 \end{bmatrix}, so U_{11}^{-1} = \begin{bmatrix} 1 & 0.5 \\ 0 & -0.25 \end{bmatrix}, L_{21} = A_{21} U_{11}^{-1} = \begin{bmatrix} 9 & 10 \\ 13 & 14 \end{bmatrix} \begin{bmatrix} 1 & 0.5 \\ 0 & -0.25 \end{bmatrix} = \begin{bmatrix} 9 & 2 \\ 13 & 3 \end{bmatrix}. The S = 0 due to the matrix's rank deficiency. This decomposition is used in parallel solvers for large dense systems, where block sizes are tuned for load balancing across processors. Rank-revealing QR (RRQR) enhances the QR decomposition with column pivoting to expose the matrix's numerical rank, particularly useful for ill-conditioned or rank-deficient matrices. In RRQR, a permutation matrix P is chosen such that A P = Q R, where Q is orthogonal, R is upper triangular, and the trailing subdiagonal elements of R are small, ensuring |R_{ij}| \leq f \sigma_{\min}(R_{11}) for i > r, j \geq r+1, with f a small factor and r the numerical rank. This reveals the rank by inspecting R's structure, providing a low-rank approximation A \approx Q(:,1:r) R(1:r,:) with error bounded by the tail singular values. The algorithm extends classical QR by selecting pivots that maximize the ratio |r_{jj}| / \| \mathbf{r}_j \|, running in O(m n^2) time for m \times n matrices. Strong variants guarantee tighter bounds on singular values, improving reliability for subset selection in least squares problems. The URV decomposition serves as a rank-revealing analog to the , factoring A = U R V^T where U and V are orthogonal, and R is triangular with its leading r \times r block well-conditioned, revealing the through small subdiagonal norms. Introduced for tracking, it approximates the SVD's rank revelation but avoids full orthogonalization, making it efficient for updates in dynamic applications like . Algorithms compute URV via bidiagonalization followed by rank-revealing steps, with costs comparable to QR but providing both row and column space approximations. Rank factorization decomposes a A \in \mathbb{R}^{m \times n} of r as A = C R, where C \in \mathbb{R}^{m \times r} has full column and R \in \mathbb{R}^{r \times n} has full row , spanning the column and row spaces exactly. This basic form enables low- approximations by truncating to k < r, minimizing Frobenius norm error among -k matrices when combined with SVD-derived factors, though direct computation uses row echelon forms. It underpins model order reduction by projecting high-dimensional systems onto lower-dimensional subspaces while preserving block structures in partitioned models. Blocked Householder algorithms optimize QR and related decompositions for cache efficiency by applying Householder reflections in panels, reducing memory accesses in level-3 BLAS operations. In blocked QR, the matrix is divided into blocks, with each panel factorized using unblocked Householder, then applied via block reflections to the trailing matrix, achieving O(n^3) performance on hierarchical memory systems. This approach extends to block LU and URV, enhancing parallelism in multicore environments. These decompositions find applications in parallel computing, where block LU enables task-based parallelism for solving large-scale linear systems on distributed architectures, and in model order reduction, where rank-revealing QR and URV identify dominant subspaces for approximating high-fidelity simulations with reduced models. For instance, RRQR supports subset selection in data mining, selecting informative columns for compression.

Decompositions for Nonsquare Matrices

Matrix decompositions extend naturally to nonsquare matrices, particularly rectangular ones where the number of rows m differs from the number of columns n. For tall matrices with m > n, the generalized via transforms the matrix into a , consisting of an m \times n U (possibly with zero rows at the bottom) and permutation matrices accounting for , such that PA = LU where L is lower triangular but adapted for the rectangular case. This form facilitates solving overdetermined systems by back-substitution after forward elimination. The thin (or reduced) QR decomposition is particularly useful for tall matrices, factoring an m \times n matrix A (with m > n and full column rank) as A = QR, where Q is an m \times n matrix with orthonormal columns and R is an n \times n upper . This decomposition is computed using reflections or Gram-Schmidt orthogonalization, preserving for computations. For wide matrices with m < n, a similar economy QR form exists with Q as m \times m orthogonal and R as m \times n upper trapezoidal. The singular value decomposition (SVD) applies directly to rectangular matrices, yielding A = U \Sigma V^T where U is m \times m orthogonal, \Sigma is m \times n diagonal with nonnegative singular values, and V is n \times n orthogonal. For efficiency with nonsquare matrices, economy (or thin) SVD variants are used: for m > n, A = U_{m \times n} \Sigma_{n \times n} V_{n \times n}^T with U having orthonormal columns; for m < n, a similar reduced form discards unnecessary zero singular values and corresponding vectors. These compact forms reduce computational cost while retaining essential information for rank and approximation. These decompositions are central to solving overdetermined systems Ax = b with m > n, minimizing the residual \|Ax - b\|_2 via the normal equations or directly through QR or . For underdetermined systems with m < n, they enable minimum-norm solutions \min \|x\|_2 subject to Ax = b, often using the pseudoinverse derived from . The connection to the pseudoinverse provides a unified for both cases. An intermediate step in computing the SVD of a rectangular matrix is bidiagonalization, which reduces A to an m \times n upper bidiagonal form B via Householder transformations: A = U B V^T, where U and V are orthogonal. The Golub-Kahan procedure applies alternating left and right Householder reflections to zero subdiagonal elements progressively, yielding a form amenable to iterative SVD refinement like the QR algorithm on the bidiagonal matrix. This step is crucial for numerical stability in rectangular SVD computations. For illustration, consider the with the 3×2 matrix A = \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ 1 & 3 \end{pmatrix} and right-hand side b = \begin{pmatrix} 1 \\ 2 \\ 2 \end{pmatrix}. The thin is A = QR with Q \approx \begin{pmatrix} 0.5774 & -0.7071 \\ 0.5774 & 0 \\ 0.5774 & 0.7071 \end{pmatrix}, \quad R \approx \begin{pmatrix} 1.7321 & 3.4641 \\ 0 & 1.4142 \end{pmatrix}. Solving Rx = Q^T b via back-substitution gives the solution x \approx (0.6667, 0.5000)^T, minimizing \|Ax - b\|_2 \approx 0.4082.

Extensions to Other Structures

Matrix decompositions extend beyond finite-dimensional matrices to higher-order structures and infinite-dimensional operators, providing analogous tools for analysis in and . In tensor decompositions, the CANDECOMP/PARAFAC () decomposition represents a higher-order tensor as a sum of rank-1 terms, where each term is the outer product of vectors along each mode of the tensor. This of matrix was first proposed by Hitchcock in 1927 and later independently developed as PARAFAC by Cattell and Harshman in the . The model is particularly useful for capturing multiway interactions in data, with the tensor indicating the minimal number of such components needed. The serves as a higher-order analog to the (), factoring a tensor into a smaller core tensor multiplied by factor matrices along each mode, allowing for multilinear rank reduction. Introduced by in 1963 and refined in subsequent works, it enables flexible while preserving tensor structure, often applied in and data compression. For a third-order tensor \mathcal{X} \in \mathbb{R}^{I \times J \times K}, the Tucker form is \mathcal{X} = \mathcal{G} \times_1 A \times_2 B \times_3 C, where \mathcal{G} is the core tensor and A, B, C are factor matrices. In the realm of , the generalizes eigendecomposition to operators on Hilbert spaces, asserting that such an T admits a T = \int \lambda \, dE(\lambda), where E is a supported on the real of T. This holds for bounded operators and extends to unbounded operators via the , which defines functions of the as f(T) = \int f(\lambda) \, dE(\lambda), enabling the application of polynomials and continuous functions to infinite-dimensional settings. These extensions are foundational in and partial differential equations, where operators may not be bounded. Further generalizations include the QZ decomposition for matrix pencils, which computes a generalized Schur form A - \lambda B = Q S Z^{-1}, where Q and Z are orthogonal (or unitary) and S is quasitriangular, facilitating the solution of generalized eigenvalue problems \det(A - \lambda B) = 0. Developed in the 1970s as an extension of the Schur form, it is essential in control theory for analyzing system stability and pole placement in descriptor systems. In symplectic geometry, the Williamson normal form decomposes a positive definite symmetric matrix M via a symplectic transformation S such that S^T M S = \operatorname{diag}(D_+, D_-), where D_+ and D_- are diagonal blocks of symplectic eigenvalues, preserving the symplectic structure in Hamiltonian mechanics. Similarly, the Takagi factorization applies to complex symmetric matrices A = U \Sigma U^T, with U unitary and \Sigma diagonal containing non-negative singular values, generalizing the SVD for non-Hermitian cases in quantum information and vibration analysis. These extensions find applications in multidimensional , such as using decomposition in recommender systems to model user-item-context interactions from sparse tensors, improving prediction accuracy over matrix-based methods. In , the QZ decomposition aids in solving generalized eigenvalue problems for linear descriptor systems, enabling robust stability assessments. As an illustrative example, consider a $2 \times 2 \times 2 tensor \mathcal{X} with rank 2: \mathcal{X} \approx \sum_{r=1}^2 \mathbf{u}_r \circ \mathbf{v}_r \circ \mathbf{w}_r, where \mathbf{u}_r, \mathbf{v}_r, \mathbf{w}_r \in \mathbb{R}^2 are component vectors; for instance, with rank-1 terms capturing bilinear and trilinear dependencies in small-scale multiway data. Recent advances in multilinear algebra, building on these foundations, have expanded tensor decompositions to handle large-scale and structured data, addressing limitations in earlier matrix-centric approaches.

References

  1. [1]
    Linear Algebra and Matrix Decompositions - Duke People
    Matrix decompositions are an important step in solving linear systems in a computationally efficient manner.Lu Decomposition And... · Cholesky Decomposition · EigendecompositionMissing: scholarly sources
  2. [2]
    [PDF] THE DECOMPOSITIONAL APPROACH TO MATRIX COMPUTATION
    TO MATRIX COMPUTATION. The introduction of matrix decomposition into numerical linear algebra revolutionized matrix computations. This article outlines the ...
  3. [3]
    [2201.00145] Matrix Decomposition and Applications - arXiv
    Jan 1, 2022 · The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis.Missing: scholarly | Show results with:scholarly
  4. [4]
    Matrix Factorizations — Linear Algebra, Geometry, and Computation
    A factorization of a matrix A is an equation that expresses A as a product of two or more matrices. ; The essential difference with what we have done so far is ...<|control11|><|separator|>
  5. [5]
    [PDF] Matrix Decomposition - Electrical and Computer Engineering
    “Matrix decomposition refers to the transformation of a given matrix into a given canonical form.” [1], when the given matrix is transformed to a right-hand- ...Missing: motivation | Show results with:motivation
  6. [6]
    Matrices and determinants - MacTutor History of Mathematics
    The beginnings of matrices and determinants goes back to the second century BC although traces can be seen back to the fourth century BC.
  7. [7]
    On the Early History of the Singular Value Decomposition
    The first proofs of the SVD for real square matrices came out of the study of bilinear forms, first by Beltrami in 1873 and, independently, by Jordan in 1874.
  8. [8]
    Gram‐Schmidt orthogonalization: 100 years and more
    Jun 4, 2012 · In 1907, Erhard Schmidt published a paper in which he introduced an orthogonalization algorithm that has since become known as the classical ...
  9. [9]
    Andre-Louis Cholesky (1875 - 1918) - Biography - MacTutor
    Andre-Louis Cholesky was a French military officer and mathematician who invented the Method of Cholesky, a computational procedure for solving least squares ...
  10. [10]
    [PDF] A Brief History of Linear Algebra and Matrix Theory
    Turing introduced the LU decomposition of a matrix in 1948. The L is a lower triangular matrix with 1's on the diagonal and the U is an echelon matrix.
  11. [11]
    None
    Below is a merged summary of matrix decompositions from Chapters 2, 3, and 5, consolidating all information from the provided segments into a comprehensive and dense format. To maximize detail retention, I will use a combination of text and tables in CSV-like format where appropriate. The summary is organized by key properties, computational complexity, stability and conditioning, relations to triangularization/diagonalization, key quotes/theorems, page references, and useful URLs.
  12. [12]
    LU Decomposition for Solving Linear Equations - CS 357
    The forward substitution algorithm solves a lower-triangular linear system by working from the top down and solving each variable in turn.
  13. [13]
  14. [14]
    [PDF] LU Decomposition 1.0 Introduction We have seen how to construct ...
    U is an upper-triangular matrix, meaning that all elements below the diagonal are 0. •. L is a lower-triangular matrix, meaning that all elements above the ...
  15. [15]
    [PDF] 11. LU Decomposition - UC Davis Math
    From here, the process is exactly the same as for a square matrix. We create a sequence of matrices Li and Ui that is eventually the LU decomposition. Again, ...Missing: history | Show results with:history
  16. [16]
    Cholesky decomposition - StatLect
    A square matrix is said to have a Cholesky decomposition if it can be written as the product of a lower triangular matrix and its transpose.Missing: advantages | Show results with:advantages
  17. [17]
    Cholesky Factorization - an overview | ScienceDirect Topics
    Cholesky decomposition or factorization is a form of triangular decomposition that can only be applied to either a positive definite symmetric matrix or a ...
  18. [18]
    [PDF] 1. Positive Definite Matrices
    Oct 26, 2005 · The Cholesky Decomposition. The Cholesky decomposition can be computed directly from the matrix equation A = FTF. Examining this equation on ...
  19. [19]
    Cholesky factorization - Higham - Wiley Interdisciplinary Reviews
    Jun 30, 2009 · The Cholesky factorization (sometimes called the Cholesky decomposition) is named after André-Louis Cholesky (1875–1918), a French military ...Missing: original | Show results with:original
  20. [20]
    [PDF] MODELING COVARIANCE MATRICES IN TERMS OF STANDARD ...
    Another approach is to use the Cholesky decomposition of the inverse of the covariance matrix (e.g., Pourahmadi (1999, 2000)), which has a nice regres- sion ...
  21. [21]
    [PDF] Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD
    How do we compute the QR Factorization? There are three main methods. ▷ Gram-Schmidt Orthogonalization. ▷ Householder Triangularization. ▷ Givens Rotations.
  22. [22]
    [PDF] Some notes on QR factorization - Alen Alexanderian
    Sep 27, 2025 · The QR factorization of A is A = QR, where Q ∈ Rm×n has orthonormal columns and R is upper triangular. We also know, if A has full column rank, ...
  23. [23]
    QR decomposition - MATLAB - MathWorks
    Use the economy-size QR decomposition of a coefficient matrix to solve the linear system Ax · Create a 10-by-5 coefficient matrix by using the first five columns ...
  24. [24]
    [PDF] The QR Algorithm - Ethz
    The QR algorithm computes a Schur decomposition of a matrix. It is certainly one of the most important algorithm in eigenvalue computations [9].
  25. [25]
    [PDF] Updating the QR Factorization and the Least Squares Problem
    Nov 12, 2008 · Abstract. In this paper we treat the problem of updating the QR factorization, with applications to the least squares problem.
  26. [26]
    [PDF] QR Decomposition with Gram-Schmidt
    The QR decomposition (also called the QR factorization) of a matrix is a decomposition of the matrix into an orthogonal matrix and a triangular matrix.
  27. [27]
    Eigen Decomposition - an overview | ScienceDirect Topics
    Eigen decomposition is defined as the factorization of a tensor into a canonic form using eigenvalues and eigenvectors, and it is utilized in various machine ...Missing: seminal papers
  28. [28]
    Eigenvalue computation in the 20th century - ScienceDirect.com
    This paper sketches the main research developments in the area of computational methods for eigenvalue problems during the 20th century.
  29. [29]
    [PDF] A Short History of Operator Theory - NYU Stern
    Cauchy proved the spectral theorem for self-adjoint matrices, i.e., that every real, symmetric matrix is diagonable. The spectral theorem as generalized by ...
  30. [30]
    [PDF] The QR algorithm: 50 years later its genesis by John Francis and ...
    Jun 8, 2009 · Introduction. On 29 October 1959 John Francis submitted his first theoretical QR algorithm paper. On 6 June 1961.
  31. [31]
    [PDF] A Tutorial on the Spectral Theory of Markov Chains - arXiv
    Aug 19, 2022 · Finding the eigenvalues and eigenvectors of a matrix is one way to partition a linear transformation into its component parts and to reveal ...
  32. [32]
    [PDF] Why eigenvalues? 1 Nonlinear equation solving - CS@Cornell
    The purpose of this lecture is to tell you about a few applications of eigenvalue analysis, or perhaps to remind you of some applications that you've seen in ...
  33. [33]
    [PDF] Eigenvalues and eigenvectors of rotation matrices
    √cos2 θ − 1 = cosθ ± i sin θ = e±iθ . (6). Thus, we ave confirmed that the eigenvalues are not real if θ 6= 0 (mod π). For the special.
  34. [34]
    [PDF] 10. Schur decomposition
    𝛼2(𝜆2 − 𝜆1)x2 +···+ 𝛼k(𝜆k − 𝜆1)xk = 0. • since x2, ..., xk are linearly independent, 𝛼2 = ··· = 𝛼k = 0. • 𝛼1x1 = −(𝛼2x2 +···+ 𝛼kxk) = 0, so 𝛼1 is also zero.
  35. [35]
    [PDF] Golub and Van Loan - EE IIT Bombay
    2.4 The Singular Value Decomposition. It is fitting that the first matrix decomposition that we present in the book is the singular value decomposition (SVD).
  36. [36]
    [PDF] Basic Theory of Algebraic Eigenvalue Problems
    The Schur decomposition is one of the most important tools in theoretical and computa- tional linear algebra! A real square matrix A may have complex ...<|control11|><|separator|>
  37. [37]
    [PDF] Basics - Ethz
    ... Schur vector is an eigenvector of A. The other columns of U, however, are in general not eigenvectors of A. Notice, that the Schur decomposition is not unique.
  38. [38]
    [PDF] Eigenvalue Problem for General Matrices 1 The Complex Schur ...
    Schur decomposition is defined as: A=QUQ. ∗. ,. (1.1) where ∗ denotes the ... important property of unreduced Hessenberg matrix is that it can only have simple ...
  39. [39]
    [PDF] Computing the Polar Decomposition—with Applications
    Abstract. A quadratically convergent Newton method for computing the polar decomposition ofa full-rank matrix is presented and analysed. Acceleration ...
  40. [40]
    [PDF] Introduction to Continuum Mechanics
    Continuum mechanics is a modern discipline that unifies solid and ... the right polar decomposition in Equation (3.92). Here, for the stretch U to ...
  41. [41]
    Traité des substitutions et des équations algébriques - Internet Archive
    Dec 3, 2008 · Traité des substitutions et des équations algébriques. by: Jordan, Camille, 1838-1922 ... PDF download · download 1 file · SCRIBE SCANDATA ZIP ...Missing: 1874 | Show results with:1874
  42. [42]
  43. [43]
    [PDF] Block LU Factorization 1 Introduction - The Netlib
    Feb 16, 1992 · Algorithm BLU. This algorithm computes a block LU factorization A = LU 2IRn n. 1. U11 = A11, U12 = A12.
  44. [44]
    Parallel Implementation - The Netlib
    In this section we describe the parallel implementation of LU factorization, with partial pivoting over rows, for a block-partitioned matrix. The matrix, A, to ...
  45. [45]
    Efficient parallel algorithm for dense matrix LU decomposition with ...
    The algorithm employs row- column- as well as block-parallelisms interchangeably so that the n processors are used efficiently in the whole computation process.
  46. [46]
    Rank revealing QR factorizations - ScienceDirect.com
    For matrices of low rank deficiency, the algorithm is guaranteed to reveal the rank of A, and the cost is only slightly more than the cost of one regular ̧ O ̧ ...
  47. [47]
    Downdating the Rank-Revealing URV Decomposition - SIAM.org
    An accurate algorithm is presented for downdating a row in the rank-revealing URV decomposition that was recently introduced by Stewart.
  48. [48]
    Some Applications of the Rank Revealing QR Factorization - SIAM.org
    The rank revealing QR factorization can be used to compute solutions to rank deficient least squares problems, to perform subset selection, to compute matrix ...
  49. [49]
    [PDF] Block Structure Preserving Model Order Reduction
    Abstract—. We propose a block structure preserving model reduction (BSMOR), which generalizes the structure preserving model order reduction (SPRIM).
  50. [50]
    [PDF] Calculating the Singular Values and Pseudo-Inverse of a Matrix
    GOLUB AND W. KAHAN. Householder's method for the solution of the algebraic eigenproblem, Com- put. J., 3 (1960), pp. 23-27. Error analysis of direct methods ...
  51. [51]
    A Gentle Introduction to Matrix Factorization for Machine Learning
    Aug 9, 2019 · A matrix decomposition is a way of reducing a matrix into its constituent parts. It is an approach that can simplify more complex matrix operations.Missing: motivation | Show results with:motivation
  52. [52]
    Tensor Decompositions and Applications | SIAM Review
    Tensor Rank and the CANDECOMP/PARAFAC Decomposition. In 1927, Hitchcock [105, 106] proposed the idea of the polyadic form of a tensor, i.e., expressing a tensor ...
  53. [53]
    The spectral theorem and its converses for unbounded symmetric ...
    Dec 20, 2011 · The spectral theorem for finite-dimensional self-adjoint operators (ie Hermitian matrices, when viewed in coordinates), which provides a sequence.
  54. [54]
    On the Generalized Schur Decomposition of a Matrix Pencil for ...
    Here we introduce, as an alternative, a companion QZ algorithm that solves a generalized eigenvalue problem for a companion pencil. More importantly, we ...
  55. [55]
    Williamson theorem in classical, quantum, and statistical physics
    Dec 1, 2021 · For each real square positive-definite symmetric matrix with even dimension, there is an associated symplectic matrix that diagonalizes it ...
  56. [56]
    [PDF] Iterative Algorithms for Computing the Takagi Factorization ... - IAENG
    Aug 28, 2018 · The Takagi factorization of a complex symmetric matrix has many applications, such as the Grunsky inequalities [23], computation of the near- ...
  57. [57]
    [PDF] Tensor Methods and Recommender Systems - arXiv
    Feb 18, 2018 · rank in case of CP) of a tensor decomposition ... The TFC model: Tensor factorization and tag clustering for item recommendation in social tagging ...
  58. [58]
    Iterative refinement of the QZ decomposition for solving linear DSGE ...
    Our approach allows users to improve the initial QZ solution through repeated refinement, preserving eigenvalues while incrementally improving accuracy.Missing: theory | Show results with:theory