Gaussian elimination
Gaussian elimination is a systematic algorithm for solving systems of linear equations, determining the rank of a matrix, computing matrix inverses, and finding matrix determinants by transforming an augmented matrix into row echelon form using elementary row operations—such as swapping rows, multiplying a row by a nonzero scalar, and adding a multiple of one row to another—followed by back-substitution to obtain the solution.[1][2] The method reduces the original system to an equivalent upper triangular form, where the variables can be solved sequentially from the bottom up, ensuring numerical stability when partial pivoting is employed to select the largest pivot element and avoid division by small numbers.[3][4] Although commonly attributed to Carl Friedrich Gauss, who formalized its use in the early 19th century for solving least squares problems in astronomy, the technique traces its origins to ancient Chinese mathematics in the text Nine Chapters on the Mathematical Art around 200 BCE, where a similar row reduction method was applied to linear systems.[5][6] The algorithm gained prominence in the West through contributions from figures like Isaac Newton in the 17th century and was extended by Wilhelm Jordan in 1888 into the Gauss-Jordan elimination variant, which fully reduces the matrix to reduced row echelon form for direct solution without back-substitution.[7][8] In modern computational contexts, Gaussian elimination underpins numerical linear algebra libraries and is fundamental to fields like engineering, physics, and data science, though it exhibits cubic time complexity O(n³) for an n×n system, prompting optimizations such as LU decomposition for repeated solves.[9][10] Its reliability and simplicity make it a cornerstone of introductory linear algebra, with pivoting strategies enhancing accuracy in floating-point arithmetic implementations.[11]Fundamentals
Elementary row operations
Elementary row operations are the fundamental manipulations applied to matrices during Gaussian elimination to simplify systems of linear equations without altering their solutions. There are three types of these operations: (1) interchanging any two rows of the matrix, (2) multiplying all entries in a single row by a nonzero scalar c, and (3) adding a scalar multiple c of the entries in one row to the corresponding entries in another row.[12] These operations are defined for an m \times n matrix A = (a_{ik}) as follows: for interchanging rows i and j, swap a_{ik} with a_{jk} for all k = 1 to n; for multiplying row i by c \neq 0, replace a_{ik} \leftarrow c a_{ik} for k = 1 to n; for adding c times row j to row i, replace a_{ik} \leftarrow a_{ik} + c a_{jk} for k = 1 to n.[12] When applied to the augmented matrix [A \mid b] of a linear system Ax = b, these operations generate an equivalent system that preserves the original solution set. Interchanging rows merely reorders the equations, which does not affect the solutions; multiplying a row by a nonzero scalar c scales the equation by c on both sides, maintaining equality for any solution; adding a multiple of one row to another performs a valid algebraic manipulation that keeps the solution set unchanged, as the new equations are linear combinations of the originals.[12][13] Each elementary row operation is reversible through another elementary operation, ensuring that the transformations are invertible. Specifically, interchanging rows i and j is its own inverse; multiplying row i by c \neq 0 is inverted by multiplying by c^{-1}; and adding c times row j to row i is inverted by adding -c times row j to row i.[14] These operations correspond to left-multiplication by elementary matrices, which are obtained by applying the same operation to the identity matrix I_m; for instance, the matrix E for adding c times row j to row i has $1s on the diagonal, cin the(i,j)-entry, and $0s elsewhere, and EA yields the modified matrix. Since each elementary matrix E is invertible with inverse E^{-1} also elementary, a sequence of operations represented by U = E_k \cdots E_1 is invertible, confirming that the row-equivalent matrices share the same solution set.[14][15] The motivation for these operations lies in their ability to simplify matrices while preserving the row space, the span of the row vectors, which remains invariant under row interchanges, scalings, and additions.[16] This preservation ensures that the linear dependencies among rows are maintained, allowing Gaussian elimination to reduce the matrix to row echelon form without losing essential structural information.[12]Row echelon form
A matrix is in row echelon form (REF) if it satisfies three conditions: all nonzero rows are above any rows of all zeros; the leading entry (pivot) in each nonzero row is strictly to the right of the leading entry in the row immediately above it; and all entries below each leading entry are zeros.[17] These forms are achieved through a sequence of elementary row operations.[17] Consider the following 3×3 matrix, which is not in REF because the leading entry in the third row (5) appears in a column to the left of the leading entry in the second row (4): \begin{pmatrix} 1 & 2 & 3 \\ 0 & 0 & 4 \\ 0 & 5 & 0 \end{pmatrix} In contrast, the matrix below is in REF, with pivots at positions (1,1), (2,3), and a zero row at the bottom; note that the leading entries need not be 1 in REF: \begin{pmatrix} 1 & 2 & 3 \\ 0 & 0 & 4 \\ 0 & 0 & 0 \end{pmatrix} [17] The reduced row echelon form (RREF) builds on REF by adding two further requirements: each leading entry is 1 (a pivot of 1); and each leading 1 is the only nonzero entry in its column.[17] A key property of RREF is its uniqueness: every matrix is row equivalent to exactly one matrix in RREF, which uniquely represents its row space.[18] In REF (or RREF), the positions of the pivots identify the basic variables in systems derived from the matrix, with the number of pivots equal to the dimension of the row space.[17] The number of nonzero rows in an REF equals the row rank of the matrix, which is the maximum number of linearly independent rows and remains invariant under elementary row operations. This equivalence holds because the nonzero rows in REF are linearly independent, providing a basis for the row space.The Algorithm
Forward elimination phase
The forward elimination phase of Gaussian elimination systematically transforms the augmented matrix of a system of linear equations into an upper triangular form by eliminating the entries below the main diagonal using elementary row operations. This phase focuses on processing each column from top to bottom, ensuring that after completion, the matrix is in row echelon form, which simplifies subsequent solving.[19] The procedure begins by considering the augmented matrix [A | b], where A is the n \times n coefficient matrix and b is the right-hand side vector. For each pivot position k from 1 to n-1:- Identify the pivot element a_{kk}; if it is zero, perform partial pivoting by swapping row k with a row i > k that has a nonzero entry in column k to ensure a_{kk} \neq 0 and avoid division by zero.[19]
- For each row i from k+1 to n, compute the multiplier m_{ik} = a_{ik} / a_{kk}.[20]
- Subtract m_{ik} times row k from row i to zero out the entry a_{ik}, updating the remaining entries in row i and the corresponding right-hand side entry b_i accordingly.[19]
[19] By progressively zeroing out the subdiagonal elements in each column, this phase reduces the original system to an equivalent upper triangular system U \mathbf{x} = \mathbf{c}, where U has nonzero diagonals and zeros below. This triangular structure eliminates the need to solve for multiple variables simultaneously in lower rows, enabling efficient back-substitution to find the solution vector \mathbf{x}.[20]for k = 1 to n-1 if a_{kk} == 0 swap row k with a suitable row i > k where a_{ik} ≠ 0 for i = k+1 to n m = a_{ik} / a_{kk} for j = k to n a_{ij} = a_{ij} - m * a_{kj} b_i = b_i - m * b_k end endfor k = 1 to n-1 if a_{kk} == 0 swap row k with a suitable row i > k where a_{ik} ≠ 0 for i = k+1 to n m = a_{ik} / a_{kk} for j = k to n a_{ij} = a_{ij} - m * a_{kj} b_i = b_i - m * b_k end end
Back-substitution phase
The back-substitution phase follows the forward elimination phase, where the augmented matrix has been transformed into an upper triangular form with nonzero pivots on the diagonal for the leading coefficients. This phase solves for the unknowns by starting from the bottom row and working upwards, substituting previously computed values into earlier equations to isolate each variable sequentially.[21][20] In a system of n equations with n unknowns, assuming the matrix is square and invertible after elimination, the procedure begins with the last equation, which simplifies to x_n = b_n' / a_{nn}', where b_n' and a_{nn}' are the updated right-hand side and pivot entry, respectively. For each preceding row i from n-1 down to 1, the i-th variable is solved as x_i = \frac{b_i' - \sum_{j=i+1}^n a_{ij}' x_j}{a_{ii}'}, where the sum subtracts the contributions of the already-determined variables x_{j} for j > i. This backward pass yields the unique solution vector x in O(n^2) operations.[21][22][23] For underdetermined systems, where the number of equations m < n or rank deficiency occurs, back-substitution identifies free variables corresponding to non-pivot columns in the row echelon form. These free variables can be assigned arbitrary values (often set to zero to obtain a particular solution), while pivot variables are expressed in terms of the free ones through the substitution process. The general solution is then the particular solution plus a linear combination of basis vectors spanning the null space, determined by solving the homogeneous system with free variables as parameters.[24][25][26] Unlike full reduction to reduced row echelon form (RREF), which eliminates entries above pivots for direct solution reading, back-substitution on the row echelon form suffices for extracting solutions from triangular systems and is computationally less intensive for this purpose.[10][27]Illustrative example
To illustrate the Gaussian elimination algorithm, consider the system of linear equations Ax = b, where A = \begin{bmatrix} 2 & 1 & -1 \\ -3 & -1 & 2 \\ -2 & 1 & 2 \end{bmatrix}, \quad b = \begin{bmatrix} 8 \\ -11 \\ -3 \end{bmatrix}. The goal is to solve for x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}. Begin by forming the augmented matrix [A \mid b]: \begin{bmatrix} 2 & 1 & -1 & \mid & 8 \\ -3 & -1 & 2 & \mid & -11 \\ -2 & 1 & 2 & \mid & -3 \end{bmatrix}. In the forward elimination phase, apply elementary row operations to transform this into row echelon form (upper triangular). First, use the pivot in position (1,1), which is 2. Eliminate the entries below it in column 1. The multiplier for row 2 is -3/2, so add (3/2) times row 1 to row 2: \text{Row 2} \leftarrow \text{Row 2} + \frac{3}{2} \cdot \text{Row 1} \implies \begin{bmatrix} 0 & \frac{1}{2} & \frac{1}{2} & \mid & 1 \end{bmatrix}. The multiplier for row 3 is -2/2 = -1, so add row 1 to row 3: \text{Row 3} \leftarrow \text{Row 3} + \text{Row 1} \implies \begin{bmatrix} 0 & 2 & 1 & \mid & 5 \end{bmatrix}. The matrix is now \begin{bmatrix} 2 & 1 & -1 & \mid & 8 \\ 0 & \frac{1}{2} & \frac{1}{2} & \mid & 1 \\ 0 & 2 & 1 & \mid & 5 \end{bmatrix}. Next, use the pivot in position (2,2), which is $1/2. Eliminate the entry below it in column 2. The multiplier for row 3 is $2 / (1/2) = 4, so subtract 4 times row 2 from row 3: \text{Row 3} \leftarrow \text{Row 3} - 4 \cdot \text{Row 2} \implies \begin{bmatrix} 0 & 0 & -1 & \mid & 1 \end{bmatrix}. The matrix is now \begin{bmatrix} 2 & 1 & -1 & \mid & 8 \\ 0 & \frac{1}{2} & \frac{1}{2} & \mid & 1 \\ 0 & 0 & -1 & \mid & 1 \end{bmatrix}. For clarity in back-substitution, scale row 2 by 2 and row 3 by -1 (optional normalization steps): \text{Row 2} \leftarrow 2 \cdot \text{Row 2} \implies \begin{bmatrix} 0 & 1 & 1 & \mid & 2 \end{bmatrix}, \quad \text{Row 3} \leftarrow -1 \cdot \text{Row 3} \implies \begin{bmatrix} 0 & 0 & 1 & \mid & -1 \end{bmatrix}. The upper triangular augmented matrix is \begin{bmatrix} 2 & 1 & -1 & \mid & 8 \\ 0 & 1 & 1 & \mid & 2 \\ 0 & 0 & 1 & \mid & -1 \end{bmatrix}. In the back-substitution phase, solve from the bottom up. From row 3: x_3 = -1.From row 2: x_2 + x_3 = 2 \implies x_2 - 1 = 2 \implies x_2 = 3.
From row 1: $2x_1 + x_2 - x_3 = 8 \implies 2x_1 + 3 - (-1) = 8 \implies 2x_1 + 4 = 8 \implies 2x_1 = 4 \implies x_1 = 2.
Thus, x = \begin{bmatrix} 2 \\ 3 \\ -1 \end{bmatrix}. To verify, compute Ax: A x = \begin{bmatrix} 2 & 1 & -1 \\ -3 & -1 & 2 \\ -2 & 1 & 2 \end{bmatrix} \begin{bmatrix} 2 \\ 3 \\ -1 \end{bmatrix} = \begin{bmatrix} 4 + 3 + 1 \\ -6 - 3 - 2 \\ -4 + 3 - 2 \end{bmatrix} = \begin{bmatrix} 8 \\ -11 \\ -3 \end{bmatrix} = b.