Woodbury matrix identity
The Woodbury matrix identity is a key result in linear algebra that expresses the inverse of a matrix A + UCV—where A is an invertible matrix and U, C, and V are matrices of compatible dimensions with C invertible—in terms of the inverse of A and a correction term involving a smaller matrix inversion.[1] This identity enables efficient computation of matrix inverses when the update UCV has low rank, reducing the complexity from O(n^3) for an n \times n matrix to O(k^3 + nk^2), where k is the rank of the update and typically k \ll n.[2]
Named after mathematician Max A. Woodbury, the identity was first presented in his 1950 memorandum report on inverting modified matrices, building on earlier work by Sherman and Morrison for rank-one updates.[3] The general form of the identity is (A + UCV)^{-1} = A^{-1} - A^{-1}U(C^{-1} + VA^{-1}U)^{-1}VA^{-1}, which can be verified by direct multiplication to confirm it yields the identity matrix.[4] Special cases include the Sherman-Morrison formula (when k=1) and the matrix inversion lemma often used in statistical derivations.[1]
The identity's utility stems from its role in updating matrix inverses incrementally, avoiding full recomputation in iterative algorithms.[2] It is particularly valuable in numerical linear algebra for solving systems of linear equations with perturbations and in optimization problems where matrices evolve over time.[4]
In applied contexts, the Woodbury identity facilitates efficient implementations in fields like control theory and signal processing, notably in the Kalman filter for updating covariance matrices during state estimation.[4] It also appears in statistical modeling, such as ridge regression, where it simplifies the inversion of (X^T X + \lambda I)^{-1} by treating the regularization as a low-rank adjustment.[5] Additionally, it supports scalable computations in machine learning tasks involving Gaussian processes and kernel methods, enabling predictions without inverting large kernel matrices directly.[1]
Statement
The Woodbury matrix identity provides a formula for the inverse of a matrix that has been modified by a low-rank perturbation. Specifically, let A be an invertible n \times n matrix over the real or complex numbers, U an n \times k matrix, C an invertible k \times k matrix, and V a k \times n matrix. Then,
(A + U C V)^{-1} = A^{-1} - A^{-1} U (C^{-1} + V A^{-1} U)^{-1} V A^{-1},
provided that C^{-1} + V A^{-1} U is invertible.[6]
This identity expresses the inverse of the perturbed matrix A + U C V in terms of the known inverse A^{-1} and a correction term involving matrices of reduced dimension k. The perturbation U C V has rank at most k, making the formula particularly valuable for low-rank updates where direct inversion of the full n \times n matrix would be inefficient.[3]
When k \ll n, applying the identity avoids the O(n^3) cost of full matrix inversion, instead requiring operations dominated by O(k^2 n + k^3) time for matrix multiplications and the inversion of the k \times k matrix C^{-1} + V A^{-1} U.
Notation and assumptions
The Woodbury matrix identity employs standard notation for matrices over the real numbers, where A \in \mathbb{R}^{n \times n} is an invertible matrix, U \in \mathbb{R}^{n \times k} and V \in \mathbb{R}^{k \times n} are rectangular matrices, and C \in \mathbb{R}^{k \times k} is an invertible square matrix, with k \leq n often chosen to be small for computational purposes.[Golub and Van Loan (2013). Matrix Computations (4th ed.). Johns Hopkins University Press, p. 70.] These dimensions ensure that the low-rank perturbation U C V \in \mathbb{R}^{n \times n} is compatible with A, forming the modified matrix A + U C V \in \mathbb{R}^{n \times n}.
Key assumptions include the invertibility of A and C, which are necessary for the direct application of the identity; without these, the expressions involving A^{-1} and C^{-1} are undefined.[Petersen, K. B., & Pedersen, M. S. (2012). The Matrix Cookbook. Technical University of Denmark, Section 3.2.2.] Additionally, the inner matrix C^{-1} + V A^{-1} U \in \mathbb{R}^{k \times k} must be invertible, as its singularity would prevent the formula from yielding the correct inverse of A + U C V, even if the latter is invertible.[Golub and Van Loan (2013). Matrix Computations (4th ed.). Johns Hopkins University Press, p. 70.] For cases where A or C are singular, extensions using Moore-Penrose pseudoinverses exist but require separate treatment beyond the classical assumptions.[Afra, S., & Ascher, U. M. (2022). A singular Woodbury matrix identity and application to Gaussian processes. arXiv preprint arXiv:2207.08038.]
In terms of compatibility, the matrix products are well-defined under the specified dimensions, ensuring A + U C V remains square and the operations align.[Petersen, K. B., & Pedersen, M. S. (2012). The Matrix Cookbook. Technical University of Denmark, Section 3.2.2.] Edge cases highlight the identity's scope: when k=0, U and V effectively vanish, reducing the identity to the trivial (A)^{-1} = A^{-1}.[Golub and Van Loan (2013). Matrix Computations (4th ed.). Johns Hopkins University Press, p. 70.] If C is singular, no such guarantee holds, as the inner term may fail to be invertible, potentially requiring alternative approaches.[Afra, S., & Ascher, U. M. (2022). A singular Woodbury matrix identity and application to Gaussian processes. arXiv preprint arXiv:2207.08038.]
History
Early developments
The foundations of what would later contribute to the Woodbury matrix identity trace back to 19th-century developments in determinant theory and linear systems, particularly through Augustin-Louis Cauchy's work on adjugates and cofactor expansions. In 1812, Cauchy formalized the concept of the adjugate matrix (then referred to in the context of reciprocal determinants), providing a method to express the inverse of a matrix as the adjugate divided by the determinant.[7] This approach, rooted in solving systems via minors, laid essential groundwork for handling perturbations to matrix inverses without recomputing from scratch.[7]
Early 20th-century advancements built on these ideas through studies of partitioned and block matrices. Ferdinand Georg Frobenius in 1908 derived a determinant formula for bordered matrices, establishing key identities for block structures that anticipated low-rank modifications.[8] Issai Schur's 1917 work on power series further developed these concepts, introducing a lemma on block determinants—now known as the Schur complement—that provided an indirect foundation for identities involving low-rank updates by relating the determinant of a block matrix to its submatrices.[9] Schur's result, while focused on bounded power series, highlighted connections between submatrices and overall structure.[8]
In the 1930s and 1940s, several researchers derived special cases of inverse updates in specific contexts, such as numerical analysis and statistics. Tadeusz Banachiewicz in 1937 explicitly presented the partitioned matrix inverse formula, later termed the Schur-Banachiewicz inverse, which facilitated computations for bordered or updated systems.[8] Harold Hotelling in 1943 contributed methods for matrix calculations involving nonsingular blocks, while William J. Duncan in 1944 provided representations of inverses in partitioned systems, including forms akin to rank-k perturbations, often in the context of solving large linear equations from engineering applications.[3] These efforts, along with Alexander Aitken's 1937 work on bordered determinants building on earlier determinant theory, produced scattered results for rank-1 and low-rank updates, particularly in interpolation and statistical estimation, but lacked a cohesive framework.[8]
Despite these contributions, early developments remained fragmented, with no unified general low-rank update identity until the mid-20th century; for instance, the 1949 Sherman-Morrison formula emerged as a prominent special case for rank-1 updates in adjustment computations.[3]
Woodbury's contribution and naming
The general form of the Woodbury matrix identity, extending the rank-1 perturbation case to arbitrary low-rank updates of rank k, was first formally stated by Max A. Woodbury in his 1950 memorandum report titled "Inverting Modified Matrices." This work, prepared as part of the Statistical Research Group at Princeton University, emphasized applications in statistical computations, such as efficiently updating matrix inverses for modified covariance structures in multivariate analysis. Around the same time, R. L. Plackett independently derived the identity in 1950 for updating least-squares estimates.[3]
The identity bears Woodbury's name and is widely known as the Woodbury matrix identity, reflecting his pivotal role in presenting the general rank-k version. Alternative names include the matrix inversion lemma, highlighting its utility in avoiding full matrix inversions, and the Woodbury-Sherman-Morrison formula, which acknowledges the foundational rank-1 case developed by Jack Sherman and Winifred J. Morrison in their 1949 paper "Adjustment of an Inverse Matrix Corresponding to Changes in the Elements of a Given Column or a Given Row of the Original Matrix." Attribution often pairs Woodbury with Sherman and Morrison due to the latter's earlier specific contribution, but Woodbury's generalization marked a significant unification for broader numerical and statistical use.[10][11]
Although Woodbury's report synthesized and explicitly stated the general identity for statistical purposes, similar ideas appeared in scattered forms prior to 1950, such as W.J. Duncan's 1944 derivation via block Jordan elimination for partitioned matrices and L. Guttman's 1946 presentation in the context of matrix enlargement methods. Initially disseminated through technical reports rather than peer-reviewed journals, the identity's impact grew in the 1970s and 1980s, as it became a standard tool in numerical linear algebra, featured prominently in seminal texts like Gene H. Golub and Charles F. Van Loan's "Matrix Computations" (first edition, 1983).[3][10]
Special Cases
The Sherman–Morrison formula is a special case of the Woodbury matrix identity that addresses rank-1 updates to an invertible matrix A \in \mathbb{R}^{n \times n}. It provides an efficient way to compute the inverse of A + uv^T, where u, v \in \mathbb{R}^n are column vectors, without fully inverting the perturbed matrix from scratch. The formula states:
(A + uv^T)^{-1} = A^{-1} - \frac{A^{-1} u v^T A^{-1}}{1 + v^T A^{-1} u},
assuming the denominator is nonzero to ensure invertibility. This expression allows for the update using only matrix-vector multiplications with the existing inverse A^{-1}, making it computationally advantageous for scenarios involving sequential modifications.
This rank-1 form arises directly from the general Woodbury identity by setting the perturbation rank k=1, with U = u (an n \times 1 matrix), C = 1 (a $1 \times 1 scalar), and V = v^T (a $1 \times n matrix). The resulting simplification highlights how a single outer product perturbation can be handled via a low-rank correction to the original inverse, preserving much of the prior computational work.[10]
Historically, the formula was introduced in the late 1940s by Jack Sherman and Winifred J. Morrison through their work on adjusting matrix inverses for changes in specific elements or rows/columns, with the general rank-1 version appearing in their 1950 publication. In applications, it serves as a foundational tool for incremental updates in online learning algorithms, where models adapt to streaming data by efficiently maintaining inverses of covariance-like matrices under sequential rank-1 perturbations.
Inverse of a sum
The inverse of the sum of two square invertible matrices A and B of the same dimension n \times n can be expressed as a special case of the Woodbury matrix identity by substituting U = I_n, C = B, and V = I_n into the general form, resulting in
(A + B)^{-1} = A^{-1} - A^{-1} (A^{-1} + B^{-1})^{-1} A^{-1},
provided that A^{-1} + B^{-1} is invertible.[3] This expression assumes A and B are invertible and requires the invertibility of their sum of inverses to ensure the overall formula holds.[3]
Unlike the general Woodbury identity, which is optimized for low-rank perturbations where the computational cost can be reduced to O(n k^2 + k^3) for rank k \ll n, this full-rank case for equal-sized matrices A and B involves inverting an n \times n matrix A^{-1} + B^{-1}, leading to an overall complexity of O(n^3).[3] Consequently, it offers no asymptotic efficiency gains over direct inversion methods like Gaussian elimination but can be advantageous when A^{-1} and B^{-1} are already available or when analyzing structured sums.[3]
This formula finds application in deriving theoretical bounds within matrix perturbation theory, particularly for analyzing the sensitivity of inverses to full-rank additive perturbations, such as estimating \|(A + B)^{-1} - A^{-1}\| in terms of norms of A^{-1} and B.[12] However, it requires B to be invertible, limiting its use to scenarios without singular perturbations, and does not leverage rank deficiency, making it less practical for large-scale numerical computations compared to low-rank variants.[3]
Variations and Generalizations
Binomial inverse theorem
The binomial inverse theorem is an alternative name for the Woodbury matrix identity in the form (A + U B V)^{-1} = A^{-1} - A^{-1} U (B^{-1} + V A^{-1} U)^{-1} V A^{-1}, where A is an n \times n invertible matrix, B is a k \times k invertible matrix, and U, V are matrices of compatible dimensions n \times k and k \times n.[8] This formulation highlights the structure for low-rank updates and is equivalent to the general Woodbury identity.
A related concept is the binomial series expansion for perturbed matrix powers, applicable when matrices commute. For commuting invertible A and perturbation B with spectral radius of A^{-1} B < 1,
(A + B)^{-m} = A^{-m} \sum_{k=0}^{\infty} \binom{-m}{k} (A^{-1} B)^k,
where \binom{-m}{k} is the generalized binomial coefficient. This infinite series provides an approximation when the closed-form Woodbury is not directly applicable, such as for higher powers or non-low-rank perturbations, but requires verification of convergence conditions.
In applications, such expansions appear in deriving moments for perturbed stochastic processes and in combinatorial matrix analysis, though non-commuting generalizations are more complex.[13]
Pseudoinverse with semidefinite matrices
Generalizations of the Woodbury identity extend to the Moore-Penrose pseudoinverse, particularly useful for positive semidefinite matrices that may be singular. When the standard identity fails due to singularity in the inner matrix, a singular Woodbury formula applies, allowing computation of pseudoinverses for low-rank perturbations of semidefinite matrices.[14]
This is valuable in scenarios where A is positive semidefinite but rank-deficient, and the update U C U^* (with C positive semidefinite) adjusts the rank. Conditions such as range inclusion ensure the pseudoinverse properties hold, preventing violations of the Penrose equations.
In kernel methods and Gaussian process models, these pseudoinverse variants enable efficient updates to covariance or precision matrices, especially with indefinite or singular kernels. For example, in regression tasks with rank-deficient Gram matrices, the formula supports incremental computations without full pseudoinverse recalculation.[14]
Proofs
Direct algebraic proof
The direct algebraic proof verifies the Woodbury matrix identity by direct multiplication, confirming that the proposed expression for the inverse yields the identity matrix when post-multiplied by A + UCV. This approach relies solely on the axioms of matrix algebra and assumes that A and C are invertible square matrices of compatible dimensions, with U and V being rectangular matrices such that all products are defined, and that the inner matrix C^{-1} + VA^{-1}U is also invertible to ensure the expression is well-defined.[15]
Consider the proposed inverse
X = A^{-1} - A^{-1}U \left(C^{-1} + VA^{-1}U\right)^{-1} VA^{-1}.
For brevity, denote K = \left(C^{-1} + VA^{-1}U\right)^{-1}, so
X = A^{-1} - A^{-1}UKVA^{-1}.
Now compute the product
(A + UCV)X = A(A^{-1} - A^{-1}UKVA^{-1}) + UCV(A^{-1} - A^{-1}UKVA^{-1}).
The first component simplifies as
AA^{-1} - AA^{-1}UKVA^{-1} = I - UKVA^{-1},
since AA^{-1} = I. The second component expands to
UCVA^{-1} - UCVA^{-1}UKVA^{-1}.
Combining both, the full product is
I - UKVA^{-1} + UCVA^{-1} - UCVA^{-1}UKVA^{-1}.
The terms following I must sum to zero:
-UKVA^{-1} + UCVA^{-1} - UCVA^{-1}UKVA^{-1} = U\left(CVA^{-1} - KVA^{-1} - CVA^{-1}UKVA^{-1}\right).
Focus on the parenthetical expression inside the parentheses:
CVA^{-1} - KVA^{-1} - C(VA^{-1}U)KVA^{-1}.
Let S = VA^{-1}U for compactness, so this becomes
CVA^{-1} - KVA^{-1} - CSVKA^{-1} = \left(C - K - CSK\right)VA^{-1}.
It remains to show that C - K - CSK = 0. By definition of K,
K^{-1} = C^{-1} + S,
which rearranges to
S = K^{-1} - C^{-1}.
Multiplying on the left by C gives
CS = CK^{-1} - I,
and post-multiplying by K yields
CSK = (CK^{-1} - I)K = C - K.
Substituting back,
C - K - CSK = C - K - (C - K) = 0.
Thus, the parenthetical expression is the zero matrix, the terms following I vanish, and (A + UCV)X = I. To complete the verification, one can similarly confirm X(A + UCV) = I, though the above suffices due to the symmetry of matrix multiplication in this context. This establishes the identity under the stated assumptions.[4]
Derivation via block matrices
One approach to deriving the Woodbury matrix identity utilizes the inversion formula for block matrices, providing insight into how low-rank updates affect the inverse through the structure of Schur complements. Consider the augmented block matrix
M = \begin{pmatrix}
A & U \\
-V & C^{-1}
\end{pmatrix},
where A is an n \times n invertible matrix, U is n \times k, V is k \times n, and C is a k \times k invertible matrix. The Schur complement of the bottom-right block C^{-1} in M is A - U (C^{-1})^{-1} (-V) = A + U C V, assuming the necessary invertibility conditions hold. Thus, the top-left block of M^{-1} is precisely (A + U C V)^{-1}.[16]
To express this inverse in terms of A^{-1}, apply the general block matrix inversion formula. For a block matrix \begin{pmatrix} P & Q \\ R & S \end{pmatrix} with S invertible, the top-left block of its inverse is P^{-1} + P^{-1} Q (S - R P^{-1} Q)^{-1} R P^{-1}, where the term S - R P^{-1} Q is the Schur complement of the top-left block. Substituting P = A, Q = U, R = -V, and S = C^{-1} yields the Schur complement C^{-1} - (-V) A^{-1} U = C^{-1} + V A^{-1} U. Therefore,
(A + U C V)^{-1} = A^{-1} - A^{-1} U (C^{-1} + V A^{-1} U)^{-1} V A^{-1}.
[16]
This derivation highlights the geometric interpretation of the Woodbury identity as a bordering process, where the rank-k update U C V corresponds to adding "borders" to the matrix A, akin to steps in Gaussian elimination on bordered systems. The Schur complement S = C^{-1} + V A^{-1} U encapsulates the interaction between the original matrix and the update, reducing the problem to inverting a smaller k \times k matrix. This block perspective extends naturally to indefinite or singular cases by relaxing invertibility assumptions on A or C, provided the overall matrix remains invertible, offering advantages over purely algebraic manipulations for understanding low-rank perturbations.[17]
Applications
In estimation and filtering
The Woodbury matrix identity is essential for efficient covariance updates in the Kalman filter, a cornerstone algorithm in state estimation for dynamic systems. In the information filter formulation, which operates on the inverse covariance matrix, the update step computes the posterior inverse covariance as P_t^{-1} = P_{t-1}^{-1} + H^T R^{-1} H, where H is the observation model matrix and R is the measurement noise covariance. This form leverages the identity to perform low-rank updates, reducing the computational cost from O(n^3) for full matrix inversion to O(n^2 + m^3), with n the state dimension and m the measurement dimension typically much smaller than n. This efficiency is particularly valuable in real-time applications like navigation and control, where recursive processing of sequential data is required.[4][18]
In recursive least squares (RLS) methods for adaptive filtering, the Sherman-Morrison formula—a rank-1 special case of the Woodbury identity—enables incremental updates to the inverse correlation matrix as new observations arrive. The gain vector is updated via K_t = P_{t-1} \phi_t (\lambda + \phi_t^T P_{t-1} \phi_t)^{-1}, followed by a rank-1 correction to the inverse covariance P_t = \frac{1}{\lambda} (P_{t-1} - K_t \phi_t^T P_{t-1}), where \phi_t is the regressor vector and \lambda the forgetting factor. This avoids repeated full inversions, allowing the algorithm to track slowly varying system parameters in signal processing tasks such as echo cancellation and channel equalization with quadratic complexity per iteration.[19]
The identity also supports scalable Bayesian inference in Gaussian processes by incorporating low-rank structure into priors and likelihoods, circumventing the need to invert large full-rank covariance matrices. For instance, inducing point methods approximate the covariance as K \approx K_{uu} + K_{uf} K_{ff}^{-1} K_{fu}, where subscripts denote inducing and full data points; applying the Woodbury identity yields an efficient inverse form K^{-1} = K_{ff}^{-1} - K_{fu} (K_{uu} + K_{uf} K_{ff}^{-1} K_{fu})^{-1} K_{uf}^T K_{ff}^{-1}, with complexity dominated by the smaller inducing set size. This approach is critical for high-dimensional regression and uncertainty quantification in machine learning.[14]
A practical example arises in GPS signal processing, where the Kalman filter fuses pseudorange measurements from multiple satellites, introducing low-rank noise updates to the covariance; the Woodbury identity efficiently incorporates these via the measurement model H^T R^{-1} H, maintaining real-time performance in receiver tracking loops amid multipath and ionospheric disturbances.[20]
In numerical optimization
The Woodbury matrix identity plays a key role in numerical optimization by enabling efficient updates to matrix inverses under low-rank perturbations, which is essential for handling large-scale problems where direct inversion would be computationally prohibitive.
In interior-point methods, low-rank updates to approximate slack matrices arising from barrier functions and Lagrange multipliers reduce the per-iteration cost in semidefinite programming from O(n^\omega) to sums involving smaller matrices of size r, improving the overall runtime of interior-point solvers for convex optimization to \tilde{O}(\sqrt{n} (mn^2 + m^\omega + n^\omega)) \log(1/\epsilon), making it viable for high-dimensional problems.
Quasi-Newton methods, such as BFGS, leverage the Woodbury identity to update approximations of the inverse Hessian with rank-2 corrections based on curvature information from gradient differences. This applies the identity to avoid full reinversion, achieving O(n^2) cost per iteration while preserving positive definiteness for convergence in unconstrained optimization.
In support vector machines (SVMs), the dual formulation requires inverting a kernel matrix K = X X^T + \lambda J J^T, where J is a vector of ones; the Woodbury identity computes this inverse efficiently as (K)^{-1} = (X X^T)^{-1} - (X X^T)^{-1} J ( \lambda^{-1} I + J^T (X X^T)^{-1} J )^{-1} J^T (X X^T)^{-1}, enabling low-rank approximations for large-scale kernel methods and facilitating quadratic optimization of the margin. This is particularly useful in high-dimensional settings like neuroimaging classification, where it supports inference on support vectors without direct large-matrix inversion.
For example, in large-scale logistic regression with vertically partitioned data, the Woodbury identity handles rank-k updates from feature additions by inverting modified Gram matrices, such as (I + \eta I_{p+1})^{-1} via (A + U C W)^{-1} = A^{-1} - A^{-1} U (C^{-1} + W A^{-1} U)^{-1} W A^{-1}, allowing distributed optimization of parameters and standard errors across nodes with minimal communication. This approach scales to high-dimensional datasets by incorporating new covariates X^{(k)} incrementally, reducing the need for full matrix operations in privacy-preserving settings.[21]
Numerical Considerations
Computational stability
The numerical stability of the Woodbury matrix identity arises primarily from the computation of the inner matrix inverse (C^{-1} + V A^{-1} U)^{-1}, which can become ill-conditioned for certain choices of the low-rank factors U and V, even when the full matrix A + U C V is well-conditioned. In such cases, the errors associated with inverting or solving systems involving this k \times k inner matrix can dominate the overall computation, leading to a worthless approximate solution. The condition number of the inner matrix is bounded by \kappa(C + V A^{-1} U) \leq \kappa(A) \kappa(A + U C V), highlighting how the conditioning of A and the full perturbed matrix propagates to the inner term.
Severe cancellation errors can occur in floating-point arithmetic when V A^{-1} U \approx -C^{-1}, causing subtractive cancellation in forming the inner matrix and amplifying rounding errors. This instability is particularly pronounced in applications where the rank k is small but the matrices are large and nearly singular, as the formula's efficiency relies on accurate inner computations. If the original matrices A and A + U C V are well-conditioned, the overall procedure remains backward stable, but the inner term's potential ill-conditioning necessitates careful monitoring.
To mitigate these issues, especially for symmetric positive semidefinite matrices, the Cholesky factorization of A is recommended, as it exploits symmetry, halves the computational cost compared to general methods, and avoids the need for pivoting, thereby enhancing stability. For symmetric indefinite cases, the LDL^T factorization is preferred over LU decomposition to circumvent pivoting-related instabilities and avoid square root operations that may introduce errors from negative arguments due to rounding. Orthogonal preconditioning of U and V can also improve conditioning in some settings.
The computational cost breakdown underscores the trade-off with stability: forming A^{-1} U and V A^{-1} U requires operations costing O(n^2 k + n k^2), while inverting the k \times k inner matrix costs O(k^3); this yields substantial savings over direct O(n^3) inversion when k \ll n, but the inner inversion's stability must be verified to ensure overall reliability.
Implementations of the Woodbury identity appear in numerical libraries like LAPACK for symmetric matrices, often integrated into routines for low-rank updates, with built-in checks for near-singularity to alert users to potential instability.[22]
Error bounds and implementation
The backward error analysis of the Woodbury matrix identity reveals that the computed inverse \widetilde{(A + UCV)^{-1}} satisfies a perturbed equation where the perturbation \Delta B on B = A + UCV is bounded by \|\Delta B\|_2 \leq 2\epsilon_1 \|A\|_2^2 + 8\epsilon_2, with \epsilon_1 denoting the absolute error in the 2-norm of the approximate inverse of A and \epsilon_2 the absolute error in the 2-norm of the inversion of the capacitance matrix C^{-1} + V A^{-1} U, under assumptions that \epsilon_1, \epsilon_2 < 1/(2\lambda \alpha) and \sigma_{\min}(A) \leq 1, where \lambda relates to the update size and \alpha to the norm of A^{-1}.[23] This bound indicates that the backward error is generally small, scaling with machine precision \epsilon times factors involving \|A\|_2^2 and the update term, but it can amplify if the condition number of A is high or if the approximate inverses are poorly computed. In practice, the error remains controlled provided the original matrix A and the low-rank factors are well-conditioned, aligning with broader results showing backward stability when \cond(A) and \cond(B) are moderate.[24]
Forward error bounds for the computed inverse quantify the deviation \|\widetilde{B^{-1}} - B^{-1}\|_2 \leq 2\epsilon_2 \|A^{-1}\|_2 + 12\epsilon_1, again under similar small-error assumptions and with the forward error dominated by terms like \epsilon \alpha^2 \lambda^2 \|A^{-1}\|_2^2 for larger updates.[23] These estimates highlight sensitivity to the condition number of B = A + UCV, as the error grows with \cond(B) and inversely with the singular value gap of B, potentially leading to larger relative errors if singular values cluster near zero. Numerical experiments confirm these bounds hold tightly for small-rank updates, with observed errors within a factor of 10 of the theoretical predictions in double precision.[23]
For practical implementation, precomputing the inverse A^{-1} (or a factorization) is advisable when A is fixed across multiple updates, as it reduces repeated expensive operations; however, in high dimensions without sparsity, direct application should be avoided due to O(k^3 + n^2 k + n k^2) complexity potentially exceeding full inversion costs for large k. Iterative refinement can enhance precision by solving residual-corrected systems, mitigating amplification from ill-conditioned capacitance matrices. In the symmetric case where A and C are symmetric positive definite and V = U, the identity simplifies to B^{-1} = A^{-1} - A^{-1} U (C^{-1} + U^T A^{-1} U)^{-1} U^T A^{-1}, which preserves symmetry if implemented carefully.
The following pseudocode illustrates a stable matrix-multiply-based implementation for the symmetric case, assuming access to A^{-1}:
function symmetric_woodbury(A_inv, U, C_inv):
# Inputs: A_inv = A^{-1} (n x n symmetric), U (n x k), C_inv = C^{-1} (k x k symmetric)
# Output: B_inv ≈ (A + U C U^T)^{-1} (n x n symmetric)
S = C_inv + U^T * A_inv * U # k x k capacitance matrix
S_inv = invert(S) # Invert small k x k matrix
tmp = A_inv * U # n x k intermediate
B_inv = A_inv - tmp * (S_inv * (U^T * tmp)) # n x n update
return B_inv
function symmetric_woodbury(A_inv, U, C_inv):
# Inputs: A_inv = A^{-1} (n x n symmetric), U (n x k), C_inv = C^{-1} (k x k symmetric)
# Output: B_inv ≈ (A + U C U^T)^{-1} (n x n symmetric)
S = C_inv + U^T * A_inv * U # k x k capacitance matrix
S_inv = invert(S) # Invert small k x k matrix
tmp = A_inv * U # n x k intermediate
B_inv = A_inv - tmp * (S_inv * (U^T * tmp)) # n x n update
return B_inv
This approach leverages BLAS-level operations for efficiency and numerical robustness, with error controlled by the bounds above when using high-quality linear algebra libraries.[23]