Fact-checked by Grok 2 weeks ago

QR algorithm

The QR algorithm is an iterative procedure in numerical linear algebra for computing the eigenvalues and eigenvectors of a square matrix, which repeatedly decomposes the matrix into an orthogonal matrix Q and an upper triangular matrix R via QR factorization, then forms the product RQ as the next iterate, converging to a form where the eigenvalues appear on the diagonal (for symmetric matrices) or in Schur triangular form (for general matrices).^[1]^[2] Developed independently in the late 1950s and early 1960s by John G. F. Francis in the United Kingdom and Vera N. Kublanovskaya in the Soviet Union, the algorithm built upon earlier methods like the LR transformation and addressed stability issues in eigenvalue computation, with Francis's foundational papers published in The Computer Journal in 1961 and 1962.^[3]^[1] For symmetric matrices, the basic unshifted QR algorithm exhibits quadratic or cubic convergence rates near the eigenvalues, while for nonsymmetric matrices, convergence can be linear and erratic without modifications, necessitating techniques like Hessenberg reduction to upper Hessenberg form prior to iteration for computational efficiency.^[1]^[2] Practical implementations, such as the implicitly shifted QR algorithm introduced by Francis and later refined with Wilkinson shifts or multishift strategies, accelerate convergence by simulating shifts without explicit computation, making the method robust and widely applicable to dense, sparse, or large-scale eigenvalue problems in fields like physics, engineering, and data analysis.^[1]^[2] Recognized as one of the top 10 algorithms of the 20th century by the IEEE Computer Society for its transformative impact on scientific computing, the QR algorithm underpins modern software libraries like LAPACK and EISPACK, enabling reliable solutions to the eigenvalue problem that is central to stability analysis, vibration modes, and principal component analysis.^[4]^[2] Key variants, including the double-shift QR for real matrices and the QZ algorithm for generalized eigenvalue problems, extend its utility while preserving backward stability, ensuring small perturbations in input yield correspondingly small errors in computed eigenvalues.^[1]

Background

The Eigenvalue Problem

In linear algebra, the eigenvalue problem for an n \times n square matrix A seeks scalars \lambda (eigenvalues) and corresponding non-zero vectors v (eigenvectors) satisfying the equation A v = \lambda v. These values reveal intrinsic properties of the matrix, such as its spectral structure, and form the basis for understanding linear transformations.^[5]^[6] The eigenvalues \lambda are precisely the roots of the characteristic polynomial defined by \det(A - \lambda I) = 0, where I is the identity matrix; solving this polynomial equation yields the spectrum of A. For matrices over the complex numbers, there are always n eigenvalues counting multiplicities, though they may be complex even for real A.^[7] Eigenvalues have broad applications across disciplines. In stability analysis of dynamical systems, the signs of the real parts of eigenvalues determine whether equilibria are stable or unstable. In mechanical engineering, they identify vibration modes and natural frequencies of structures, aiding in design against resonance. In quantum mechanics, eigenvalues represent discrete energy levels of physical systems, fundamental to solving the Schrödinger equation.^[8]^[9]^[10] Direct computation of eigenvalues presents formidable challenges, particularly for large-scale or non-symmetric matrices. The characteristic polynomial approach becomes impractical for dimensions beyond small n due to numerical instability in root-finding and the exponential growth in coefficients. Non-symmetric matrices often exhibit clustered or defective eigenvalues, rendering them highly sensitive to perturbations, where small changes in A can cause large shifts in the spectrum.^[1]^[11] A key theoretical resolution to the eigenvalue problem is the Schur decomposition, which asserts that every square matrix A over the complex numbers admits a unitary matrix Q and an upper triangular matrix T such that A = Q T Q^*, with the eigenvalues of A on the diagonal of T. This triangular form simplifies eigenvalue extraction and underpins numerical algorithms, though computing it requires iterative methods leveraging orthogonal transformations.^[12]

QR Decomposition

The QR decomposition of an n \times n square matrix A with linearly independent columns is a factorization A = QR, where Q is an n \times n orthogonal matrix satisfying Q^T Q = I and R is an n \times n upper triangular matrix. This decomposition exists for any such matrix and can be extended to rectangular matrices, though the focus here is on the square case relevant to eigenvalue computations. One method to construct the QR decomposition is the classical Gram-Schmidt orthogonalization process, which builds the columns of Q sequentially by projecting each column of A onto the orthogonal complement of the previous columns and normalizing. Specifically, set q_1 = a_1 / \|a_1\|, and for j = 2, \dots, n,

q_j = \frac{a_j - \sum_{i=1}^{j-1} (a_j^T q_i) q_i}{\left\| a_j - \sum_{i=1}^{j-1} (a_j^T q_i) q_i \right\|},

with the entries of R given by r_{ij} = q_i^T a_j for i \leq j and r_{ij} = 0 otherwise; this yields A = QR directly.^[13] However, the classical Gram-Schmidt process can suffer from numerical instability due to rounding errors in ill-conditioned matrices, leading to loss of orthogonality in Q.^[13] More stable alternatives include Householder reflections and Givens rotations. Householder reflections apply a series of orthogonal transformations to introduce zeros below the diagonal in each column of A, progressively triangularizing it while accumulating the product of reflectors as Q. A single Householder reflector is defined as

H = I - \frac{2 u u^T}{u^T u},

where u is chosen such that H a = \|a\| e_1 for a target vector a, with e_1 = (1, 0, \dots, 0)^T; applying n-1 such reflectors H_1, \dots, H_{n-1} gives H_{n-1} \cdots H_1 A = R, so Q = H_1^T \cdots H_{n-1}^T.^[14] Givens rotations, which are orthogonal matrices differing from the identity in a $2 \times 2 block, zero out one element at a time by rotating pairs of rows or columns; a sequence of such rotations can triangularize A, with Q as the product of the rotations.^[15] Householder reflections are generally preferred for dense matrices due to better data locality and fewer operations, while Givens rotations are useful for sparse or banded matrices to avoid fill-in.^[13] The computational complexity of forming the QR decomposition for a general n \times n matrix is O(n^3) floating-point operations, dominated by the matrix-vector products and updates in either Householder or Givens methods.^[16] Key properties of the QR decomposition include the fact that orthogonal similarity transformations preserve eigenvalues: if B = Q^T A Q, then B and A share the same eigenvalues since their characteristic polynomials coincide, \det(\lambda I - B) = \det(\lambda I - A).^[16] Additionally, the QR decomposition plays a central role in solving overdetermined least squares problems \min_x \|A x - b\|_2, where the normal equations A^T A x = A^T b are avoided in favor of the more stable R x = Q^T b after factorization, reducing conditioning issues. This factorization thus serves as a foundational tool for numerical linear algebra tasks, including those addressing the eigenvalue problem.

Basic QR Algorithm

Algorithm Description

The QR algorithm is an iterative method for computing the eigenvalues of a square matrix A \in \mathbb{C}^{n \times n}. It begins with the initial matrix A_0 = A. At each iteration k = 0, 1, 2, \dots, a QR decomposition is performed on A_k = Q_k R_k, where Q_k is unitary (orthogonal if real) and R_k is upper triangular. The next iterate is then formed as the orthogonal similarity transformation A_{k+1} = R_k Q_k = Q_k^* A_k Q_k, where Q_k^* denotes the conjugate transpose of Q_k.^[17] The goal of this process is to generate a sequence of matrices \{A_k\} that converges to an upper triangular matrix T, whose diagonal entries are the eigenvalues of A. The accumulated product of the unitary factors Q_k yields the Schur vectors in the decomposition A = Q T Q^*, where Q = \prod_{k=1}^\infty Q_k. This triangularization reveals all eigenvalues simultaneously without requiring explicit deflation in the basic form.^[17] Each iteration preserves the eigenvalues of the original matrix because A_{k+1} is unitarily similar to A_k, and thus to A_0 = A. Specifically, the characteristic polynomial \det(A - \lambda I) remains unchanged under unitary similarity transformations.^[17] The unshifted QR iteration can be expressed in pseudocode as follows:

function QR_algorithm(A, tol, maxiter)
    n = size(A, 1)
    Q_total = eye(n)
    A_k = copy(A)
    for k = 1 to maxiter
        Q, R = qr(A_k)  // QR decomposition
        A_k = R * Q
        Q_total = Q_total * Q
        if norm(subdiagonal(A_k)) < tol  // Check convergence
            break
    end
    return A_k, Q_total  // T ≈ A_k, eigenvalues on diag(T)
end
function QR_algorithm(A, tol, maxiter)
    n = size(A, 1)
    Q_total = eye(n)
    A_k = copy(A)
    for k = 1 to maxiter
        Q, R = qr(A_k)  // QR decomposition
        A_k = R * Q
        Q_total = Q_total * Q
        if norm(subdiagonal(A_k)) < tol  // Check convergence
            break
    end
    return A_k, Q_total  // T ≈ A_k, eigenvalues on diag(T)
end

This pseudocode assumes a QR decomposition routine and a tolerance for subdiagonal norms to test convergence.^[16] Early observations noted that the unshifted QR algorithm exhibits slow linear convergence for matrices whose eigenvalues lie inside the unit circle (i.e., spectral radius less than 1), as the contraction factor approaches 1, leading to rates proportional to \max |\lambda_i / \lambda_j| close to unity.^[18]

Simple Iteration Example

To illustrate the basic QR algorithm, consider the symmetric 2×2 matrix

A_0 = \begin{pmatrix} 4 & 1 \\ 1 & 3 \end{pmatrix}.

This matrix has trace 7 and determinant 11, so its eigenvalues are the roots of the characteristic polynomial \lambda^2 - 7\lambda + 11 = 0, which are \lambda_1 = \frac{7 + \sqrt{5}}{2} \approx 4.618 and \lambda_2 = \frac{7 - \sqrt{5}}{2} \approx 2.382. The first iteration begins with the QR decomposition of A_0 using the Gram-Schmidt process. The first column of A_0 is (4, 1)^T with norm \sqrt{17} \approx 4.123. The first column of Q_0 is thus \mathbf{q}_1 = (4/\sqrt{17}, 1/\sqrt{17})^T \approx (0.970, 0.243)^T. The (1,1) entry of R_0 is \sqrt{17}. The projection of the second column onto \mathbf{q}_1 gives the (1,2) entry of R_0 as $7/\sqrt{17} \approx 1.697. The second column of Q_0 is then obtained by orthogonalizing the second column of A_0 against \mathbf{q}_1 and normalizing, yielding Q_0 \approx \begin{pmatrix} 0.970 & -0.243 \\ 0.243 & 0.970 \end{pmatrix} and R_0 \approx \begin{pmatrix} 4.123 & 1.697 \\ 0 & 2.668 \end{pmatrix}. The updated matrix is A_1 = R_0 Q_0 \approx \begin{pmatrix} 4.412 & 0.647 \\ 0.647 & 2.588 \end{pmatrix}. For the second iteration, apply QR decomposition to A_1. The norm of its first column is approximately 4.459, so the first column of Q_1 is approximately (0.989, 0.145)^T. The resulting R_1 \approx \begin{pmatrix} 4.459 & 1.016 \\ 0 & 2.466 \end{pmatrix} and Q_1 \approx \begin{pmatrix} 0.989 & -0.145 \\ 0.145 & 0.989 \end{pmatrix}. Thus, A_2 = R_1 Q_1 \approx \begin{pmatrix} 4.559 & 0.358 \\ 0.358 & 2.439 \end{pmatrix}. A third iteration on A_2 yields A_3 \approx \begin{pmatrix} 4.592 & 0.246 \\ 0.246 & 2.414 \end{pmatrix}, computed similarly via Gram-Schmidt QR decomposition and reconstruction. In these iterations, the off-diagonal (subdiagonal) entries decrease from 1 to 0.647, then to 0.358, and to 0.246, while the diagonal entries shift toward the exact eigenvalues: the (1,1) entry increases from 4 toward 4.618, and the (2,2) entry decreases from 3 toward 2.382. This demonstrates how repeated QR iterations drive the matrix toward upper triangular form, with the diagonal revealing the eigenvalues.

Practical Implementations

Hessenberg Reduction

In numerical linear algebra, the Hessenberg reduction is a preprocessing step for the QR algorithm that transforms a general square matrix into upper Hessenberg form while preserving its eigenvalues. An upper Hessenberg matrix H of order n is defined such that h_{ij} = 0 for all i > j + 1, meaning all elements below the first subdiagonal are zero. The reduction process employs a sequence of orthogonal Householder transformations to systematically zero out the elements below the subdiagonal in each column, starting from the first. These transformations are applied from both the left and right to the matrix to maintain similarity, ensuring the eigenvalues remain unchanged. Specifically, for each column k from 1 to n-2, a Householder reflector P_k is constructed to eliminate the entries in positions (k+2:n, k), and the update is performed as A \leftarrow P_k^T A P_k. The algorithm requires exactly n-2 such Householder reflectors, one for each targeted column. The overall transformation yields a Hessenberg matrix H = P^T A P, where P = P_1 P_2 \cdots P_{n-2} is the product of these orthogonal matrices, and P^T denotes the transpose (or conjugate transpose for complex cases). This similarity transformation guarantees that H and the original matrix A share the same eigenvalues. The computational complexity of the Hessenberg reduction is \frac{10}{3} n^3 floating-point operations (flops) without accumulating the orthogonal matrix P, which is on the order of O(n^3); accumulating P requires an additional \frac{4}{3} n^3 flops.^[16] A key benefit of this reduction is that it sparsifies the matrix, allowing each subsequent QR iteration—based on QR decomposition followed by remultiplication—to be performed in O(n^2) time rather than the full O(n^3) for a dense matrix.

Shifting Strategies

Shifting strategies in the QR algorithm involve subtracting a scalar multiple of the identity matrix, known as a shift s_k, from the current iterate A_k to accelerate convergence toward the eigenvalues. These techniques exploit the similarity invariance of the QR iteration, preserving the eigenvalues while targeting specific ones more effectively. Typically applied to the upper Hessenberg form of the matrix, shifts adjust the spectral properties iteratively without altering the overall structure significantly.^[19] The simplest practical shift is the zero shift, where s_k = 0, which maintains theoretical simplicity but leads to slow linear convergence for non-normal matrices. In contrast, the Rayleigh quotient shift selects s_k = A_k(n,n), the bottom-right entry of the trailing principal submatrix, approximating the dominant eigenvalue and promoting faster attraction of the smallest eigenvalue to the bottom-right position. This choice draws from the Rayleigh quotient iteration's cubic convergence properties and ensures almost global convergence when combined with the QR steps.^[19] For enhanced robustness, particularly to avoid stagnation in cases like symmetric eigenvalue pairs, the Wilkinson shift refines the selection by considering the 2×2 trailing principal submatrix of A_k. The eigenvalues of this submatrix are computed exactly, and the shift s_k is chosen as the one closer to A_k(n,n) to minimize instability and guarantee convergence. This strategy, introduced by J. H. Wilkinson, breaks symmetry effectively while maintaining numerical stability.^[20] The explicit shift iteration proceeds as follows: compute the QR decomposition of the shifted matrix (A_k - s_k I) = Q_k R_k, where Q_k is orthogonal and R_k is upper triangular, then form the next iterate as A_{k+1} = R_k Q_k + s_k I. This preserves the Hessenberg structure and similarity to A_k.

(A_k - s_k I) = Q_k R_k, \quad A_{k+1} = R_k Q_k + s_k I

With appropriate shifts like the Rayleigh quotient or Wilkinson, the algorithm achieves quadratic convergence for simple eigenvalues, meaning the error reduces quadratically per iteration near convergence. For multiple eigenvalues, the rate improves to cubic, significantly reducing the number of iterations required in practice.^[21] Shift selection balances theoretical elegance with computational robustness; while the zero shift suffices for analysis, the Wilkinson shift is preferred in implementations for its proven global convergence and resistance to pathological cases.^[20]

Deflation Techniques

Deflation techniques in the QR algorithm enable the detection and isolation of converged eigenvalues during iterations, allowing the problem size to be reduced progressively for improved efficiency. When a subdiagonal element becomes sufficiently small relative to the adjacent diagonal entries, it is treated as negligible, effectively decoupling the matrix into a block upper triangular form where the diagonal blocks correspond to invariant subspaces. This process focuses subsequent iterations solely on the unreduced active block, avoiding unnecessary computations on already converged parts.^[16] The standard deflation criterion, as implemented in practical algorithms, declares a subdiagonal entry h_{j+1,j} in the Hessenberg matrix to be zero if |h_{j+1,j}| < \varepsilon (|h_{j,j}| + |h_{j+1,j+1}|), where \varepsilon is a small tolerance, typically on the order of the unit roundoff. This relative measure ensures numerical robustness by accounting for the scale of the local entries, preventing premature deflation in ill-conditioned cases. Once identified, the subdiagonal is set to zero, partitioning the matrix into a converged upper triangular block (containing isolated eigenvalues) and a smaller active lower block. Iterations then proceed only on this active submatrix, with the overall transformation accumulated via orthogonal factors.^[16] In Hessenberg form, deflation typically occurs from the bottom (or top) as eigenvalues converge to the diagonal; for instance, in a 4×4 unreduced Hessenberg matrix, if the bottom subdiagonal falls below the threshold after several shifts, the matrix decouples into a 1×1 block (the converged eigenvalue) and a 3×3 active block, reducing the next QR step's cost. This example illustrates how deflation exploits the Hessenberg structure, where only the subdiagonal needs monitoring, as larger off-diagonals are absent.^[16] To enhance numerical stability and maximize deflation opportunities, converged eigenvalues may be reordered along the diagonal by magnitude after each deflation step, grouping small subdiagonals to potentially deflate multiple eigenvalues at once. Such reordering, performed via a unitary similarity transformation on the triangular block, ensures that peripheral eigenvalues (often the easiest to converge) are isolated without disturbing the active subspace. This step is particularly beneficial in multishift variants, where it can augment the number of deflatable entries.^[22] The computational savings from deflation are significant: each QR iteration on an n \times n Hessenberg matrix costs approximately $4n^2 flops without deflation, but as the active size reduces to m < n over iterations, the effective complexity drops, achieving overall savings proportional to the number of deflations (often approaching O(n^2) total for well-separated spectra). For matrices with clustered eigenvalues, fewer deflations occur, but the technique still avoids full-matrix operations on converged portions.^[16] For matrices with multiple eigenvalues, deflation decouples the problem into smaller invariant subspaces corresponding to distinct eigenvalue clusters, allowing independent processing of each block. In cases involving Jordan blocks (defective eigenvalues), the algorithm does not deflate within the block—where subdiagonals remain nonzero—but isolates the entire block once the connecting subdiagonal to the rest of the matrix is small, preserving the structure in the real Schur form. If twists arise from complex conjugate pairs or near-degeneracies, additional unitary transformations may be applied post-deflation to "unwind" and clarify the block structure without altering eigenvalues.^[16]

Implicit QR Algorithm

Overview and Motivation

The implicit QR algorithm addresses key limitations in the explicit shifted QR iteration for computing eigenvalues of matrices reduced to Hessenberg form. In the explicit approach, applying a shift involves forming the matrix A - sI and computing its full QR decomposition, which disrupts the Hessenberg structure and incurs an O(n^3) computational cost per iteration, making it inefficient for large matrices.^[16] This inefficiency motivated the development of an implicit variant that performs the shifted QR step without explicitly constructing A - sI, thereby preserving the Hessenberg form and reducing the cost to O(n^2) operations per iteration.^[23] The core of the implicit method lies in bulge formation and chasing techniques, where a small perturbation (bulge) is introduced to the Hessenberg matrix to approximate the shifted transformation, followed by a series of unitary similarity transformations to chase the bulge down the matrix and restore the structure. This approach leverages the fact that for small shifts s, the orthogonal factor Q(s) in the shifted QR decomposition closely approximates the unshifted Q, allowing Q(s) to be computed efficiently through targeted Givens rotations applied to the subdiagonal elements. Developed by John G. F. Francis in the late 1950s amid constraints of early computers like the Pegasus, the implicit algorithm was designed to incorporate acceleration via shifts while maintaining the low per-iteration complexity of the basic QR method, enabling practical eigenvalue computations for real-world problems such as aircraft stability analysis.^[3]^[23] A significant advantage of the implicit QR algorithm is its numerical stability and enhanced speed, particularly for large sparse Hessenberg matrices, as the unitary transformations ensure backward stability without fill-in beyond the bulge. For real matrices with complex conjugate eigenvalue pairs, Francis extended the method with a double-shift strategy, applying two simultaneous shifts to process the pair in real arithmetic and produce 2×2 blocks in the resulting quasi-triangular form, further improving efficiency and avoiding complex operations.^[16]^[23]

Algorithm Steps

The implicit QR iteration performs a single step on an upper Hessenberg matrix H \in \mathbb{R}^{n \times n} without explicitly forming the shifted matrix H - sI, where s is the shift, to maintain numerical stability and efficiency. Instead, it simulates the effect of the QR decomposition of H - sI = QR followed by H \leftarrow RQ + sI through a sequence of orthogonal transformations that introduce and chase a "bulge" along the subdiagonal. This bulge arises from applying rotations to the first column, temporarily disrupting the Hessenberg form, and is then pursued downward until it exits the matrix, restoring the form.^[16] The procedure for a single iteration consists of four main steps. First, compute the shift s, typically the Wilkinson shift, which is the eigenvalue of the bottom-right $2 \times 2 submatrix of H closest to h_{nn}. Second, form the initial bulge by applying 2 to 4 Givens rotations to the first column of the implicit shifted matrix, specifically targeting entries to zero out positions below the (1,1) entry while introducing nonzeros just below the subdiagonal (e.g., a nonzero in position (3,1) for a single shift). Third, chase the bulge to the bottom by applying a sequence of trailing Givens rotations from both sides of the matrix: each rotation annihilates a subdiagonal entry in the current column while propagating the bulge one row downward, requiring approximately n-2 such rotations. Fourth, apply the accumulated left and right similarities to update the matrix and any associated eigenvectors, ensuring the overall transformation is a similarity.^[24]^[16] For real arithmetic when complex conjugate eigenvalue pairs are expected, a double-shift strategy is employed to avoid complex operations. This uses two shifts s_1 and s_2 (the eigenvalues of the $2 \times 2 bottom block), effectively performing a double QR step equivalent to applying the QR decomposition to a $2 \times 2 block polynomial (H - s_1 I)(H - s_2 I). The initial bulge is larger (typically 3x3), formed by computing the first few entries of this polynomial applied to the first standard basis vector and applying 3 to 4 Givens rotations, followed by chasing as before. This maintains real arithmetic throughout and is standard in implementations like LAPACK.^[16]^[24] The following pseudocode outlines a single double-shifted implicit QR step on an n \times n upper Hessenberg matrix H, using Givens rotations for bulge formation and chasing (assuming a routine Givens(a,b) returns the rotation parameters to zero b):

function ImplicitQRStep(H, s1, s2)
    // Compute first three entries of (H^2 - (s1 + s2) H + s1 s2 I) e1
    v1 = h11^2 + h12 * h21 - (s1 + s2) * h11 + s1 * s2
    v2 = h21 * (h11 + h22 - (s1 + s2))
    v3 = h21 * h32
    v = [v1, v2, v3]

    // Step 1: Form initial bulge with 2-3 Givens rotations on columns 1-3
    for k = 1 to 2
        (c, s) = Givens(v(k), v(k+1))
        Apply rotation to rows k, k+1 of H from left (columns 1 to end)
        Apply same rotation to columns k, k+1 of H from right (trailing submatrix)

    // Step 2: Chase bulge down
    for j = 3 to n-1
        // Annihilate bulge entry at (j+1, j-1)
        (c, s) = Givens(H(j, j-1), H(j+1, j-1))
        Apply to rows j, j+1 from left (columns j-1 to end)
        Apply to columns j, j+1 from right (trailing submatrix)
        // Bulge now at (j+2, j)

    // Final rotations to clean up if needed
    // The accumulated Q is the product of all rotations; H is updated via Q^T H Q
    return updated H
end
function ImplicitQRStep(H, s1, s2)
    // Compute first three entries of (H^2 - (s1 + s2) H + s1 s2 I) e1
    v1 = h11^2 + h12 * h21 - (s1 + s2) * h11 + s1 * s2
    v2 = h21 * (h11 + h22 - (s1 + s2))
    v3 = h21 * h32
    v = [v1, v2, v3]

    // Step 1: Form initial bulge with 2-3 Givens rotations on columns 1-3
    for k = 1 to 2
        (c, s) = Givens(v(k), v(k+1))
        Apply rotation to rows k, k+1 of H from left (columns 1 to end)
        Apply same rotation to columns k, k+1 of H from right (trailing submatrix)

    // Step 2: Chase bulge down
    for j = 3 to n-1
        // Annihilate bulge entry at (j+1, j-1)
        (c, s) = Givens(H(j, j-1), H(j+1, j-1))
        Apply to rows j, j+1 from left (columns j-1 to end)
        Apply to columns j, j+1 from right (trailing submatrix)
        // Bulge now at (j+2, j)

    // Final rotations to clean up if needed
    // The accumulated Q is the product of all rotations; H is updated via Q^T H Q
    return updated H
end

This pseudocode represents the core loop; in practice, optimizations like delayed updates reduce fill-in, and the rotations simulate the implicit shift.^[24]^[16] During bulge chasing, the matrix temporarily loses its strict Hessenberg form due to the bulge's nonzeros below the subdiagonal, but the process restores the upper Hessenberg structure by the end of the iteration, as each trailing rotation eliminates the offending entries while preserving the form above the bulge. This restoration is exact in exact arithmetic and numerically stable with orthogonal transformations.^[24] Each iteration requires O(n) Givens rotations, leading to O(n^2) arithmetic operations per step due to the trailing matrix updates, which are confined to the subdiagonal band for efficiency. Over all iterations until convergence, the total complexity for the full algorithm on a Hessenberg matrix is O(n^3), with a much smaller constant factor than the explicit QR's O(n^4).^[16]^[24]

Convergence Analysis

Theoretical Interpretation

The QR algorithm can be geometrically interpreted as a sequence of orthogonal transformations that progressively align the invariant subspaces of the matrix with the standard basis vectors, thereby revealing the Schur form. Each iteration applies a unitary matrix Q_k such that the accumulated transformation U_k = Q_1 Q_2 \cdots Q_k rotates the coordinate system to better approximate the Schur vectors, with the off-diagonal elements diminishing as the subspaces decouple.^[16] In the unshifted case, the QR algorithm bears a close relation to the power method, functioning as a simultaneous power iteration applied to the rows and an inverse power iteration to the columns of the matrix. This duality arises because the R_k factors capture the scaling associated with dominant eigenvalues, while the Q_k factors refine the directions, leading to asymptotic behavior where the columns of the accumulated Q converge to eigenvectors ordered by decreasing modulus of eigenvalues.^[25] The algorithm converges to the Schur form, an upper triangular matrix with eigenvalues on the diagonal, for general matrices under suitable conditions on eigenvalue separations. For normal matrices, the algorithm converges to the Schur form, which is diagonal if the matrix is Hermitian, due to the orthogonal diagonalizability and the preservation of normality under unitary similarities. A key element of the convergence proof involves showing that subdiagonal elements decrease monotonically in magnitude during iterations, particularly in Hessenberg form. Specifically, for the shifted QR algorithm, the subdiagonal entry satisfies |a_{i+1,i}^{(k+1)}| \leq |a_{i+1,i}^{(k)}| / \rho, where \rho > 1 depends on the shift strategy and eigenvalue separations, ensuring quadratic or cubic convergence rates near the diagonal. This bound relies on the spectral radius of certain submatrices and the unitary invariance of the Frobenius norm, which preserves the eigenvalues and bounds the perturbation of off-diagonal terms.^[16] Visualizations of the QR iterations often depict the trajectories of approximate eigenvalues in the complex plane, where points representing diagonal or corner entries migrate towards the true eigenvalues under repeated transformations, illustrating the algorithm's deflationary progress as subdiagonals vanish.^[25]

Factors Affecting Convergence

The choice of shift in the QR algorithm significantly influences its convergence behavior. The Wilkinson shift, which selects the eigenvalue of the trailing 2×2 submatrix closer to the bottom-right entry of the current iterate, guarantees local quadratic convergence near the end of the process. However, the overall global convergence rate remains dependent on the clustering of eigenvalues, where closely spaced eigenvalues can lead to slower progress across multiple iterations.^[26]^[27] Matrix conditioning plays a critical role in the reliability and speed of convergence. Ill-conditioned eigenvalues, often arising from matrices with small eigenvalue gaps or high condition numbers, exhibit slower convergence due to increased sensitivity to perturbations, with relative error bounds governed by the matrix's condition number and machine epsilon. In such cases, the algorithm may require additional iterations to resolve closely clustered or perturbed eigenvalues accurately.^[28]^[16] Deflation techniques further modulate convergence by allowing the removal of converged eigenvalues, thereby reducing the effective matrix size. Frequent early deflation accelerates the process by focusing subsequent iterations on unresolved subproblems, potentially halving the computational effort in practice. Conversely, infrequent or missed deflations can extend the number of iterations, as unresolved components continue to influence the entire matrix.^[29]^[30] Numerical stability is essential for robust convergence, yet rounding errors accumulated during Givens rotations or Householder reflections can disrupt the Hessenberg form or stall progress. Refined shifts, such as those originally proposed by Rutishauser, mitigate these issues by enhancing stability in non-positive definite cases, preventing failure from accumulated floating-point errors.^[31]^[13] In terms of iteration requirements, the QR algorithm typically demands O(n) iterations for dense n×n matrices under standard shifting, with each iteration costing O(n²) after Hessenberg reduction. This count deteriorates for clustered eigenvalues, where separation is minimal, or defective matrices, where non-normality amplifies the need for more steps to achieve the Schur form.^[32]^[33] Contemporary implementations often incorporate hybrid strategies to address outliers or tiny submatrices, switching to specialized direct computations or small-scale QR iterations for subproblems smaller than a threshold (e.g., explicit formulas for 2×2 or 3×3 blocks when N ≤ 11 in LAPACK's DHSEQR), thereby handling edge cases more efficiently without relying solely on iterative QR steps.^[34]

History and Development

Early Contributions

The foundational work leading to the QR algorithm emerged in the 1950s amid efforts to compute matrix eigenvalues more reliably on early computers. A key predecessor was the LR algorithm, developed by Heinz Rutishauser in the mid-1950s and published in 1958, which applied successive lower-upper (LU) decompositions followed by reverse-order multiplication to symmetric tridiagonal matrices, often incorporating shifts to accelerate convergence.^[35] This method addressed limitations in earlier iterative techniques like the power method but suffered from potential numerical instability due to the non-orthogonality of the L and U factors, which could amplify rounding errors in finite-precision arithmetic.^[36] The QR algorithm itself originated independently in the late 1950s through the work of John G. F. Francis at the National Research Development Corporation (NRDC) in the United Kingdom. Between 1957 and 1961, Francis, motivated by the need to stabilize eigenvalue computations for applications such as aircraft flutter analysis, devised the QR transformation as a unitary analogue to Rutishauser's LR method.^[36] His approach replaced LU factorization with QR decomposition—where Q is orthogonal and R is upper triangular—ensuring that the transformations preserved the Euclidean norm of vectors and avoided the growth of errors inherent in non-orthonormal bases.^[37] Francis initially tested the basic unshifted version on the Pegasus computer, a vacuum-tube machine, before extending it to include shifts for improved convergence.^[36] Concurrently, in the Soviet Union, Vera N. Kublanovskaya developed a similar algorithm in 1960–1961, independently of Francis, drawing inspiration from Rutishauser's LR paper.^[36] She formulated it using LQ decompositions (with L lower triangular and Q orthogonal) and reverse multiplication, proving convergence for Hessenberg matrices under the assumption of distinct eigenvalue moduli.^[36] Lacking access to computers, Kublanovskaya verified her method through manual calculations on electromechanical devices.^[36] Francis's contributions were detailed in two seminal technical reports submitted in 1959 and resubmitted in 1961, published in 1962 as "The QR Transformation" Parts 1 and 2 in The Computer Journal, covering the basic algorithm, shifted variants, and convergence proofs for unreduced Hessenberg matrices.^[37] Kublanovskaya's early results appeared in 1961, with her initial summary in Doklady Akademii Nauk SSSR followed by expanded treatments in Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki.^[36] These publications, produced in an era of manual programming on machines like the Pegasus and EDSAC, highlighted the QR algorithm's potential for practical implementation despite the computational constraints of the time.^[36]

Key Milestones

In the early 1960s, the implicit QR algorithm emerged as a pivotal efficiency enhancement to the basic QR iteration. John G. F. Francis introduced the implicit shift mechanism in 1962, enabling the computation of shifted QR decompositions without explicitly forming the shifted matrix, thus avoiding dense operations on Hessenberg forms. This approach relied on similarity transformations that preserved the Hessenberg structure while accelerating convergence, incorporating bulge chasing to propagate and eliminate non-Hessenberg elements through Givens rotations and reducing computational complexity to O(n^2) per iteration. During the 1960s, Gene Golub and William Kahan extended QR-based methods to bidiagonalization for computing the singular value decomposition (SVD), transforming general matrices into bidiagonal form via Householder reflections and subsequent QR iterations on the bidiagonal matrix. These innovations, detailed in their 1965 paper, laid foundational techniques for robust SVD computation and directly influenced later implementations in numerical libraries like LAPACK, where QR iterations remain central to SVD routines. Standardization efforts in the 1970s solidified the QR algorithm's practical utility. James H. Wilkinson and Christian Reinsch published a comprehensive chapter in their 1971 Handbook for Automatic Computation, outlining stable implementations of the QR algorithm for Hessenberg and tridiagonal matrices, including shift strategies and deflation, which became the blueprint for the EISPACK library's eigenvalue routines. The 1980s and 1990s saw the QR algorithm integrated into major numerical software packages, enhancing scalability for high-performance computing. LINPACK, released in 1979, incorporated QR for least squares and eigenvalue problems, while LAPACK, developed from 1992 onward, introduced blocked variants of the QR algorithm using Level 3 BLAS operations to exploit parallelism and cache efficiency on vector and parallel processors. These blocked algorithms partitioned matrices into blocks, performing QR updates via matrix-matrix multiplications, which significantly reduced communication overhead in distributed-memory systems.^[38] Post-2000 developments addressed sparse and parallel computing challenges. The ARPACK library, introduced in the 1990s but refined in subsequent decades, incorporated aggressive early deflation within its implicitly restarted Arnoldi method, identifying and deflating converged Ritz values early in QR iterations to accelerate eigenvalue computation for large sparse nonsymmetric matrices. By the 2020s, GPU accelerations further boosted performance; libraries like MAGMA implemented blocked QR algorithms on GPUs, achieving up to 10x speedups over CPU counterparts for dense eigenvalue problems through kernel fusion and tensor core utilization. Recent advances as of 2025 include enhanced shifted QR variants for non-Hermitian matrices and randomized approaches improving efficiency for structured problems.^[39] The QR algorithm's enduring impact is evident in recognitions and applications. In 2019, a tribute to John G. F. Francis highlighted his contributions during the 60th anniversary celebrations at numerical analysis conferences, underscoring the algorithm's transformation of matrix computations.^[3] Its role in quantum chemistry simulations is important, where QR-based diagonalization via libraries like LAPACK supports electronic structure calculations in packages such as Gaussian and Psi4 for smaller matrices.^[40]

Variants and Extensions

For Symmetric Matrices

For real symmetric matrices, the QR algorithm is adapted by first reducing the matrix to symmetric tridiagonal form, which simplifies subsequent iterations and improves efficiency. This reduction is achieved using a sequence of Householder transformations, which are orthogonal reflections that zero out subdiagonal elements below the tridiagonal band while preserving symmetry. The computational cost of this reduction is approximately \frac{4n^3}{3} floating-point operations for computing both eigenvalues and eigenvectors of an n \times n matrix, as the transformations are applied from both sides to maintain the symmetric structure.^[41] Once in tridiagonal form, the QR iterations are applied directly to this condensed matrix, where each iteration requires only O(n) operations due to the banded structure, and the tridiagonal form is preserved throughout the process. In practice, shifts are incorporated to accelerate convergence, such as the Wilkinson shift, which selects the eigenvalue closest to the bottom-right corner of the tridiagonal matrix. To compute eigenvectors, the orthogonal matrices from each QR step are accumulated as Q = Q_1 Q_2 \cdots Q_k, where Q_i is the orthogonal factor from the i-th iteration, and the final eigenvectors are obtained by applying this product to the initial eigenvectors of the tridiagonal matrix.^[42] The symmetric case exhibits faster convergence compared to the general case because all eigenvalues are real, avoiding the need for complex arithmetic and the handling of complex conjugate pairs that can slow iterations in nonsymmetric problems. This property ensures that the algorithm remains within real arithmetic, leading to more predictable deflation and isolation of eigenvalues.^[43] A notable variant for symmetric tridiagonal matrices is the divide-and-conquer approach, which enhances parallelism by recursively splitting the tridiagonal matrix into smaller irreducible blocks at positions where the subdiagonal elements are negligible, solving the subproblems independently, and then combining the results using rank-one updates. This method, originally proposed in the early 1980s, reduces the overall complexity for eigenvector computation to O(n^2) in favorable cases and is particularly suited for parallel architectures. In applications such as principal component analysis (PCA), where the input is typically a symmetric covariance or correlation matrix, the QR algorithm on the tridiagonal form efficiently computes the principal components as the eigenvectors corresponding to the largest eigenvalues, leveraging the inherent symmetry to ensure numerical stability and accuracy.^[44]

Generalized Eigenvalue Problems

The QR algorithm, originally developed for standard eigenvalue problems, has been extended to address the generalized eigenvalue problem Ax = \lambda Bx, where A and B are square matrices and B may be singular. This extension, known as the QZ algorithm, computes a generalized Schur decomposition of the matrix pencil (A, B), yielding unitary matrices Q and Z such that A = QSZ^H and B = QTZ^H, with S upper quasi-triangular and T upper triangular; the generalized eigenvalues are then the ratios of corresponding diagonal elements of S and T.^[45] The algorithm begins with a reduction phase that transforms the pair (A, B) into a generalized upper Hessenberg-triangular form, where A is upper Hessenberg and B is upper triangular, using a series of generalized Householder transformations applied simultaneously to both matrices. These transformations preserve the generalized eigenvalues and require approximately 14n^3 floating-point operations for an n \times n pair, which is of the same order as the initial reduction in the standard QR algorithm (though higher by a factor of about 4).^[45] Following this reduction, the iterative phase applies a sequence of implicit double-shifted QZ steps, analogous to the shifted QR iterations, which chase bulges in the Hessenberg structure of A while maintaining the triangular form of B; each double iteration typically costs about $13n^2 operations and converges in roughly $1.2n to $1.3n steps for dense pairs.^[45] The overall computational complexity of the QZ algorithm is O(n^3), similar to the standard QR algorithm, but involves coupled decompositions that account for the interaction between A and B. To handle potential singularities in B, the algorithm incorporates deflation techniques: when a diagonal element of the triangularized B block becomes negligible (indicating an infinite eigenvalue), the corresponding row and column are deflated, reducing the problem size without requiring matrix inversion.^[45]^[46] This approach finds applications in fields such as structural dynamics, where it solves for vibration modes via K\phi = \lambda M\phi with stiffness matrix K and mass matrix M, and in control theory for analyzing descriptor systems or state-space models of the form E\dot{x} = Ax, where the generalized eigenvalues determine system stability and poles.^[47]^[48]

References

[1]
[PDF] NUMERICAL METHODS FOR LARGE EIGENVALUE PROBLEMS ...
This is a revised edition of a book which appeared close to two decades ago. Someone scrutinizing how the field has evolved in these two decades will make.
[2]
The QR Algorithm Revisited | SIAM Review
The $QR$ algorithm is still one of the most important methods for computing eigenvalues and eigenvectors of matrices. Most discussions of the $QR$ algorithm ...Missing: original | Show results with:original
[3]
[PDF] The QR algorithm: 50 years later its genesis by John Francis and ...
Jun 8, 2009 · Its second edition (Faddeev & Faddeeva, 1963, Chapter 8) discusses the QR-type eigenvalue algorithms of both. Kublanovskaya and Francis. Vera's ...
[4]
[PDF] The Best of the 20th Century: Editors Name Top 10 Algorithms
1959–61: J.G.F. Francis of Ferranti Ltd., London, finds a stable method for computing eigenvalues, known as the QR algorithm. Eigenvalues are arguably the most ...
[5]
Eigenvalue -- from Wolfram MathWorld
Eigenvalues are a special set of scalars associated with a linear system of equations (ie, a matrix equation) that are sometimes also known as characteristic ...
[6]
Eigenvector -- from Wolfram MathWorld
Each eigenvector is paired with a corresponding so-called eigenvalue. Mathematically, two different kinds of eigenvectors need to be distinguished: left ...
[7]
[PDF] Chapter 6 Eigenvalues and Eigenvectors
In the example, the shifted matrix A − 5I is singular and 5 is the other eigenvalue. Summary To solve the eigenvalue problem for an n by n matrix, follow these ...
[8]
[PDF] 1·Eigenvalues - Princeton University
The majority of the familiar applications of eigenvalue analysis involve matrices or operators that are normal or close to normal, having eigenfunc tions ...
[9]
Eigenvalues for Vibration Problems
Eigenvalue/Eigenvector analysis is useful for a wide variety of differential equations. This page describes how it can be used in the study of vibration ...Missing: stability quantum
[10]
[PDF] Lecture 4. Sturm-Liouville eigenvalue problems - UC Davis Math
For example, they describe the vibrational modes of various systems, such as the vibrations of a string or the energy eigenfunctions of a quantum mechanical ...
[11]
[PDF] Sensitivity and Computation of a Defective Eigenvalue - arXiv
Mar 4, 2021 · A defective eigenvalues is well documented to be hypersensitive to data per- turbations and round-off errors, making it a formidable challenge ...
[12]
Schur Decomposition -- from Wolfram MathWorld
The first step in a Schur decomposition is a Hessenberg decomposition. Schur decomposition on an n×n matrix requires O(n^3) arithmetic operations.
[13]
[PDF] Algorithms for the QR-Decomposition - Ethz
Abstract. In this report we review the algorithms for the QR decomposition that are based on the Schmidt orthonormalization process and show how an accurate.Missing: reflections original
[14]
Unitary Triangularization of a Nonsymmetric Matrix | Journal of the ...
This paper, 'Unitary Triangularization of a Nonsymmetric Matrix', by Alston S. Householder, was published in the Journal of the ACM on October 1, 1958.
[15]
Computation of Plain Unitary Rotations Transforming a General ...
The canonical coset decomposition of unitary matrices through Householder transformations. Journal of Mathematical Physics, Vol. 51, No. 8 | 3 August 2010.<|separator|>
[16]
[PDF] The QR Algorithm - Ethz
The QR algorithm computes a Schur decomposition of a matrix. It is certainly one of the most important algorithm in eigenvalue computations [9].
[17]
A Unitary Analogue to the LR Transformation-Part 1
In this paper it is proved that the trans- formations can be unitary, and the QR transformation, as I have (somewhat arbitrarily) named this modification. of ...<|control11|><|separator|>
[18]
[PDF] 11 The QR Algorithm
Thus, convergence of the “pure” (unshifted) QR algorithm is linear for both the eigenvalues and eigenvectors. We now look at the “practical” QR algorithm that ...
[19]
Understanding the $QR$ Algorithm | SIAM Review
The $QR$ algorithm is currently the most popular method for finding all eigenvalues of a full matrix. While $QR$ is now well understood by specialists in ...
[20]
The algebraic eigenvalue problem : Wilkinson, J. H. (James Hardy)
Jul 29, 2019 · Publication date: 1965. Topics: Algebras, Linear, Equations -- Numerical solutions, Matrices. Publisher: Oxford, Clarendon Press.
[21]
[PDF] (U.John Hopkins) Matrix Computations (3rd Ed.) [ripped by sabbanji]
Van Loan (1988). Handbook for Matrix Computa- tions, SLAM Publications ... QR Algorithm Without Shifts. QR. Algorithm With Shifta. Other Eigenvalue Al ...Missing: unit circle
[22]
The Multishift QR Algorithm. Part II: Aggressive Early Deflation
Aggressive early deflation has proven to significantly enhance the convergence of the QR algorithm for computing the eigenvalues of a nonsymmetric matrix. One ...
[23]
https://www.jstor.org/stable/10.4169/amer.math.monthly.118.05.387
[24]
[PDF] Optimally packed chains of bulges in multishift QR algorithms
Aug 8, 2012 · After an initial reduction to Hessenberg form, a QR iteration can be viewed as chasing a small bulge from the top left to the bottom right ...
[25]
[PDF] THE QR ALGORITHM - Hua Zhou
Independent of Rutishauser and Francis, Vera Kublanovskaya in the USSR presented the basic QR Algorithm in 1961.12 However, Francis not only gave us the.
[26]
[PDF] Lecture 16 The QR Algorithm II - DSpace@MIT
Nov 2, 2006 · • The QR algorithm with Rayleigh quotient shift might fail, e.g. with two symmetric eigenvalues. • Break symmetry by the Wilkinson shift. µ ...Missing: seminal papers
[27]
convergence of the unitary qr algorithm with a unimodular wilkinson ...
Jun 25, 2002 · The main results are, with the unimodular Wilkinson shift, the QR iteration converges ... Wang and W. B. Gragg, Convergence of the shifted QR ...
[28]
A Wilkinson-like multishift QR algorithm for symmetric eigenvalue ...
In this paper, we propose Wilkinson-like shifts, which reduce to the standard Wilkinson shift in the single shift case, and show a global convergence theorem.Missing: seminal | Show results with:seminal
[29]
The Effect of Aggressive Early Deflation on the Convergence of the ...
Aggressive early deflation has proven to significantly enhance the convergence of the QR algorithm for computing the eigenvalues of a nonsymmetric matrix.Missing: frequency | Show results with:frequency
[30]
[PDF] the multishift qr algorithm. part ii: aggressive early deflation
Abstract. Aggressive early deflation is a QR algorithm deflation strategy that takes advantage of matrix perturbations outside of the subdiagonal entries of ...
[31]
[PDF] A technique to have a convergence for the QR algorithm
The Schur form. Any square complex matrix is unitary similar to an upper triangular one. Proof. See [4]. Theorem 1.2. The convergence of the QR Farncis method.
[32]
[PDF] A PARALLEL EIGENSOLVER FOR DENSE SYMMETRIC ...
In practice 2–3 iterations, on average, are needed per eigen- value in the QR algorithm where each iteration is composed of at most n − 1 Givens rotations.
[33]
[PDF] 21 Eigenvalue Computation
The eigenvalue approximations determined by the. QR algorithm converge at least quadratically with the iteration number. The computation of the QR ...
[34]
[PDF] LAPACK Working Note #216: A novel parallel QR algorithm for ...
A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing (HPC) systems ...
[35]
[PDF] 3. Solution of Eigenvalue Problems With the LR-Transformation¹2
Jun 30, 2018 · 1 (a) In this report the expressions "latent root" and "latent vector" have been used throughout in place of eigenvalue and eigenvector; ...
[36]
[PDF] The QR algorithm: 50 years later its genesis by John Francis and ...
2009年6月8日 · On 29 October 1959 John Francis submitted his first theoretical QR algorithm paper. On 6 June 1961 he resubmitted the first part together ...
[37]
QR Transformation A Unitary Analogue to the LR ... - Oxford Academic
Part 1 of the paper is largely concerned with proof of convergence, and the theoretical aspect. Part 2, to be published in January, discussed practical ...
[38]
[PDF] LAPACK Working Note 58 The Design of Linear Algebra Libraries ...
LINPACK is organized around four matrix factorizations: LU factorization, pivoted Cholesky factorization, QR factorization, and singular value decomposition.
[39]
[PDF] 11.2 Reduction of a Symmetric Matrix to Tridiagonal Form
The Householder algorithm reduces an n×n symmetric matrix A to tridiagonal ... The basic idea behind the QR algorithm is that any real matrix can be.
[40]
[PDF] 9. QR algorithm
Convergence: in simultaneous iteration (and QR iteration),. U. T. k. AUk ... Lloyd N. Trefethen and David Bau, III, Numerical Linear Algebra (1997).Missing: unshifted | Show results with:unshifted
[41]
[PDF] Notes on the Symmetric QR Algorithm - UT Computer Science
Nov 4, 2014 · We briefly review the main tool employed to reduce a matrix to tridiagonal form: the Householder transform, also known as a reflector. Full ...
[42]
[PDF] Principal Component Analysis using QR Decomposition
In the next section, the QR decomposition based PCA is described which can extract eigenvectors and square root eigenvalues of 𝚺x in a numerically stable manner ...
[43]
An Algorithm for Generalized Matrix Eigenvalue Problems
The algorithm is a generalization of the $QR$ algorithm, and reduces to it when $B = I$. Problems involving higher powers of $\lambda $ are also mentioned.Missing: pseudocode | Show results with:pseudocode
[44]
The Combination Shift $QZ$ Algorithm - SIAM.org
This paper explains why the QZ algorithm functions well even in the presence of infinite eigenvalues. The key to rapid convergence of QZ (and QR) algorithms is ...
[45]
The eigenvalue problem for structural systems with statistical ...
Application of the random eigenvalue problem in forced response analysis ... Random Eigenvalue Problems in Structural Dynamics: Experimental Investigations.
[46]
https://epubs.siam.org/doi/10.1137/0712062