Fact-checked by Grok 2 weeks ago

Preconditioner

A preconditioner is a matrix or operator in numerical linear algebra that transforms a linear system Ax = b into an equivalent form, such as M^{-1}Ax = M^{-1}b for left preconditioning, to accelerate the convergence of iterative methods by improving the spectral properties of the system, such as reducing the condition number or clustering eigenvalues around unity.^[1] The preconditioner M must approximate the original matrix A closely enough to enhance efficiency while remaining computationally inexpensive to apply or invert, often through sparse approximations or factorizations.^[2] The term "preconditioning" originated in the late 1940s with the first use by Alan Turing in the context of direct methods to address round-off errors related to matrix conditioning, but techniques gained widespread adoption in the 1970s alongside the rise of Krylov subspace methods like the conjugate gradient (CG) algorithm and incomplete Cholesky (IC) preconditioning.^[1] By the 1980s and 1990s, advancements included incomplete LU (ILU) factorizations and algebraic multigrid (AMG) methods, addressing challenges in solving large sparse systems arising from partial differential equations (PDEs) in scientific computing, engineering simulations, and optimization.^[3] These developments were driven by the need for robust solvers on increasingly large-scale problems, where direct methods become prohibitive due to memory and time constraints.^[1] Common types of preconditioners include stationary methods like Jacobi (using the diagonal of A) and Gauss-Seidel (using the lower triangular part), which are simple but limited for ill-conditioned systems; incomplete factorizations such as ILU(0) or IC(0), which drop small elements during LU or Cholesky decomposition to maintain sparsity; and advanced approaches like domain decomposition (e.g., Schwarz methods) or sparse approximate inverses (SPAI), which exploit problem structure for parallelism and scalability on modern hardware.^[2] The choice of preconditioner depends on the matrix properties, such as symmetry and positive definiteness, and is crucial for methods like GMRES or BiCGSTAB, often reducing iteration counts from hundreds to a few dozen in practical applications.^[3] Ongoing research focuses on adaptive and machine-learning-enhanced preconditioners to handle the demands of exascale computing and multiphysics simulations.^[4]^[5]

Introduction

Definition and Purpose

In numerical linear algebra, the condition number of a matrix A, denoted \kappa(A), is defined as \kappa(A) = \|A\| \cdot \|A^{-1}\|, where \|\cdot\| denotes a consistent matrix norm. This scalar measures the sensitivity of the solution to a linear system Ax = b to small perturbations in the input data or roundoff errors in computations, with higher values indicating greater amplification of errors and potential instability in solvers or slower convergence rates in iterative algorithms.^[6] A preconditioner is a matrix or operator P (often approximating A or A^{-1}) that transforms the original linear system Ax = b into an equivalent form with more favorable spectral properties, such as the left-preconditioned system P^{-1}Ax = P^{-1}b or the right-preconditioned system AP^{-1}y = b where y = P^{-1}x.^[7] The goal is to reduce the condition number of the transformed system, either \kappa(P^{-1}A) for left preconditioning or \kappa(AP^{-1}) for right preconditioning, thereby improving numerical stability and efficiency. The primary purpose of a preconditioner is to accelerate the convergence of iterative methods, such as the conjugate gradient method, by clustering the eigenvalues of the preconditioned matrix near unity, which minimizes the effective condition number and reduces the number of iterations required for a desired accuracy. This benefit involves a fundamental trade-off: the computational cost of constructing and applying P must be outweighed by the savings from fewer iterations.^[7] In cases where A is symmetric positive definite, left and right preconditioning are equivalent with a single symmetric preconditioner P = M = N, yielding the system M^{-1}Ax = M^{-1}b.

Historical Overview

The concept of preconditioning was first introduced by Alan Turing in 1948, who used the term to describe transformations that improve the conditioning of matrices in iterative and direct solvers.^[1] The origins of preconditioning techniques trace back to early iterative methods for solving linear systems, with foundational developments in the mid-20th century. In the 1950s, David M. Young introduced successive over-relaxation (SOR) in his doctoral thesis, establishing a framework for accelerating stationary iterative methods like Jacobi and Gauss-Seidel, which implicitly preconditioned the system by adjusting relaxation parameters to improve convergence.^[8] Around the same time, Richardson iteration, originally proposed in 1910 but adapted for modern numerical contexts, served as a simple explicit preconditioner by incorporating dynamic scaling to reduce the spectral radius of the iteration matrix.^[9] These methods laid the groundwork for preconditioning as a means to cluster eigenvalues and enhance solver efficiency, particularly for elliptic partial differential equations discretized on grids.^[10] The 1970s marked a pivotal advancement with the formalization of preconditioned Krylov subspace methods. In 1976, Paul Concus, Gene H. Golub, and Dianne P. O'Leary developed the preconditioned conjugate gradient (PCG) method, extending the classical conjugate gradient algorithm to incorporate left or right preconditioners, significantly accelerating convergence for symmetric positive definite systems arising in finite difference and finite element discretizations.^[11] The following year, J.A. Meijerink and H.A. van der Vorst introduced incomplete LU (ILU) factorization as an algebraic preconditioner, providing stable incomplete decompositions for sparse matrices, especially M-matrices, and enabling robust preconditioning for nonsymmetric problems.^[12] Gene Golub's contributions to PCG and related eigenvalue analyses further solidified these techniques' theoretical foundations.^[13] During the 1980s and 1990s, preconditioning evolved toward scalable methods for large-scale problems, driven by advances in parallel computing and discretization techniques. Multigrid methods, pioneered by Achi Brandt in the late 1970s and refined throughout the 1980s, emerged as powerful preconditioners by leveraging hierarchical grids to eliminate errors across scales, achieving mesh-independent convergence rates. Domain decomposition preconditioners, developed concurrently by researchers like Pierre-Louis Lions, partitioned problems into subdomains for parallel solution, with overlapping variants like Schwarz methods enhancing robustness.^[12] In the 1990s, sparse approximate inverse (SPAI) preconditioners were introduced by Marcus J. Grote and Thomas Huckle in 1997, optimizing sparse matrices to approximate the inverse via least-squares minimization, offering explicit and parallelizable alternatives to factorization-based approaches.^[14] Andrea Greenbaum's spectral analyses during this period provided critical insights into convergence bounds and preconditioner efficacy for Krylov methods. In the 2000s, preconditioning extended beyond linear systems to nonlinear and time-dependent contexts, with variable preconditioners adapting to evolving problem structures. For nonlinear problems, Xu Cai and David E. Keyes proposed additive Schwarz preconditioned inexact Newton (ASPIN) in 2002, combining domain decomposition with nonlinear elimination to precondition Jacobian approximations in optimization and PDE solvers.^[15] This evolution facilitated applications in constrained optimization, where block preconditioners addressed indefinite KKT systems, and time-dependent simulations, employing variable metric updates to maintain conditioning amid changing coefficients.^[16]

Preconditioning for Linear Systems

Core Concepts and Description

Preconditioning addresses the challenge of solving large-scale sparse linear systems of the form Ax = b, where A is an n \times n matrix with n potentially reaching millions, and direct methods like Gaussian elimination become infeasible due to high computational cost and storage requirements, particularly when A is ill-conditioned.^[7]^[1] These systems arise frequently in scientific computing, such as discretizations of partial differential equations, where sparsity preserves structure but ill-conditioning amplifies numerical errors and slows convergence of iterative solvers.^[7]^[1] The core idea of preconditioning is to introduce a matrix P, typically nonsingular and easier to invert than A, that approximates A^{-1} such that the preconditioned matrix P^{-1}A has eigenvalues clustered near unity, effectively reducing the condition number \kappa(P^{-1}A) and improving solver efficiency.^[7]^[1] In the ideal case, P^{-1}A approaches the identity matrix, minimizing the eigenvalue spread and accelerating convergence.^[7]^[1] Preconditioning can be applied in left, right, or split forms: left preconditioning transforms the system to P^{-1}Ax = P^{-1}b; right preconditioning solves AP^{-1}y = b followed by x = P^{-1}y; and split preconditioning uses P = P_1 P_2 to form P_1^{-1} A P_2^{-1} y = P_1^{-1} b with x = P_2^{-1} y, offering flexibility for symmetric or structured problems.^[7]^[1] For stationary iterative methods, such as those based on splittings like Jacobi or successive over-relaxation (SOR), preconditioning reduces the spectral radius of the iteration matrix, ensuring faster asymptotic convergence rates.^[7] In Krylov subspace methods, it clusters eigenvalues to diminish the effective dimension of the subspace needed for convergence.^[7]^[1] When A is symmetric positive definite (SPD), preconditioners are often chosen to be SPD as well, enabling methods like the preconditioned conjugate gradient (PCG) that exploit symmetry for efficiency.^[7]^[1] For general nonsymmetric matrices, preconditioners must handle potential indefiniteness, with methods like GMRES accommodating left, right, or split variants to maintain stability.^[7]^[1] The transformed system emphasizes residual minimization in a preconditioned inner product defined as \langle x, y \rangle_M = x^T M y, where M is SPD (often M = P^T P for nonsymmetric P), altering the geometry of the error space to favor rapid convergence.^[7]^[1] This inner product induces a norm \| z \|_M = \sqrt{\langle z, z \rangle_M} under which the preconditioned residual is minimized, providing a theoretically grounded framework for analysis.^[7]

Preconditioned Iterative Methods

Iterative methods for solving large sparse linear systems Ax = b often rely on Krylov subspace techniques, which generate solution approximations within the Krylov subspace \mathcal{K}_k(A, r_0) = \operatorname{span}\{r_0, Ar_0, \dots, A^{k-1}r_0\}, where r_0 = b - Ax_0 is the initial residual.^[7] For symmetric positive definite (SPD) matrices A, the conjugate gradient (CG) method minimizes the A-norm of the error over the Krylov subspace at each step, ensuring orthogonality of residuals and search directions.^[17] For nonsymmetric systems, the generalized minimal residual (GMRES) method minimizes the Euclidean norm of the residual over the Krylov subspace, while the biconjugate gradient stabilized (BiCGSTAB) method provides a computationally efficient alternative with smoother convergence properties by combining elements of biconjugate gradients with stabilizing polynomials.^[18]^[19] Preconditioning integrates into these Krylov methods by transforming the system to improve spectral properties, typically via left-preconditioning P^{-1}Ax = P^{-1}b or right-preconditioning A P^{-1}y = b with x = P^{-1}y, where P approximates A. In the preconditioned conjugate gradient (PCG) method for SPD systems, the search directions incorporate preconditioned residuals z_k = P^{-1}r_k, leading to an error bound in the P^{-1}A-norm:

\frac{\|e_k\|_{P^{-1}A}}{\|e_0\|_{P^{-1}A}} \leq 2 \left( \frac{\sqrt{\kappa(P^{-1}A)} - 1}{\sqrt{\kappa(P^{-1}A)} + 1} \right)^k,

where e_k = x - x_k is the error and \kappa(M) = \lambda_{\max}(M)/\lambda_{\min}(M) is the condition number.^[7] For symmetric indefinite systems, the preconditioned MINRES method extends this framework by minimizing the residual norm in the Krylov subspace while handling indefinite spectra through Lanczos tridiagonalization.^[20] These preconditioned variants maintain the short-recurrence properties of their unpreconditioned counterparts, enabling efficient implementation for large-scale problems.^[7] A common approach to constructing preconditioners is matrix splitting, where A = P - Q with P chosen for easy inversion, serving as an approximate inverse to A since P^{-1} \approx A^{-1} when Q is small relative to P in an appropriate norm. For incomplete LU (ILU(0)) factorization, P = [LU](/page/LU_decomposition) where L and U are sparse lower and upper triangular factors of A with no fill-in beyond the main diagonal and first superdiagonal/subdiagonal, derived by dropping terms in the exact LU decomposition to preserve sparsity.^[7] Similarly, successive over-relaxation (SOR) splitting yields P = (D - \omega L)D^{-1}(D - \omega U) for relaxation parameter \omega, and the symmetric version (SSOR) applies forward and backward SOR sweeps to ensure symmetry preservation for SPD matrices, acting as P^{-1} \approx A^{-1} through iterative approximation of the inverse.^[7] In practice, the preconditioned residual is computed as r_{k+1} = b - A x_{k+1}, with the search direction updated using z_k = P^{-1} r_k to accelerate convergence within the Krylov framework.^[7] Convergence theory for these methods emphasizes the condition number \kappa(P^{-1}A); for PCG on SPD systems, the number of iterations to reduce the relative error below a tolerance \epsilon scales as O(\sqrt{\kappa(P^{-1}A)} \log(1/\epsilon)), reflecting slower convergence for ill-conditioned matrices due to eigenvalue spread.^[7] Effective preconditioning targets \kappa(P^{-1}A) \approx 1 by clustering eigenvalues near unity, ideally achieving mesh-independent convergence rates in discretized PDEs. For nonsymmetric cases like preconditioned GMRES or BiCGSTAB, convergence is superlinear but lacks sharp worst-case bounds, though preconditioning typically reduces effective condition numbers and iteration counts significantly in practice.^[18]^[7]

Geometric and Spectral Interpretations

Preconditioning can be interpreted geometrically as a change of basis that alters the geometry of the error propagation in iterative methods for solving linear systems Ax = b. In the original A-norm, defined as \|x\|_A = \sqrt{x^T A x} for a symmetric positive definite (SPD) matrix A, the set of errors with unit A-norm forms an ellipsoid aligned with the eigenvectors of A. Applying a preconditioner P transforms this into the preconditioned norm \|x\|_P = \sqrt{x^T P x}, where the error ellipsoid in the original space becomes a unit ball (circle in 2D) in the preconditioned space, simplifying the convergence analysis of methods like the conjugate gradient algorithm. This transformation is often achieved via the Cholesky factorization of the preconditioner, which scales and rotates the error contours to make them more isotropic, thereby reducing the distortion caused by the ill-conditioning of A.^[21] From a spectral perspective, preconditioning aims to cluster the eigenvalues of the preconditioned matrix P^{-1}A tightly around 1, which promotes robust and rapid convergence of iterative solvers. For SPD matrices A, the effective condition number \kappa(P^{-1}A) is determined by the ratio of the largest to smallest eigenvalues, \kappa(P^{-1}A) = \lambda_{\max}(P^{-1}A) / \lambda_{\min}(P^{-1}A), where an ideal preconditioner minimizes this ratio by pushing eigenvalues toward unity and away from extremes. In structured problems, such as those arising from discretized differential operators, optimal preconditioners like circulant approximations achieve eigenvalue clustering near 1 (or \{-1, 1\} for indefinite cases), ensuring the spectral distribution remains bounded independently of the problem size.^[22] This clustering directly bounds the number of iterations required for convergence, as the residual reduction per step depends on the eigenvalue spread.^[22] For non-Hermitian matrices, the field of values (numerical range) \mathcal{W}(P^{-1}A) = \{ z \in \mathbb{C} : z = u^* (P^{-1}A) u, \, \|u\|=1 \} provides a convex set containing the eigenvalues, and preconditioning reduces its spread to improve the convergence of methods like GMRES. A well-chosen preconditioner confines \mathcal{W}(P^{-1}A) to a small disk or region around 1, minimizing the maximum deviation from the origin and thus bounding the residual norms independently of discretization parameters. In nonsymmetric saddle-point systems, including zero in the field of values requires preconditioners that control the skew-symmetric part, ensuring the numerical range forms a convex set with a bounded "hole" for dimension-independent convergence.^[23]

Advanced Preconditioning Techniques

Variable preconditioning extends standard techniques by allowing the preconditioner matrix P to vary across iterations, denoted as P_k at the k-th step, which is particularly useful for solving sequences of linear systems arising in time-dependent or space-varying problems, such as those from discretized partial differential equations (PDEs).^[24] This flexibility accommodates evolving system matrices, as in unsteady fluid dynamics simulations where the operator changes with time, enabling more accurate approximations tailored to each system without recomputing a fixed preconditioner from scratch.^[25] Seminal work introduced variable preconditioning through nested GMRES methods like GMRESR, where an inner GMRES iteration approximates the preconditioner update per outer step, demonstrating reduced computational cost for unsymmetric systems in computational fluid dynamics applications.^[24] Nonlinear preconditioning addresses challenges in solving nonlinear equations by enhancing fixed-point iterations, such as Picard iterations, with techniques that accelerate convergence without linearizing the full Jacobian. Anderson acceleration, a prominent method, modifies the Picard update by minimizing a residual over a subspace of previous iterates, effectively acting as a nonlinear preconditioner that improves the robustness of outer solvers like Newton methods for systems derived from nonlinear PDEs.^[26] In this context, left-preconditioning integrates the nonlinear step to precondition the linearized subproblems within Picard iterations, leading to faster convergence for stiff nonlinear systems in areas like non-isothermal flow simulations.^[26] The approach originates from Anderson's 1965 mixing technique but was formalized for fixed-point iterations in modern analysis, showing equivalence to GMRES on linear problems and quadratic convergence under suitable conditions.^[26] Random preconditioning leverages randomized algorithms to construct approximate inverses or low-rank updates efficiently, particularly for high-dimensional linear systems where exact factorization is prohibitive. Randomized sketches, such as those based on Gaussian projections, reduce the matrix to a lower-dimensional representation, enabling stochastic approximations that serve as preconditioners for methods like conjugate gradients. A key example is randomized Nyström preconditioning, which approximates the spectral decomposition via subsampling and randomization to form a low-rank preconditioner, achieving a condition number bounded independently of the system size when the sketch rank matches the effective numerical rank. This technique is especially effective for kernel matrices in machine learning, where it accelerates iterative solvers by capturing dominant spectral components with minimal overhead. Spectrally equivalent preconditioning constructs a matrix P such that the preconditioned operator P^{-1}A has eigenvalues clustered within a small interval, ensuring robust convergence of iterative methods regardless of the mesh size in discretized PDEs. Formally, this requires bounds of the form

\alpha \|v\|^2 \leq v^T P^{-1} A v \leq \beta \|v\|^2 \quad \text{for all } v \neq 0,

with \beta / \alpha close to 1, often achieved through \alpha I \preceq P^{-1}A \preceq \beta I. Construction typically involves approximate spectral decompositions, such as those from multilevel methods or domain decomposition, where additive Schwarz preconditioners provide spectral equivalence with constants independent of discretization parameters. In additive Schwarz frameworks, the preconditioner aggregates local solvers over overlapping subdomains, yielding bounded spectra via partition-of-unity arguments and approximation properties of finite element spaces. This approach underpins robust solvers for elliptic PDEs, with the condition number scaling logarithmically with subdomain count in optimized variants.

Specific Examples

One prominent example of a classical preconditioner is the Jacobi preconditioner, which uses the diagonal part of the matrix A to form P = D, where D = \operatorname{diag}(A). This approach is simple to implement and highly parallelizable, as it requires only local operations on each diagonal entry, making it suitable for distributed computing environments. It is particularly effective for matrices that are diagonally dominant, where the magnitude of each diagonal entry exceeds the sum of the absolute values of the off-diagonal entries in the corresponding row. The inverse preconditioner is then P^{-1} = D^{-1}, and in the context of preconditioned Richardson iteration (a basic form of preconditioned gradient descent), the update step is given by

x_{k+1} = x_k + P^{-1} (b - A x_k),

which clusters eigenvalues near unity to accelerate convergence for well-conditioned problems.^[10] Incomplete factorizations represent another foundational class of preconditioners, approximating the LU decomposition of A while controlling computational cost through sparsity patterns. The ILU(k) method, for instance, computes lower and upper triangular factors L and U such that P = LU \approx A, but discards (drops) elements beyond a fill level k, where fill refers to new nonzeros introduced during Gaussian elimination. The algorithm proceeds by performing Gaussian elimination on A and dropping elements smaller than a threshold or beyond the sparsity pattern defined by k; for k=0, it preserves only the original nonzero structure (ILU(0)). This technique is robust for symmetric positive definite matrices and M-matrices, providing a good approximation to the exact factorization when drop tolerances are tuned appropriately. Sparse approximate inverse (SPAI) preconditioners construct P directly as an approximation to A^{-1} by minimizing the Frobenius norm \|I - P A\|_F subject to sparsity constraints on P. The algorithm uses column-by-column projections: for each column j of P, it solves a constrained least-squares problem \min \|e_j - P_j A\|_2, where e_j is the j-th unit vector and P_j is the j-th column of P restricted to a predefined sparse pattern, often derived from the sparsity of A. This yields a sparse, explicit inverse that is easily parallelizable, as columns can be computed independently, and it performs well for nonsymmetric matrices where factorization methods may fail. SPAI often requires fewer iterations than diagonal preconditioners in Krylov methods, especially for ill-conditioned systems from PDE discretizations.^[14] Other notable examples include the symmetric successive over-relaxation (SSOR) preconditioner, which applies two successive over-relaxation sweeps in forward and backward directions with relaxation parameter \omega, yielding P = (D + \omega L) D^{-1} (D + \omega U), where L and U are the strict lower and upper triangular parts of A; it enhances symmetry preservation and is effective for elliptic PDEs. Polynomial preconditioners, such as those based on the least-squares commutator (LSC), approximate A^{-1} using low-degree polynomials \sum_{i=0}^m c_i A^i fitted via least-squares minimization of residuals, often exploiting commutator relations like [B, C] \approx A for auxiliary operators B and C to simplify construction; these are particularly useful in block preconditioning for coupled systems, reducing condition numbers by up to an order of magnitude in targeted applications.^[27]

Preconditioning for Eigenvalue Problems

Spectral Transformation Methods

Spectral transformation methods play a crucial role in solving large-scale eigenvalue problems of the form A \mathbf{x} = \lambda \mathbf{x}, where A is a large sparse matrix, by reshaping the spectrum to make target eigenvalues more accessible to iterative algorithms. These techniques shift or scale the eigenvalues, clustering unwanted ones while isolating those of interest, such as interior or extreme values, thereby accelerating convergence of projection methods. Preconditioners are integral to these transformations, approximating the inverse of shifted operators to mitigate ill-conditioning and reduce the cost of linear solves within the iterations.^[28] One prominent approach is the shift-and-invert transformation, particularly effective for targeting interior eigenvalues near a shift parameter \sigma. In this method, the original problem is reformulated by solving linear systems of the form (A - \sigma I) \mathbf{y} = \mathbf{x}, which transforms the eigenvalues such that the new eigenvalues \mu satisfy \mu = 1/(\lambda - \sigma); thus, eigenvalues \lambda closest to \sigma become the largest in magnitude in the transformed spectrum. A preconditioner P approximates (A - \sigma I)^{-1}, enabling efficient solution of these systems via iterative solvers like GMRES, which is essential for sparse matrices where direct inversion is infeasible. This preconditioned shift-and-invert strategy enhances the clustering of interior eigenvalues, making them dominant for extraction.^[28] For non-Hermitian problems, the Cayley transform offers a robust alternative, converting the original pencil into a Hermitian form to leverage symmetric solvers. The transformation is defined as (A - \sigma I)(A + \overline{\sigma} I)^{-1}, where \sigma is complex and \overline{\sigma} its conjugate, yielding a Hermitian operator whose eigenvalues relate to the original spectrum via a Möbius mapping that maps the imaginary axis to the unit circle.^[29] Preconditioning targets the resolvent (A + \overline{\sigma} I), with P approximating its inverse to stabilize solves and preserve the Hermitian structure, thus improving robustness over pure shift-invert when linear systems are solved inexactly.^[29] In the preconditioned eigensolver framework, the transformed problem takes the form

P^{-1} (A - \sigma I) \mathbf{z} = \mu \mathbf{z},

where \mu is related to the target \lambda by \mu = 1/(\lambda - \sigma) for shift-invert, and P clusters the spectrum around unity to accelerate convergence.^[28] This setup is typically solved using preconditioned Lanczos for symmetric A (targeting exterior or interior spectra via shifts) or Arnoldi for nonsymmetric cases, where matrix-vector products involve preconditioned solves of the shifted system. These methods, such as implicitly restarted variants, efficiently compute a few eigenpairs by deflating converged modes and restarting to focus on the transformed extremal eigenvalues.^[30]

Ideal and Practical Preconditioning

In the context of eigenvalue problems, the ideal preconditioner is defined as P = (A - \sigma I)^{-1}, where A is the matrix, I is the identity, and \sigma is a shift parameter chosen near the target eigenvalues to enhance convergence of iterative methods.^[31] This exact inverse transforms the original eigenvalues \lambda_i of A into \mu_i = 1 / (\lambda_i - \sigma) for the preconditioned operator, effectively clustering those near \sigma around unity while spreading others, thereby improving spectral separation and accelerating methods like Arnoldi or Lanczos.^[31] However, computing this preconditioner exactly requires a full factorization of A - \sigma I, which is computationally prohibitive for large sparse matrices due to high setup costs and storage demands.^[31] The primary goals of such ideal preconditioning include deflation of known eigenpairs to avoid recomputation and targeted isolation of eigenvalue clusters, though these remain theoretical benchmarks rather than direct implementations.^[31] Practical preconditioners approximate this ideal form to balance accuracy and efficiency, often tailoring approximations to the spectrum of A. For instance, subspace recycling reuses approximate invariant subspaces from prior iterations or related solves to augment Krylov bases, reducing the effective dimension and speeding up convergence in sequences of shifted systems.^[32] Coarse-grid corrections, drawn from multigrid techniques, project the problem onto lower-resolution grids to approximate low-frequency eigenmodes, providing robust preconditioning for symmetric problems without inner-outer iterations.^[33] Other approximations include polynomial-based inverses, such as Chebyshev or least-squares polynomials fitted to (A - \sigma I)^{-1}, which avoid explicit factorizations while preserving spectral tailoring.^[31] A generalized ideal preconditioner P seeks to minimize the spectral condition number of the preconditioned matrix P^{-1} A, ideally clustering all relevant eigenvalues near 1 to ensure uniform convergence bounds across iterative eigensolvers.^[31] Robustness is particularly crucial for handling multiple shifts or clustered eigenvalues, where methods like shift-and-invert combined with deflation separate clusters by removing converged eigenpairs, preventing stagnation in block solvers.^[34] Preconditioned gradient subspace iterations further enhance cluster robustness by analyzing convergence in eigenvalue groups rather than individually, yielding tighter bounds for dense clusters.^[34] Key challenges in practical preconditioning include trading off setup costs—such as building multigrid hierarchies or recycling bases—against per-solve speed, where overly complex approximations may not amortize over few iterations.^[33] Adaptive preconditioners address varying shifts by dynamically updating approximations, such as through Rayleigh quotient adjustments in inexact iterations, though they require careful subspace selection to maintain stability.^[31] These trade-offs underscore the need for problem-specific tuning to achieve reliable performance in large-scale eigenvalue computations.^[32]

Preconditioning in Optimization

Role and Description

In nonlinear optimization, preconditioning plays a crucial role in accelerating gradient-based methods for minimizing an objective function f(\mathbf{x}) by rescaling the search directions to mitigate issues arising from poor variable scaling or ill-conditioned Hessians. The standard preconditioned gradient descent update takes the form \mathbf{x}_{k+1} = \mathbf{x}_k - \alpha P^{-1} \nabla f(\mathbf{x}_k), where P is a positive definite preconditioning matrix that approximates the inverse Hessian or scales the coordinates to improve the condition number of the problem.^[35] This approach transforms the optimization landscape, making the gradients more isotropic and enabling faster convergence compared to vanilla gradient descent, especially in high-dimensional settings where variables have disparate scales.^[35] In second-order methods, preconditioning approximates the inverse Hessian to enhance the efficiency of Newton's method, which solves for the step \delta \mathbf{x} in the system H \delta \mathbf{x} = -\nabla f(\mathbf{x}_k), where H is the Hessian or its approximation. The preconditioned variant solves the modified system P^{-1} H \delta \mathbf{x} = -P^{-1} \nabla f(\mathbf{x}_k), effectively incorporating P^{-1} as a scaling that clusters the eigenvalues of the preconditioned Hessian near unity, thus reducing the number of iterations needed for convergence.^[36] This is particularly valuable in quasi-Newton methods like BFGS, where limited-memory approximations serve as both the Hessian update and a form of adaptive preconditioning to handle large-scale problems without full Hessian storage.^[36] For constrained optimization, preconditioning extends to methods like sequential quadratic programming (SQP), where it targets the Jacobians of the constraints to improve the conditioning of the Karush-Kuhn-Tucker (KKT) systems solved at each iteration. In active-set SQP variants, preconditioners based on block factorizations of the Jacobian help stabilize the reduced Hessian solves, leading to robust performance in PDE-constrained problems.^[37] Overall, these techniques yield faster convergence near the minimum by addressing ill-conditioning from mismatched variable scales, often reducing computational cost in practice without altering the problem's stationary points.^[38]

Connections to Linear and Eigenvalue Problems

In the context of optimization, preconditioning techniques draw heavily from methods developed for solving large-scale linear systems, particularly in Newton's method where each iteration requires solving the linear system H \delta x = -\nabla f(x), with H denoting the Hessian matrix of the objective function f. Preconditioners such as incomplete LU (ILU) factorization, originally designed for sparse linear systems, are directly applicable to these Hessian equations, improving convergence of iterative solvers like GMRES by reducing the condition number without full factorization. This reuse leverages the structural similarities between general linear systems and the quadratic models in optimization, enabling efficient handling of ill-conditioned Hessians in large-scale problems.^[39]^[38] Preconditioning in optimization also connects to eigenvalue problems through the spectral properties of the Hessian, whose eigenvalues characterize the local curvature of f. Large eigenvalue spreads indicate poor conditioning, leading to slow convergence; preconditioners shift or scale the spectrum to promote positive definiteness, ensuring the Hessian is well-conditioned for subsequent steps. Preconditioning can help manage indefinite Hessians by promoting positive definiteness, aiding local convergence in nonconvex optimization settings such as molecular geometry problems.^[40] Such spectral insights, derived from eigenvalue solvers, guide preconditioner selection to cluster eigenvalues around unity, mirroring techniques in generalized eigenvalue problems.^[41] Hybrid approaches further bridge these domains, with the limited-memory BFGS (L-BFGS) method acting as an approximate preconditioner that approximates the inverse Hessian using gradient history, linking quasi-Newton updates to linear system solvers. This limited-memory variant maintains efficiency for high-dimensional problems by storing only recent curvature information, effectively preconditioning Newton-like steps without explicit Hessian storage or factorization. In trust-region methods, preconditioning transforms the subproblem \min_{\delta x} \frac{1}{2} \delta x^T H \delta x + \nabla f^T \delta x subject to \|\delta x\| \leq \Delta, often via a scaling matrix P to yield the preconditioned Hessian P^{-1} H P^{-T}, which facilitates faster solution of the constrained quadratic using preconditioned conjugate gradients.^[42]^[43] Techniques from eigenvalue problems transfer to optimization preconditioner design, particularly through spectral clustering, where eigenvalues of graph Laplacians or matrices are grouped to inform partitioning and preconditioning strategies. In optimization, this aids in constructing preconditioners that cluster the spectrum of the preconditioned operator, accelerating iterative solvers by mimicking ideal shifts observed in eigenvalue decompositions. For instance, scaled spectral preconditioners capture outlier eigenvalues and cluster the remainder, drawing from multigrid and domain decomposition methods rooted in spectral analysis.^[44]

Variable and Nonlinear Approaches

In variable preconditioning for nonlinear optimization, the preconditioner matrix P_k is updated at each iteration k to incorporate local information about the problem's geometry, such as approximations to the Hessian or gradients, rather than using a fixed preconditioner throughout the process.^[45] This approach enhances convergence by adapting to the changing curvature in non-quadratic objectives, often through quasi-Newton updates that maintain spectral equivalence in the underlying space.^[45] For instance, diagonal scaling can be derived directly from the gradient \nabla f(x_k) to normalize step directions, reducing sensitivity to ill-conditioning in dynamic landscapes.^[46] Nonlinear preconditioning techniques extend these ideas to handle the full nonlinearity of the objective, particularly in methods like the preconditioned inexact Newton algorithm, which solves an approximate Newton system P_k \delta x = -\nabla f(x_k) with a line search to ensure global convergence. Here, the preconditioner P_k approximates the local Jacobian or Hessian, and inexact solves (e.g., via Krylov methods) balance computational cost with accuracy. Broyden's method provides an efficient way to approximate the Jacobian for such preconditioners by applying low-rank secant updates that satisfy the secant condition B_{k+1} (x_{k+1} - x_k) = \nabla f(x_{k+1}) - \nabla f(x_k), avoiding full recomputation at every step.^[47] The nonlinear preconditioned update takes the form

\delta x_k = -P_k^{-1} \nabla f(x_k),

where P_k is constructed to satisfy secant conditions from prior iterations, enabling superlinear convergence under suitable assumptions on the objective's smoothness.^[47] In stochastic optimization settings, such as those encountered in machine learning, preconditioners adapt to gradient noise through methods like Adam and RMSProp, which maintain exponential moving averages of gradient moments to form a diagonal preconditioner that scales updates inversely with the root-mean-square of recent gradients.^[48] Recent advances include nonlinear Schwarz preconditioning for bound-constrained nonlinear optimization, which decomposes the problem into subdomains for parallelizable updates, improving scalability as of 2023.^[49] In neural field optimization, stochastic preconditioning has been developed to handle expensive nonlinear least-squares problems in 2025.^[50] Despite these advances, challenges persist in nonconvex landscapes, where indefinite Hessians can lead to unstable preconditioned directions that fail to decrease the objective, necessitating inertia-revealing techniques to detect and correct negative curvature.^[51] Additionally, frequent updates to P_k risk amplifying noise or divergence, prompting adaptive freezing strategies that halt recomputation when preconditioner quality metrics (e.g., residual norms) indicate sufficient stability.^[38]

Applications and Recent Developments

Practical Applications

Preconditioners play a crucial role in computational fluid dynamics (CFD), particularly for solving the Stokes equations in simulations of fluid flow. Multigrid preconditioners are widely employed to accelerate iterative solvers for these systems, enabling efficient handling of large-scale problems in engineering design and aerodynamics. For instance, matrix-free monolithic multigrid methods precondition the discretized Stokes equations, supporting convergence in finite element discretizations.^[52] In machine learning, preconditioners enhance the efficiency of training algorithms by addressing ill-conditioned matrices that arise in optimization problems. For Gaussian processes, preconditioning techniques such as those based on conjugate gradients with re-orthogonalization and mixed precision accelerate the inversion of large covariance matrices, which is essential for scalable probabilistic predictions in regression tasks.^[53] Similarly, in neural network training, preconditioners like those integrated into stochastic gradient methods improve convergence by regulating the curvature of the loss landscape, as demonstrated in physics-informed neural networks where condition number optimization stabilizes computations. Engineering applications, such as structural analysis, rely on preconditioners to solve finite element systems arising from partial differential equations like the Poisson equation, which underlies elasticity models. Low-order finite element preconditioners are effective for high-order discretizations in these contexts, providing scalable solutions for predictions in complex geometries.^[54] Domain decomposition preconditioners further enable parallel computing for large-scale finite element models, partitioning the domain into subdomains to facilitate distributed solves while maintaining robustness.^[55] In medical imaging, preconditioners support iterative reconstruction algorithms for computed tomography (CT) and magnetic resonance imaging (MRI). Ordered subset preconditioned methods, such as relaxed ordered subset preconditioned alternating projection, accelerate the expectation-maximization process in positron emission tomography (PET)/CT reconstruction, improving image quality and reducing computational time for clinical diagnostics.^[56] Performance benefits of preconditioners are evident in case studies from porous media flow simulations, where they significantly reduce iteration counts in iterative solvers. For example, multi-stage preconditioners in compositional flow models achieve parallel speedups on GPUs compared to sequential approaches, enabling faster simulations of oil reservoir dynamics.^[57]

Modern and Emerging Techniques

In the 2020s, the integration of machine learning has revolutionized preconditioning by enabling data-driven approximations tailored to complex partial differential equations (PDEs). Neural network-based preconditioners leverage deep learning, such as graph neural networks, to approximate matrix factorizations that accelerate Krylov subspace methods like conjugate gradient for solving sparse linear systems arising from PDE discretizations. These approaches train on ensembles of similar systems to predict effective preconditioners, addressing limitations of traditional methods in high-dimensional or parametric settings.^[58]^[59] Scalable preconditioning techniques have advanced through GPU acceleration and hybrid machine learning-iterative frameworks, enabling efficient handling of massive datasets in modern computing environments. GPU-accelerated randomized preconditioners, like those constructing approximate Cholesky factors for Laplacian matrices, parallelize the sampling and factorization processes to achieve near-linear speedup on sparse systems with millions of degrees of freedom.^[60] Mixed-precision variants further optimize memory and computation by using lower precision for randomization steps in LSQR solvers, outperforming double-precision baselines by factors of 2-3 on dense least-squares problems.^[61] Hybrid methods combine deep operator networks, such as DeepONet, with iterative solvers like GMRES, where the neural component preconditions the system to reduce residual norms faster, particularly for parametric PDEs with up to 10^5 parameters.^[62] Another example is neural preconditioned conjugate gradient (neuralPCG), which learns subspace projections to cut solver iterations by 30-70% on time-dependent PDEs.^[63] Quantum-inspired preconditioners draw from variational quantum algorithms to tackle linear systems in noisy intermediate-scale quantum (NISQ) devices, extending classical ideas to quantum linear algebra. Variational preconditioners enhance the Harrow-Hassidim-Lloyd (HHL) algorithm by incorporating classical preconditioning steps, such as incomplete factorizations, to condition the quantum phase estimation and improve solution accuracy on systems with condition numbers exceeding 10^4.^[64] Hybrid HHL variants integrate variational quantum eigensolvers with classical iterative refinement, reducing gate complexity and enabling solutions on current hardware for sparse matrices up to 2^10 dimensions.^[65] These methods precondition the input Hamiltonian to cluster eigenvalues, achieving exponential speedup over classical solvers for well-conditioned subsystems while mitigating noise through error-corrected post-processing.^[65] Emerging techniques address critical gaps in preconditioning for exascale computing and nonlinear problems through adaptive and physics-aware strategies. In exascale environments, libraries like hypre provide scalable algebraic multigrid preconditioners with adaptive coarsening, supporting simulations on 10^6+ cores while maintaining convergence rates below 1.5 iterations per time step for fluid dynamics.^[66] Adaptive deep learning for nonlinear preconditioning employs nonlinear elimination within Newton frameworks, bounding errors to 10^{-6} for transport equations by dynamically selecting subspaces based on residual growth. Post-2023 developments in physics-informed neural preconditioners (PINNs) use adjoint-based matrix preconditioning to stabilize training, improving convergence on ill-conditioned PDEs like 3D aerodynamics by factors of 10 in loss reduction.^[67] As of 2025, further advances include neural-operator preconditioned Newton methods for parametric nonlinear systems and machine learning techniques like PCGBandit for one-shot acceleration of transient PDE solvers.^[68]^[69] A key recent innovation involves learning approximate inverses via autoencoders, where the preconditioner P is trained to satisfy P \approx A^{-1} by encoding matrix sparsity patterns and decoding low-rank approximations, achieving sparsity levels of 99% with relative errors under 10^{-3} for elliptic operators. Generative models further refine sparse approximate inverse (SPAI) preconditioners using diffusion processes on graph representations of A, enabling deployment on heterogeneous architectures for seismic imaging tasks.^[70] These learned inverses integrate seamlessly with iterative solvers, demonstrating robustness across out-of-distribution matrices in large-scale applications.^[71]

References

[1]
[PDF] Preconditioning Techniques for Large Linear Systems: A Survey
Preconditioning as a means of reducing the condition number in or- der to improve convergence of an iterative process seems to have been first considered by ...<|control11|><|separator|>
[2]
[PDF] 11.3 Iterative Methods and Preconditioners
With a good preconditioner, conjugate gradients becomes one of the most popular and powerful algorithms in numerical linear algebra. Multigrid Solve smaller ...
[3]
Preconditioner - an overview | ScienceDirect Topics
Preconditioning is a sort of preprocessing of the original linear system that in theory is used for improving the convergence of the iterative methods.
[4]
[PDF] 11.2 Norms and Condition Numbers
For a positive definite symmetric matrix the norm is kAk = λmax(A). Choose x to be the eigenvector with maximum eigenvalue. Then kAxk/kxk equals λmax. The point ...
[5]
[PDF] Iterative Methods for Sparse Linear Systems Second Edition
In the six years that passed since the publication of the first edition of this book, iterative methods for linear systems have made good progress in ...
[6]
[PDF] A Ph.D. Thesis of Historical Importance Iterative Methods for Solving ...
Preface. David Young's thesis is one ofthe monumental works of modern numerical analysis. His creation, development and analysis of the Successive ...
[7]
[PDF] Iterative methods for linear systems of equations: A brief historical ...
Abstract. This paper presents a brief historical survey of iterative methods for solving linear systems of equations. The journey begins with Gauss who.
[8]
[PDF] On the Origins of Linear and Non-Linear Preconditioning
This gives us a very general idea of non-linear preconditioning: one first designs a fixed point iteration (like the stationary iterative method in the linear.
[9]
[PDF] s u326 p30-44 a generalized conjugate gradient method for the ...
ABSTRACT. We consider a generalized conjugate gradient method for solving sparse, symmetric, positive-definite systems of linear equations,.
[10]
Preconditioning for Sparse Linear Systems at the Dawn of the 21st ...
Dec 26, 2012 · The Block FSAI preconditioner of A is defined as the product FTF ... 45 Saad Y., Iterative Methods for Sparse Linear Systems, 2003, 2nd ...
[11]
On the Origins of Linear and Non-linear Preconditioning
The idea of preconditioning iterative methods for the solution of linear systems goes back to Jacobi (Astron Nachr 22(20):297–306, 1845), who used rotations ...
[12]
Parallel Preconditioning with Sparse Approximate Inverses
Parallel Preconditioning with Sparse Approximate Inverses. Authors: Marcus J. Grote and Thomas HuckleAuthors Info & Affiliations ... PDF. View PDF. Figures ...
[13]
https://www.researchgate.net/publication/315310198_On_the_Origins_of_Linear_and_Non-linear_Preconditioning
[14]
Preconditioners for Indefinite Systems Arising in Optimization
For nonlinear programs a preconditioner is derived from the “smaller” KKT system associated with variables that are not near a bound. For linear programs ...
[15]
[PDF] Methods of Conjugate Gradients for Solving Linear Systems 1
Hestenes 2 and Eduard Stiefel 3. An iterative algorithm is given for solving ... In the present paper, the conjugate gradient rou- tines are developed ...
[16]
GMRES: A Generalized Minimal Residual Algorithm for Solving ...
We present an iterative method for solving linear systems, which has the property of minimizing at every step the norm of the residual vector over a Krylov ...
[17]
Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for ...
Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems. Author: H. A. van der VorstAuthors Info & ...
[18]
Solution of Sparse Indefinite Systems of Linear Equations - SIAM.org
Solution of Sparse Indefinite Systems of Linear Equations. Authors: C. C. Paige and M. A. SaundersAuthors Info & Affiliations ... MINRES: From Negative Curvature ...Missing: original | Show results with:original
[19]
[PDF] A geometric theory for preconditioned inverse iteration: IV
This makes a fundamental difference between optimal preconditioning for linear systems and for eigenvalue problems. Lemma 2.1 provides a geometric description ...
[20]
[PDF] Spectral Analysis and Preconditioned Iterative Solvers for Large ...
Apr 30, 2022 · Greenbaum, V. Pt k, and Z. Strakoš. Any nonincreasing convergence curve is possible for GMRES. SIAM J Matrix Anal. Appl., 17(3):465–469, 1996 ...Missing: Andrea | Show results with:Andrea
[21]
[PDF] Field of values analysis that includes zero for preconditioned ...
We present a field-of-values (FOV) analysis for preconditioned nonsymmetric saddle- point linear systems, where zero is included in the field of values of the ...
[22]
https://arxiv.org/pdf/2205.00339
[23]
[1611.00288] Efficient variants of the CMRH method for solving a ...
Nov 1, 2016 · Then, we introduce a flexible variant of the algorithm that allows to use variable preconditioning at each iteration to further accelerate the ...
[24]
https://doi.org/10.1002/nla.1680010404
[25]
Block Preconditioners Based on Approximate Commutators - SIAM.org
This paper introduces two stabilization schemes for the least squares commutator (LSC) preconditioner developed by Elman, Howle, Shadid, Shuttleworth, and ...
[26]
Shift-Invert Arnoldi's Method with Preconditioned Iterative Solves
We consider the computation of a few eigenvectors and corresponding eigenvalues of a large sparse nonsymmetric matrix using shift-invert Arnoldi's method ...Missing: seminal | Show results with:seminal
[27]
Using Generalized Cayley Transformations within an Inexact ...
We show that a Cayley transformation leads to a more efficient and robust eigensolver than the usual shift-invert transformation when the linear systems are ...
[28]
https://epubs.siam.org/doi/10.1137/080716281
[29]
[PDF] NUMERICAL METHODS FOR LARGE EIGENVALUE PROBLEMS ...
This is a revised edition of a book which appeared close to two decades ago. Someone scrutinizing how the field has evolved in these two decades will make.
[30]
https://epubs.siam.org/doi/book/10.1137/1.9781611971237
[31]
None
### Summary of Multigrid Preconditioners for Symmetric Eigenvalue Problems
[32]
Cluster robustness of preconditioned gradient subspace iteration ...
The paper uses a novel approach of studying the convergence of groups of eigenvalues, rather than individual ones, to obtain new convergence estimates for this ...
[33]
[PDF] Preconditioned Stochastic Gradient Descent - arXiv
This paper proposes a new method to adaptively estimate a preconditioner such that the amplitudes of perturbations of pre- conditioned stochastic gradient match ...
[34]
Automatic Preconditioning by Limited Memory Quasi-Newton Updating
This paper proposes a preconditioner for the conjugate gradient method (CG) that is designed for solving systems of equations Ax=bi with different ...
[35]
Preconditioning of Active-Set Newton Methods for PDE-constrained ...
We present two new preconditioners based on a full block matrix factorization of the Schur complement of the Jacobian matrices, where the active-set blocks are ...
[36]
[PDF] Preconditioning issues in the numerical solution of nonlinear ...
Here, we will focus on sequences arising in optimization methods for nonlinear systems and nonlinear least-squares problems and assume to use a Krylov subspace ...
[37]
[PDF] Preconditioning Techniques for Large Linear Systems: A Survey
Preconditioning as a means of reducing the condition number in or- der to improve convergence of an iterative process seems to have been first considered by ...
[38]
Preconditioners for the geometry optimisation and saddle point ...
Sep 18, 2018 · A class of preconditioners is introduced to enhance geometry optimisation and transition state search of molecular systems.
[39]
[PDF] Preconditioning for Hessian-Free Optimization
Apr 2, 2012 · The convergence of the conjugate-gradient method is strongly influenced by the condi- tion number of the Hessian (i.e., its extreme eigenvalues) ...
[40]
On the limited memory BFGS method for large scale optimization
Abstract. We study the numerical performance of a limited memory quasi-Newton method for large scale optimization, which we call the L-BFGS method.Missing: solvers | Show results with:solvers
[41]
[PDF] Preconditioned Conjugate Gradient Methods in Truncated Newton ...
We summarize a trust region Newton method in Algorithm 1. Solving the sub-problem (9) is similar to solving the linear system (5) though a constraint ksk ≤ ∆k ...
[42]
[PDF] An Efficient Scaled spectral preconditioner for sequences of ... - arXiv
Oct 3, 2024 · The main idea is to capture the eigenvalues not captured by the first-level preconditioner, and cluster them to a positive quantity, typically ...
[43]
Variable Preconditioning via Quasi-Newton Methods for Nonlinear ...
The aim of this paper is to develop stepwise variable preconditioning for the iterative solution of monotone operator equations in Hilbert space and apply ...Missing: seminal | Show results with:seminal
[44]
Quasi-Newton variable preconditioning for nonlinear nonuniformly ...
May 18, 2021 · This paper develops quasi-Newton iterative solvers for nonlinear elliptic problems in Banach spaces, extending variable preconditioning to ...
[45]
On the Behavior of Broyden's Class of Quasi-Newton Methods
This paper analyzes algorithms from the Broyden class of quasi-Newton methods for nonlinear unconstrained optimization.
[46]
Adam: A Method for Stochastic Optimization
### Summary of Adam's Preconditioning in Stochastic Optimization (arXiv:1412.6980)
[47]
Inertia-Revealing Preconditioning For Large-Scale Nonconvex ...
In nonconvex problems, the Newton direction is guaranteed to be a descent direction only if the Hessian of the Lagrange function is positive definite on the ...Missing: challenges | Show results with:challenges
[48]
Matrix-Free Monolithic Multigrid Methods for Stokes and ...
We propose and analyze matrix-free monolithic geometric multigrid solvers that are based on appropriately scaled Chebyshev–Jacobi smoothers.
[49]
Low-Precision Arithmetic for Fast Gaussian Processes - arXiv
Jul 14, 2022 · We propose a multi-faceted approach involving conjugate gradients with re-orthogonalization, mixed precision, and preconditioning.
[50]
Scalable Low-Order Finite Element Preconditioners for High-Order ...
The best performing preconditioners are formed with low-order finite element meshes that have more vertices than the high-order element has degrees of freedom, ...
[51]
A Domain Decomposition Solver for a Parallel Adaptive Meshing ...
We describe a domain decomposition algorithm for use in the parallel adaptive meshing paradigm of Bank and Holst SIAM J. Sci. Comput., 22 (2000), pp.
[52]
An Improved Multi-Stage Preconditioner on GPUs for Compositional ...
Aug 18, 2022 · Abstract:The compositional model is often used to describe multicomponent multiphase porous media flows in the petroleum industry. ... speedup ...
[53]
[2201.01970] Parallel Multi-Stage Preconditioners with Adaptive ...
Jan 6, 2022 · The black oil model is widely used to describe multiphase porous media flow in the petroleum industry. The fully implicit method features strong ...
[54]
Learning Preconditioners for Conjugate Gradient PDE Solvers
We present a new method that leverages learning-based approach to obtain an approximate matrix factorization to the system matrix to be used as a preconditioner ...<|separator|>
[55]
Accelerating PDE Solvers with Equation-Recast Neural Operator ...
Sep 1, 2025 · We introduce a Minimal-Data Parametric Neural Operator Preconditioning (MD-PNOP) framework, which establishes a new paradigm for accelerating ...
[56]
Parallel GPU-Accelerated Randomized Construction of Approximate ...
May 5, 2025 · We introduce a parallel algorithm to construct a preconditioner for solving a large, sparse linear system where the coefficient matrix is a Laplacian matrix.
[57]
[PDF] A Mixed Precision Randomized Preconditioner for the LSQR Solver ...
We implement and eval- uate our method on GPUs and we demonstrate that it outperforms the standard double precision version of randomized, preconditioned LSQR.
[58]
DeepONet Based Preconditioning Strategies for Solving Parametric ...
Jan 4, 2024 · The proposed preconditioners are constructed by hybridizing the deep operator network, namely, DeepONet, with standard iterative methods.
[59]
[PDF] neuralpcg: learning preconditioners for solving partial differential ...
Our strategy is to build a hybrid PDE solver that combines the advantages of both machine learning approaches and classic numerical solvers. Traditional ...
[60]
Preconditioning for a Variational Quantum Linear Solver - Inspire HEP
Dec 25, 2023 · Our findings suggest that combining classical computing techniques, such as preconditioning, with quantum algorithms can significantly enhance ...Missing: variants | Show results with:variants
[61]
Solving linear systems on quantum hardware with hybrid HHL - Nature
Sep 10, 2024 · Our proposal adds to the existing literature of hybrid quantum algorithms for linear algebra that are more compatible with the current scale of quantum devices.
[62]
SUNDIALS and hypre: Exascale-Capable Libraries for Adaptive ...
Jan 17, 2023 · The ECP SUNDIALS-hypre project provides two numerical libraries that offer vendor-agnostic, GPU-accelerated performance and CPU support. Both ...Sundials · The Hypre Project · A Focus On Multigrid Methods
[63]
A matrix preconditioning framework for physics-informed neural ...
Sep 10, 2025 · Abstract. Physics-informed neural networks (PINNs) have recently emerged as a popular approach for solving forward and inverse problems ...
[64]
[PDF] Generative modeling of Sparse Approximate Inverse Preconditioners
In this paper, we propose a deep learning based generative model for constructing a. SPAI preconditioner, P, for a given SPD matrix A arising from the finite ...
[65]
[PDF] Preconditioning least-squares migration with a deep - GeoConvention
After an iterative training process, we combine the learned inverse Hessian approximation with the dimensionality reduction characteristics of the autoencoder ...<|control11|><|separator|>