Fixed-point iteration
Fixed-point iteration is a fundamental method in numerical analysis for computing fixed points of a function g, defined as points p where g(p) = p.[1] This technique is widely used to approximate solutions to nonlinear equations of the form f(x) = 0 by reformulating them into an equivalent fixed-point problem, such as g(x) = x - f(x), where the roots of f correspond to the fixed points of g.[2] The algorithm begins with an initial approximation x_0 and generates a sequence of iterates via x_{n+1} = g(x_n) until the values stabilize within a specified tolerance or a maximum number of iterations is reached.[3] This iterative process relies on the function g being well-behaved, and practical implementations often include stopping criteria based on the difference between consecutive iterates, such as |x_{n+1} - x_n| < \epsilon.[1] Convergence of the method to a unique fixed point is ensured if g is continuous on a closed interval [a, b], maps [a, b] into itself, and satisfies |g'(x)| \leq k < 1 for some constant k and all x \in [a, b].[3] Under these conditions, the error decreases linearly, with bounds like |x_n - p| \leq k^n |x_0 - p|, making the method reliable for sufficiently contractive mappings.[2] The choice of g is critical, as different rearrangements of the original equation can affect the convergence rate and stability.[1]Fundamentals
Definition
Fixed-point iteration is a fundamental numerical method in applied mathematics used to approximate solutions to equations by identifying fixed points of a suitably chosen function. Given a function g: D \to D, where D is a subset of a metric space (X, d), a fixed point x^* \in D satisfies g(x^*) = x^*.[4] Here, the metric space (X, d) consists of a set X with a distance function d: X \times X \to [0, \infty) that obeys the properties of non-negativity (with d(x, y) = 0 if and only if x = y), symmetry (d(x, y) = d(y, x)), and the triangle inequality (d(x, z) \leq d(x, y) + d(y, z)).[5] This setup generalizes the concept beyond real numbers, allowing application in abstract spaces while assuming familiarity with basic functions and sequences.[3] The iteration process starts with an initial guess x_0 \in D and constructs a sequence \{x_n\}_{n=0}^\infty via the recursive formula x_{n+1} = g(x_n), \quad n = 0, 1, 2, \dots The objective is for this sequence to converge to a fixed point x^*, providing an approximation to the solution of the original problem.[1] Fixed points often arise from reformulating equations; for instance, to solve f(x) = 0, one may define g(x) = x - f(x), so that roots of f correspond to fixed points of g.[6] The origins of fixed-point iteration trace to the method of successive approximations, first introduced by Joseph Liouville in 1837 and systematically advanced by Émile Picard around 1890, particularly for establishing existence of solutions to differential equations through iterative schemes.[7] As a dedicated numerical technique for root-finding, it emerged prominently in the early 20th century, bolstered by theoretical advancements such as L. E. J. Brouwer's 1912 theorem guaranteeing the existence of fixed points for continuous self-maps on compact convex subsets of Euclidean space.[8]Basic Examples
To illustrate fixed-point iteration, consider the general process of generating a sequence where each term is obtained by applying a function g to the previous one, such that the limit satisfies the desired fixed-point equation if it converges.[9] A classic introductory example is the Babylonian method for computing square roots, which reformulates the equation x^2 = 2 (or more generally x^2 = s) as the fixed-point problem x = g(x) with g(x) = \frac{1}{2} \left( x + \frac{2}{x} \right). This method, known since ancient Babylonian times around 1500 BC, provides rapid approximations through iteration.[10] Starting with an initial guess x_0 = 1, x_1 = \frac{1}{2} \left( 1 + \frac{2}{1} \right) = 1.5, x_2 = \frac{1}{2} \left( 1.5 + \frac{2}{1.5} \right) \approx 1.4167, x_3 = \frac{1}{2} \left( 1.4167 + \frac{2}{1.4167} \right) \approx 1.4142. The sequence approaches \sqrt{2} \approx 1.414213562, demonstrating how the iterates refine the solution.[10] For a linear equation ax + b = 0, the fixed-point form can be constructed by selecting a suitable g(x) to promote convergence, such as through relaxation techniques. Consider the equation $5x - 10 = 0, which has solution x = 2. One reformulation is x = g(x) with g(x) = 0.5x + 1, derived from x_{n+1} = x_n - \omega (5x_n - 10) using relaxation parameter \omega = 0.1. Starting from x_0 = 0, x_1 = 0.5 \cdot 0 + 1 = 1, x_2 = 0.5 \cdot 1 + 1 = 1.5, x_3 = 0.5 \cdot 1.5 + 1 = 1.75, x_4 = 0.5 \cdot 1.75 + 1 = 1.875. The terms steadily approach 2, illustrating the iterative refinement for linear problems. The choice of g significantly affects the iteration for a given equation. For instance, the transcendental equation x = \cos x (in radians) can be solved directly via x_{n+1} = \cos x_n, starting from x_0 = 0, yielding iterates that converge to the fixed point known as the Dottie number, approximately 0.739085.[9] Alternative g functions for the same equation might involve trigonometric identities or approximations, potentially altering the sequence's path. To build intuition for these iterations, plotting the sequence values against n reveals the approach to the fixed point, while a basic cobweb diagram—constructed by drawing vertical lines from points on y = g(x) to y = x and horizontal lines back to y = g(x)—visually traces the trajectory of the iterates.[11]Convergence Analysis
Attracting and Repelling Fixed Points
In fixed-point iteration, fixed points of a function g are classified as attracting, repelling, or neutral based on the local behavior of iterates near the fixed point x^*, where g(x^*) = x^*, assuming g is differentiable at x^*. This classification relies on the magnitude of the derivative g'(x^*) to determine stability.[12] An attracting fixed point occurs when |g'(x^*)| < 1. In this case, iterates starting sufficiently close to x^* converge to it, as small perturbations diminish over iterations. To derive this condition, consider the error e_n = x_n - x^* in the iteration x_{n+1} = g(x_n). Using the first-order Taylor expansion around x^*, g(x_n) = g(x^*) + g'(x^*)(x_n - x^*) + o(|x_n - x^*|), which simplifies to x_{n+1} - x^* \approx g'(x^*)(x_n - x^*), or e_{n+1} \approx g'(x^*) e_n. Iterating this yields e_{n} \approx [g'(x^*)]^n e_0, a geometric series that converges to zero as n \to \infty if |g'(x^*)| < 1.[12][13] Conversely, a repelling fixed point arises when |g'(x^*)| > 1. Here, the same Taylor approximation shows that errors amplify, |e_{n+1}| \approx |g'(x^*)| |e_n|, causing iterates to diverge from x^* unless starting exactly at it. This instability makes repelling fixed points unsuitable for reliable convergence in fixed-point iteration.[12][14] When |g'(x^*)| = 1, the fixed point is neutral, and the behavior is more subtle, depending on higher-order terms in the Taylor expansion or the specific form of g. For instance, it may be weakly attracting from one side and repelling from the other, or exhibit slower convergence without clear attraction or repulsion. Analysis requires examining the sign of g'(x^*) and subsequent derivatives to resolve these cases.[14] For attracting fixed points, the convergence is linear, with asymptotic error constant \mu = |g'(x^*)|, meaning the relative error reduces by a factor of \mu < 1 per iteration. The number of accurate decimal digits gained per step is approximately -\log_{10} \mu, providing a practical measure of efficiency. Smaller \mu yields faster convergence, though values too close to 1 result in sluggish progress.[13][12]Banach Fixed-Point Theorem
The Banach fixed-point theorem, also known as the contraction mapping theorem, states that if (X, d) is a complete metric space and g: X \to X is a contraction mapping—meaning there exists a constant k with $0 \leq k < 1 such that d(g(x), g(y)) \leq k \, d(x, y) for all x, y \in X—then g has a unique fixed point x^* \in X (i.e., g(x^*) = x^*), and the sequence defined by x_{n+1} = g(x_n) for any initial x_0 \in X converges to x^* as n \to \infty.[15] This result provides a powerful guarantee of global convergence for fixed-point iterations under the contraction condition, distinguishing it from local convergence analyses.[15] The theorem was formulated by Stefan Banach in his 1922 paper "Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales," published in Fundamenta Mathematicae, building on earlier ideas of contractions in the context of integral equations by Émile Picard and others.[16][17] Banach's work, stemming from his 1920 doctoral dissertation, extended these concepts to abstract complete metric spaces, providing both existence and uniqueness with a constructive iterative proof, a significant advancement over prior existential fixed-point results like those of Henri Poincaré (1886) and Luitzen E. J. Brouwer (1909–1912).[17] To outline the proof, consider the sequence \{x_n\} generated by x_{n+1} = g(x_n) starting from arbitrary x_0 \in X. The contraction property implies d(x_{n+1}, x_n) \leq k^n d(x_1, x_0) for all n \geq 1, so the partial sums \sum_{i=m}^{n-1} d(x_{i+1}, x_i) \leq d(x_m, x_n) \leq k^m d(x_1, x_0)/(1-k) for n > m, showing \{x_n\} is Cauchy; by completeness of X, it converges to some x^* \in X. Continuity of g (as a contraction) yields g(x^*) = x^*. For uniqueness, suppose y^* \neq x^* is another fixed point; then d(x^*, y^*) = d(g(x^*), g(y^*)) \leq k \, d(x^*, y^*), implying d(x^*, y^*) (1 - k) \leq 0, so x^* = y^* since k < 1.[15] A key application is the Picard–Lindelöf theorem on existence and uniqueness of solutions to initial value problems for ordinary differential equations. For y' = f(t, y) with y(t_0) = y_0, where f is continuous and Lipschitz in y on a suitable rectangle, the integral operator T(y)(t) = y_0 + \int_{t_0}^t f(s, y(s)) \, ds on a closed ball in the space of continuous functions is a contraction for small time intervals, yielding a unique fixed point as the solution by the Banach theorem.[18]General Convergence Conditions
A sufficient condition for the convergence of fixed-point iteration x_{n+1} = g(x_n) to a fixed point x^* is that g is differentiable and |g'(x)| \leq k < 1 for all x in a closed interval [a, b] containing x^* and the initial guess x_0.[19] This follows from the mean value theorem, which implies that the error satisfies |x_{n+1} - x^*| \leq k |x_n - x^*|, ensuring linear convergence with rate k from any starting point in the interval.[2] For local convergence near x^*, it suffices that |g'(x)| \leq k < 1 in a small neighborhood around x^*, provided x_0 lies within this region.[19] If g is monotonically increasing or decreasing on the interval and maps it into itself with |g'(x)| < 1, the iterates converge monotonically to x^* without overshooting.[19] For example, for g(x) = (x^3 + 2)/7 on [0, 1], which is increasing and satisfies the derivative condition, the sequence approaches the fixed point from below.[19] Convergence fails if |g'(x^*)| > 1, leading to divergence as errors amplify. For instance, with g(x) = 3 + 2 \sin x, the fixed point is approximately x^* \approx 3.094, but |g'(x^*)| \approx 1.998 > 1, causing the iterates to diverge from any initial guess.[20] If g'(x^*) < -1, the sequence may oscillate with increasing amplitude. Even when |g'(x^*)| = 1, convergence is not guaranteed and often fails due to insufficient error reduction.[19] Higher-order convergence occurs if g'(x^*) = 0, resulting in at least quadratic rates where the error e_{n+1} \approx C e_n^2 for some constant C > 0.[21] This is derived from the Taylor expansion around x^*, where the linear term vanishes and the quadratic term dominates.[21] To verify these conditions practically, one can analytically compute or bound |g'(x)| over the relevant interval to check if it is less than 1.[2] Numerically, iterate the method from test initial points and monitor the error or residual |x_{n+1} - x_n| for stabilization, while plotting g'(x) to assess the derivative behavior near suspected fixed points.[19]Iterative Methods
Construction of Iterative Methods
Fixed-point iteration is a fundamental technique for solving nonlinear equations of the form f(x) = 0, where the goal is to find a root x^* such that f(x^*) = 0. The general strategy involves reformulating the equation as x = g(x), where g is a suitably chosen function whose fixed point coincides with the root of f. This rearrangement transforms the root-finding problem into an iterative process defined by x_{n+1} = g(x_n), starting from an initial guess x_0 sufficiently close to x^*.[22] A critical aspect of constructing an effective g is selecting a form that promotes convergence. For local convergence near the fixed point x^*, the derivative must satisfy |g'(x)| < 1 throughout a neighborhood of x^*, ensuring that each iteration reduces the error. Additionally, the choice of g should avoid inducing periodic cycles, where the sequence oscillates between points without approaching x^*, which can occur if |g'(x)| \geq 1 at certain points or if the mapping is not contractive.[2] For polynomial equations, rearrangement techniques focus on isolating the variable x in a way that emphasizes dominant terms, such as the highest-degree or linear components, to define g(x). For instance, given a polynomial f(x) = a_k x^k + \cdots + a_1 x + a_0 = 0, one may solve for x by dividing through by the leading coefficient and expressing lower-order terms as a function of the higher ones, or vice versa, while verifying the contraction condition on g'. This approach balances computational simplicity with the need for |g'(x)| < 1 near the anticipated root.[23] In the multivariable case, fixed-point iteration extends naturally to systems \mathbf{F}(\mathbf{x}) = \mathbf{0} where \mathbf{x} \in \mathbb{R}^n. The construction rewrites the system as \mathbf{x} = \mathbf{g}(\mathbf{x}), with \mathbf{g}: \mathbb{R}^n \to \mathbb{R}^n, and proceeds by vector iteration \mathbf{x}_{n+1} = \mathbf{g}(\mathbf{x}_n). Convergence requires that the spectral radius of the Jacobian matrix D\mathbf{g}(\mathbf{x}) be less than 1 in a neighborhood of the fixed point, often checked using a suitable vector norm.[24] A common pitfall in constructing g is selecting a rearrangement where |g'(x^*)| > 1, leading to divergence as errors amplify with each step. For example, consider the quadratic equation f(x) = x^2 - 4x + 3.5 = 0, with roots approximately 1.293 and 2.707. Rearranging as x = g(x) = x - f(x) = -x^2 + 5x - 3.5 yields fixed points at these roots, but |g'(1.293)| \approx 2.414 > 1, so iterations starting near 1.293 diverge, while those near 2.707 converge since |g'(2.707)| \approx 0.414 < 1. This illustrates how the same equation can yield both attracting and repelling fixed points depending on the local derivative behavior.[2]Specific Iterative Method Examples
One prominent example of a fixed-point iteration for root-finding is the Newton-Raphson method, which reformulates the root-finding problem f(x) = 0 as finding a fixed point of the function g(x) = x - \frac{f(x)}{f'(x)}.[25] This iteration generates a sequence x_{k+1} = g(x_k), and under the condition that f'(x^*) \neq 0 at the root x^*, it exhibits quadratic convergence, meaning the error typically squares with each iteration once sufficiently close to the root.[26] Another derivative-free approach is the secant method, which approximates the derivative using finite differences and can be viewed as a fixed-point iteration g(x_n, x_{n-1}) = x_n - f(x_n) \frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}, requiring two initial guesses and updating with previous iterates.[27] This method achieves superlinear convergence of order approximately 1.618 (the golden ratio), offering efficiency without explicit derivative computations, though it may require more function evaluations per iteration than Newton-Raphson.[27] For solving systems of linear equations Ax = b, where A = L + D + U with D diagonal, L strictly lower triangular, and U strictly upper triangular, the Jacobi method defines the iteration x^{(k+1)} = D^{-1} (b - (L + U) x^{(k)}), treating it as a fixed-point problem with the iteration matrix -D^{-1}(L + U).[28] The Gauss-Seidel method modifies this by incorporating newly computed components immediately, yielding x^{(k+1)} = D^{-1} (b - L x^{(k+1)} - U x^{(k)}), which often converges faster for diagonally dominant matrices due to reduced information lag.[28] Iteration counts for these methods vary by problem conditioning; for instance, Newton-Raphson typically requires 5-10 iterations for machine precision on well-behaved nonlinear functions, while secant may need 8-15 due to its slightly slower order.[29] Error estimates follow from convergence theory: for Newton-Raphson, the asymptotic error constant involves \frac{|f''(x^*)|}{2|f'(x^*)|}, enabling quadratic reduction; secant errors scale with a constant related to the second derivative.[26] For Jacobi and Gauss-Seidel, convergence occurs if the spectral radius of the iteration matrix is less than 1, with error norms decreasing geometrically at rate equal to that radius, often yielding 20-50 iterations for sparse, diagonally dominant systems.[30]| Method | Pros | Cons |
|---|---|---|
| Newton-Raphson | Quadratic convergence; efficient near root | Requires derivative computation; sensitive to poor initial guesses |
| Secant | No derivatives needed; superlinear order | Needs two initial points; irregular convergence behavior |
| Jacobi | Simple parallelizable implementation | Slower convergence; requires storing previous iterate fully |
| Gauss-Seidel | Faster than Jacobi for many matrices | Sequential updates limit parallelism; may diverge if not diagonally dominant |