Fact-checked by Grok 2 weeks ago

Steffensen's method

Steffensen's method is a derivative-free iterative algorithm for solving nonlinear equations of the form f(x) = 0, where f is a sufficiently smooth function, achieving quadratic convergence order similar to Newton's method but without requiring derivative computations.^[1] The method, proposed by Danish mathematician Johan Frederik Steffensen in his 1933 paper "Remarks on Iteration," approximates the Newton step using a finite difference scheme based on function values alone, specifically through the iteration x_{n+1} = x_n - \frac{[f(x_n)]^2}{f(x_n + f(x_n)) - f(x_n)}, which requires two function evaluations per step.^[2]^[1] This approach stems from applying Aitken's \Delta^2 acceleration technique to fixed-point iterations, enhancing convergence for problems where explicit derivatives are unavailable or costly to obtain.^[3] Developed amid early 20th-century advances in numerical analysis, Steffensen's method addressed limitations in primitive iterative schemes by combining three successive approximations to extrapolate toward the root, as detailed in Steffensen's original work published in Skandinavisk Aktuarietidskrift.^[2] Its quadratic convergence—meaning the error roughly squares with each iteration under suitable conditions, such as f being twice continuously differentiable near the root and f'(r) \neq 0 where r is the root—makes it efficient for scalar equations, though it demands careful initial guess selection to avoid divergence or slow progress.^[1] Compared to the secant method, which also avoids derivatives but converges more slowly (order approximately 1.618), Steffensen's method offers superior efficiency index (around 1.414, balancing convergence order and evaluations) while remaining robust for a wide class of nonlinear problems in applied mathematics and engineering.^[1] Over time, the method has inspired numerous extensions, including multipoint variants for higher-order convergence (up to eighth order in some families) and multivariate adaptations for systems of equations, often incorporating divided differences or weight functions to maintain derivative-free properties.^[4] These developments, explored in modern numerical analysis literature, highlight its versatility, such as in optimization where gradients may be noisy or absent, and in stochastic settings for machine learning applications requiring stable root-finding without second-order information.^[5] Despite its strengths, challenges persist, including potential instability if the denominator nears zero, prompting safeguards like fallback to simpler iterations in implementations.^[3] Overall, Steffensen's method remains a cornerstone of derivative-free optimization and root-finding techniques, valued for its balance of simplicity, efficiency, and broad applicability.^[1]

Introduction

Overview

Steffensen's method is an iterative numerical algorithm designed to find roots of the equation f(x) = 0 by accelerating the convergence of fixed-point iteration applied to a function g(x) satisfying f(x) = x - g(x).^[6] This approach transforms the root-finding problem into locating a fixed point where x = g(x), leveraging sequence acceleration techniques to improve efficiency.^[7] The basic workflow starts with an initial approximation x_0 to the root. Each subsequent iterate is generated by computing auxiliary values based on g and using a finite-difference approximation to refine the estimate, effectively achieving quadratic convergence without requiring derivative evaluations of f.^[1] Unlike Newton's method, which relies on explicit derivatives for its quadratic convergence, Steffensen's method operates in a fully derivative-free manner, making it particularly useful when derivatives are unavailable or costly to compute.^[8] As a brief illustration, consider solving x^2 - 2 = 0, whose positive root is \sqrt{2} \approx 1.414. One possible fixed-point formulation is g(x) = 2/x. Beginning with x_0 = 1, standard fixed-point iteration oscillates between 1 and 2, converging slowly. Steffensen's method, by incorporating an acceleration step that estimates the limit from these early iterates, produces a refined approximation much closer to \sqrt{2} in the first iteration, demonstrating its ability to overcome linear convergence limitations.

Historical Context

Steffensen's method was developed by the Danish mathematician and actuary Johan Frederik Steffensen (1873–1961), who introduced it in his 1933 paper "Remarks on Iteration," published in the Skandinavisk Aktuarietidskrift.^[9] Steffensen, a professor of insurance mathematics at the University of Copenhagen from 1923 to 1943, made significant contributions to numerical analysis during the interwar period, building on his earlier work in interpolation and actuarial computations as detailed in his 1927 book Interpolation.^[10] His research in Denmark emphasized practical numerical techniques for solving equations in insurance and probability contexts, where efficient approximations were essential.^[10] The method emerged amid broader advancements in convergence acceleration techniques during the 1920s and 1930s, a period marked by efforts to enhance iterative processes for numerical computations. Steffensen's approach was influenced by earlier extrapolation methods, such as Alexander Aitken's delta-squared process introduced in 1926, which aimed to accelerate slowly converging sequences. The initial motivation for Steffensen's method was to improve the convergence of fixed-point iterations used for finding roots of polynomials and performing derivative-free approximations, addressing limitations in manual calculations where derivative evaluations were impractical or unavailable.^[9] With the advent of digital computers after World War II, Steffensen's method saw increased use in numerical computation due to its simplicity and quadratic convergence properties.^[10] It has since become a foundational tool in derivative-free optimization, with no substantial modifications to the core algorithm, maintaining its relevance in computational mathematics due to its efficiency in environments lacking derivative information.^[10]

Background Concepts

Fixed-Point Iteration

Fixed-point iteration is a fundamental technique in numerical analysis for solving nonlinear equations of the form f(x) = 0. The method reformulates the problem by constructing a function g(x) such that a root of f(x) corresponds to a fixed point of g, where g(p) = p. A common choice is g(x) = x - f(x), and the iteration proceeds as x_{n+1} = g(x_n) starting from an initial guess x_0 sufficiently close to the fixed point p. This generates a sequence \{x_n\} that ideally converges to p. Convergence of the fixed-point iteration requires that |g'(x)| < 1 in a neighborhood of the fixed point p, ensuring the sequence approaches p at a linear rate, meaning the error e_{n+1} \approx k e_n for some constant $0 < k < 1. If |g'(p)| \geq 1, the iteration may fail to converge or diverge. A simple example illustrates the method applied to f(x) = x^2 - 2 = 0, whose positive root is \sqrt{2} \approx 1.414213562. Choosing g(x) = \frac{x + 2/x}{2} (a variant of the Babylonian method for square roots), start with x_0 = 1:
x_1 = \frac{1 + 2/1}{2} = 1.5,
x_2 = \frac{1.5 + 2/1.5}{2} \approx 1.416667,
x_3 = \frac{1.416667 + 2/1.416667}{2} \approx 1.414216,
x_4 \approx 1.414213562.
The sequence converges to \sqrt{2}, but in general fixed-point iterations exhibit slow linear convergence unless the choice of g yields g'(p) = 0. The method's limitations include its linear order of convergence (order 1), which requires many iterations for high precision, especially when |g'(p)| is close to 1. Convergence is highly sensitive to the choice of g(x); poor selections can lead to divergence if |g'(p)| \geq 1. Aitken's delta-squared process can accelerate such linearly convergent sequences. The roots of fixed-point iteration trace back to the Picard–Lindelöf theorem in the 1890s, which employs successive approximations to prove existence and uniqueness of solutions to ordinary differential equations. Practical applications in numerical methods for root-finding became prominent in the early 20th century.

Aitken's Delta-Squared Process

Aitken's delta-squared process, developed by Alexander C. Aitken in 1926, serves as a foundational technique for accelerating the convergence of sequences in statistical and numerical analysis.^[11] This method targets sequences exhibiting linear convergence and employs forward differences to extrapolate a more accurate approximation to the limit in a single step.^[11] For a sequence \{x_n\} assumed to converge linearly to a limit L, the process defines the first forward difference as \Delta x_n = x_{n+1} - x_n and the second forward difference as \Delta^2 x_n = \Delta x_{n+1} - \Delta x_n = x_{n+2} - 2x_{n+1} + x_n. The accelerated value \hat{x}_n is then computed using the formula

\hat{x}_n = x_n - \frac{(\Delta x_n)^2}{\Delta^2 x_n}.

This expression can also be rewritten in determinant form as

\hat{x}_n = \frac{x_n x_{n+2} - x_{n+1}^2}{x_{n+2} - 2x_{n+1} + x_n},

which avoids explicit computation of differences and enhances numerical stability.^[11] The method relies on the assumption that the sequence converges linearly, meaning the error satisfies e_n = x_n - L = \alpha r^n + o(r^n) for some constants \alpha \neq 0 and $0 < |r| < 1.^[3] To derive this formula under the linear convergence assumption, substitute x_k = L + \alpha r^k + o(r^k) for k = n, n+1, n+2. The first difference becomes

\Delta x_n = x_{n+1} - x_n = \alpha r^n (r - 1) + o(r^n),

and the second difference is

\Delta^2 x_n = \alpha r^n (r - 1)^2 + o(r^n).

Then,

\frac{(\Delta x_n)^2}{\Delta^2 x_n} = \frac{[\alpha r^n (r - 1)]^2 + o(r^{2n})}{\alpha r^n (r - 1)^2 + o(r^n)} = \alpha r^n + o(r^n),

since the higher-order terms o(r^{2n}) are negligible compared to o(r^n) for |r| < 1. Thus,

\hat{x}_n = x_n - [\alpha r^n + o(r^n)] = L + \alpha r^n + o(r^n) - \alpha r^n - o(r^n) = L + o(r^n).

This eliminates the leading linear error term, achieving quadratic convergence from the original linear rate.^[3] The process assumes strictly linear convergence; if the error contains higher-order terms (e.g., quadratic or faster), the acceleration may fail or introduce inaccuracies, as the formula does not account for such deviations.^[11] A illustrative example is the sequence x_n = 1 - (1/2)^n for n \geq 0, which converges linearly to L = 1 with error -(1/2)^n. The first few terms are x_0 = 0, x_1 = 0.5, x_2 = 0.75. Here, \Delta x_0 = 0.5, \Delta^2 x_0 = 0.75 - 2 \cdot 0.5 + 0 = -0.25, so

\hat{x}_0 = 0 - \frac{(0.5)^2}{-0.25} = 0 - \frac{0.25}{-0.25} = 1,

yielding the exact limit in one application. Subsequent applications to the accelerated sequence maintain this precision, demonstrating the method's effectiveness for purely linear errors.^[11]

Method Formulation

Iteration Formula

Steffensen's method updates the iterate x_n to x_{n+1} by applying Aitken's delta-squared extrapolation to the fixed-point iteration x_{n+1} = g(x_n), where g is a continuous function satisfying g(\alpha) = \alpha at the desired fixed point \alpha, often constructed as g(x) = x - f(x) for solving f(\alpha) = 0 with f continuous near \alpha. The core iteration formula is

x_{n+1} = x_n - \frac{(g(x_n) - x_n)^2}{g(g(x_n)) - 2g(x_n) + x_n},

which requires three evaluations of g per step (or equivalently, three of f when using the g(x) = x - f(x) form). This formula arises from Aitken's \Delta^2 process applied to the sequence x_n, y = g(x_n), z = g(y), yielding the extrapolated value without computing additional points. An equivalent and more efficient formulation directly in terms of f, requiring only two evaluations of f per iteration, is

x_{n+1} = x_n - \frac{[f(x_n)]^2}{f(x_n + f(x_n)) - f(x_n)}.

This approximates the Newton-Raphson step using a finite difference for the derivative: \frac{f(x_n + f(x_n)) - f(x_n)}{f(x_n)} \approx f'(x_n). The fixed-point form can be shown to reduce to this under the substitution g(x) = x - f(x), but the direct form avoids the third evaluation by using a forward difference instead of the sequence-based extrapolation. The method assumes an initial guess x_0 close to \alpha, with f continuous and g satisfying the Lipschitz condition in a neighborhood of \alpha to ensure the approximations remain valid.^[12] As an illustrative numerical example, consider solving f(x) = e^x - 3 = 0 (root \ln 3 \approx 1.0986) using the direct formula with initial guess x_0 = 1:

f(x_0) = e^1 - 3 \approx -0.2817,
x_0 + f(x_0) \approx 0.7183,
f(0.7183) \approx e^{0.7183} - 3 \approx -0.9490,
Denominator: -0.9490 - (-0.2817) = -0.6673,
x_1 = 1 - \frac{(-0.2817)^2}{-0.6673} \approx 1 + 0.1191 \approx 1.1191.

Algorithm Steps

To implement Steffensen's method for solving the nonlinear equation f(x) = 0, the efficient direct formulation is recommended. The algorithm proceeds as follows:

Select an initial approximation x_0 close to the expected root.
Specify a tolerance \epsilon > 0 and a maximum number of iterations N.
Set iteration counter n = 0.
While n < N:
- Compute f(x_n).
- If |f(x_n)| < \epsilon, output x_n as the approximate root and terminate.
- Compute u = x_n + f(x_n).
- Compute f(u).
- If the denominator f(u) - f(x_n) is zero or smaller than a small threshold (e.g., $10^{-10}) to avoid numerical instability, set x_{n+1} = x_n + 0.5 f(x_n) (a relaxed step) or fallback to bisection if available, and proceed to step 5.
- Otherwise, compute the next approximation: $x_{n+1} = x_n - \frac{[f(x_n)]^2}{f(u) - f(x_n)}.$
- If |x_{n+1} - x_n| < \epsilon or |f(x_{n+1})| < \epsilon, output x_{n+1} as the approximate root and terminate.
- Set n = n + 1.
If the maximum iterations N is reached without convergence, output a failure message.

The choice of initial x_0 and ensuring the denominator does not vanish are crucial; the method converges quadratically if x_0 is sufficiently close and f'(\alpha) \neq 0. For cases where the fixed-point iteration with g(x) = x - f(x) converges slowly (|g'(\alpha)| \geq 1), reformulate g(x) = x - \alpha f(x) with small \alpha > 0, or use algebraic manipulation to find a suitable g with |g'(\alpha)| < 1. The direct formulation requires two evaluations of f per iteration for the update, with an additional evaluation for the |f(x_{n+1})| check if needed; implementations often count two per iteration for efficiency comparisons.^[3]

Derivation

From Aitken Extrapolation

Steffensen's method can be derived by applying Aitken's delta-squared process to accelerate the convergence of a linearly converging fixed-point iteration sequence.^[13] Consider the fixed-point iteration defined by x_{n+1} = g(x_n), where the sequence \{x_n\} converges linearly to the fixed point \rho satisfying g(\rho) = \rho, assuming |g'(\rho)| < 1.^[13] Aitken's delta-squared process extrapolates a linearly converging sequence \{x_n\} to obtain an improved estimate \hat{x}_n using three consecutive terms:

\hat{x}_n = x_n - \frac{(x_{n+1} - x_n)^2}{x_{n+2} - 2x_{n+1} + x_n}.

This formula assumes the errors in the sequence are asymptotically proportional to a constant factor, leading to faster convergence.^[13] In the context of fixed-point iteration, substitute x_{n+1} = g(x_n) and x_{n+2} = g(x_{n+1}) = g(g(x_n)) into the Aitken formula to yield

\hat{x}_n = x_n - \frac{(g(x_n) - x_n)^2}{g(g(x_n)) - 2g(x_n) + x_n}.

This expression defines the next iterate in Steffensen's method, confirming that the method is precisely the application of Aitken extrapolation to the fixed-point sequence.^[13] Under the assumption that g is twice continuously differentiable with g'(\rho) \neq 1, this substitution accelerates the linear convergence of the original fixed-point iteration to quadratic convergence, provided |g'(\rho)| < 1.^[13] The resulting method, first proposed by J. F. Steffensen in 1933, avoids explicit computation of derivatives while achieving the same order of convergence as Newton's method for root-finding.^[9]

Using Finite Differences

An alternative derivation of Steffensen's method approximates the derivative in Newton's method using a finite-difference quotient, underscoring its derivative-free character. Consider the equation f(x) = 0, and Newton's iteration x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}. To avoid computing f'(x_n), approximate it with the first-order divided difference using step size h = f(x_n):

f'(x_n) \approx \frac{f(x_n + h) - f(x_n)}{h} = \frac{f(x_n + f(x_n)) - f(x_n)}{f(x_n)}.

Substituting this approximation yields

x_{n+1} = x_n - \frac{f(x_n)^2}{f(x_n + f(x_n)) - f(x_n)},

which requires only two function evaluations per iteration.^[14] This approach can be linked to the fixed-point formulation by setting g(x) = x - f(x), where the root satisfies g(x^*) = x^*. The divided difference approximation effectively mimics a derivative-free Newton step for the equation g(x) - x = 0. To see the connection to Aitken's method, note that the denominator f(x_n + f(x_n)) - f(x_n) relates to the second difference in the iterate sequence when applying the acceleration formally, though the forward step h = f(x_n) differs from the backward step that would arise directly from iterating g. The finite-difference perspective highlights how Steffensen's method extends the secant method's use of first-order difference quotients to achieve quadratic convergence without derivatives.^[14]

Convergence Properties

Order of Convergence

Steffensen's method exhibits quadratic convergence under appropriate conditions on the fixed-point function g. Specifically, if g is thrice continuously differentiable in a neighborhood of the fixed point \rho where g(\rho) = \rho, |g'(\rho)| < 1, and g''(\rho) \neq 0, with the initial approximation sufficiently close to \rho, then the method converges with order 2.^[11] The asymptotic error is given by

e_{n+1} \approx \frac{g''(\rho)}{2} e_n^2,

where e_n = x_n - \rho denotes the error at iteration n.^[11] To establish this, consider the Taylor expansion of g around \rho:

g(x) = \rho + g'(\rho) (x - \rho) + \frac{1}{2} g''(\rho) (x - \rho)^2 + O((x - \rho)^3).

Applying the iteration formula x_{n+1} = x_n - \frac{(g(x_n) - x_n)^2}{g(g(x_n)) - 2g(x_n) + x_n} incorporates the Aitken \Delta^2 extrapolation. Substituting the expansions for g(x_n), g(g(x_n)), and the differences shows that the linear error term g'(\rho) e_n is eliminated, leaving a dominant quadratic term proportional to e_n^2 with the specified constant. Higher-order terms confirm the order is precisely 2 when g''(\rho) \neq 0.^[11] In comparison to standard fixed-point iteration, which typically converges linearly with order 1 when |g'(\rho)| < 1, Steffensen's method achieves order 2 akin to Newton's method, but without requiring derivative evaluations.^[11] Convergence may fail or alter under certain conditions. If g'(\rho) = 0, the method can attain cubic convergence, as the linear term vanishes inherently and the Aitken step further reduces the quadratic term. Conversely, if g''(\rho) = 0, the method may achieve cubic or higher convergence, depending on higher derivatives such as g'''(\rho) \neq 0.^[15]

Error Analysis

Steffensen's method exhibits quadratic convergence under suitable conditions on the fixed-point function g(x), where the root \rho satisfies g(\rho) = \rho. Specifically, if |g'(x)| \leq \lambda < 1 for x in an interval containing the initial guess and the root, then the error satisfies |e_{n+1}| \leq K |e_n|^2, where K depends on the bound of g''(x) in that interval.^[16] This a priori bound ensures that the method accelerates the underlying fixed-point iteration toward quadratic rates, provided the initial approximation is sufficiently close to \rho. In practice, each iteration of Steffensen's method requires three evaluations of the fixed-point function g, which can amplify floating-point rounding errors, particularly when g is ill-conditioned or evaluated near machine epsilon. To mitigate this, double precision arithmetic is recommended, especially for problems where small changes in input lead to large variations in g, as the denominator in the iteration formula may exacerbate relative errors.^[6] Convergence may fail through oscillation if |g'(\rho)| is close to 1, causing the iterates to cycle without approaching the root, a behavior inherited from the base fixed-point iteration. Additionally, the basin of attraction for Steffensen's method is generally smaller than that of Newton's method, limiting its robustness for distant initial guesses due to reduced stability in the divided-difference approximation.^[17] A numerical illustration of the error behavior appears in the solution of f(x) = x^3 + 4x^2 - 10 = 0, with root \rho \approx 1.36523 and initial guess x_0 = 1.5, using the fixed-point form g(x) = \sqrt{10 / (x + 4)}. The table below shows the iterates and absolute errors |x - \rho|, demonstrating quadratic decay until limited by machine precision. | Iteration n | x_n | |x_n - \rho| | |-----------------|-----------|----------------------| | 0 | 1.50000 | 0.13477 | | 1 | 1.36527 | 3.97 \times 10^{-5} | | 2 | 1.36523 | 2.81 \times 10^{-12}| This example confirms the rapid error reduction characteristic of quadratic convergence.^[18]

Advantages and Limitations

Computational Benefits

Steffensen's method is a derivative-free root-finding algorithm, relying solely on evaluations of the function f rather than its derivatives, making it particularly suitable for black-box functions where analytical or numerical differentiation is unavailable or costly. Unlike Newton's method, which requires both f and f' per iteration, Steffensen approximates the derivative using function values alone through an Aitken-like acceleration, enabling its use in scenarios such as simulation-based models or legacy software without gradient access.^[8] The method achieves quadratic convergence with two function evaluations per iteration, yielding an efficiency index of \sqrt{2} \approx 1.414, comparable to Newton's but without derivative computation overhead. This makes it faster than linear methods for well-conditioned problems near the root, as the quadratic order reduces error rapidly once sufficiently close. Steffensen's quadratic convergence generally requires fewer iterations than the secant method (which has superlinear order \approx 1.618 but only one evaluation per iteration) or bisection and fixed-point iterations (linear convergence) for achieving high accuracy in smooth functions.^[8] In practice, Steffensen excels over fixed-point iteration (linear convergence, often requiring many more steps) and bisection (guaranteed but slowly halving error) for local convergence, while matching secant's derivative-free nature but with superior order for faster resolution in well-behaved problems. Its modest per-iteration cost—two f calls versus secant's one—translates to overall efficiency gains due to fewer iterations needed.^[8] Key applications include optimization problems without gradient information, such as parameter tuning in black-box simulations, and resource-constrained environments like embedded systems where derivative approximations would increase computational load. For instance, in real-time control systems, its low overhead supports rapid root-finding for stability analysis without dedicated differentiation routines.^[1]

Potential Drawbacks

The method can fail when the denominator in the iteration formula approaches zero, leading to division by zero or numerical instability; for example, in solving f(x) = x^3 - x - 1 = 0 with initial guess x_0 = 1, the formula yields a zero denominator and diverges.^[19] Steffensen's method exhibits local quadratic convergence similar to Newton's method, but its basin of attraction is relatively small, limiting reliability from distant initial guesses and often necessitating hybrid approaches, such as combining with the bisection method for global root location. Numerical experiments on test functions reveal convergence basins as low as 4.35% of starting points for Steffensen versus nearly 100% for Newton.^[4] Ill-conditioning poses another challenge, particularly when the derivative f'(x) is near zero near the root, as the finite-difference approximation amplifies rounding errors in the update formula. For multiple roots of multiplicity greater than one, the method reduces to linear convergence, unlike the quadratic rate for simple roots; techniques like deflation or modified weight functions are required to mitigate this, though they add complexity.^[20] Steffensen's method is best avoided for stiff nonlinear equations, where Newton's method (with available derivatives) provides more robust handling of ill-conditioned Jacobians, and in high-dimensional systems, where extensions demand O(n^2) evaluations per iteration, rendering it inefficient compared to specialized solvers.^[5]

Implementation

Pseudocode

Steffensen's method is typically implemented in a derivative-free manner to approximate roots of the equation f(x) = 0. The algorithm requires an initial guess x_0, a tolerance \mathrm{tol} for checking |f(x)| < \mathrm{tol} or |x_{new} - x| < \mathrm{tol}, and a maximum number of iterations \mathrm{maxiter} (default 100) to prevent infinite loops. Safeguards include verifying that the denominator in the update formula exceeds a small epsilon (e.g., $10^{-14}) to avoid division by zero, and logging the iteration count for monitoring. The pseudocode below outlines the procedure in a language-agnostic form, assuming f is a function handle that evaluates the given function and its scalar input. It uses the standard iteration formula and checks convergence on both |f(x)| and the step size.

Algorithm: Steffensen's Method
Input: Function f, initial approximation x0, tolerance tol, maximum iterations maxiter (default 100)
Output: Approximate root x, or failure message
1. Set epsilon = 1e-14
2. Set i = 0
3. Set x = x0
4. While i < maxiter do
   5.   Compute fx = f(x)
   6.   If |fx| < tol then
         6a.  Output x (approximate root)
         6b.  Output "Converged after" i "iterations"
         6c.  Stop
   7.   Set y = x + fx
   8.   Compute fy = f(y)  // Second evaluation
   9.   Set denominator = fy - fx
   10.  If |denominator| < epsilon then
         10a. Output "Division by zero error"
         10b. Stop
   11.  Set x_new = x - (fx)^2 / denominator
   12.  If |x_new - x| < tol then
         12a. Output x_new (approximate root)
         12b. Output "Converged after" i+1 "iterations"
         12c. Stop
   13.  Set x = x_new
   14.  Set i = i + 1
15. Output "Failure after maxiter iterations"
    Output x (best approximation)
Algorithm: Steffensen's Method
Input: Function f, initial approximation x0, tolerance tol, maximum iterations maxiter (default 100)
Output: Approximate root x, or failure message
1. Set epsilon = 1e-14
2. Set i = 0
3. Set x = x0
4. While i < maxiter do
   5.   Compute fx = f(x)
   6.   If |fx| < tol then
         6a.  Output x (approximate root)
         6b.  Output "Converged after" i "iterations"
         6c.  Stop
   7.   Set y = x + fx
   8.   Compute fy = f(y)  // Second evaluation
   9.   Set denominator = fy - fx
   10.  If |denominator| < epsilon then
         10a. Output "Division by zero error"
         10b. Stop
   11.  Set x_new = x - (fx)^2 / denominator
   12.  If |x_new - x| < tol then
         12a. Output x_new (approximate root)
         12b. Output "Converged after" i+1 "iterations"
         12c. Stop
   13.  Set x = x_new
   14.  Set i = i + 1
15. Output "Failure after maxiter iterations"
    Output x (best approximation)

This implementation evaluates f twice per iteration, ensuring efficiency while incorporating error handling for numerical stability.

Programming Examples

Steffensen's method can be implemented in programming languages such as Python and MATLAB to solve nonlinear equations numerically. These implementations typically follow the iterative formula x_{n+1} = x_n - \frac{[f(x_n)]^2}{f(x_n + f(x_n)) - f(x_n)}, where convergence is checked against a tolerance on |f(x)| or the change in x. A common test problem is finding the positive root of f(x) = x^2 - 2 = 0, which is \sqrt{2} \approx 1.41421356237, starting from an initial guess x_0 = 1.

Python Example

The following Python function implements Steffensen's method for solving f(x) = 0. It includes iteration counting and stops upon convergence or reaching the maximum iterations.

python
def steffensen(f, x0, tol=1e-6, maxiter=50):
    x = x0
    for i in range(maxiter):
        fx = f(x)
        if abs(fx) < tol:
            return x, i + 1  # Converged within tol on f(x)
        fxpfx = f(x + fx)
        denominator = fxpfx - fx
        if abs(denominator) < tol:
            raise ValueError("Division by small number")
        x_new = x - fx**2 / denominator
        if abs(x_new - x) < tol:
            return x_new, i + 1
        x = x_new
    raise ValueError("Max iterations reached")

# Test example: f(x) = x^2 - 2, x0 = 1
f = lambda x: x**2 - 2
root, iterations = steffensen(f, 1.0)
print(f"Root: {root:.6f}, Iterations: {iterations}")
def steffensen(f, x0, tol=1e-6, maxiter=50):
    x = x0
    for i in range(maxiter):
        fx = f(x)
        if abs(fx) < tol:
            return x, i + 1  # Converged within tol on f(x)
        fxpfx = f(x + fx)
        denominator = fxpfx - fx
        if abs(denominator) < tol:
            raise ValueError("Division by small number")
        x_new = x - fx**2 / denominator
        if abs(x_new - x) < tol:
            return x_new, i + 1
        x = x_new
    raise ValueError("Max iterations reached")

# Test example: f(x) = x^2 - 2, x0 = 1
f = lambda x: x**2 - 2
root, iterations = steffensen(f, 1.0)
print(f"Root: {root:.6f}, Iterations: {iterations}")

When executed, this code yields:

Root: 1.414214, Iterations: 8
Root: 1.414214, Iterations: 8

The method converges in 8 iterations to the root with the given tolerance, demonstrating its quadratic convergence for this quadratic function.

MATLAB Example

A similar implementation in MATLAB is provided below, returning the root and number of iterations.

matlab
function [root, iter] = steffensen(f, x0, tol, maxiter)
    if nargin < 3, tol = 1e-6; end
    if nargin < 4, maxiter = 50; end
    x = x0;
    iter = 0;
    while iter < maxiter
        fx = f(x);
        if abs(fx) < tol
            root = x;
            return;
        end
        fxpfx = f(x + fx);
        denominator = fxpfx - fx;
        if abs(denominator) < tol
            error('Division by small number');
        end
        x_new = x - fx^2 / denominator;
        if abs(x_new - x) < tol
            root = x_new;
            iter = iter + 1;
            return;
        end
        x = x_new;
        iter = iter + 1;
    end
    error('Max iterations reached');
end

% Test example: f(x) = x^2 - 2, x0 = 1
f = @(x) x.^2 - 2;
[root, iter] = steffensen(f, 1.0);
fprintf('Root: %.6f, Iterations: %d\n', root, iter);
function [root, iter] = steffensen(f, x0, tol, maxiter)
    if nargin < 3, tol = 1e-6; end
    if nargin < 4, maxiter = 50; end
    x = x0;
    iter = 0;
    while iter < maxiter
        fx = f(x);
        if abs(fx) < tol
            root = x;
            return;
        end
        fxpfx = f(x + fx);
        denominator = fxpfx - fx;
        if abs(denominator) < tol
            error('Division by small number');
        end
        x_new = x - fx^2 / denominator;
        if abs(x_new - x) < tol
            root = x_new;
            iter = iter + 1;
            return;
        end
        x = x_new;
        iter = iter + 1;
    end
    error('Max iterations reached');
end

% Test example: f(x) = x^2 - 2, x0 = 1
f = @(x) x.^2 - 2;
[root, iter] = steffensen(f, 1.0);
fprintf('Root: %.6f, Iterations: %d\n', root, iter);

Running this produces:

Root: 1.414214, Iterations: 8
Root: 1.414214, Iterations: 8

This confirms convergence in 8 iterations, consistent with the Python example. For improved performance in higher-dimensional or repeated applications, implementations can be vectorized using NumPy in Python or built-in vector operations in MATLAB, though the scalar version suffices for one-dimensional problems.

Generalizations

To Systems of Equations

Steffensen's method extends naturally to systems of nonlinear equations F(\mathbf{x}) = \mathbf{0}, where \mathbf{x} \in \mathbb{R}^m and F: \mathbb{R}^m \to \mathbb{R}^m is a nonlinear mapping. To apply the method, reformulate the problem as a fixed-point equation \mathbf{x} = G(\mathbf{x}), with G(\mathbf{x}) = \mathbf{x} - F(\mathbf{x}). The root \boldsymbol{\rho} satisfies G(\boldsymbol{\rho}) = \boldsymbol{\rho}, and the iteration accelerates the fixed-point process using finite differences to approximate higher-order terms without derivatives. A common extension applies the Aitken \delta^2 process component-wise to the fixed-point sequence, yielding a Jacobian-free scheme. Compute \mathbf{g} = G(\mathbf{x}_n) and \mathbf{h} = G(\mathbf{g}), requiring two evaluations of F. For each component i = 1, \dots, m, form the second finite difference s_i = h_i - 2 g_i + x_{n,i}, which approximates a scaled second derivative along the iteration path. The updated iterate is then

x_{n+1,i} = h_i - \frac{(h_i - g_i)^2}{s_i}.

This component-wise acceleration mimics the scalar case and preserves the structure of Steffensen's method. The method achieves quadratic convergence near the root \boldsymbol{\rho} if the spectral radius of G'(\boldsymbol{\rho}) is less than 1, ensuring the underlying fixed-point iteration is locally attractive while the acceleration eliminates the linear error term. For illustration, consider the 2D system

F(x,y) = \begin{pmatrix} x^2 + y - 2 \\ x + y^2 - 2 \end{pmatrix} = \mathbf{0},

with root (1,1). The fixed-point function is G(x,y) = (2 - x^2 - y, 2 - x - y^2). Starting from an initial guess, compute G(\mathbf{x}_n) and G(G(\mathbf{x}_n)), then apply the component-wise update to both coordinates, yielding quadratic reduction in error near the solution.^[21] An alternative formulation uses matrix approximations to capture inter-component coupling, generalizing the second differences into full matrices built from chained fixed-point iterates \mathbf{f}^{(j)}(\mathbf{x}) = G(\mathbf{f}^{(j-1)}(\mathbf{x})) for j = 1, \dots, m, with \mathbf{f}^{(0)}(\mathbf{x}) = \mathbf{x}. The first-difference matrix has columns \mathbf{f}^{(j)} - \mathbf{f}^{(j-1)}, and the second-difference matrix has columns \mathbf{f}^{(j)} - 2\mathbf{f}^{(j-1)} + \mathbf{f}^{(j-2)}. The update solves a linear system using these matrices to approximate the inverse mapping, requiring m evaluations of F but providing a more accurate approximation for coupled systems. Quadratic convergence holds under similar spectral radius conditions on G'(\boldsymbol{\rho}). The component-wise variant incurs low computational cost with fixed evaluations per iteration, suitable for large m with weak coupling. In contrast, matrix-based approaches scale as O(m^2) operations due to matrix construction and inversion, though still derivative-free; this overhead can limit scalability for high dimensions despite the method's efficiency over full finite-difference Jacobians.

In Banach Spaces

Steffensen's method extends to Banach spaces for solving nonlinear operator equations of the form F(x) = 0, where F: \Omega \subseteq X \to X and X is a Banach space with \Omega an open convex subset. The generalization relies on divided difference operators to approximate the Fréchet derivative of F without requiring its explicit computation, making it suitable for abstract settings where derivatives may be difficult to evaluate. To initiate the iteration, select an initial point x_0 \in \Omega. For each step n \geq 0, compute y_n = x_n + F(x_n), then define a divided difference operator A_n \in L(X) satisfying A_n (y_n - x_n) = F(y_n) - F(x_n). The next iterate is obtained by solving the linear equation A_n z_n = F(x_n) for z_n and setting x_{n+1} = x_n - z_n. This formulation ensures the method remains derivative-free while mimicking Newton's method in finite dimensions.^[22] In the fixed-point context, the equation F(x) = 0 is equivalent to finding a fixed point \rho of the operator G(x) = x - F(x), where G: \Omega \to X. Assume G is Lipschitz continuous with constant L < 1 in a neighborhood of \rho, guaranteeing the existence of a unique fixed point via the Banach fixed-point theorem. The Steffensen iteration accelerates the standard fixed-point scheme x_{n+1} = G(x_n) by incorporating the divided difference approximation, leading to the same update formula as above since F(x_n) = x_n - G(x_n). This setup is particularly useful when the basic fixed-point iteration converges slowly, as the method achieves higher-order convergence under additional regularity assumptions. Quadratic convergence holds locally if F is Fréchet differentiable at the root \rho with F'(\rho) invertible, and F' satisfies a Lipschitz condition \|F'(x) - F'(y)\| \leq K \|x - y\| for some K > 0 in a ball around \rho, provided the initial guess x_0 is sufficiently close to \rho and K \|(F'(\rho))^{-1}\| < 1/2. Under these conditions, the error satisfies \|x_{n+1} - \rho\| \leq C \|x_n - \rho\|^2 for some constant C > 0 depending on F'(\rho) and K. This semilocal convergence theorem, adaptable from scalar cases using majorizing functions, ensures the divided difference A_n approximates F'(\rho) sufficiently well for the quadratic rate. Seminal results trace to analyses in the 1960s and 1970s, with unified frameworks appearing in later works on Newton-like methods in Banach spaces.^[22]^[23] Applications of this generalization arise in solving functional equations and problems involving integral operators, where the infinite-dimensional nature of the space precludes finite-dimensional techniques. For instance, boundary value problems for nonlinear differential equations can be reformulated as fixed-point equations for integral operators in appropriate Banach spaces like C[0,1], and Steffensen's method provides efficient numerical solutions with quadratic convergence when the operator satisfies the required smoothness. Modern extensions incorporate these ideas into predictor-corrector schemes to broaden the convergence basin.

References

[1]
Steffensen type methods for solving nonlinear equations
Finally, numerical tests confirm the theoretical results and allow us to compare these variants with the corresponding methods that make use of derivatives and ...
[2]
Remarks on iteration.
Remarks on Iteration.1. By J. F. Steffensen (Copenhagen). 1. Iteration is an operation with many aspects; we shall here only occupy ourselves with iteration ...
[3]
[PDF] Numerical Analysis and Computing - Joseph M. Mahaffy
Steffensen's Method. Steffensen's Method: Example. 1 of 2. Below we compare a Fixed Point iteration, Newton's Method, and. Steffensen's Method for solving: f (x) ...
[4]
An Algorithm Derivative-Free to Improve the Steffensen-Type Methods
The Steffensen-type methods, defined using divided differences are derivative free, are usually considered to solve these problems when H is a non- ...
[5]
[PDF] Stochastic Steffensen method
note that this is not true for SGD or SLBFGS ...
[6]
[PDF] Acceleration methods for fixed point iterations
Nov 16, 2024 · It turns out that the above formulation of Aitken acceleration for vector sequences is a one-dimensional ver- sion of the RRE extrapolation ...
[7]
[PDF] 2.5 Accelerating Convergence
Steffensen's Method. • Steffensen's Method is a combination of fixed-point iteration and the Aitken's ∆. 2 method: 6. Page 7. Example. Compare Fixed point ...
[8]
Efficient and stable derivative-free Steffensen algorithm for root finding
Mar 13, 2025 · We explore a family of numerical methods, based on the Steffensen divided difference iterative algorithm, that do not evaluate the derivative of the objective ...Missing: original | Show results with:original
[9]
Remarks on iteration.: Scandinavian Actuarial Journal: Vol 1933, No 1
Remarks on iteration. J. F. Steffensen Copenhagen. Pages 64-72 | Published online: 22 Dec 2011. Cite this article; https://doi.org/10.1080/03461238.1933.
[10]
[PDF] Birth of Numerical Analysis
In Wikipedia, numerical analysis is described as that part of mathematics where algorithms for problems of continuous mathematics are studied (as opposed to.
[11]
[PDF] Numerical Analysis, 9th ed.
... Burden. Youngstown State University. J. Douglas Faires. Youngstown State ... Steffensen's Method. Johan Frederik Steffensen. (1873–1961) wrote an ...
[12]
[PDF] Solving Scalar Nonlinear Equations Atkinson Chapter 2, Stoer ...
Steffensen instead suggested the following iteration: (0) Run your fixed-point method until it looks like it's starting to converge. Then set the three most ...
[13]
A New Sixth‐Order Steffensen‐Type Iterative Method for Solving ...
Feb 12, 2014 · The convergence order of this method is six and consists of four evaluations of the function per iteration, so it has an efficiency index equal ...<|control11|><|separator|>
[14]
General efficient class of Steffensen type methods with memory for ...
This proposed class requires one divided difference and inverse of only one matrix per full iteration. We also demonstrate their applicability and illustrate ...
[15]
Convergence and Error Estimate of the Steffensen Method - 2010
Sep 28, 2009 · In this paper, we show that the Steffensen method only uses evaluations of the function but maintains quadratic convergence and can converge. ...<|control11|><|separator|>
[16]
[PDF] On the Adimensional Scale Invariant Steffensen (ASIS) Method. - arXiv
Jun 15, 2021 · We begin by finding a system of bounds for Steffensen's, similar to that of Newton's. 5.1. A system of a priori error estimates for Steffensen's ...
[17]
(PDF) Basins of Attraction for Various Steffensen-Type Methods
Although both Steffensen and Newton methods exhibit quadratic convergence, it has been shown that Steffensen's scheme is less stable than Newton's method, with ...
[18]
[PDF] Numerical Analysis and Computing - Lecture Notes #04
Steffensen's Method. Introduction. “It is rare to have the luxury of quadratic convergence.” (Burden-Faires, p.83). There are a number of methods for squeezing ...
[19]
Improving the Computational Efficiency of a Variant of Steffensen's ...
Mar 26, 2019 · In this paper, a variant of Steffensen's method is proposed which is derivative-free and with memory. In fact, using an acceleration technique ...
[20]
[PDF] Chapter 142 A slight generalization of Steffensen Method for Solving ...
an equation of the form f (x) = 0. Different from. Newton's method, the method we purpose do not require evaluation of derivatives. The method is based.<|separator|>
[21]
3.8 Nonlinear Equations ‣ Areas ‣ Chapter 3 Numerical Methods
If p = 2 , then the convergence is quadratic; if p = 3 , then the convergence is cubic, and so on. An iterative method converges locally to a solution ζ if ...
[22]
[PDF] Numerical Analysis Assignment 1 (due Sep. 25, 2018) - NYU Courant
Sep 25, 2018 · (e) Implement the fixed point iteration method for x = g(x) given above. ... Steffensen's method xk+1 = xk −. [f(xk)]2 f(xk + f(xk)) − f ...<|control11|><|separator|>
[23]
Steffensen's method - Rosetta Code
### Python Code Example for Steffensen's Method
[24]
[PDF] Chapter 2 Solutions of Equations in One Variable - Per-Olof Persson
Fixed Points and Root-Finding. A number p is a fixed point for a given function g if g(p) = p. Given a root-finding problem f(p) = 0, there are many g with.
[25]
Root Finding Algorithms - Gordon College
This function performs a steffensen's iteration to solve f(x) = 0. It constructs a function g(x) = x + a * f(x) so that p = g(p) when f(p) ...<|control11|><|separator|>
[26]
A Class of Steffensen‐Type Iterative Methods for Nonlinear Systems
Apr 10, 2014 · We name this multiplication-rich method as the Steffensen-Schulz iteration since it is derivative-free and inversion-free at the same time. It ...
[27]
https://doi.org/10.1016/j.cam.2013.02.004
[28]
A new convergence theorem for Steffensen's method on Banach ...
A new convergence theorem for Steffensen's method on Banach spaces and applications. Argyros, Ioannis K. Southwest Journal of Pure and Applied Mathematics ...