Bisection method
The bisection method is a root-finding algorithm in numerical analysis that approximates solutions to equations of the form f(x) = 0 for a continuous function f by repeatedly dividing an initial interval in half and selecting the subinterval containing a root, based on the intermediate value theorem.[1] It requires an initial bracketing interval [a, b] where f(a) and f(b) have opposite signs, ensuring at least one root exists within the interval.[2]
One of the earliest developed numerical methods for solving nonlinear equations, the bisection method operates iteratively: it computes the midpoint m = (a + b)/2, evaluates f(m), and if f(m) = 0, terminates with m as the root; otherwise, it replaces either a or b with m based on the sign of f(m) to form a new interval half the size of the previous one.[1] The process continues until the interval width falls below a specified tolerance \epsilon or a maximum number of iterations is exceeded, providing an approximation to the root with guaranteed error bounds.[3]
The method's primary advantages include its simplicity in implementation, minimal assumptions on the function beyond continuity, and assured convergence to a root within the initial bracket, with the error halving at each step such that |p_n - p| \leq (b - a)/2^n, where p_n is the approximation after n iterations and p is the true root.[4] However, it exhibits linear convergence with an asymptotic rate of $1/2, making it slower than alternatives like Newton's method for well-behaved functions, and it requires prior knowledge of a bracketing interval, which may not always be readily available.[3] Despite these limitations, its reliability has made it a foundational tool in computational mathematics, often used for verification or in hybrid algorithms.[1]
Foundations
Mathematical Prerequisites
The bisection method relies on fundamental concepts from real analysis, particularly the behavior of continuous functions defined on closed and bounded intervals. A function f is continuous on a closed interval [a, b] if it is continuous at every point in the open interval (a, b), continuous from the right at a, and continuous from the left at b. This means that for every point c in [a, b], the limit of f(x) as x approaches c exists and equals f(c). Continuous functions on such intervals exhibit key properties, including the preservation of connectedness (mapping connected sets to connected sets) and the attainment of maximum and minimum values, as guaranteed by the Extreme Value Theorem.[5] Polynomials and rational functions are examples of continuous functions where defined, making them suitable for analysis on closed intervals.[5]
A root, or zero, of a function f is a value x = r such that f(r) = [0](/page/0). In the context of root-finding methods like bisection, identifying such roots is essential for solving equations of the form f(x) = [0](/page/0).[6]
Central to the bisection method is the Intermediate Value Theorem (IVT), which provides a guarantee for the existence of roots under certain conditions. The IVT states: If f is continuous on the closed interval [a, b] with f(a) < [0](/page/0) and f(b) > [0](/page/0), then there exists at least one c \in (a, b) such that f(c) = [0](/page/0). Intuitively, this theorem captures the idea that a continuous function cannot "jump" over values between f(a) and f(b) without attaining them; it must pass through every intermediate value, reflecting the dense and complete nature of the real numbers. For instance, unlike functions over the rationals, where gaps can allow skipping zero (e.g., a step function around \sqrt{2}), continuity on reals ensures no such skips occur.[7]
A proof sketch of the IVT relies on the completeness axiom of the real numbers, which asserts that every nonempty subset of \mathbb{R} bounded above has a least upper bound. Consider the set A = \{ t \in [a, b] : f(t) \leq 0 \}, which is nonempty (as a \in A) and bounded above by b. By completeness, A has a supremum x = \sup A. To show f(x) = 0, assume otherwise: if f(x) < 0, continuity implies a neighborhood around x where f remains negative, contradicting x as the supremum; if f(x) > 0, a neighborhood where f is positive implies points left of x in A, again a contradiction. Thus, f(x) = 0.[7]
For the IVT to apply in root-finding, the function must change signs at the endpoints of the initial interval, meaning f(a) \cdot f(b) < 0. This condition ensures one endpoint is negative and the other positive (or vice versa), bracketing a root whose existence is guaranteed by the IVT.[7]
Initial Setup and Assumptions
To apply the bisection method, the function f(x) must be continuous on the closed interval [a_0, b_0], ensuring the existence of at least one root within the interval by the Intermediate Value Theorem if f(a_0) and f(b_0) have opposite signs.[8][3][9]
The initial bracketing interval [a_0, b_0] is selected such that f(a_0) \cdot f(b_0) < 0, confirming a sign change that brackets a root; this requires evaluating f at the endpoints to verify the condition before proceeding.[8][3] If f(a_0) = 0, then a_0 is the root and the process terminates immediately; similarly, if f(b_0) = 0, b_0 is the root.[8][3][9]
A tolerance parameter \epsilon > 0 is specified as a stopping criterion, typically to control the precision of the root approximation, such as ensuring the interval width falls below \epsilon.[3][9] The initial interval length b_0 - a_0 serves as the starting error estimate for the root's location, providing a baseline for assessing convergence progress.[8][3]
Considerations for function evaluation costs are essential, as each setup step and subsequent iterations demand computing f(x) at specific points, with the initial evaluations at a_0 and b_0 establishing the feasibility of the bracket.[8][9]
Algorithm
Step-by-Step Procedure
The bisection method proceeds iteratively by repeatedly bisecting an initial interval [a_0, b_0] that contains a root of the continuous function f, where f(a_0) and f(b_0) have opposite signs, guaranteeing the existence of at least one root by the intermediate value theorem.[10][4]
At each iteration n, compute the midpoint c_n = \frac{a_n + b_n}{2}.[11][4] Then, evaluate f(c_n). If f(a_n) \cdot f(c_n) < 0, the sign change indicates the root lies in the left subinterval, so update the bounds to a_{n+1} = a_n and b_{n+1} = c_n. Otherwise, if f(a_n) \cdot f(c_n) > 0, the root must be in the right subinterval, so set a_{n+1} = c_n and b_{n+1} = b_n. If f(c_n) = 0, then c_n is exactly the root, and the algorithm terminates.[10][11][4]
This process halves the length of the interval at every step, progressively reducing the uncertainty in the root's location from the initial width |b_0 - a_0| to \frac{|b_0 - a_0|}{2^n} after n iterations.[10][4] The iteration continues until a stopping criterion is met, typically when the interval width satisfies |b_n - a_n| < \epsilon for a prescribed tolerance \epsilon > 0, or when the function value at the midpoint is sufficiently small, such as |f(c_n)| < \delta for some small \delta > 0. At termination, the midpoint c_n serves as the approximation to the root.[11][4]
Pseudocode and Implementation
The bisection method translates the step-by-step procedure into a straightforward iterative algorithm that repeatedly halves the search interval until the desired precision is achieved.[12]
The following pseudocode outlines the full algorithm, taking initial interval endpoints a and b (with f(a) \cdot f(b) < 0), a tolerance \epsilon > 0, and a maximum number of iterations N to prevent infinite loops as inputs, and returning an approximate root as output./3:_Nonlinear_Equations/3.03:_Bisection_Methods_for_Solving_a_Nonlinear_Equation)
function bisection(f, a, b, ε, N)
if f(a) * f(b) >= 0:
return error: "No sign change in [interval](/page/Interval)"
i ← 1
fa ← f(a)
while i <= N:
c ← a + (b - a) / 2 // Midpoint, avoiding potential overflow
fc ← f(c)
if |b - a| < ε or |fc| < ε:
return c // Approximate root
if fa * fc < 0:
b ← c
else:
a ← c
fa ← fc
i ← i + 1
return c // Approximate root after max iterations
function bisection(f, a, b, ε, N)
if f(a) * f(b) >= 0:
return error: "No sign change in [interval](/page/Interval)"
i ← 1
fa ← f(a)
while i <= N:
c ← a + (b - a) / 2 // Midpoint, avoiding potential overflow
fc ← f(c)
if |b - a| < ε or |fc| < ε:
return c // Approximate root
if fa * fc < 0:
b ← c
else:
a ← c
fa ← fc
i ← i + 1
return c // Approximate root after max iterations
In programming languages such as Python, C++, or MATLAB, this algorithm is typically implemented using double-precision floating-point arithmetic to minimize rounding errors, with the midpoint computed as c = a + (b - a)/2 rather than (a + b)/2 to reduce the risk of overflow when |a| and |b| are large.[12] A maximum iteration limit N, often set to around \lceil \log_2((b - a)/\epsilon) \rceil + 1, ensures termination even if floating-point precision causes the loop to stall due to underflow or exact zero evaluations./3:_Nonlinear_Equations/3.03:_Bisection_Methods_for_Solving_a_Nonlinear_Equation)
The primary computational cost of the bisection method arises from evaluating the function f at the midpoint in each iteration, requiring one such evaluation per step, while arithmetic operations like midpoint calculation and sign checks are negligible in comparison.[12] After n iterations, the length of the enclosing interval satisfies
|b_n - a_n| = \frac{b_0 - a_0}{2^n},
providing a predictable reduction in uncertainty regardless of the function's behavior within the interval./3:_Nonlinear_Equations/3.03:_Bisection_Methods_for_Solving_a_Nonlinear_Equation)
Applications and Examples
Root-Finding for Continuous Functions
The bisection method applies effectively to finding roots of arbitrary continuous functions, including non-polynomial or transcendental equations where analytical solutions are unavailable. These functions often arise in scientific and engineering contexts, such as modeling physical systems with trigonometric or exponential terms. For example, the equation \cos x = x defines a transcendental equation f(x) = \cos x - x = 0, which has a unique real root known as the Dottie number, approximately 0.739085. This root cannot be expressed in closed form using elementary functions, highlighting the method's utility for such cases.[13][12]
To demonstrate, consider applying the bisection method to f(x) = \cos x - x = 0 with an initial interval [a_0, b_0] = [0, 1], where f(0) = 1 > 0 and f(1) \approx -0.4597 < 0, ensuring a sign change. The procedure repeatedly computes the midpoint c_n = (a_n + b_n)/2 and updates the interval based on the sign of f(c_n): if f(c_n) > 0, set a_{n+1} = c_n; otherwise, set b_{n+1} = c_n. The following table shows the first three iterations, illustrating the sign changes and interval halvings:
| Iteration | a_n | b_n | c_n | f(c_n) | Update |
|---|
| 0 | 0 | 1 | - | - | Initial |
| 1 | 0 | 1 | 0.5 | ≈0.3776 (>0) | a_1 = 0.5 |
| 2 | 0.5 | 1 | 0.75 | ≈-0.0183 (<0) | b_2 = 0.75 |
| 3 | 0.5 | 0.75 | 0.625 | ≈0.1854 (>0) | a_3 = 0.625 |
After these steps, the interval [0.625, 0.75] contains the root, with width reduced to 0.125. Continuing iterations further refines the approximation to any desired precision.[13]
A key advantage of the bisection method for continuous functions is its guaranteed convergence under the stated assumptions, relying solely on the Intermediate Value Theorem to ensure a root exists in the initial interval. Unlike derivative-based methods, it requires only function evaluations, making it reliable for functions where derivatives are computationally expensive or undefined. However, its linear convergence rate—halving the error interval each step—results in slower progress compared to faster quadratic methods for well-behaved functions.[12][14]
Polynomial Root Example
The bisection method is well-suited for finding roots of polynomials, as these functions are continuous on the real line and satisfy the intermediate value theorem, allowing the identification of intervals containing roots where the function changes sign. Polynomials may have multiple roots, but the method reliably locates one root per bracketing interval by repeatedly halving it.[12]
Consider the cubic polynomial f(x) = x^3 - x - 2. Descartes' rule of signs provides a hint for selecting an initial interval by indicating that the number of positive real roots equals the number of sign changes in the coefficients of f(x) or is less by an even integer; here, there is one sign change, suggesting one positive real root.[15] Evaluation confirms f(1) = -2 < 0 and f(2) = 4 > 0, so a root lies in [1, 2]. The method proceeds by computing the midpoint c_n = (a_n + b_n)/2 at each step and updating the interval based on the sign of f(c_n).
The following table illustrates the first several iterations, showing the left endpoint a_n, right endpoint b_n, midpoint c_n, function value f(c_n), and the narrowing interval length b_n - a_n:
| Iteration n | a_n | b_n | c_n | f(c_n) | Interval length |
|---|
| 0 | 1.0000 | 2.0000 | 1.5000 | -0.1250 | 1.0000 |
| 1 | 1.5000 | 2.0000 | 1.7500 | 1.6094 | 0.5000 |
| 2 | 1.5000 | 1.7500 | 1.6250 | 0.6660 | 0.2500 |
| 3 | 1.5000 | 1.6250 | 1.5625 | 0.2522 | 0.1250 |
| 4 | 1.5000 | 1.5625 | 1.5313 | 0.0566 | 0.0625 |
| 5 | 1.5000 | 1.5313 | 1.5156 | -0.0306 | 0.0313 |
| 6 | 1.5156 | 1.5313 | 1.5234 | 0.0129 | 0.0156 |
| 7 | 1.5156 | 1.5234 | 1.5195 | -0.0085 | 0.0078 |
| 8 | 1.5195 | 1.5234 | 1.5215 | 0.0022 | 0.0039 |
| 9 | 1.5195 | 1.5215 | 1.5205 | -0.0032 | 0.0020 |
After 14 iterations, the interval narrows to length less than $10^{-4}, yielding an approximate root of 1.5214 to four decimal places. Verification shows f(1.5214) \approx 0.0001, confirming the approximation is close to the true root.[12]
Analysis
Convergence Proof
The bisection method is well-defined under the assumptions that the function f is continuous on the closed interval [a_0, b_0] and satisfies f(a_0) f(b_0) < 0, ensuring the existence of at least one root r \in (a_0, b_0) by the intermediate value theorem.[10] At each iteration n, the algorithm selects the subinterval [a_n, b_n] where f(a_n) f(b_n) < 0, preserving the sign-change condition and thus containing the root r by the same theorem.[16]
The length of the interval halves with each step, so b_n - a_n = (b_0 - a_0)/2^n, which approaches zero as n \to \infty.[10] The midpoint approximation c_n = (a_n + b_n)/2 therefore satisfies |r - c_n| \leq (b_n - a_n)/2 \leq (b_0 - a_0)/2^{n+1}, implying c_n \to r as n \to \infty.[16]
The sequences \{a_n\} and \{b_n\} converge monotonically to r, with a_n non-decreasing and bounded above by r, and b_n non-increasing and bounded below by r.[16] This monotonicity, combined with the shrinking interval length, guarantees that the method reliably brackets and approaches the root from both sides.[10]
Error Bounds and Precision
The bisection method provides a reliable bound on the error in the approximation of the root after a finite number of iterations. Suppose the initial interval is [a_0, b_0] with length b_0 - a_0, and c_n denotes the midpoint approximation after n iterations. At each step, the interval containing the root is halved, so the length of the interval [a_n, b_n] after n iterations is (b_0 - a_0)/2^n. Since the true root x lies within this interval, the maximum possible distance from c_n to x is half the interval length, yielding the error bound |x - c_n| \leq (b_0 - a_0)/2^{n+1}. A looser but commonly used bound is |x - c_n| \leq (b_0 - a_0)/2^n, which follows directly from the full interval length serving as an upper estimate for the error.[17]
To achieve a desired absolute precision \epsilon > 0, where the error should satisfy |x - c_n| < \epsilon, the minimum number of iterations required is determined by solving the inequality (b_0 - a_0)/2^n < \epsilon, which rearranges to n > \log_2((b_0 - a_0)/\epsilon). Thus, the number of iterations is n \geq \lceil \log_2((b_0 - a_0)/\epsilon) \rceil. This bound ensures the algorithm terminates with the specified accuracy, independent of the function's behavior beyond continuity and the sign change.[17][18]
In practice, stopping criteria often rely on either absolute or relative error estimates. Absolute error can be controlled by monitoring the interval width b_n - a_n < \epsilon, which directly bounds |x - c_n| as above. Alternatively, checking |f(c_n)| < \epsilon approximates near-root behavior, though it may fail if the function is flat near the root. Relative error, defined as |x - c_n|/|x| < \epsilon or approximated via |c_n - c_{n-1}|/|c_n| < \epsilon, provides scale-invariant precision but requires care when the root is near zero. These criteria balance computational efficiency with reliability.[17]
The bisection method exhibits linear convergence with order 1 and asymptotic error constant $1/2, meaning the error satisfies |x - c_{n+1}| \leq (1/2) |x - c_n| in the worst case. This halving of the error per iteration guarantees steady progress but renders the method slow for high-precision requirements, as the number of iterations grows logarithmically with the inverse of the desired accuracy—for instance, achieving \epsilon = 10^{-15} on an initial interval of length 1 demands roughly 50 iterations.[18]
Generalizations
Higher-Dimensional Extensions
The bisection method extends to higher dimensions for solving systems of nonlinear equations F: \mathbb{R}^n \to \mathbb{R}^n, where zeros are sought within hyper-rectangular domains, analogous to intervals in one dimension. These extensions rely on the Poincaré-Miranda theorem, a vector-valued generalization of the intermediate value theorem, which guarantees the existence of a zero if the function takes values with opposite signs (or zeros) on opposite faces of the hyper-rectangle.[19] The process involves subdividing the initial hyper-rectangle and selecting subdomains where sign conditions are satisfied, ensuring guaranteed convergence to a root.
In two dimensions, the method can halve rectangles by bisecting along coordinate axes, focusing on regions where F exhibits a sign change across boundaries, such as through component-wise evaluations on faces. Each successful bisection reduces the area of the search rectangle by half, progressively contracting the domain around a zero. This approach maintains the robustness of the one-dimensional bisection while adapting to planar geometry.
Key challenges arise from the curse of dimensionality, where the number of potential subdomains grows exponentially as $2^n per iteration in n dimensions, increasing computational demands significantly beyond low dimensions. Additionally, the method requires a carefully chosen initial bounding hyper-rectangle that encloses at least one zero, often determined through prior analysis or grid searches.
For illustration, consider a simple linear system in two dimensions, F(x, y) = \begin{pmatrix} x + y - 1 \\ 2x - y \end{pmatrix} = \mathbf{0}, with solution at (1/3, 2/3). Starting from an initial rectangle [0, 1] \times [0, 1], where sign changes occur on the boundaries (e.g., F(0,0) = (-1, 0)^T, F(1,0) = (0, 2)^T), bisection along the longer side yields sub-rectangles like [0.5, 1] \times [0, 1], and further halvings contract the domain—e.g., after two steps, to [0.5, 0.75] \times [0.5, 1]—isolating the root within a smaller area.
Variant Methods
The Regula Falsi method, also known as the false position method, modifies the standard bisection approach by replacing the midpoint selection with a linear interpolation between the bracketing endpoints to estimate the next approximation. This interpolation point is calculated as the x-intercept of the secant line connecting the two points where the function changes sign, potentially accelerating convergence for functions that are approximately linear over the interval. Unlike pure bisection, which guarantees linear convergence regardless of function shape, Regula Falsi can exhibit superlinear convergence when the root is closer to one endpoint but may slow down if the function is concave toward that side, retaining the bracketing property throughout.
The Illinois algorithm addresses a key limitation of Regula Falsi by modifying the function value at the non-updating endpoint after each iteration, scaling it by the ratio of the current interval length to the previous one to better balance the approximation. This adjustment enhances bracketing reliability and restores faster convergence rates, achieving an order of approximately 1.442 in many cases, while maintaining the simplicity and guaranteed convergence of bisection-like methods. Introduced as a practical improvement for numerical root-finding, it is particularly effective for unimodal functions where standard Regula Falsi stagnates.[20]
The characteristic bisection method extends the bisection framework to higher dimensions by using only the signs of the function values at selected points to guide subdivision, applicable to systems f: \mathbb{R}^d \to \mathbb{R}^d for d \geq 2, without requiring derivative information or specific polynomial structure. For polynomials, related techniques incorporate Sturm sequences to count the number of real roots within an interval via sign variations in the sequence, enabling efficient isolation by bisecting only intervals containing roots, as confirmed by the variation in sign counts at the endpoints.
Methods based on degree computation extend bisection principles to algebraic functions by refining intervals through the degree of the greatest common divisor (GCD) between the defining polynomial and a representation of the interval, such as a quadratic bounding the disk or line segment. For a polynomial p(x), the GCD with (x - a)(x - b) or a resultant-based test detects root presence by degree drop, enabling targeted bisection without sign evaluations. These techniques, rooted in subresultant computations, provide robust isolation for real algebraic numbers, with complexity scaling favorably for sparse or structured polynomials.[21]