Newton polynomial

In numerical analysis, a Newton polynomial, also known as a Newton interpolation polynomial, is a polynomial of degree at most n-1 that exactly interpolates a function at n distinct points (x_0, y_0), (x_1, y_1), \dots, (x_{n-1}, y_{n-1}), where y_i = f(x_i) for some function f.^[1] It is constructed using divided differences, which serve as the coefficients in its nested form:
P_n(x) = a_0 + a_1 (x - x_0) + a_2 (x - x_0)(x - x_1) + \cdots + a_{n-1} (x - x_0)(x - x_1) \cdots (x - x_{n-2}),
where a_k = f[x_0, x_1, \dots, x_k] is the k-th divided difference, computed recursively from the data points.^[1]^[2] The method originated with Isaac Newton in 1675, who developed it as part of his foundational work on finite differences for approximating functions from tabular data, particularly in astronomy and physics.^[3] This approach built on earlier interpolation ideas, such as those by Brahmagupta in the 7th century for unequal intervals, but Newton's formulation established the classical theory of polynomial interpolation using forward differences for equally spaced points and divided differences for arbitrary spacing.^[3]^[4] Newton polynomials offer computational advantages over alternatives like Lagrange interpolation, as adding a new data point requires only updating the higher-order divided differences without altering prior coefficients, enhancing efficiency and numerical stability in iterative computations.^[1] They are widely applied in scientific computing, data fitting, and approximation theory, where exact interpolation through discrete samples is essential for modeling continuous functions.^[2]

Introduction

Definition

The Newton polynomial, also known as the Newton interpolation polynomial, is a unique polynomial of degree at most n-1 that interpolates a given set of n distinct points (x_i, f(x_i)) for i = 0, 1, \dots, n-1, meaning it exactly passes through each of these points.^[5]^[6] This uniqueness follows from the fundamental theorem of algebra and the properties of polynomial interpolation, ensuring that no other polynomial of the same degree or lower can satisfy the interpolation conditions simultaneously.^[5] Unlike the standard monomial basis representation (e.g., P(x) = c_0 + c_1 x + c_2 x^2 + \dots + c_{n-1} x^{n-1}), which requires solving a system of linear equations for the coefficients, the Newton form expresses the polynomial in a nested or "divided difference" basis. This basis leverages finite divided differences, which generalize finite differences to unequally spaced points, providing a more computationally efficient and stable way to construct the interpolant, especially for incremental additions of data points.^[6]^[5] The basic form of the Newton polynomial is

P_n(x) = a_0 + a_1 (x - x_0) + a_2 (x - x_0)(x - x_1) + \dots + a_{n-1} \prod_{i=0}^{n-2} (x - x_i),

where the coefficients a_k = f[x_0, x_1, \dots, x_k] are the divided differences of order k, defined recursively from the function values at the interpolation points.^[6]^[5] This formulation assumes the interpolation points x_i are distinct, as coincident points would require modifications such as higher-order divided differences or alternative approaches to handle multiplicity.^[5]

Historical Development

Building on earlier interpolation ideas such as those by Brahmagupta in the 7th century for unequal intervals, the Newton polynomial originated in the late 17th century as part of early efforts in numerical analysis to approximate functions from discrete data points using finite differences. Isaac Newton developed the foundational ideas for interpolation using finite differences around 1675–1676, presenting a general formula for equally spaced points in a letter to Henry Oldenburg, the secretary of the Royal Society, and later incorporating related concepts in his Philosophiæ Naturalis Principia Mathematica (1687).^[7] These methods allowed for the construction of interpolating polynomials by iteratively applying forward differences, laying the groundwork for systematic table interpolation in astronomy and computation. Newton's general formulation also introduced divided differences for unequally spaced points.^[8] Prior to Newton's explicit formulation, James Gregory had independently derived a similar interpolation series for equidistant data points in 1670, as detailed in a letter to John Collins, which anticipated the Newton-Gauss forward difference formula.^[9] In the 18th century, Leonhard Euler advanced these ideas by applying Newton's finite difference approach to logarithmic tables in a 1734 letter to Daniel Bernoulli and extending it through generalizations involving the q-binomial theorem in the 1750s, which bridged finite differences toward more abstract series expansions.^[8] Euler's contributions, including a variant of the divided-differences formula in 1783, helped popularize the method among continental mathematicians.^[10] By the 19th century, further refinements to Newton's framework appeared, such as Carl Friedrich Gauss's development of the Newton-Gauss interpolation formula in lectures from 1812 (noted by Encke and published in 1830), which adapted the method for practical astronomical computations.^[7] In the early 20th century, numerical methods texts further solidified and disseminated these developments; for instance, Edmund Taylor Whittaker and George Robinson's The Calculus of Observations: A Treatise on Numerical Mathematics (1924) provided comprehensive treatments of Newton-Gauss formulae and central differences, emphasizing practical computation for observational data.^[11] This evolution marked the transition from Newton's original finite difference schemes for uniform grids to the more versatile divided differences suited for arbitrary data distributions in modern numerical analysis, which remains essential in scientific computing as of 2025.^[8]

Mathematical Foundations

Divided Differences

Divided differences provide a systematic way to compute the coefficients required for the Newton form of interpolating polynomials, applicable to arbitrarily spaced data points. The zeroth-order divided difference at a point x_i is defined as the function value itself:
f[x_i] = f(x_i).
This serves as the starting point for higher-order computations.^[12] Higher-order divided differences are defined recursively. For the first order,
f[x_i, x_{i+1}] = \frac{f[x_{i+1}] - f[x_i]}{x_{i+1} - x_i}.
In general, the k-th order divided difference is given by
f[x_i, x_{i+1}, \dots, x_{i+k}] = \frac{f[x_{i+1}, \dots, x_{i+k}] - f[x_i, \dots, x_{i+k-1}]}{x_{i+k} - x_i}.
This recursion allows computation of divided differences of any order from lower-order ones, with the denominator reflecting the span of the points involved. The process is symmetric, meaning the value of a divided difference depends only on the set of points, not their order.^[12] For a set of n+1 distinct points \{x_0, x_1, \dots, x_n\}, the divided differences are organized into a table to facilitate recursive calculation and avoid redundant computations. The table begins with the points and their function values in the first two columns. Subsequent columns fill in the higher-order differences using the recursive formula, typically computed diagonally from top-left to bottom-right. For four points (n=3), the table layout is as follows:

x	f	f[\cdot,\cdot]	f[\cdot,\cdot,\cdot]	f[\cdot,\cdot,\cdot,\cdot]
x_0	f[x_0]	f[x_0, x_1]	f[x_0, x_1, x_2]	f[x_0, x_1, x_2, x_3]
x_1	f[x_1]	f[x_1, x_2]	f[x_1, x_2, x_3]
x_2	f[x_2]	f[x_2, x_3]
x_3	f[x_3]

The entries along the top row (or the first entry of each order column) yield the divided difference coefficients used in the Newton interpolation polynomial. This tabular method ensures stability and efficiency, with each entry depending only on the two immediately preceding ones in the prior columns.^[12]^[13] In the special case of equally spaced points, where x_i = x_0 + i h for constant spacing h > 0, the divided differences simplify and relate directly to forward differences \Delta^k f(x_0). Specifically, the k-th order divided difference is
f[x_0, x_1, \dots, x_k] = \frac{\Delta^k f(x_0)}{k! \, h^k},
where \Delta^k f(x_0) is the k-th forward difference at x_0. This reduction connects the general divided difference framework to finite difference methods, which are particularly useful for tabular data with uniform intervals.^[13]

Forward and Backward Difference Formulas

The forward difference operator, denoted by \Delta, is defined for a function f at equally spaced points x_i = x_0 + i h (where h is the constant spacing) as \Delta f(x_i) = f(x_{i+1}) - f(x_i), with higher-order differences given recursively by \Delta^k f(x_i) = \Delta^{k-1} f(x_{i+1}) - \Delta^{k-1} f(x_i).^[14] Similarly, the backward difference operator, denoted by \nabla, is \nabla f(x_i) = f(x_i) - f(x_{i-1}), with \nabla^k f(x_i) = \nabla^{k-1} f(x_i) - \nabla^{k-1} f(x_{i-1}).^[15] For interpolation using n+1 points starting from x_0, the Newton forward difference formula expresses the interpolating polynomial P_n(x) as

P_n(x) = \sum_{k=0}^n \binom{u}{k} \Delta^k f(x_0),

where u = (x - x_0)/h and \binom{u}{k} = u(u-1)\cdots(u-k+1)/k! is the binomial coefficient, or equivalently,

P_n(x) = \sum_{k=0}^n \frac{\Delta^k f(x_0)}{k!} \prod_{j=0}^{k-1} (u - j).

^[16] This form is particularly suited for equally spaced data and leverages the forward differences computed at the initial point x_0. The Newton backward difference formula, applied to the same set of points but anchored at the final point x_n, is

P_n(x) = \sum_{k=0}^n \binom{u + k - 1}{k} \nabla^k f(x_n),

where u = (x - x_n)/h and \binom{u + k - 1}{k} = [u(u+1)\cdots(u+k-1)]/k!, or equivalently,

P_n(x) = \sum_{k=0}^n \frac{\nabla^k f(x_n)}{k!} \prod_{j=0}^{k-1} (u + j).

^[17] This uses backward differences at x_n and the rising factorial in the coefficients. The forward formula facilitates stable extrapolation forward from x_0 (i.e., for x > x_0) and interpolation near the beginning of the table, as the binomial coefficients remain well-behaved in that region, minimizing computational instability.^[18] Conversely, the backward formula is preferred for extrapolation backward from x_n (i.e., for x < x_n) and interpolation near the table's end, where using forward differences would lead to larger, less stable coefficients.^[18] This choice enhances numerical accuracy by aligning the difference computations with the proximity to the respective endpoints.

Formulation and Derivation

Newton Interpolation Formula

The Newton interpolation formula expresses a polynomial P_n(x) of degree at most n that interpolates a given function f at n+1 distinct points x_0, x_1, \dots, x_n, ensuring P_n(x_i) = f(x_i) for i = 0, 1, \dots, n. This form leverages divided differences as coefficients, providing a structured basis for the polynomial. The coefficients, denoted f[x_0, \dots, x_k] for k = 0, 1, \dots, n, are computed recursively from the function values and represent the k-th order divided differences, with f[x_0] = f(x_0) as the zeroth-order case.^[5]^[19] The explicit formula is

P_n(x) = f[x_0] + f[x_0, x_1](x - x_0) + f[x_0, x_1, x_2](x - x_0)(x - x_1) + \cdots + f[x_0, \dots, x_n] \prod_{i=0}^{n-1} (x - x_i).

This representation employs the Newton basis, where each term builds upon the previous through nested products \pi_k(x) = \prod_{i=0}^{k-1} (x - x_i) for k \geq 1 (with \pi_0(x) = 1), facilitating incremental construction as additional points are incorporated. The nested structure inherently supports efficient evaluation using Horner's method, which rewrites the polynomial as a sequence of multiplications and additions, achieving O(n) complexity per evaluation after the coefficients are determined.^[5]^[20] By the uniqueness theorem of polynomial interpolation, for any set of n+1 distinct points, there exists exactly one polynomial of degree at most n that passes through the corresponding function values, and the Newton form is identical to the Lagrange interpolation polynomial, though expressed in a different basis.^[21]^[22] A primary computational benefit arises from precomputing the divided difference coefficients once, allowing rapid repeated evaluations at arbitrary points without recomputation, which is particularly advantageous in numerical applications requiring multiple assessments.^[23]

Derivation from Divided Differences

The Newton interpolation polynomial P_n(x) of degree at most n can be derived inductively from the divided difference table, ensuring it interpolates the function f at the distinct points x_0, x_1, \dots, x_n. The form is given by

P_n(x) = \sum_{k=0}^n f[x_0, \dots, x_k] \prod_{j=0}^{k-1} (x - x_j),

where the empty product for k=0 is 1, and f[x_0, \dots, x_k] denotes the k-th divided difference.^[12] For the base case n=0, P_0(x) = f[x_0] = f(x_0), which is the constant polynomial interpolating f at x_0. For n=1, the linear polynomial is P_1(x) = f[x_0] + f[x_0, x_1](x - x_0), where f[x_0, x_1] = \frac{f(x_1) - f(x_0)}{x_1 - x_0}, and it satisfies P_1(x_0) = f(x_0) and P_1(x_1) = f(x_1).^[24] Assume the statement holds for n-1, so P_{n-1}(x) interpolates f at x_0, \dots, x_{n-1}. Define P_n(x) = P_{n-1}(x) + f[x_0, \dots, x_n] \prod_{j=0}^{n-1} (x - x_j). At points x_i for i = 0 to n-1, the product term vanishes, yielding P_n(x_i) = P_{n-1}(x_i) = f(x_i). At x_n, the coefficient f[x_0, \dots, x_n] is chosen via the divided difference recursion f[x_0, \dots, x_n] = \frac{f[x_1, \dots, x_n] - f[x_0, \dots, x_{n-1}]}{x_n - x_0} to ensure P_n(x_n) = f(x_n), completing the induction.^[25]^[12] To verify interpolation rigorously, note that P_n(x) is a polynomial of degree at most n. The error function e(x) = f(x) - P_n(x) satisfies e(x_i) = 0 for i = 0 to n. The divided differences of e of order up to n at these points thus match those of f, but since \deg P_n \leq n, the (n+1)-th divided difference of P_n vanishes: P_n[x_0, \dots, x_n, x_{n+1}] = 0 for any additional point x_{n+1}, as higher-order divided differences are zero for polynomials of lower degree. This confirms P_n uniquely interpolates f at the n+1 points.^[24] The leading coefficient of P_n(x) is the n-th divided difference f[x_0, \dots, x_n], which scales the highest-degree term \prod_{j=0}^{n-1} (x - x_j). For sufficiently smooth f, this generalizes the n-th derivative via the mean value theorem for divided differences: f[x_0, \dots, x_n] = \frac{f^{(n)}(\xi)}{n!} for some \xi in the interval spanning the points, analogous to Taylor expansion coefficients.^[26] The interpolation error follows directly from the construction: consider the auxiliary polynomial Q(x) of degree at most n+1 that interpolates f at x_0, \dots, x_n and also at x, so Q(x) = P_n(x) + f[x_0, \dots, x_n, x] \prod_{j=0}^n (x - x_j). Then f(x) - P_n(x) = f[x_0, \dots, x_n, x] \prod_{j=0}^n (x - x_j), since Q(x) = f(x). This holds as the divided difference f[x_0, \dots, x_n, x] captures the discrepancy.^[26]

Comparisons and Properties

Relation to Taylor Polynomials

The Newton divided-difference interpolation polynomial serves as a discrete analog to the Taylor polynomial, recovering the latter in the continuous limit. Specifically, when the interpolation points x_0, x_1, \dots, x_n all approach a common value x_0, the divided differences f[x_0, \dots, x_k] converge to \frac{f^{(k)}(x_0)}{k!} for each k = 0, 1, \dots, n, assuming f is sufficiently differentiable near x_0. This limit follows from the mean value theorem for divided differences, which states that f[x_0, \dots, x_n] = \frac{f^{(n)}(\xi)}{n!} for some \xi in the interval spanned by the points; as the points coalesce, \xi \to x_0. Consequently, the Newton form p_n(x) = \sum_{k=0}^n f[x_0, \dots, x_k] \prod_{j=0}^{k-1} (x - x_j) approaches the Taylor polynomial T_n(x) = \sum_{k=0}^n \frac{f^{(k)}(x_0)}{k!} (x - x_0)^k. In the case of equally spaced points x_j = x_0 + j h (forward differences), this convergence is explicit in the basis functions. The product term \prod_{j=0}^{k-1} (x - x_0 - j h) normalized by h^k becomes \binom{s}{k} h^k where s = (x - x_0)/h, and as h \to 0, \binom{s}{k} \to \frac{s^k}{k!}, transforming the forward Newton series into the Taylor expansion. The forward differences \Delta^k f(x_0) similarly approximate h^k f^{(k)}(x_0), with the scaling factor h^k / k! yielding the derivative coefficients in the limit. This equivalence highlights the Newton polynomial's role in bridging discrete data interpolation to continuous approximation. A key advantage of the Newton form lies in numerical differentiation, where divided differences provide derivative estimates directly from tabular data without presupposing the function's smoothness beyond the given points. For instance, the first divided difference f[x_0, x_1] approximates f'(x_0) as the points approach each other, and higher-order differences extend this to higher derivatives, enabling robust approximations even for non-smooth or noisy datasets. This utility stems from the polynomial's additive structure, allowing incremental computation of derivatives via the difference table. Historically, Isaac Newton's development of finite difference methods in the 1660s–1670s, as detailed in his unpublished manuscripts on interpolation, explicitly connected discrete differences to infinite series expansions, laying foundational groundwork for viewing the Taylor series as the continuum limit of Newton interpolation.^[27]

Comparison with Lagrange Interpolation

The Newton and Lagrange forms both represent the unique polynomial of degree at most n that interpolates n+1 given points (x_i, f(x_i)) for i = 0, \dots, n, ensuring exact agreement at these points. However, they differ fundamentally in structure: the Lagrange form expresses the polynomial as a linear combination of basis functions centered at each data point, while the Newton form builds it hierarchically using cumulative products based on divided differences.^[28] The Lagrange interpolating polynomial is given by

P_n(x) = \sum_{i=0}^n f(x_i) \, l_i(x),

where each basis polynomial is

l_i(x) = \prod_{\substack{j=0 \\ j \neq i}}^n \frac{x - x_j}{x_i - x_j}.

This form is explicit and does not require intermediate computations like divided differences, making it straightforward to implement for a fixed set of points. In contrast, the Newton form, P_n(x) = f[x_0] + \sum_{i=1}^n f[x_0, \dots, x_i] \prod_{j=0}^{i-1} (x - x_j), leverages divided differences for coefficients, allowing a nested structure amenable to efficient evaluation via Horner's method.^[28] Computationally, both methods require a similar number of arithmetic operations to construct the polynomial coefficients—approximately O(n^2) in total—but differ in evaluation and updates. Evaluating the Lagrange polynomial at a point demands O(n^2) operations due to the products in each l_i(x), whereas the Newton form evaluates in O(n) steps using nested multiplication, akin to Horner's scheme. Moreover, adding a new data point to the Lagrange form necessitates recomputing all basis functions from scratch, at O(n^2) cost, while the Newton form allows incremental updates in O(n) time by appending a single new divided difference and term. This makes Newton preferable in scenarios involving sequential data addition, such as adaptive interpolation.^[29]^[28] Regarding numerical stability, the Newton form is generally less susceptible to rounding errors, particularly when data points are clustered, as the divided differences propagate errors more controllably than the potentially ill-conditioned products in Lagrange basis functions. For equispaced points, higher-order Lagrange interpolation can amplify round-off due to alternating signs in the denominators, whereas Newton's recursive divided differences mitigate this for moderate n. Empirical comparisons confirm Newton's superior accuracy in such cases, with average errors often 17-20% lower than Lagrange's for test functions over varied intervals.^[28]^[30]

Analysis and Extensions

Error Analysis and Accuracy

The error in the Newton polynomial approximation P_n(x) to a function f(x) at n+1 distinct points x_0, \dots, x_n is expressed as

f(x) - P_n(x) = f[x_0, \dots, x_n, x] \prod_{i=0}^n (x - x_i),

where f[x_0, \dots, x_n, x] denotes the divided difference of order n+1.^[31] If f is n+1 times continuously differentiable on an interval containing x_0, \dots, x_n, x, then by the generalized mean value theorem,

f[x_0, \dots, x_n, x] = \frac{f^{(n+1)}(\xi)}{(n+1)!}

for some \xi in the smallest interval enclosing these points.^[4] Substituting yields the standard error term

f(x) - P_n(x) = \frac{f^{(n+1)}(\xi)}{(n+1)!} \prod_{i=0}^n (x - x_i).

^[24] For a smooth function f on a closed interval [a, b] containing the interpolation points and evaluation point x, the absolute error is bounded by

|f(x) - P_n(x)| \leq \frac{M_{n+1}}{(n+1)!} \prod_{i=0}^n |x - x_i|,

where M_{n+1} = \max_{[a,b]} |f^{(n+1)}(t)|.^[24] A coarser uniform bound follows as

|f(x) - P_n(x)| \leq \frac{M_{n+1} (b - a)^{n+1}}{(n+1)!},

since each |x - x_i| \leq b - a. This bound decreases rapidly with n due to the factorial denominator, provided the higher derivatives remain controlled, enabling convergence to f(x) as n \to \infty under suitable conditions on f.^[24] A key limitation arises with equispaced interpolation points, where high-degree Newton polynomials exhibit the Runge phenomenon: large oscillations near the interval endpoints that grow with n, preventing uniform convergence for certain functions like f(x) = 1/(1 + 25x^2) on [-1, 1].^[32] This instability stems from the exponential growth of the Lebesgue constant, which measures the worst-case amplification of data errors. To mitigate it, Chebyshev nodes—defined as x_k = \cos\left( \frac{(2k+1)\pi}{2(n+1)} \right) for k = 0, \dots, n—should be used instead, as they cluster toward the endpoints and bound the Lebesgue constant by O(\log n), ensuring stable approximation.^[33] Regarding numerical conditioning, the Newton form exhibits favorable sensitivity to perturbations in function values compared to explicit forms like Lagrange, particularly in sequential computations where additional points are incorporated. The divided difference coefficients are computed hierarchically, allowing reuse of prior values without recomputation, which minimizes propagation of rounding errors and data perturbations across iterations.^[5] This property enhances overall accuracy in iterative refinement scenarios, though the global condition number still depends on point distribution and can degrade for ill-conditioned sets like equispaced points at high degrees.^[34]

Handling Additional Data Points

One key advantage of the Newton interpolation polynomial is its ability to incorporate additional data points incrementally without recomputing the entire interpolant from scratch. Given an existing polynomial P_{n-1}(x) of degree n-1 based on points (x_0, f(x_0)), \dots, (x_{n-1}, f(x_{n-1})), adding a new point (x_n, f(x_n)) yields the updated polynomial P_n(x) = P_{n-1}(x) + f[x_0, \dots, x_n] \prod_{i=0}^{n-1} (x - x_i), where f[x_0, \dots, x_n] is the divided difference of order n.^[24] This structure reuses all prior coefficients, appending only a single new term.^[35] The underlying divided difference table can be updated efficiently to compute the new coefficient. Starting from the existing table for the first n points, the new point is added as an additional row (or column, depending on orientation), and the highest-order divided difference is calculated recursively using previously computed values: f[x_{i}, \dots, x_n] = \frac{f[x_{i+1}, \dots, x_n] - f[x_i, \dots, x_{n-1}]}{x_n - x_i} for i = 0 to n-1. This process requires only O(n) arithmetic operations, as each of the n new entries depends on two prior ones.^[35] In contrast, updating a Lagrange interpolant typically demands O(n^2) operations to recompute all basis polynomials incorporating the new point.^[24] To illustrate, consider beginning with two points, say (x_0, y_0) and (x_1, y_1), yielding the linear interpolant P_1(x) = y_0 + f[x_0, x_1](x - x_0), where f[x_0, x_1] = \frac{y_1 - y_0}{x_1 - x_0}. Adding a third point (x_2, y_2) first computes the intermediate f[x_1, x_2] = \frac{y_2 - y_1}{x_2 - x_1}, then the quadratic coefficient f[x_0, x_1, x_2] = \frac{f[x_1, x_2] - f[x_0, x_1]}{x_2 - x_0}, resulting in P_2(x) = P_1(x) + f[x_0, x_1, x_2](x - x_0)(x - x_1). The original linear coefficients are preserved, demonstrating direct reuse.^[36] This incremental capability finds application in adaptive interpolation, where new points are dynamically added in regions of high approximation error to refine the polynomial locally, and in online data processing, such as real-time signal fitting where data arrives sequentially.^[35]

Variants and Generalizations

Bessel and Stirling Forms

The Bessel and Stirling forms represent specialized adaptations of the Newton interpolation polynomial for equally spaced data points, leveraging central finite differences to facilitate accurate interpolation near the midpoint of the tabulated interval. These variants center the difference table around a reference point x_0, with the normalized variable u = (x - x_0)/h where h is the uniform spacing, and employ averaged differences to symmetrize the approximation, thereby minimizing asymmetry errors inherent in endpoint-biased forward or backward forms. By incorporating central differences \delta^k y_0, defined as \delta y_0 = \Delta y_0 - \Delta y_{-1} and higher orders recursively, these forms enhance stability for interior evaluations and are derived from averaging Gauss forward and backward interpolation expressions.^[37]^[38] In the Bessel form, central differences are evaluated at the midpoint to support symmetric interpolation around the interval center, making it particularly effective for points where u \approx 1/2. The polynomial is constructed as P(u) = \sum_{k=0}^n \beta_k \prod_{j=0}^{k-1} (u - j + 1/2), where the basis functions \prod_{j=0}^{k-1} (u - j + 1/2) shift the standard Newton falling factorial by half-increments for centering, and the Bessel coefficients \beta_k are determined from the k-th central differences scaled by h^k k!, with \beta_0 as the average of the nearest endpoint values to ensure midpoint symmetry. An explicit series expansion is

\begin{align*} P(u) &= y_0 + u \Delta y_0 + \frac{u(u-1)}{2!} \left[ \frac{\Delta^2 y_0 + \Delta^2 y_{-1}}{2} \right] \\ &\quad + \frac{(u - 1/2) u (u - 1)}{3!} \Delta^3 y_{-1} + \frac{u(u-1)(u-2)(u+1)}{4!} \left[ \frac{\Delta^4 y_{-2} + \Delta^4 y_{-1}}{2} \right] + \cdots, \end{align*}

where higher terms alternate between odd central differences and averages of even ones, providing odd-degree accuracy at the exact center. This structure reduces sensitivity to endpoint extrapolations compared to non-centered Newton forms.^[37]^[39]^[38] The Stirling form combines forward and backward differences in a centered hybrid, using coefficients \gamma_k that blend even and odd central differences for improved conditioning at interior points, especially where u \approx 0. It takes the form P(u) = \sum_{k=0}^n \gamma_k \prod_{j=1}^k (u^2 - (j - 1/2)^2), with \gamma_k derived analogously from central differences, emphasizing even-degree fidelity through squared binomial adjustments. The explicit series is

\begin{align*} P(u) &= y_0 + u \left[ \frac{\Delta y_0 + \Delta y_{-1}}{2} \right] + \frac{u^2}{2!} \Delta^2 y_{-1} + \frac{u (u^2 - 1)}{3!} \left[ \frac{\Delta^3 y_{-1} + \Delta^3 y_{-2}}{2} \right] \\ &\quad + \frac{u^2 (u^2 - 1)}{4!} \Delta^4 y_{-2} + \frac{u (u^2 - 1)(u^2 - 4)}{5!} \left[ \frac{\Delta^5 y_{-2} + \Delta^5 y_{-3}}{2} \right] + \cdots, \end{align*}

where even powers use direct central differences and odd powers average adjacent ones, yielding superior convergence for even-degree approximations near the center.^[37]^[38]^[40] Compared to forward and backward baselines, both forms exhibit lower oscillation due to their central symmetrization, with the Bessel form excelling in odd-degree cases for central accuracy and the Stirling form in even-degree scenarios for interior stability; notably, Stirling's coefficients decay more rapidly than Bessel's beyond initial terms, enhancing efficiency for higher-order fits. These properties stem from the averaging of differences, which dampens endpoint influences and promotes uniform error distribution across the interval.^[37]^[39]

General Case for Arbitrary Points

The Newton interpolation polynomial for arbitrary distinct points x_0, x_1, \dots, x_n is constructed using divided differences, which fully accommodate unequal spacing without the uniform step-size h simplifications of equally spaced cases. The interpolating polynomial takes the nested form

p(x) = f[x_0] + f[x_0,x_1](x-x_0) + f[x_0,x_1,x_2](x-x_0)(x-x_1) + \cdots + f[x_0,\dots,x_n]\prod_{i=0}^{n-1}(x-x_i),

where the coefficients are the divided differences f[x_{i_0},\dots,x_{i_k}], computed recursively as

f[x_{i_0},\dots,x_{i_k}] = \frac{f[x_{i_1},\dots,x_{i_k}] - f[x_{i_0},\dots,x_{i_{k-1}}]}{x_{i_k} - x_{i_0}}

for k \geq 1, with base case f[x_i] = f(x_i).^[41] The denominators in this recursion vary as x_{i_k} - x_{i_0}, reflecting the arbitrary positions of the points and enabling the same recursive table-based computation as in the equal-spacing case, but without factorizations involving powers of a fixed h.^[41] In multivariate settings, the Newton form generalizes to tensor product structures for interpolation on rectangular grids in multiple dimensions, such as 2D or 3D. For a bivariate example with univariate grids \{x_i\}_{i=0}^m and \{y_j\}_{j=0}^k, the interpolant is

p(x,y) = \sum_{r=0}^m \sum_{s=0}^k f[x_0,\dots,x_r; y_0,\dots,y_s] \left( \prod_{i=0}^{r-1} (x - x_i) \right) \left( \prod_{j=0}^{s-1} (y - y_j) \right),

where the coefficients are mixed divided differences generalizing the univariate recursion to partial differences in each variable.^[42] This tensor product approach allows efficient computation via successive univariate divided difference tables, preserving the hierarchical addition of terms for higher-degree approximation on the grid.^[42] For scattered data not aligned on grids, the Newton form can be adapted using barycentric weights to improve numerical stability during evaluation and coefficient computation. This reformulation expresses the interpolant as a weighted sum incorporating second-kind barycentric weights \lambda_k = 1 / \prod_{j \neq k} (x_k - x_j), which mitigates cancellation errors in divided differences for irregularly distributed points, akin to the stable barycentric Lagrange variant. Despite these extensions, the general case shows heightened sensitivity to point clustering, as closely grouped x_i cause near-zero denominators in divided differences, amplifying rounding errors and ill-conditioning in the recursion.^[43] Furthermore, the Lebesgue constant—which bounds the maximum error magnification in polynomial interpolation—exhibits worst-case growth of O(2^n / n) for degree n, a bound that worsens with clustering or in higher dimensions due to the tensor product's multiplicative effect on univariate constants.^[44]

Applications

Numerical Examples

To illustrate the construction of a Newton polynomial using divided differences, consider interpolating the function f(x) = \sin(x) at the points x_0 = 0, x_1 = 0.5, x_2 = 1.0, and x_3 = 1.5. The function values are f(x_0) = 0, f(x_1) \approx 0.4794, f(x_2) \approx 0.8415, and f(x_3) \approx 0.9975. The divided difference table is constructed as follows:

x_i	f[x_i]	First divided differences	Second divided differences	Third divided differences
0	0
		0.9589		-0.1182
0.5	0.4794		-0.2348
		0.7241
1.0	0.8415		-0.4120
		0.3120
1.5	0.9975

The leading divided differences yield the coefficients for the Newton form: P_3(x) = 0 + 0.9589(x - 0) - 0.2348(x - 0)(x - 0.5) - 0.1182(x - 0)(x - 0.5)(x - 1.0). Evaluating at x = 0.75 gives P_3(0.75) \approx 0.6806, compared to the true value \sin(0.75) \approx 0.6816. For equispaced points, the Newton forward difference form can be used, which employs forward differences instead of general divided differences. Consider f(x) = x^2 at points x_0 = 0, x_1 = 1, x_2 = 2 with step size h = 1. The values are f(0) = 0, f(1) = 1, f(2) = 4. The forward difference table is:

x_i	f(x_i)	\Delta f	\Delta^2 f
0	0		2
1	1	1
2	4	3

Here, the first forward differences are \Delta f(0) = 1 and \Delta f(1) = 3, while the second forward difference is \Delta^2 f(0) = 2. Since f(x) is quadratic, higher-order differences are zero, and the interpolating polynomial is P_2(s) = 0 + \binom{s}{1} \cdot 1 + \binom{s}{2} \cdot 2 = s + s(s-1) = s^2, where s = x/h = x. This recovers f(x) = x^2 exactly. The Newton polynomial can be evaluated efficiently using Horner's nested multiplication scheme, which reduces the number of operations. For the general cubic form P_3(x) = a_0 + a_1 (x - x_0) + a_2 (x - x_0)(x - x_1) + a_3 (x - x_0)(x - x_1)(x - x_2), rewrite it as:

\begin{align*} P_3(x) &= \Big( \big( a_3 (x - x_2) + a_2 \big) (x - x_1) + a_1 \Big) (x - x_0) + a_0. \end{align*}

Applying this to the \sin(x) example at x = 0.75 with a_0 = 0, a_1 = 0.9589, a_2 = -0.2348, a_3 = -0.1182, x_0 = 0, x_1 = 0.5, x_2 = 1.0:

Innermost: -0.1182 (0.75 - 1.0) + (-0.2348) = -0.1182 (-0.25) - 0.2348 = 0.02955 - 0.2348 = -0.20525,
Next: -0.20525 (0.75 - 0.5) + 0.9589 = -0.20525 (0.25) + 0.9589 = -0.0513125 + 0.9589 = 0.9075875,
Next: $0.9075875 (0.75 - 0) + 0 = 0.9075875 \times 0.75 = 0.680690625 \approx 0.6807, yielding the same result with fewer multiplications.

A plot of P_3(x) for the \sin(x) example against the true function shows a close fit within the interpolation interval [0, 1.5], with the polynomial passing exactly through the data points. Outside this interval, such as for x > 1.5, the polynomial extrapolates with increasing deviation from \sin(x), highlighting the method's suitability for local approximation rather than global extension.

Practical Implementations

Newton polynomials are implemented in various numerical analysis software libraries, often leveraging divided differences for efficient computation. In Python, the SciPy library provides the KroghInterpolator class, which constructs interpolating polynomials and supports derivative information at data points for enhanced accuracy. This serves as an alternative to least-squares fitting methods like NumPy's polyfit, particularly for exact interpolation on unequally spaced data. In MATLAB, while no built-in function exists, user-contributed implementations on MATLAB Central enable Newton polynomial interpolation via finite differences, facilitating integration into custom numerical workflows.^[45] In engineering applications, Newton polynomials find use in trajectory prediction for ballistics, where polynomial filters approximate projectile paths over short distances to account for curvilinear motion and environmental factors. Isaac Newton's early work on finite differences laid foundational techniques for interpolating astronomical and ballistic tables, influencing modern predictive models. In signal processing, recursive smoothed Newton predictors apply these polynomials to estimate polynomial signals, providing frequency-domain gains for data smoothing while mitigating high-frequency noise in practical systems.^[46] Within scientific computing, piecewise polynomials form the basis of spline interpolations, enabling global approximations that avoid the Runge phenomenon—oscillatory errors from high-degree global polynomials—by restricting polynomial degree locally across intervals. This approach enhances stability for curve fitting on large datasets. Additionally, Newton interpolation basis functions appear in finite difference schemes integrated with finite element methods, constructing generalized difference formulas for solving partial differential equations with simple, stable algorithms. In modern extensions, particularly machine learning, Newton polynomials support surrogate modeling in optimization tasks, where the form's incremental update via divided differences simulates iterative data collection by efficiently incorporating new points without recomputing the entire model. This is valuable for approximating expensive black-box functions in design optimization and parameter estimation.^[47]