Polynomial interpolation
Polynomial interpolation is a fundamental method in numerical analysis for constructing a polynomial of the lowest possible degree that exactly passes through a given set of distinct data points (x_i, y_i) for i = 0, 1, \dots, n, thereby providing an approximation to an underlying continuous function f(x) at those points.[1] This unique interpolating polynomial p_n(x) of degree at most n satisfies p_n(x_i) = y_i = f(x_i) for all i, and its existence and uniqueness are guaranteed by the fundamental theorem of algebra, which ensures that the system of equations defining the polynomial coefficients has a solution.[2] Common techniques for deriving the interpolating polynomial include the Lagrange form, which expresses p_n(x) as a weighted sum \sum_{i=0}^n y_i \ell_i(x), where each basis polynomial \ell_i(x) = \prod_{j \neq i} \frac{x - x_j}{x_i - x_j} equals 1 at x_i and 0 at the other points, and the Newton divided-difference form, which builds the polynomial incrementally using divided differences to facilitate efficient computation and addition of points.[3][4] These methods originated in the 17th and 18th centuries, with Isaac Newton developing the divided-difference approach around 1675 for applications in finite differences and quadrature, followed by Edward Waring's publication of the Lagrange-like formula in 1779, Leonhard Euler's rediscovery in 1783, and Joseph-Louis Lagrange's formal presentation in 1795 as a simple tool for astronomical computations.[5] The importance of polynomial interpolation lies in its versatility for approximating functions from discrete data, enabling tasks such as numerical differentiation, integration (e.g., via quadrature rules like Simpson's), and solving ordinary differential equations, while also finding applications in computer-aided design, signal processing, and machine learning for curve fitting and data smoothing.[2] However, the approximation's accuracy depends on the error term f(x) - p_n(x) = \frac{f^{(n+1)}(\xi)}{(n+1)!} \prod_{i=0}^n (x - x_i) for some \xi between the min and max of x and the x_i, which highlights phenomena like Runge's paradox where high-degree polynomials oscillate wildly for equispaced points, often necessitating clustered nodes like Chebyshev points for stability.[2]Fundamentals
Definition and Motivation
Polynomial interpolation is a fundamental technique in numerical analysis for constructing a polynomial that exactly matches a given set of data points. Given a collection of n+1 distinct points (x_0, y_0), \dots, (x_n, y_n) where the x_i are distinct real or complex numbers, the task is to determine a polynomial P(x) of degree at most n satisfying P(x_i) = y_i for each i = 0, \dots, n. This polynomial serves as an exact representation at the specified nodes x_i, while providing an approximation to any underlying function generating the y_i values at other points.[6] The foundation of polynomial interpolation rests on the basic properties of polynomials, which are finite sums of non-negative integer powers of the variable, expressed as P(x) = a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n, where the a_k are constant coefficients. Interpolation leverages this structure to achieve precise agreement at the nodes, distinguishing it from general function approximation methods that may only seek closeness without exact matches. The motivation stems from practical needs in computation and science, where discrete observations must be extended to continuous models, such as filling gaps in tabulated data or estimating function values between measurements. Historically, polynomial interpolation emerged from efforts in numerical analysis to facilitate the creation and use of mathematical tables, with early roots in ancient astronomy for predicting celestial positions. The modern development is credited to Isaac Newton, who initiated work on interpolation using finite differences around 1675 to address irregularities in observational data, though his methods were not published until 1711. In the 18th century, Joseph-Louis Lagrange advanced the field by deriving an explicit formula for the interpolating polynomial, published in 1795, building on prior contributions like those of Edward Waring in 1779; this formulation emphasized direct construction from the data points without relying on differences.[7][5] A simple illustration occurs for n=1, known as linear interpolation, where the polynomial is the straight line connecting two points: P(x) = y_0 + \frac{y_1 - y_0}{x_1 - x_0} (x - x_0). This formula linearly extrapolates between (x_0, y_0) and (x_1, y_1), providing an intuitive entry point to the general case.[8]Basic Applications
Polynomial interpolation serves as a foundational tool in numerical analysis for approximating integrals and derivatives from discrete data sets. By constructing a polynomial that passes through given points, one can integrate or differentiate this polynomial to estimate the integral or derivative of the underlying function, providing a practical method when analytical forms are unavailable. This approach underpins techniques like Simpson's rule for numerical quadrature, where the interpolating polynomial enables accurate approximation of definite integrals over intervals defined by sample points.[9][10] In data smoothing and curve fitting, polynomial interpolation generates continuous representations from scattered measurements, essential in engineering and computer graphics for creating smooth plots and visualizations. It provides an exact fit to the data points, suitable for clean datasets to preserve trends and facilitate tasks like trajectory prediction in design software. For noisy data, however, regression-based curve fitting with lower-degree polynomials is preferred to approximate the underlying trend and reduce noise effects. This method contrasts with higher-degree fits by potentially exhibiting fewer oscillations when using appropriate node distributions, making it suitable for initial curve approximations in fields requiring visual or analytical continuity.[11][12] Polynomial interpolation plays a key role in signal processing by reconstructing continuous signals from discrete samples, such as in audio and image resampling where it estimates values between known points to avoid aliasing. Techniques like Lagrange or finite impulse response filters derived from polynomials enable efficient upsampling, preserving signal integrity in digital systems. In scientific computing, it supports solving differential equations through methods like the method of lines, where spatial derivatives are approximated via interpolation before time-stepping; this adoption surged in the mid-20th century with the rise of electronic computers, enabling numerical simulations in physics and engineering.[13][14][15][16] Polynomial interpolation finds applications in various fields, including meteorology for estimating values from sparse observations, though specialized methods like kriging are also common for spatial data.Existence and Uniqueness
Interpolation Theorem
The interpolation theorem, also known as the existence and uniqueness theorem for polynomial interpolation, asserts that given n+1 distinct points x_0, x_1, \dots, x_n in the real numbers and arbitrary values y_0, y_1, \dots, y_n, there exists a unique polynomial P(x) of degree at most n such that P(x_i) = y_i for each i = 0, 1, \dots, n.[17] This result guarantees that the interpolation problem is well-posed under these conditions, providing a foundational basis for constructing approximations to functions using polynomials.[8] From a vector space perspective, the space of polynomials of degree at most n over the reals forms a vector space of dimension n+1, with the standard monomial basis \{1, x, x^2, \dots, x^n\}. Interpolating given values at n+1 distinct points corresponds to solving a linear system V \mathbf{c} = \mathbf{y}, where V is the (n+1) \times (n+1) Vandermonde matrix with entries V_{ij} = x_i^{j} (for i,j = 0, \dots, n), \mathbf{c} are the coefficients of the polynomial, and \mathbf{y} = (y_0, \dots, y_n)^T. The Vandermonde matrix is invertible precisely when the x_i are distinct, ensuring a unique solution for \mathbf{c}.[18] This linear algebraic formulation extends naturally to the complex numbers, where the theorem holds analogously for distinct points in \mathbb{C}, as the Vandermonde determinant remains nonzero.[19] Intuitively, the theorem arises because the evaluations at n+1 distinct points define n+1 linearly independent linear functionals on the (n+1)-dimensional space of polynomials of degree at most n, yielding a trivial kernel for the interpolation operator and thus guaranteeing existence and uniqueness.[20]Proofs of Existence and Uniqueness
The existence and uniqueness of a polynomial interpolant of degree at most n for n+1 distinct points (x_0, y_0), \dots, (x_n, y_n) in \mathbb{R} can be established through both non-constructive and constructive arguments. A non-constructive proof relies on the linear algebra of the vector space of polynomials. The space \mathcal{P}_n of polynomials of degree at most n has dimension n+1, with the monomial basis \{1, x, \dots, x^n\}. The evaluation map from \mathcal{P}_n to \mathbb{R}^{n+1}, which sends a polynomial p to (p(x_0), \dots, p(x_n)), is linear. This map is represented by the (n+1) \times (n+1) Vandermonde matrix V with entries V_{ij} = x_i^{j} (indexing from 0). Since the x_i are distinct, \det(V) = \prod_{0 \leq i < j \leq n} (x_j - x_i) \neq 0, so V is invertible. Thus, the map is bijective, implying there exists a unique p \in \mathcal{P}_n such that p(x_i) = y_i for all i = 0, \dots, n.[19] Uniqueness can also be proven directly without reference to matrices. Suppose p, q \in \mathcal{P}_n both satisfy p(x_i) = y_i = q(x_i) for i = 0, \dots, n. Then r = p - q \in \mathcal{P}_n satisfies r(x_i) = 0 for n+1 distinct points. A nonzero polynomial of degree at most n can have at most n roots (by the fundamental theorem of algebra, as r(x) = c \prod_{i=0}^n (x - x_i) would otherwise require degree at least n+1). Thus, r \equiv 0, so p = q.[8] For a constructive proof of existence, the Lagrange form provides an explicit interpolating polynomial, though its full derivation is omitted here. The basis polynomials l_i(x) (for i = 0, \dots, n) are defined such that each l_i(x_j) = \delta_{ij} (Kronecker delta), ensuring linear independence in \mathcal{P}_n. The interpolant p(x) = \sum_{i=0}^n y_i l_i(x) then satisfies p(x_j) = y_j by construction, and lies in \mathcal{P}_n. Combined with the uniqueness argument above, this yields both existence and uniqueness.[8][19] An alternative existence proof proceeds by induction on n, yielding a recursive construction. For the base case n=0, the constant polynomial p_0(x) = y_0 uniquely interpolates at the single point (x_0, y_0). Assume existence and uniqueness hold for n = k \geq 0: there is a unique p_k \in \mathcal{P}_k interpolating at (x_0, y_0), \dots, (x_k, y_k). For n = k+1, let p_k^{(0)} interpolate the first k+1 points (x_0, y_0), \dots, (x_k, y_k), and p_k^{(1)} interpolate the last k+1 points (x_1, y_1), \dots, (x_{k+1}, y_{k+1}). Define p_{k+1}(x) = \frac{x - x_0}{x_{k+1} - x_0} p_k^{(1)}(x) + \frac{x_{k+1} - x}{x_{k+1} - x_0} p_k^{(0)}(x). This is in \mathcal{P}_{k+1}, and direct substitution shows p_{k+1}(x_i) = y_i for i = 0, \dots, k+1 (using the induction hypothesis at intermediate points). Uniqueness follows from the earlier root argument. By induction, existence and uniqueness hold for all n.[21]Uniqueness Corollary
A key consequence of the uniqueness theorem for polynomial interpolation is that the interpolating polynomial has degree at most n for n+1 distinct points, as any two such polynomials agreeing at those points must be identical.[22] Furthermore, if the given data points arise from a polynomial of degree strictly less than n, the unique interpolant of degree at most n coincides exactly with the original polynomial.[23] In the context of Hermite interpolation, where data includes both function values and derivatives at the nodes, uniqueness holds for the polynomial of appropriate degree satisfying these higher-order conditions, provided the total number of conditions matches the degree plus one.[24] The uniqueness of the interpolating polynomial underpins the stability of numerical representations like the barycentric form, which leverages this property to enable efficient and robust evaluation without explicit coefficient computation, particularly for well-distributed nodes such as Chebyshev points.[25] Extending to multiple variables, uniqueness can be achieved via tensor products of univariate interpolants over product grids, yielding a unique polynomial in the tensor-product space; however, for spaces of total degree at most n, uniqueness is more restricted and requires carefully poised point sets to avoid singularities.[26]Construction Methods
Lagrange Polynomial
The Lagrange form provides an explicit expression for the unique polynomial of degree at most n that interpolates n+1 given distinct points (x_0, y_0), \dots, (x_n, y_n) in the plane. It is constructed as P(x) = \sum_{i=0}^n y_i \ell_i(x), where the Lagrange basis polynomials \ell_i(x) are defined by \ell_i(x) = \prod_{\substack{j=0 \\ j \neq i}}^n \frac{x - x_j}{x_i - x_j}. [27][28] Each basis polynomial \ell_i(x) is of degree n and possesses the key property that \ell_i(x_k) = \delta_{ik} for k = 0, \dots, n, where \delta_{ik} is the Kronecker delta (equal to 1 if i = k and 0 otherwise).[27][28] This ensures that P(x_k) = y_k for each interpolation point, while \ell_i(x) vanishes at all other nodes x_j with j \neq i.[28] A primary advantage of the Lagrange form is its direct construction, which requires no solution of linear systems or prior computation of coefficients, making it particularly intuitive for theoretical purposes such as proving the existence of an interpolating polynomial.[27] To illustrate, consider quadratic interpolation through the points (0, 0), (1, 1), and (2, 4). The basis polynomials are \ell_0(x) = \frac{(x-1)(x-2)}{(0-1)(0-2)} = \frac{(x-1)(x-2)}{2}, \ell_1(x) = \frac{(x-0)(x-2)}{(1-0)(1-2)} = \frac{x(x-2)}{-1} = -x(x-2), \ell_2(x) = \frac{(x-0)(x-1)}{(2-0)(2-1)} = \frac{x(x-1)}{2}. Thus, P(x) = 0 \cdot \ell_0(x) + 1 \cdot \ell_1(x) + 4 \cdot \ell_2(x) = -x(x-2) + 2x(x-1) = x^2, which matches the underlying quadratic function.[29] Evaluating P(x) at a new point via the Lagrange form incurs a computational cost of O(n^2) arithmetic operations, as each of the n+1 basis polynomials requires O(n) products.[27]Newton Polynomial
The Newton polynomial provides an alternative representation of the interpolating polynomial, expressed in a nested or hierarchical form that facilitates efficient computation, particularly when data points are added incrementally.[30] The polynomial P_n(x) of degree at most n that interpolates the function values f(x_i) = y_i at distinct nodes x_0, x_1, \dots, x_n is given by P_n(x) = a_0 + a_1 (x - x_0) + a_2 (x - x_0)(x - x_1) + \cdots + a_n \prod_{i=0}^{n-1} (x - x_i), where the coefficients a_k = f[x_0, x_1, \dots, x_k] are the divided differences of order k.[31] This form, also known as Newton's divided difference interpolation formula, leverages a recursive structure for the coefficients, making it distinct from the direct product form in Lagrange interpolation.[31] Divided differences are defined recursively to compute these coefficients. The zeroth-order divided difference is simply the function value: f[x_i] = y_i. Higher-order divided differences are then obtained via f[x_i, x_{i+1}, \dots, x_{i+k}] = \frac{f[x_{i+1}, \dots, x_{i+k}] - f[x_i, \dots, x_{i+k-1}]}{x_{i+k} - x_i} for k \geq 1, assuming x_{i+k} \neq x_i.[31] These are typically organized into a divided difference table, where each entry in subsequent columns is calculated from the differences of the previous column divided by the node spacing. The leading diagonal or first row of this table provides the coefficients a_k.[30] For smooth functions, the divided differences of order k approximate the k-th derivative scaled by k!, revealing information about the function's higher-order behavior.[30] A key advantage of the Newton form is its computational efficiency for updating the interpolant when a new data point is added, as only the highest-order divided differences need recalculation without recomputing prior terms.[30] This hierarchical structure also supports adaptive interpolation strategies and provides insight into the function's smoothness through the magnitude of higher-order differences.[30] To illustrate, consider interpolating the points (0, 0), (1, 1), (2, 8), and (3, 27), which correspond to f(x) = x^3. The divided difference table is constructed as follows:| x_i | f[x_i] | First-order | Second-order | Third-order |
|---|---|---|---|---|
| 0 | 0 | |||
| 1 | ||||
| 1 | 1 | 3 | ||
| 7 | 1 | |||
| 2 | 8 | 6 | ||
| 19 | ||||
| 3 | 27 |