Fact-checked by Grok 2 weeks ago

Interpolation

Interpolation is a fundamental method in numerical analysis and mathematics for estimating unknown values within a range of known discrete data points by constructing a continuous function that exactly passes through those points.^[1] This process, often involving polynomials or piecewise functions, enables the approximation of smooth curves from sampled data, distinguishing it from broader approximation techniques that may not require exact fits at the given points.^[1] The practice of interpolation has ancient origins, with Babylonian astronomers employing it around 300 BC for tabular computations, followed by Greek scholar Hipparchus around 150 BC in astronomical predictions.^[2] The term "interpolation" first appeared in 1655 in a Latin text by John Wallis, but systematic development began in the 17th century with Thomas Harriot's work in 1611 and Isaac Newton's foundational contributions starting in 1675, including finite difference methods for polynomial interpolation.^[2] In the 18th century, Edward Waring discovered the Lagrange interpolation formula in 1779, which was independently published by Joseph-Louis Lagrange in 1795, providing an explicit basis for polynomial forms.^[3] Key methods of interpolation include polynomial interpolation, where a single polynomial of degree at most n is fitted to n+1 distinct points, ensuring uniqueness via the fundamental theorem of algebra.^[1] The Lagrange form expresses the interpolant as a weighted sum of basis polynomials, each centered at a data point, while the Newton form uses divided differences for efficient computation and extension to more points.^[1] For improved stability and avoidance of high-degree polynomial oscillations—known as Runge's phenomenon—piecewise interpolation methods like splines divide the domain into subintervals, fitting low-degree polynomials (e.g., linear or cubic) on each while enforcing continuity conditions.^[1] Interpolation finds wide applications in data analysis, computer graphics for curve rendering, signal processing for resampling, and numerical methods such as quadrature (integration) and differentiation, where it derives approximation rules from fitted polynomials.^[1] Error estimation is essential, often bounded by the next derivative of the underlying function and the point distribution, guiding the choice of method for accuracy and computational efficiency.^[1]

Fundamentals

Definition

In numerical analysis, interpolation is the process of constructing a function that passes exactly through a given set of discrete data points to estimate values at intermediate locations.^[1] Specifically, given a set of n+1 distinct points (x_0, y_0), (x_1, y_1), \dots, (x_n, y_n) where the x_i are the nodes and y_i = f(x_i) are the corresponding function values, interpolation seeks a function f such that f(x_i) = y_i for i = 0, 1, \dots, n, allowing evaluation of f(x) for x not coinciding with any x_i.^[1] This approach assumes basic knowledge of functions and emphasizes the role of data points as nodes where the interpolant must fit exactly.^[4] A key distinction from function approximation is that interpolation requires an exact fit at the specified nodes, whereas approximation methods, such as least-squares fitting, seek to minimize overall error without necessarily passing through every point—often when the number of data points exceeds the degree of the approximating function.^[1] For instance, in polynomial interpolation of degree at most n through n+1 points, the solution is unique, ensuring precise reproduction at the nodes.^[1] Common formulations include the Lagrange interpolating polynomial, expressed as

p(x) = \sum_{i=0}^n y_i \ell_i(x),

where \ell_i(x) = \prod_{j=0, j \neq i}^n \frac{x - x_j}{x_i - x_j} are the basis polynomials, or the Newton form, which builds the interpolant incrementally using divided differences without deriving the full expressions here.^[1] These provide introductory ways to represent the interpolating function while maintaining the exact fit requirement.^[5]

Motivation and History

Interpolation serves as a fundamental tool in numerical analysis for estimating unknown values within discrete datasets, enabling the approximation of continuous functions from sampled points in fields such as scientific measurement, engineering design, and data visualization.^[6] This need arises because real-world data is often collected at irregular or finite intervals, requiring interpolation to fill gaps, smooth curves, or predict intermediate values for practical applications like modeling physical phenomena or generating graphical representations.^[7] The practice of interpolation dates back to ancient astronomy, where Claudius Ptolemy employed linear interpolation techniques in the 2nd century AD to construct tables of chord functions in his Almagest, facilitating the prediction of planetary positions from discrete observations.^[8] In the 17th century, Isaac Newton laid foundational work on finite differences and interpolation in a 1675 letter, establishing methods for polynomial approximation that influenced classical theory.^[9] Joseph-Louis Lagrange advanced this in 1795 with his explicit formula for polynomial interpolation, providing a systematic way to construct unique polynomials passing through given points.^[9] By the early 20th century, Carl Runge highlighted limitations in 1901, demonstrating through examples that high-degree polynomial interpolation on equispaced points could lead to oscillatory errors, known as Runge's phenomenon, which underscored the need for careful node selection in approximations.^[10] Key advancements in the mid-20th century included the mathematical formalization of splines by I. J. Schoenberg in 1946, inspired by flexible wooden or metal strips used in shipbuilding to draw smooth hull curves, leading to piecewise polynomial methods that avoid global oscillations.^[11] In geostatistics, D. G. Krige developed early statistical interpolation techniques in 1951 for estimating ore grades in South African mining, which Georges Matheron formalized as kriging in 1960, introducing optimal unbiased prediction under spatial correlation assumptions.^[12] The advent of digital computing in the 1950s propelled interpolation into numerical methods for solving differential equations and data processing on early machines like ENIAC, enabling efficient implementation of algorithms for engineering simulations. In the modern era, interpolation has integrated with artificial intelligence, particularly post-2020, for data imputation in large-scale machine learning datasets, where methods like Gaussian process regression variants enhance missing value estimation while preserving statistical properties in high-dimensional data.^[13]

Univariate Interpolation Methods

Nearest-Neighbor Interpolation

Nearest-neighbor interpolation, also known as piecewise constant interpolation or zero-order hold, is the simplest method for estimating values between known data points in univariate interpolation. It assigns to a query point x the value y_i of the closest data point x_i from a given set of pairs (x_j, y_j) for j = 1 to n, without any blending or smoothing.^[14]^[15] This approach is particularly suited for categorical data or scenarios where smoothness is not required, producing a step-like function with constant values in intervals defined by midpoints between neighboring points.^[16] The mathematical formulation is given by:

f(x) = y_i \quad \text{where} \quad i = \arg\min_{j=1,\dots,n} |x - x_j|

In cases of ties (equal distances to multiple points), a convention such as selecting the leftmost or rightmost point, or rounding toward even indices, is typically applied to ensure determinism.^[15] The algorithm involves, for a query point x, computing the Euclidean distance (absolute difference in one dimension) to each x_j and selecting the y_j with the minimum distance; this naive implementation runs in O(n) time per query. With preprocessing, such as sorting the x_j or using a search structure like a binary search tree, the time complexity can be reduced to O(\log n) or O(1) for uniform grids.^[15]^[16] Consider a simple example with data points (0, 1), (1, 3), and (2, 2). For a query at x = 0.6, the distances are |0.6 - 0| = 0.6, |0.6 - 1| = 0.4, and |0.6 - 2| = 1.4, so the closest point is (1, 3) and f(0.6) = 3. This results in a discontinuous step function: constant at 1 for x \in [-0.5, 0.5), 3 for x \in [0.5, 1.5), and 2 for x \in [1.5, 2.5), visualizing as a staircase with jumps at decision boundaries.^[16] Nearest-neighbor interpolation offers significant computational efficiency, requiring minimal operations beyond distance comparisons, making it ideal for real-time applications or large datasets where speed trumps accuracy. It also preserves the exact range of input values, avoiding extrapolation beyond the data. However, it produces non-differentiable, discontinuous outputs that poorly approximate smooth underlying functions, leading to visual artifacts like blockiness in images or aliasing in signals.^[14]^[15]^[16]

Linear Interpolation

Linear interpolation is a fundamental method in numerical analysis for estimating values between known data points by constructing a piecewise linear function that connects consecutive points with straight line segments, assuming the data points are ordered by their independent variable values x_i. This approach provides a simple, first-order approximation that is computationally efficient and preserves monotonicity within each interval.^[17] The interpolated value f(x) for a query point x lying within the interval [x_i, x_{i+1}], where the known points are (x_i, y_i) and (x_{i+1}, y_{i+1}) with x_i < x_{i+1}, is given by the formula:

f(x) = y_i + (y_{i+1} - y_i) \frac{x - x_i}{x_{i+1} - x_i}

This expression ensures exact reproduction of the data points at the endpoints.^[17] The formula derives from the concept of a weighted average, where f(x) is a convex combination of y_i and y_{i+1}. The weight for y_{i+1} is the relative distance \frac{x - x_i}{x_{i+1} - x_i}, which ranges from 0 to 1 across the interval, and the weight for y_i is the complement $1 - \frac{x - x_i}{x_{i+1} - x_i}. This linear weighting follows directly from the equation of a straight line passing through the two points, parameterized by the slope m = \frac{y_{i+1} - y_i}{x_{i+1} - x_i}.^[18] To illustrate, consider the data points (0, 0), (1, 1), and (2, 0). The piecewise linear interpolant consists of two segments: from x=0 to x=1, and from x=1 to x=2. The following table shows interpolated values at selected points within these intervals:

x	Interval	f(x)
0.0	[0,1]	0.0
0.25	[0,1]	0.25
0.5	[0,1]	0.5
0.75	[0,1]	0.75
1.0	[0,1] or [1,2]	1.0
1.25	[1,2]	0.75
1.5	[1,2]	0.5
1.75	[1,2]	0.25
2.0	[1,2]	0.0

These values form straight lines between the points, resulting in a continuous "tent" shape that peaks at x=1.^[18] Key properties of linear interpolation include C^0 continuity, meaning the function is continuous across all points but not necessarily differentiable at the junctions x_i, where slopes may change abruptly. It also features local support, as the value at any x depends only on the two nearest data points enclosing it, enabling efficient computation without global influence. The approximation error is bounded by O(h^2), where h is the maximum interval length, specifically \|f - p\|_\infty \leq \frac{h^2}{8} \|f''\|_\infty for twice-differentiable functions f, assuming uniform spacing for the bound.^[19]^[20]^[17] Linear interpolation finds practical use in basic data plotting, where it connects discrete points to form smooth visual representations, and in computer animation for generating intermediate frames between key poses via straightforward tweening.^[21]

Polynomial Interpolation

Polynomial interpolation constructs a unique polynomial p(x) of degree at most n-1 that passes exactly through n given distinct points (x_i, y_i) for i = 0, 1, \dots, n-1, where y_i = f(x_i) for some underlying function f. This global method applies the same polynomial across the entire domain, making it suitable for exact fitting but prone to instability for high degrees or ill-conditioned points.^[22]^[23] One common representation is the Lagrange form, which expresses p(x) directly in terms of the data points without solving a system of equations:

p(x) = \sum_{i=0}^{n-1} y_i \ell_i(x),

where the Lagrange basis polynomials are

\ell_i(x) = \prod_{\substack{j=0 \\ j \neq i}}^{n-1} \frac{x - x_j}{x_i - x_j}.

Each \ell_i(x) is 1 at x = x_i and 0 at the other points x_j (j \neq i), ensuring the interpolation conditions are satisfied. This form is intuitive for theoretical analysis but computationally inefficient for large n due to the product evaluations.^[22]^[23] An alternative is the Newton form, which builds the polynomial incrementally using divided differences and is more efficient for adding points or evaluating at multiple locations:

p(x) = f[x_0] + f[x_0, x_1](x - x_0) + f[x_0, x_1, x_2](x - x_0)(x - x_1) + \cdots + f[x_0, \dots, x_{n-1}] \prod_{k=0}^{n-2} (x - x_k),

where the coefficients are the divided differences defined recursively: f[x_i] = y_i, and for k \geq 1,

f[x_i, \dots, x_{i+k}] = \frac{f[x_{i+1}, \dots, x_{i+k}] - f[x_i, \dots, x_{i+k-1}]}{x_{i+k} - x_i}.

These can be computed via a divided-difference table, facilitating numerical stability and error estimation. For equispaced points, forward differences simplify the process, but the general form handles arbitrary spacing.^[24]^[23] Example: Cubic Interpolation for Rocket Velocity Consider interpolating the upward velocity v(t) of a rocket at times t = 0, 2, 4, 6 seconds, with data:

t (s)	v(t) (m/s)
0	0
2	227
4	362
6	517

The divided-difference table is:

t_i	f[t_i]	First-order	Second-order	Third-order
0	0
		113.5
2	227		-11.5
		67.5		2.333
4	362		2.5
		77.5
6	517

The first-order differences are (227-0)/(2-0) = 113.5, (362-227)/(4-2) = 67.5, (517-362)/(6-4) = 77.5. The second-order differences are (67.5 - 113.5)/(4-0) = -11.5, (77.5 - 67.5)/(6-2) = 2.5. The third-order difference is (2.5 - (-11.5))/(6-0) = 2.333. The cubic Newton polynomial is

p(t) = 0 + 113.5 t - 11.5 t (t-2) + 2.333 t (t-2)(t-4).

This fits the four points exactly and can be used to estimate velocity between them, such as at t=3: p(3) = 299 m/s. The interpolation error for a sufficiently smooth function f is

f(x) - p(x) = \frac{f^{(n)}(\xi)}{n!} \omega(x),

where \xi lies between the minimum and maximum of x, x_0, \dots, x_{n-1}, and \omega(x) = \prod_{i=0}^{n-1} (x - x_i) is the nodal polynomial. This bound highlights that error depends on the (n)th derivative and the point distribution; clustered points near x can reduce \omega(x). For exact fit, the error is zero at the nodes.^[25]^[23] A key limitation is Runge's phenomenon, where high-degree polynomials with equispaced nodes produce large oscillations near the interval endpoints, diverging from the true function even as n increases. This arises because the Lebesgue constant, which bounds the maximum error amplification, grows as \mathcal{O}(\log n) for equispaced points but exponentially for Chebyshev nodes. For instance, interpolating f(x) = 1/(1 + 25x^2) on [-1, 1] with a degree-10 polynomial through 11 equispaced points shows pronounced overshoots near x = \pm 1, up to 20% deviation. Using clustered nodes like Chebyshev points mitigates this instability.^[26] Linear interpolation is the special case of polynomial interpolation with n=2 and degree 1.^[22]

Spline Interpolation

Spline interpolation constructs a function composed of piecewise polynomials of degree k that interpolate given data points at knots x_i, ensuring C^{k-1} continuity of derivatives at the interior knots to achieve smoothness. Common boundary conditions include natural splines, where higher-order derivatives vanish at endpoints; clamped splines, which specify endpoint derivatives; and complete splines, which incorporate additional constraints for higher smoothness.^[27] Cubic spline interpolation, with k=3, uses piecewise cubic polynomials that are continuous along with their first and second derivatives at the knots, providing a balance of flexibility and smoothness suitable for most practical applications.^[27] For data points (x_i, y_i) where x_0 < x_1 < \cdots < x_n, the spline s(x) on each interval [x_i, x_{i+1}] takes the form

s_i(x) = a_i + b_i (x - x_i) + c_i (x - x_i)^2 + d_i (x - x_i)^3,

with a_i = y_i.^[28] The coefficients b_i, c_i, and d_i are determined by enforcing interpolation at the knots, s_i(x_{i+1}) = y_{i+1}, and continuity of the first and second derivatives at interior knots, s_i'(x_{i+1}) = s_{i+1}'(x_{i+1}) and s_i''(x_{i+1}) = s_{i+1}''(x_{i+1}).^[27] These conditions yield a system of linear equations for the second derivatives (or moments) at the knots, which forms a tridiagonal matrix that can be efficiently solved using algorithms like the Thomas algorithm.^[28] For a natural cubic spline, the second derivatives at the endpoints are set to zero, simplifying the boundary conditions.^[27] Consider fitting five data points; the natural cubic spline produces a curve that closely follows the points with minimal overshoot, in contrast to a global quartic polynomial, which may exhibit unwanted wiggles due to sensitivity to endpoint values.^[28] Key advantages of spline interpolation include local control, where modifications to a single data point influence only adjacent segments, reducing propagation of errors.^[27] Unlike high-degree global polynomials, splines avoid the Runge phenomenon, preventing large oscillations near the boundaries.^[28] Computationally, the B-spline basis representation enhances stability and efficiency, as each basis function has compact support limited to a few intervals.^[27] In computer-aided design (CAD) software, spline methods underpin Non-Uniform Rational B-Splines (NURBS), which extend polynomial splines to rational forms for exact representation of conics and have evolved post-2000 with advanced implementations for freeform surface modeling in industries like automotive and aerospace.

Multivariate and Higher-Dimensional Interpolation

Bilinear and Higher-Order Interpolation

Bilinear interpolation extends univariate linear interpolation to two dimensions, particularly for functions defined on rectangular grids. It constructs a function f(x, y) that is linear in each variable separately, typically expressed as f(x, y) = a + b x + c y + d x y, where the coefficients a, b, c, d are determined by fitting the interpolant to the known values at the four corners of a rectangular cell in the grid.^[29] This method ensures the interpolant passes exactly through the grid points and provides a smooth approximation within each cell.^[30] The explicit formula for bilinear interpolation on a unit square with vertices at (0,0), (1,0), (0,1), and (1,1) is given by

f(u, v) = (1-u)(1-v) f(0,0) + u(1-v) f(1,0) + (1-u)v f(0,1) + u v f(1,1),

where (u, v) are the normalized coordinates within the cell, scaled from the actual position (x, y) relative to the cell boundaries.^[29] This can be computed efficiently via two successive one-dimensional linear interpolations: first along one axis to find intermediate values, then along the other axis.^[30] Higher-order extensions, such as trilinear interpolation in three dimensions, follow a tensor-product structure on cubic cells. The formula incorporates an additional parameter w for the third dimension, yielding

f(u, v, w) = f(u,v,0)(1-w) + f(u,v,1)w,

where f(u,v,0) and f(u,v,1) are bilinear interpolants on the bottom and top faces of the cube, respectively, ensuring exact reproduction at the eight vertices.^[30] This approach generalizes to higher dimensions but increases computational cost due to the growing number of terms ( $2^d for dimension d ).^[29] For irregular or unstructured grids, such as triangular meshes, a simplicial alternative uses barycentric coordinates to perform linear interpolation over simplices (triangles in 2D, tetrahedra in 3D). Given a point P inside a triangle with vertices A, B, C and values f(A), f(B), f(C), the interpolant is f(P) = \lambda_A f(A) + \lambda_B f(B) + \lambda_C f(C), where \lambda_A, \lambda_B, \lambda_C are the barycentric coordinates satisfying \lambda_A + \lambda_B + \lambda_C = 1 and \lambda_i \geq 0, computed as area ratios of the sub-triangles formed by P.^[31] This method is affine-invariant and suitable for triangulated domains without requiring a regular grid.^[31] A common application is in image resizing, where bilinear interpolation estimates pixel values on a new grid to scale an image. For instance, upscaling a low-resolution image using nearest-neighbor interpolation produces blocky artifacts, while bilinear yields smoother results by blending neighboring pixels, though it may introduce slight blurring around edges compared to higher-order methods.^[29] Bilinear and simplicial interpolants are C^0 continuous across cell boundaries, meaning they are continuous but not necessarily differentiable, and possess affine invariance, preserving linear functions exactly. The local truncation error for smooth functions is O(h^2) in each dimension, where h is the grid spacing, analogous to the error in univariate linear interpolation. On non-uniform grids, bilinear interpolation can exhibit anisotropy, where the approximation quality varies directionally due to stretched or distorted cells, potentially degrading the O(h^2) error bound if the grid aspect ratio becomes large.

Inverse Distance Weighting

Inverse Distance Weighting (IDW) is a deterministic spatial interpolation technique that estimates values at unsampled locations using a weighted average of known data points, where the weights are inversely proportional to the distances from the estimation point. This method assumes that points closer to the prediction location have greater influence on the interpolated value. The core formulation, known as Shepard's method, is given by

f(\mathbf{x}) = \frac{\sum_{i=1}^{n} w_i y_i}{\sum_{i=1}^{n} w_i},

where y_i are the observed values at known locations \mathbf{x}_i, w_i = 1 / \|\mathbf{x} - \mathbf{x}_i\|^p are the weights, n is the number of neighboring points considered, and p is the power parameter that controls the rate of weight decay with distance.^[32]^[33] The method was originally proposed for irregularly spaced data in two dimensions and has since been extended to higher dimensions.^[32] The power parameter p, typically set to 2, determines the smoothness of the interpolated surface: lower values of p (e.g., 1) produce smoother surfaces by giving more influence to distant points, while higher values (e.g., 3 or more) create sharper, more localized interpolations that emphasize nearby points.^[34]^[35] Another key parameter is the search radius or the number of nearest neighbors k, which limits the points used in the weighting to improve computational efficiency and focus on local structure; using all points can lead to over-smoothing in large datasets.^[33] In the limit as p approaches infinity, IDW behaves like nearest-neighbor interpolation by assigning full weight to the closest point.^[34] The algorithm for IDW proceeds as follows: for a query point \mathbf{x}, identify the relevant neighboring points (all within a search radius, or the k nearest); compute the Euclidean distances d_i = \|\mathbf{x} - \mathbf{x}_i\| to each; calculate unnormalized weights w_i = 1 / d_i^p (with a small constant added to avoid division by zero at data points); and finally, compute the weighted average as normalized by the sum of weights.^[36]^[34] This process is repeated for each prediction location, making IDW suitable for scattered data without requiring a underlying grid.^[36] A common application of IDW is in estimating terrain elevation from scattered elevation measurements in two dimensions, such as generating digital elevation models (DEMs) from field surveys; for instance, with p=2, closer survey points contribute more heavily to the interpolated height map, producing a surface that reflects local topography while smoothing over gaps.^[37]^[36] IDW is an exact interpolator, meaning it reproduces known data values precisely at the input locations, but the resulting surface is generally smooth yet prone to artifacts like bull's-eye patterns—concentric rings of similar values around isolated data points—especially with high p or sparse data, as the method lacks inherent smoothness guarantees beyond continuity.^[34]^[38] Variants of IDW include the modified Shepard's method, which incorporates barriers (permeable or absolute) to prevent interpolation across obstacles like rivers or faults, and approaches with variable p that adapt the power locally based on data density for improved accuracy.^[32]^[39] In geographic information systems (GIS) software, such as ArcGIS Pro (updated through versions post-2015), IDW is implemented with customizable search neighborhoods and barrier support, facilitating its use in environmental modeling and resource estimation.^[36]

Radial Basis Functions

Radial basis function (RBF) interpolation provides a flexible, meshfree approach to constructing smooth approximations from scattered data points in arbitrary dimensions, particularly suited for multivariate settings where structured grids are unavailable. The interpolant takes the form

s(\mathbf{x}) = \sum_{i=1}^N \lambda_i \phi(\|\mathbf{x} - \mathbf{x}_i\|) + p(\mathbf{x}),

where \phi: [0, \infty) \to \mathbb{R} is a univariate radial kernel function, \mathbf{x}_i \in \mathbb{R}^d are the scattered data sites, \lambda_i \in \mathbb{R} are coefficients to be determined, y_i = s(\mathbf{x}_i) are the data values, and p(\mathbf{x}) is an optional low-degree polynomial (often of degree at most m-1) included to enhance stability and enable polynomial reproduction for kernels that are only conditionally positive definite.^[40]^[41] The coefficients \lambda = (\lambda_1, \dots, \lambda_N)^T and parameters of p are solved via the interpolation conditions, yielding the symmetric linear system \Phi \lambda = \mathbf{y}, where \mathbf{y} = (y_1, \dots, y_N)^T and the RBF matrix \Phi has entries \Phi_{ij} = \phi(\|\mathbf{x}_i - \mathbf{x}_j\|). For conditionally positive definite kernels, the system is augmented with side conditions \sum_{i=1}^N \lambda_i q_k(\mathbf{x}_i) = 0 for a basis \{q_k\} of the polynomial space to ensure solvability and uniqueness. Direct solution via Gaussian elimination costs O(N^3), which is prohibitive for large N, and the matrix \Phi often becomes severely ill-conditioned, especially as N grows or the shape parameter varies, leading to numerical instability in floating-point arithmetic. Preconditioning and iterative methods, such as conjugate gradients, mitigate these issues by exploiting the matrix's positive definiteness properties.^[41]^[40]^[42] Common radial kernels include the infinitely differentiable Gaussian \phi(r) = e^{-(\epsilon r)^2}, which is strictly positive definite on \mathbb{R}^d for any \epsilon > 0 and dimension d, guaranteeing a unique interpolant without polynomials; the multiquadric \phi(r) = \sqrt{1 + (\epsilon r)^2}, conditionally positive definite of order 1 and originally proposed for fitting irregular topographic surfaces; and the thin-plate spline \phi(r) = r^2 \log r (for d=2), conditionally positive definite of order 2 and derived as the minimizer of a bending energy functional analogous to thin-plate deformation. These kernels are univariate in the radial distance r = \|\mathbf{x} - \mathbf{x}_i\| and scaled by a positive shape parameter \epsilon, which controls the kernel's support and influences approximation quality.^[41]^[43]^[44] RBF interpolants inherit smoothness from the kernel—for instance, Gaussian-based ones are C^\infty—and operate without requiring a mesh, making them ideal for irregular or high-dimensional data. For scalability, fast multipole methods approximate distant interactions in the summation, reducing evaluation from O(N^2) to O(N) or O(N \log N) complexity, with kernel-independent variants extending this to general RBFs via geometric expansions. RBF interpolation generalizes simpler radial methods like inverse distance weighting by using smoother, parameterized kernels.^[40]^[45] Error bounds depend on the native space of the kernel and data distribution, with convergence rates improving for smoother underlying functions; stability hinges on tuning \epsilon, as small values cause "flat limits" with near-singular matrices and poor conditioning, while large values yield overly rigid fits. Optimal \epsilon is often selected via leave-one-out cross-validation to minimize generalized cross-validation scores. Including the polynomial term allows exact reproduction of polynomials up to degree m-1 if the kernel is conditionally positive definite of order m, aiding accuracy for data with polynomial trends.^[40]^[46]^[41] In a typical application, such as fitting scattered data in three dimensions with a Gaussian RBF, the positive definiteness ensures a unique solution to the system, yielding a smooth, global interpolant that captures underlying structures without gridding artifacts; for N \approx 10^3 points, direct solves are feasible, but fast methods enable larger-scale use. Recent advances in the 2020s incorporate deep architectures, such as multi-layer RBF networks with learned centers and widths, enhancing expressivity for complex mappings while preserving interpretability.^[41]^[47]

Probabilistic and Statistical Interpolation

Gaussian Process Regression

Gaussian process regression (GPR) is a nonparametric Bayesian method for regression and interpolation that models the target function as a Gaussian process, defined by a mean function m(\mathbf{x}) and a covariance (kernel) function k(\mathbf{x}, \mathbf{x}'), such that f(\mathbf{x}) \sim \mathcal{GP}(m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}')).^[48] The posterior distribution over the function, given noisy observations \mathbf{y} = f(\mathbf{X}) + \boldsymbol{\epsilon} where \boldsymbol{\epsilon} \sim \mathcal{N}(\mathbf{0}, \sigma^2 \mathbf{I}), provides both the predictive mean as the interpolant and uncertainty estimates via the predictive variance.^[48] In the noise-free case (\sigma^2 = 0), the posterior mean exactly interpolates the observed data points.^[48] The predictive mean at a test point \mathbf{x}_* is given by

\mu(\mathbf{x}_*) = \mathbf{k}(\mathbf{x}_*, \mathbf{X}) [\mathbf{K}(\mathbf{X}, \mathbf{X}) + \sigma^2 \mathbf{I}]^{-1} \mathbf{y},

where \mathbf{k}(\mathbf{x}_*, \mathbf{X}) is the vector of covariances between \mathbf{x}_* and the training inputs \mathbf{X}, and \mathbf{K}(\mathbf{X}, \mathbf{X}) is the covariance matrix over \mathbf{X}.^[48] The predictive variance is

v(\mathbf{x}_*) = k(\mathbf{x}_*, \mathbf{x}_*) - \mathbf{k}(\mathbf{x}_*, \mathbf{X}) [\mathbf{K}(\mathbf{X}, \mathbf{X}) + \sigma^2 \mathbf{I}]^{-1} \mathbf{k}(\mathbf{X}, \mathbf{x}_*),

which quantifies epistemic uncertainty, approaching zero at observed points in the noise-free limit.^[48] Common kernels include the squared exponential (SE), or radial basis function (RBF), kernel:

k(\mathbf{x}, \mathbf{x}') = \sigma_f^2 \exp\left( -\frac{\|\mathbf{x} - \mathbf{x}'\|^2}{2\ell^2} \right),

parameterized by the signal variance \sigma_f^2 and lengthscale \ell, which controls smoothness.^[48] Hyperparameters such as \sigma_f^2, \ell, and noise variance \sigma^2 are optimized by maximizing the marginal log-likelihood \log p(\mathbf{y} \mid \mathbf{X}) = -\frac{1}{2} \mathbf{y}^\top [\mathbf{K}(\mathbf{X}, \mathbf{X}) + \sigma^2 \mathbf{I}]^{-1} \mathbf{y} - \frac{1}{2} \log |\mathbf{K}(\mathbf{X}, \mathbf{X}) + \sigma^2 \mathbf{I}| - \frac{n}{2} \log 2\pi, often using gradient-based methods like conjugate gradients or L-BFGS.^[48] GPR provides probabilistic predictions, with the posterior variance offering calibrated uncertainty that increases away from data points and reflects model confidence.^[48] However, exact inference scales cubically with the number of data points due to the inversion of the n \times n covariance matrix, limiting applicability to datasets with n \lesssim 10^4; this is addressed by sparse approximations using inducing points that reduce complexity to O(m^3) where m \ll n, as in variational methods.^[49] A Gaussian process with an RBF kernel is equivalent to Bayesian radial basis function interpolation, where the kernel defines the basis functions and the posterior incorporates prior uncertainty.^[48] For illustration, consider one-dimensional noisy observations from a smooth underlying function; fitting an SE kernel GPR yields a posterior mean that smoothly interpolates the points, surrounded by 95% credible intervals (two standard deviations of the predictive distribution) that narrow near observations and widen in data-sparse regions, quantifying interpolation reliability.^[48] To handle high-dimensional inputs where shallow GPs struggle with the curse of dimensionality, deep Gaussian processes stack multiple GP layers, composing transformations to capture complex, hierarchical structures while propagating uncertainty through variational inference.^[50]

Kriging

Kriging is a geostatistical interpolation technique that provides the best linear unbiased estimator (BLUE) for predicting the value of a spatial random field at unobserved locations, based on observed data and a model of spatial autocorrelation. Kriging, also known as Gaussian process regression in the machine learning literature, corresponds to GPR when the covariance function is derived from the variogram model.^[48] Developed initially by South African mining engineer D. G. Krige in the 1950s and formalized by Georges Matheron in the early 1960s, kriging originated in the context of estimating ore grades in mining but has since become a cornerstone of spatial prediction in geostatistics. Under the assumption of second-order stationarity, where the mean is constant and the covariance depends only on the separation distance, kriging minimizes the prediction error variance while ensuring unbiasedness.^[51]^[52]^[53] The method relies on modeling spatial dependence through the variogram, defined as \gamma(h) = \frac{1}{2} \mathrm{Var}[Z(\mathbf{x}) - Z(\mathbf{x} + \mathbf{h})], where Z is the spatial process, \mathbf{x} is a location, and \mathbf{h} is the lag vector. An empirical variogram is first computed from the data as \hat{\gamma}(h) = \frac{1}{2|N(h)|} \sum_{N(h)} [Z(\mathbf{x}_i) - Z(\mathbf{x}_j)]^2 for pairs separated by distance h, then fitted to a theoretical model such as the spherical or exponential form to ensure positive definiteness. The spherical model is given by

\gamma(h) = \begin{cases} C_0 + C \left(1 - \frac{3h}{2a} + \frac{1}{2} \left(\frac{h}{a}\right)^3 \right) & 0 < h \leq a, \\ C_0 + C & h > a, \end{cases}

while the exponential model is \gamma(h) = C_0 + C (1 - e^{-h/a}), where C_0 is the nugget effect (discontinuity at the origin due to measurement error or microscale variation), C_0 + C is the sill (total variance), and a is the range (distance beyond which observations are uncorrelated). Fitting typically uses weighted least squares to match the model to empirical points, balancing goodness-of-fit with smoothness.^[54]^[55]^[56] Kriging variants differ in their treatment of the mean structure. Simple kriging assumes a known constant mean \mu, yielding weights \boldsymbol{\lambda} = \Sigma^{-1} \mathbf{k}(\mathbf{x}^*) and predictor Z^*(\mathbf{x}^*) = \mu + \mathbf{k}(\mathbf{x}^*)^T \boldsymbol{\lambda} ( \mathbf{Z} - \mu \mathbf{1} ), where \Sigma is the covariance matrix of observations, \mathbf{k}(\mathbf{x}^*) is the covariance vector between the prediction location \mathbf{x}^* and observations \mathbf{Z}, and the kriging variance is \sigma_K^2 = c(0) - \mathbf{k}^T \boldsymbol{\lambda} with c(0) the variance at lag zero. Ordinary kriging extends this to an unknown constant mean by adding the constraint \mathbf{1}^T \boldsymbol{\lambda} = 1, solved via Lagrange multipliers, making it more robust for real data where the mean is estimated implicitly. Universal kriging further accommodates a nonstationary mean modeled as a linear trend, such as \mu(\mathbf{x}) = \mathbf{f}^T(\mathbf{x}) \boldsymbol{\beta}, incorporating regression on covariates like coordinates. For multivariate cases, cokriging uses cross-variograms between primary and auxiliary variables to improve predictions when secondary data are abundant.^[54]^[55]^[57] In practice, for a two-dimensional dataset of ore grades sampled at irregular points, an exponential variogram might be fitted to capture moderate spatial continuity, with parameters C_0 = 0.1, C = 0.9, and a = 500 meters. Ordinary kriging then produces a smooth contour map of predicted grades across the area, overlaid with kriging variance to highlight regions of high uncertainty near data gaps or edges. This approach ensures predictions honor the data at sample locations (exact interpolation) while quantifying local reliability. Kriging's key properties include its optimality as the BLUE under stationarity, explicit incorporation of spatial correlation to reduce estimation error compared to simpler methods, and provision of prediction variances for uncertainty propagation. In the 2020s, cokriging extensions have advanced environmental applications in remote sensing, such as integrating satellite-derived auxiliary variables to map forest aboveground biomass with improved accuracy over large, sparsely sampled regions.^[58]

Applications

In Numerical Analysis and Computation

In numerical analysis, interpolation plays a fundamental role in developing quadrature rules for approximating definite integrals. Interpolatory quadrature formulas, such as the Newton-Cotes rules, derive from fitting a polynomial of degree at most n-1 to n equally spaced points and integrating that polynomial exactly.^[59] The trapezoidal rule corresponds to linear (n=2) interpolation over each subinterval, while Simpson's rule uses quadratic (n=3) interpolation, yielding higher accuracy for smooth integrands.^[59] These methods form the basis for composite rules over multiple subintervals, enabling efficient computation of integrals where exact antiderivatives are unavailable.^[60] Collocation methods leverage interpolation to approximate solutions of ordinary and partial differential equations (ODEs and PDEs) by enforcing the equation at specific collocation points. In these approaches, the solution is represented by an interpolating polynomial or spline that satisfies the differential equation exactly at the chosen points, with boundary conditions incorporated separately. For ODEs, polynomial collocation dates to early works and remains a cornerstone in various numerical solvers, while for PDEs, it underpins meshless and spectral methods.^[61] Error analysis in these contexts reveals that the global error for composite interpolatory quadrature rules of degree k converges as O(h^{k+1}), where h is the subinterval width, due to propagation of local interpolation errors across the domain.^[62] This order reflects the balance between approximation accuracy and accumulated truncation errors in partitioned intervals.^[62] A practical example of interpolation in computation is numerical differentiation using cubic splines, which often outperforms finite difference methods for smooth data. Cubic spline interpolation constructs a piecewise cubic polynomial that matches function values and ensures continuous first and second derivatives at knots, allowing derivative approximation via direct differentiation of the spline coefficients.^[63] In comparison, central finite differences yield O(h^2) accuracy but amplify noise in uneven or sparse data, whereas splines achieve near-fourth-order accuracy globally while smoothing irregularities, as demonstrated in applications to experimental curves.^[63] Computational challenges arise in evaluating large-scale interpolants, such as radial basis functions (RBFs) or splines, where dense systems lead to ill-conditioned matrices; preconditioning techniques, like accelerated moving least squares for RBFs, reduce condition numbers from $10^{12} to $10^6 and accelerate iterative solvers like GMRES.^[42] Stability concerns are prominent in polynomial interpolation, particularly with Vandermonde matrices formed by monomial bases, which exhibit exponential ill-conditioning as the degree increases, leading to severe roundoff errors even for moderate n. This instability manifests in interpolated values diverging from true polynomials, with condition numbers growing as O(2^n) for equispaced nodes. In finite element methods (FEM), isogeometric analysis addresses such issues by using non-uniform rational B-splines (NURBS) as basis functions, enabling exact representation of CAD geometries and higher-order continuity for improved solution accuracy in PDE discretizations. Introduced in 2005, this approach integrates design and analysis seamlessly, reducing meshing errors and enhancing stability for complex engineering problems.

In Signal and Image Processing

In signal and image processing, interpolation plays a crucial role in resampling discrete signals and images, particularly for upsampling and downsampling operations to reconstruct continuous representations or adjust resolutions while preserving signal integrity. The Nyquist-Shannon sampling theorem provides the theoretical foundation for ideal reconstruction of bandlimited signals, stating that a continuous-time signal bandlimited to frequency B can be perfectly recovered from its samples taken at a rate of at least $2B samples per second using sinc interpolation. This theorem ensures that under these conditions, the original signal f(t) can be expressed as

f(t) = \sum_{n=-\infty}^{\infty} y_n \operatorname{sinc}\left(\pi \frac{t - nT}{T}\right),

where y_n = f(nT) are the discrete samples, T is the sampling interval, and \operatorname{sinc}(x) = \sin(x)/x. In practice, infinite sinc functions are truncated and windowed (e.g., with a Hamming or Kaiser window) to reduce computational complexity and ringing artifacts, though this introduces minor approximation errors. For upsampling, a common approach involves inserting zeros between samples followed by low-pass filtering with a sinc-based kernel to interpolate new values, effectively expanding the signal's frequency spectrum without introducing aliasing; for instance, doubling the sample rate of a 1D audio signal might yield a smoother waveform with preserved high frequencies up to the Nyquist limit, as visualized in frequency domain plots showing the baseband replication. In image processing, interpolation is essential for resizing pixel grids, where methods like bilinear and bicubic interpolation balance efficiency and quality. Bilinear interpolation computes new pixel values as weighted averages of the four nearest neighbors in a 2x2 grid, providing smooth transitions suitable for general resizing but prone to blurring fine details. Bicubic interpolation extends this by incorporating 16 neighboring pixels via a cubic polynomial kernel, yielding sharper results with better edge preservation, though it requires more computation; for example, resizing a 512x512 image to 1024x1024 using bicubic can improve visual fidelity over bilinear by reducing interpolation-induced softness. To mitigate aliasing during downsampling, the Lanczos kernel—a windowed sinc function with typically 3 lobes—serves as an anti-aliasing filter, suppressing high-frequency components that could fold into lower frequencies and produce moiré patterns; it is widely adopted in graphics software for its ability to maintain sharpness without excessive blurring. Common artifacts in these techniques include aliasing from nearest-neighbor interpolation, which replicates pixels harshly and causes jagged edges (e.g., "staircasing" in diagonal lines), and ringing from sinc or Lanczos methods due to Gibbs phenomenon near sharp transitions, manifesting as oscillatory halos. Quality is often quantified using peak signal-to-noise ratio (PSNR), where higher values indicate better fidelity; for instance, Lanczos downsampling might achieve 2-5 dB higher PSNR than nearest-neighbor on test images with fine textures. Linear interpolation, while simple for basic resampling, is frequently outperformed by these higher-order methods in preserving spectral content. Interpolation also features in image compression schemes, such as wavelet-based methods like JPEG 2000, where subband reconstruction during decoding employs spline or sinc interpolation to synthesize the full image from quantized coefficients, enabling efficient lossy compression with minimal artifacts at ratios up to 20:1. In fractal image compression, affine transformations model self-similar patterns, and iterative function systems (IFS) use interpolation decoding—often bilinear or polynomial—to iteratively refine low-resolution approximations into higher-detail images, achieving compression ratios of 10-50:1 while exploiting natural redundancies. Recent advances incorporate AI-based super-resolution, exemplified by the Super-Resolution Convolutional Neural Network (SRCNN), which learns an end-to-end mapping from low- to high-resolution images via a three-layer CNN, outperforming traditional bicubic interpolation by 1-2 dB in PSNR on standard datasets like Set5, by effectively hallucinating plausible details from learned priors.

In Geostatistics and Spatial Analysis

In geostatistics, interpolation techniques such as kriging and inverse distance weighting (IDW) are employed for spatial prediction to estimate values of continuous variables at unsampled locations, enabling the creation of contour maps from point observations.^[64]^[65] Kriging, a best linear unbiased estimator, accounts for spatial autocorrelation through a variogram model, providing predictions that minimize estimation variance, while IDW assigns weights inversely proportional to distance from neighboring points, offering a simpler deterministic approach suitable for local anomalies.^[51] Block kriging extends point kriging by estimating averages over finite areas or blocks, reducing variance compared to point estimates and proving useful for resource assessment where support effects matter.^[66] A representative application involves interpolating rainfall data from rain gauges to generate isohyet maps, which delineate lines of equal precipitation intensity. Using kriging with an exponential variogram model fitted to gauge data captures spatial dependence, yielding smoother contours that reflect orographic influences, as demonstrated in studies of mountainous regions where elevation-driven variability is prominent.^[67]^[68] Integration with geographic information systems (GIS) enhances these methods through Delaunay triangulation, which connects sample points into non-overlapping triangles maximizing minimum angles, serving as the dual to Voronoi diagrams for natural neighbor interpolation that weights contributions based on overlapping Voronoi cells.^[69]^[70] Uncertainty in spatial predictions is quantified via kriging variance surfaces, which map estimation error variability and support risk assessment by highlighting areas of high prediction uncertainty, such as regions distant from samples, informing decisions in environmental monitoring.^[71]^[72] For non-stationary fields exhibiting trends, universal kriging incorporates covariates like elevation to model deterministic drifts, improving accuracy in predicting phenomena such as precipitation gradients where mean values vary systematically with topography.^[73]^[74] Model validation in geostatistics relies on cross-validation, where each point is sequentially omitted and predicted from the rest, evaluating performance with metrics like mean error (ME) for bias assessment and root mean square error (RMSE) for overall accuracy, guiding variogram and parameter selection.^[75] In climate science, interpolation supports downscaling of coarse global climate model outputs to finer resolutions, as outlined in IPCC AR6 assessments, where statistical methods like kriging refine projections of variables such as temperature and precipitation for regional impact studies post-2020.^[76]

Function Approximation

Function approximation encompasses methods to construct a surrogate function that estimates a target function or dataset by minimizing a global error metric, rather than requiring exact matches at specific points as in interpolation. This approach is particularly valuable in numerical analysis when dealing with overdetermined systems—where the number of data points exceeds the parameters of the approximating function—or when data contains noise that could lead to overfitting in exact interpolation. For instance, least-squares approximation minimizes the sum of squared residuals between the data and the surrogate, providing a balanced fit that prioritizes overall accuracy over pointwise precision.^[77]^[78] A key distinction lies in the relation to interpolation: the latter represents a constrained special case of approximation where the error (or residual) is enforced to be zero at designated nodes, often resulting in higher-degree polynomials prone to oscillations like Runge's phenomenon. In contrast, general approximation relaxes this constraint to achieve smoother results and better generalization. Prominent methods include Chebyshev polynomial approximations, which optimize for the minimax error by minimizing the maximum absolute deviation over a continuous interval, making them ideal for uniform error control in bounded domains. For periodic functions, Fourier series provide an orthogonal basis decomposition into sines and cosines, enabling efficient approximation through truncated partial sums that capture dominant frequencies.^[77]^[79] To illustrate, consider approximating a set of n+1 noisy data points with a polynomial of degree at most m < n. An interpolating polynomial of degree n would pass exactly through all points but amplify noise, whereas a least-squares polynomial fit minimizes the discrete L_2 norm:

\min_p \sum_{i=0}^{n} (y_i - p(x_i))^2,

yielding a smoother curve with controlled oscillations. Error analysis in approximation theory contrasts the pointwise exactness of interpolation with integral or sup norms; for example, the L_2 norm integrates squared differences, while the L_\infty norm bounds the worst-case deviation. Jackson's theorem establishes direct bounds on approximation error using the modulus of continuity, stating that for a continuous function on [-1,1], the best uniform polynomial approximation error E_n(f) satisfies E_n(f) \leq C \omega(f, 1/n), where \omega measures smoothness. Bernstein's converse theorem complements this by linking rapid approximation rates to function differentiability, ensuring that smooth functions admit low-error approximations.^[80]^[81] Approximation is especially suited for noisy datasets, where exact interpolation risks propagating errors, or overdetermined problems requiring robust fits beyond exact nodal passage. In contemporary extensions, neural network approximation theory builds on the 1989 universal approximation theorem, which guarantees that sigmoidal networks can approximate any continuous function on a compact set to arbitrary accuracy given sufficient neurons. Recent 2020s developments, including universal approximation results for deep geometric networks handling non-Euclidean data like graphs, have broadened these capabilities to structured and high-dimensional settings.^[82]

Extrapolation and Generalization

Extrapolation extends an interpolating function beyond the range of the input data points, typically outside the interval [ \min x_i, \max x_i ], to predict values in unobserved regions. Unlike interpolation, which remains within known bounds and generally yields reliable estimates for low-degree methods, extrapolation introduces significant risks of divergence and instability, particularly with high-degree polynomials. For instance, high-order polynomial interpolants on equispaced data can exhibit wild oscillations and rapid growth outside the data range, a phenomenon exacerbated by the ill-conditioning of the underlying Vandermonde matrix, where small perturbations in data lead to large errors in the extrapolated function. This instability is classically illustrated by Runge's phenomenon, where interpolating the function f(x) = 1/(1 + 25x^2) on [-1, 1] with increasing polynomial degrees results in diverging oscillations near the endpoints, and extrapolation beyond this interval amplifies the divergence due to the function's analytic properties and nearby complex poles. To mitigate these risks, stable extrapolation techniques prioritize lower-order or specially structured approximants over high-degree polynomials. Linear extrapolation, using the slope from the two nearest endpoint data points, provides a simple and numerically robust method, avoiding the oscillatory behavior of higher orders while offering reasonable estimates for mildly varying functions. Rational function extrapolation, such as Padé approximants or barycentric rational interpolation, further enhances stability by incorporating poles that can be controlled to prevent divergence, allowing effective use beyond the data range even for moderately nonlinear behaviors; for example, the barycentric form suppresses spurious oscillations through adaptive weighting. These methods are preferred in practice, as high-order polynomial extrapolation is generally discouraged outside the bounds due to its propensity for exponential error growth.^[83]^[84] A practical example arises in temperature modeling for climate projections, where extrapolating historical data to future scenarios reveals pronounced uncertainty growth. In global climate models, extending temperature trends beyond observed periods (e.g., post-2020 projections) involves inherent extrapolation, with uncertainties increasing due to unmodeled feedbacks and scenario variability; the IPCC AR6 assesses that near-term (2021–2040) temperature projections carry lower uncertainty from internal variability, but long-term extrapolations (2081–2100) see dominant contributions from scenario and model structural uncertainties, often exceeding 1°C in regional spreads. This underscores the need for uncertainty quantification, such as ensemble methods, to warn against over-reliance on extrapolated predictions in policy decisions.^[85] In statistical contexts, generalization refers to the ability of an interpolant derived from an empirical distribution to perform well on unseen data, often assessed through variance estimation techniques like the bootstrap. Interpolating empirical distributions—nonparametric estimates of the underlying probability distribution from finite samples—allows for smooth density or quantile estimates, but requires quantifying sampling variability to ensure reliable generalization. The bootstrap method addresses this by resampling the empirical distribution with replacement to approximate the sampling distribution of the interpolant, providing unbiased variance estimates and confidence intervals; for instance, in kernel density estimation, bootstrapping reveals how interpolation smoothness affects out-of-sample prediction error.^[86] This statistical perspective connects to machine learning, where kernel ridge regression (KRR) serves as a regularized form of interpolation that balances fitting the training data with controlling generalization error. In the ridgeless limit (zero regularization), nonlinear kernel methods can perfectly interpolate training points yet still generalize effectively under certain conditions, such as when the kernel's reproducing kernel Hilbert space aligns with the target function's smoothness; theoretical analyses show that the excess generalization error decays polynomially with sample size, outperforming unregularized interpolation in high dimensions. Regularization in KRR introduces a ridge penalty to dampen overfitting, ensuring stable extrapolation-like behavior on test data.^[87] Theoretical bounds on extrapolation reliability are provided by the Markov brothers' inequality, which quantifies the potential growth of polynomial derivatives outside the interpolation interval. The inequality states that for a polynomial p of degree n on [-1, 1], the maximum of the k-th derivative satisfies \|p^{(k)}\|_\infty \leq \frac{n^2 (n^2 - 1) \cdots (n^2 - (k-1)^2)}{1 \cdot 3 \cdots (2k - 1)} \|p\|_\infty, implying that derivatives—and thus the function itself—can grow at most factorially outside the interval before instability dominates. Extensions to exterior regions, such as Bernstein-Markov inequalities, further bound this growth for analytic continuations, highlighting why low-degree or non-polynomial methods are essential for controlled extrapolation in uncertainty-sensitive applications like climate modeling.^[88]^[89]

Mimetic and Shape-Preserving Methods

Mimetic methods in interpolation aim to preserve fundamental mathematical and physical properties of the underlying continuous operators, such as conservation laws, symmetry, and mimetic relations between differential forms, when discretizing on unstructured or polyhedral meshes. These methods construct discrete analogs of vector calculus operators, ensuring that properties like the divergence of a curl being zero or the integral form of Stokes' theorem hold exactly in the discrete setting. Originating from efforts to mimic continuum mechanics in numerical simulations, mimetic finite difference (MFD) schemes extend traditional interpolation by incorporating geometric fidelity, particularly for vector and tensor fields in applications like electromagnetism and fluid dynamics. A key feature is the use of inner products on dual grids to define stable and accurate interpolants without relying on coordinate transformations.^[90] Seminal developments in mimetic interpolation include the framework for diffusion equations on general polyhedral meshes, where discrete gradient, divergence, and curl operators are derived to satisfy a discrete de Rham complex, preserving null spaces and exact sequences. For instance, in vector field interpolation on staggered grids like Arakawa C/D used in atmospheric modeling, mimetic schemes extend bilinear interpolation to ensure that interpolated fields maintain orthogonality and conservation properties, such as the discrete version of \nabla \cdot (\nabla \times \mathbf{v}) = 0. These methods are particularly advantageous for curvilinear or polygonal domains, where traditional finite elements may fail to preserve structure; a harmonic-based approach reconstructs fields via truncated polynomial expansions on polygons, achieving high-order accuracy while mimicking Laplace's equation solutions. Recent extensions to interpolation-free cell-centered schemes avoid explicit reconstruction, directly computing fluxes to maintain mimetic properties in heterogeneous anisotropic diffusion problems.^[91]^[92]^[93] Shape-preserving interpolation methods focus on constructing interpolants that retain qualitative features of the data, such as monotonicity, convexity, positivity, or bounded variation, preventing oscillations or inflections that could arise in standard polynomial or spline approximations. These techniques are essential in scientific visualization, data fitting, and simulations where unphysical artifacts must be avoided, like in economics for yield curves or in graphics for smooth rendering. By constraining derivatives or using rational functions, shape-preserving schemes ensure the interpolant lies within the convex hull of the data points and respects interval trends.^[94] A foundational approach is the piecewise monotone cubic interpolation, which modifies the cubic Hermite spline by adjusting end slopes to guarantee monotonicity when the data is monotone, using a tension parameter to control flatness near extrema. This method, known as PCHIP (Piecewise Cubic Hermite Interpolating Polynomial), achieves C^1 continuity and is widely adopted in software libraries for its balance of smoothness and fidelity. For quadratic splines, shape preservation involves solving tridiagonal systems to enforce non-negativity of second differences for convexity or first differences for monotonicity, as in algorithms that partition data into monotone/convex segments and apply local quadratic fits. Rational spline variants, such as those with quadratic numerators and linear denominators, introduce free parameters to flexibly preserve multiple properties like positivity and range restrictions, often outperforming polynomial methods in visual quality without overshooting. These methods typically converge at rates comparable to their polynomial counterparts but prioritize reliability over higher-order accuracy in non-smooth data regimes.^[95]^[96]^[94]

References

[1]
[PDF] Introduction to Numerical Analysis, Lecture 3 - MIT OpenCourseWare
Interpolation is the problem of fitting a smooth curve through a given set of points, generally as the graph of a function.<|control11|><|separator|>
[2]
A Chronology of Interpolation - ImageScience.Org
Interpolation was used by Babylonian astronomers (300 BC), Hipparchus (150 BC), and Harriot (1611), with the term first used in 1655. Newton laid the ...
[3]
Lagrange Interpolating Polynomial -- from Wolfram MathWorld
The formula was first published by Waring (1779), rediscovered by Euler in 1783, and published by Lagrange in 1795 (Jeffreys and Jeffreys 1988).
[4]
Interpolation -- from Wolfram MathWorld
The computation of points or values between ones that are known or tabulated using the surrounding points or values.<|control11|><|separator|>
[5]
[PDF] interpolation.pdf - UMD MATH
The task of interpolation is to define a function in the whole region so that it coincides with the data at this discrete set. Various methods for ...
[6]
[PDF] INTERPOLATION
Apr 3, 2020 · We begin by deriving two important interpolation formulae by means of forward and backward differences of a function. These formulae are often.
[7]
[PDF] Numerical Interpolation and Applications - HAL
Feb 21, 2025 · Interpolation transforms data into a continuous function that can be treated with standard analytical or transform methods. Appli- cations cover ...Missing: motivation | Show results with:motivation
[8]
Mathematical tables in Ptolemy's Almagest - ScienceDirect
Feb 26, 2014 · There is also mathematical evidence that Ptolemy used linear interpolation on his table of right ascension, Almagest II.8 col1, to calculate ...
[9]
[PDF] History of Interpolation Text Book Notes Interpolation
Jun 6, 2006 · Two of the methods of interpolation taught at the HNMI are credited to Newton and. Lagrange. Newton began his work on the subject in 1675, which ...
[10]
[PDF] 224 Üb. empir. Funktionen ud Interpolation zwischen äquidistanten ...
Über empirische Funktionen und die Interpolation zwischen äquidistanten Ordinaten. Von C. RUNGE in Hannover. Die Abhängigkeit zwischen zwei messbaren ...Missing: Lagrange | Show results with:Lagrange<|control11|><|separator|>
[11]
[PDF] Splines as Linear Combinations of B-Splines. A Survey. - DTIC
B-splines made their first appearance in Schoenberg's. 1946 paper on the approximation of equidistant data by analytic functions. There is no doubt that ~- ...Missing: shipbuilding | Show results with:shipbuilding
[12]
[PDF] "Kriging" in - UC Davis Statistics
Kriging is named after the South African mining engineer, D. G. Krige, who during the 1950s devel- oped statistical techniques for predicting ore-grade ...
[13]
Evaluating the state of the art in missing data imputation for clinical ...
Dec 9, 2021 · The Data Analytics Challenge on Missing data Imputation (DACMI) presented a shared clinical dataset with ground truth for evaluating and advancing the state of ...
[14]
[PDF] 17 Interpolation - MIT OpenCourseWare
One, referred to as a zero-order hold, interpo- lates between sample points by holding each sample value until the next sam- pling instant. This generates a ...Missing: definition | Show results with:definition
[15]
None
### Summary of Nearest Neighbor Interpolation
[16]
Python:Interpolation - PrattWiki
Apr 20, 2020 · There are several advantages of nearest neighbor: Very simple "calculation" - really, there is no calculation other than finding out which ...Missing: algorithm | Show results with:algorithm
[17]
Chapter 2 Interpolation | MA20222: Numerical Analysis
The interpolant p1(x) p 1 ( x ) is simply the straight line given by p1(x)=f(x0)+(f(x1)−f(x0)x1−x0)(x−x0).
[18]
[PDF] CS322 Lecture Notes: Interpolation - CS@Cornell
Feb 12, 2007 · m = yk+1 − yk xk+1 − xk or m = ∆y. ∆x . That is, the value is a weighted average between the points on either side, with a weight that favors ...
[19]
Linear Interpolation - an overview | ScienceDirect Topics
Generalized interpolation methods extend linear interpolation by modeling images as linear combinations of basis functions with local support. For example ...
[20]
[PDF] Chapter 11: Piecewise Polynomial Interpolation
Jul 20, 2013 · • The error is O(h2). If we have n + 1 equi-distant data points ... Error analysis for piecewise linear. • Linear polynomials: m = 1, ti ...
[21]
Linear Interpolation in Computer Graphics - Tutorials Point
We also discussed various applications of linear interpolation in graphics. These include drawing lines, creating color gradients, animation, and texture ...
[22]
[PDF] 18.330 Spring 2012 Introduction to numerical analysis Class notes
Lagrange's solution to the problem of polynomial interpolation is based on the following construction. Lemma 1. (Lagrange elementary polynomials) Let {xj,j = 0, ...
[23]
DLMF: §3.3 Interpolation ‣ Areas ‣ Chapter 3 Numerical Methods
... Lagrange interpolation polynomial and ω n + 1 ⁡ ( z ) : nodal polynomial ... Lagrange interpolation, Newton's interpolation formula; Notes: See Davis ...
[24]
[PDF] Newton Polynomials and Divided Differences
This is called Newton's interpolatory divided difference formula. Page 15. Table Format x f (x). First. Second. Third x0 f [x0] f [x0, x1] = f [x1] - f [x0] x1 ...
[25]
[PDF] Chapter 05.03 Newton's Divided Difference Interpolation
Dec 23, 2009 · Figure 2 Linear interpolation. Example 1. The upward velocity of a rocket is given as a function of time in Table 1 (Figure 3). Table ...
[26]
[PDF] Polynomial Interpolation: Error Analysis and Introduction to Splines
Oct 17, 2010 · 1 Error Analysis. Recall the error formula for polynomial interpolation. Given some function f, a set of interpolation points x = (x0,x1 ...
[27]
[PDF] The Runge Phenomenon and Piecewise Polynomial Interpolation
Aug 16, 2017 · In this lecture we consider the dangers of high degree polynomial interpolation and the spurious oscillations that can.Missing: empirische Funktionen
[28]
A Practical Guide to Splines - SpringerLink
$$99.99 Free delivery 14-day returnsNov 29, 2001 · Book Title: A Practical Guide to Splines. Authors: Carl de Boor. Series Title: Applied Mathematical Sciences. Publisher: Springer New York, NY.
[29]
Numerical Analysis, 10th Edition - 9781305253667 - Cengage
30-day returnsHardcopy textbook for Burden/Faires/Burden's Numerical Analysis. Buy direct for hassle-free returns. Included in Cengage Unlimited.
[30]
[PDF] Linear Methods for Image Interpolation - IPOL Journal
Jun 16, 2015 · Given input image v with uniformly-sampled pixels vm,n, the goal of interpolation is to find a function u(x, y) satisfying.Missing: citation | Show results with:citation
[31]
[PDF] 3.6 Interpolation in Two or More Dimensions
There are two distinctly different directions that one can take in going beyond bilinear interpolation to higher-order methods: One can use higher order to ...
[32]
[PDF] Interpolation via Barycentric Coordinates - Inria
•The convex hull of a set S is: • The minimal convex set that contains S, i.e. any convex set C such that S ⊆ C satisfies CH(S) ⊆ C.
[33]
A two-dimensional interpolation function for irregularly-spaced data
In many fields using empirical areal data there arises a need for interpolating from irregularly-spaced data to produce a continuous surface.
[34]
How IDW works—ArcGIS Pro | Documentation
Inverse distance weighted (IDW) interpolation determines cell values using a linearly weighted combination of a set of sample points.How Idw Works · Controlling The Influence... · Limiting The Points Used For...<|control11|><|separator|>
[35]
How inverse distance weighted interpolation works—ArcGIS Pro
As mentioned above, weights are proportional to the inverse of the distance (between the data point and the prediction location) raised to the power value p. As ...Missing: steps | Show results with:steps
[36]
Inverse distance weighting method optimization in the process of ...
Apr 30, 2020 · The IDW interpolation parameters such as search area, number ... effect of the power parameter on the accuracy and speed of interpolation.<|control11|><|separator|>
[37]
IDW (Spatial Analyst)—ArcGIS Pro | Documentation
The output value for a cell using inverse distance weighting (IDW) is limited to the range of the values used to interpolate. Because IDW is a weighted distance ...Interpolation Toolset · Idw (spatial Analyst) · Usage
[38]
Inverse Distance Weighting (IDW) Interpolation - GIS Geography
Inverse Distance Weighting (IDW) interpolation estimates unknown values with specifying search distance, closest points, power setting & barriers.
[39]
Spatial estimation of large-scale soil salinity using enhanced inverse ...
Aug 1, 2025 · The unrealistic "bull's-eye" effect is a common problem when using the inverse distance weighting method (IDW) for interpolation.
[40]
Weighted essentially non-oscillatory shepard method
Jun 27, 2025 · Shepard's method constructs an approximation of a continuous function by assigning weights to data points that are inversely proportional to the ...
[41]
[PDF] Radial basis functions - UC Davis Math
Radial basis function methods are modern ways to approximate multivariate functions, especially in the absence of grid data. They have been known,.
[42]
[PDF] Distance matrices and conditionally positive definite functions
Jul 10, 1985 · 9. Springer-Verlag New York Inc. Interpolation of Scattered Data: Distance Matrices and Conditionally Positive Definite Functions. Charles A.
[43]
[PDF] Preconditioning of Radial Basis Function Interpolation Systems via ...
In this paper we present a preconditioning technique based on residual iteration of an approximate moving least squares quasi-interpolant that can be ...
[44]
Multiquadric equations of topography and other - DOI
No information is available for this page. · Learn whyMissing: PDF | Show results with:PDF
[45]
[PDF] Interpolation des fonctions de deux variables suivant le principe de ...
Dec 2, 1976 · INTERPOLATION DES FONCTIONS. DE DEUX VARIABLES SUIVANT LE PRINCIPE. DE LA FLEXION DES PLAQUES MINCES par Jean DUCHON (*). Communiqué par P.-J ...
[46]
[PDF] A kernel independent fast multipole algorithm for radial basis functions
Nov 2, 2005 · The original fast multipole method (FMM) [6] by Greengard and Rokhlin was developed for the fast evaluation of the interaction through the ...
[47]
[PDF] Choosing Basis Functions and Shape Parameters for Radial Basis ...
Oct 25, 2011 · This set of experiments investigates the use of LOOCV, GCV, and MLE as methods for locating a shape parameter ε for use in an RBF interpolation.Missing: seminal | Show results with:seminal
[48]
Learning in Deep Radial Basis Function Networks - MDPI
In this paper, we show that deeper RBF architectures with multiple radial basis function layers can be designed together with efficient learning schemes.Missing: 2020s | Show results with:2020s
[49]
[PDF] Gaussian Processes for Machine Learning
Gaussian processes for machine learning / Carl Edward Rasmussen, Christopher K. I. Williams. p. cm. —(Adaptive computation and machine learning). Includes ...
[50]
Variational Learning of Inducing Variables in Sparse Gaussian ...
Sparse Gaussian process methods that use inducing variables require the selection of the inducing inputs and the kernel hyperparameters.
[51]
Deep Gaussian Processes
In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings.
[52]
Principles of geostatistics | Economic Geology - GeoScienceWorld
Mar 2, 2017 · Geostatistics, the principles of which are summarized in this paper, constitutes a new science leading to such an approach.
[53]
Fifty Years of Kriging | SpringerLink
Jun 26, 2018 · This chapter presents the origins of kriging as well as the development of its theory and its applications along the last fifty years.
[54]
Kriging: Understanding allays intimidation
Kriging is the name given in geostatistics to a collection of generalized linear regression techniques for the estimation of spatial phenomena. Pierre Carlier, ...
[55]
How Kriging works—ArcGIS Pro
There are two kriging methods: ordinary and universal. Ordinary kriging is the most general and widely used of the kriging methods and is the default. It ...
[56]
Chapter 14 Kriging | Spatial Statistics for Data Science - Paula Moraga
Kriging (Matheron 1963) is a spatial interpolation method used to obtain predictions at unsampled locations based on observed geostatistical data.
[57]
Computing and modelling variograms and kriging - ScienceDirect.com
Geostatistics uses kriging to interpolate from sparse data, requiring a reliable variogram. Kriging is a least-squares method for best linear unbiased ...
[58]
[PDF] Geostatistics: Learning kriging through examples - Esri Community
• Simple Kriging. - Assumes a constant but known mean value - more powerful than ordinary kriging. • Universal Kriging. - Assumes that there is an overriding ...
[59]
A Statistical Learning View of Simple Kriging - arXiv
... Kriging, the flagship problem in Geostatistics, introduced by (Krige, 1951) and later in the seminal work of (Matheron, 1962) . In the standard Kriging ...
[60]
Co-Kriging-Guided Interpolation for Mapping Forest Aboveground ...
Aug 9, 2024 · In this study, we proposed a collaborative kriging (co-kriging) interpolation-based method for mapping spatially continuous forest AGB by integrating GEDI and ...Missing: 2020s | Show results with:2020s
[61]
Newton-Cotes Formulas -- from Wolfram MathWorld
To find the fitting polynomials, use Lagrange interpolating polynomials. The resulting formulas are called Newton-Cotes formulas, or quadrature formulas.
[62]
DLMF: §3.5 Quadrature ‣ Areas ‣ Chapter 3 Numerical Methods
An interpolatory quadrature rule with weight function w ⁡ ( x ) , is one for which E n ⁡ ( f ) = 0 whenever f is a polynomial of degree ≤ n − 1.
[63]
A Review of Collocation Approximations to Solutions of Differential ...
Our aim in this review is to study quintic Hermite functions and develop a numerical collocation scheme for solving ODEs and PDEs.
[64]
[PDF] Polynomial Interpolating Quadrature Atkinson Chapter 5, Stoer ...
The interpolation error on the ith interval is f(x) − p(x) = (x − xi)(x − xi+1). 2 f00(ξi). The integral of f is approximated by the integral of p. The integral ...
[65]
[PDF] Numerical differentiation of experimental data: local versus global ...
Mar 8, 2007 · In this article, we compare four different techniques: finite differencing [15], the Savitzky-Golay filter [14], smoothing splines [12] and a ...<|separator|>
[66]
Kriging Interpolation Explanation | Columbia Public Health
Named after Danie Krige, Kriging is a method of spatial interpolation that originated in mining geology. Learn more about the process and see examples.
[67]
IDW (Geostatistical Analyst)—ArcGIS Pro | Documentation
Unlike other interpolation methods—such as Kriging—IDW does not make explicit assumptions about the statistical properties of the input data. IDW is often used ...
[68]
Spatial Correlation Models for Advanced Methods
Alternatives to ordinary kriging are universal kriging and kriging with external trend. Kriging with external trend is also known as kriging with external drift ...Missing: non- | Show results with:non-
[69]
Cokriging for enhanced spatial interpolation of rainfall in two ...
Mar 4, 2017 · The spatial variability in kriging is quantified by the variogram model that defines the degree of spatial correlation between the data points ( ...
[70]
[PDF] Comparison of Rainfall Interpolation Methods for Obtaining Mean ...
Abstract: Average rainfall calculation by isohyets method is considered in hydrology as one of the most accurate ones because traditionally this procedure ...
[71]
Fundamentals of TIN triangulation in ArcGIS
The Delaunay triangulation ensures that no vertex lies within the interior of any of the circumcircles of the triangles in the network. If the Delaunay ...
[72]
Simple Geospatial Methods
Voronoi diagrams are generated from a Delaunay triangulation. In two dimensions, a Delaunay triangulation connects a set of points [P1, P2, … PN] to form ...Missing: GIS | Show results with:GIS<|separator|>
[73]
Kriging in Geostatistical Analyst—ArcMap | Documentation
Kriging techniques can be used to describe and model spatial patterns, predict values at unmeasured locations, and assess the uncertainty associated with a ...<|separator|>
[74]
Comparison of various uncertainty modelling approaches based on ...
Mar 1, 2019 · The most commonly applied geostatistical approach is the kriging variance that is jointly computed with the kriging prediction (Webster and ...
[75]
Assessing the effect of integrating elevation data into the estimation ...
This paper indicates that, for most months, the use of elevation data to inform estimation of monthly precipitation in Great Britain is beneficial.
[76]
[PDF] Application and evaluation of universal kriging for optimal ...
This paper deals with the application of universal kriging to interpolate water table elevations from their measurements at random locations.
[77]
Cross Validation (Geostatistical Analyst)—ArcGIS Pro | Documentation
The primary use for this tool is to compare the predicted value to the observed value in order to obtain useful information about some of your model parameters.
[78]
[PDF] Chapter 10: Linking Global to Regional Climate Change - IPCC
... AR6 WGI Report ... The sources include global and regional climate model simulations, statistical downscaling and bias adjustment methods.
[79]
[PDF] Approximation and Interpolation - Numerical Analysis Lecture Notes
We will now apply our minimization results to the interpolation and least squares fitting of data and functions. 13.1. Least Squares. Linear systems with more ...
[80]
[PDF] 6 Interpolation and Approximation
The difference lies in the fact that for Taylor polynomials the information is con- centrated at one point x0, whereas for interpolation the information is ...
[81]
[PDF] CHAPTER 4 FOURIER SERIES AND INTEGRALS
FOURIER SERIES AND INTEGRALS. 4.1 FOURIER SERIES FOR PERIODIC FUNCTIONS. This section explains three Fourier series: sines, cosines, and exponentials eikx.
[82]
Jackson's Theorem -- from Wolfram MathWorld
Jackson's theorem is a statement about the error E_n(f) of the best uniform approximation to a real function f(x) on [-1,1] by real polynomials of degree at ...
[83]
2 Theorems of Jackson and Bernstein for Polynomials of Best ...
This chapter describes the direct theorems of Jackson and inverse theorems of Bernstein, which play a fundamental role in approximation of periodic ...
[84]
[PDF] Universal Approximation Theorems for Differentiable Geometric ...
Abstract. This paper addresses the growing need to process non-Euclidean data, by introducing a geometric deep learning (GDL) framework for building ...
[85]
[PDF] Interpolation and Extrapolation
Jul 3, 2019 · High-order polynomial interpolation on equally spaced data is ill-conditioned: small changes in the data can give large differences in the ...
[86]
[PDF] arXiv:2404.18288v1 [gr-qc] 28 Apr 2024
Apr 28, 2024 · Named after German mathematician Carl Runge, the phenomenon refers to the oscillation that occurs when using polynomial interpolation at ...
[87]
[PDF] STABLE EXTRAPOLATION OF ANALYTIC FUNCTIONS
Abstract. This paper examines the problem of extrapolation of an analytic function for x > 1 given perturbed samples from an equally spaced grid on [−1, 1].
[88]
Chapter 4 | Climate Change 2021: The Physical Science Basis
This chapter assesses simulations of future global climate change, spanning time horizons from the near term (2021–2040), mid-term (2041–2060), and long term ( ...
[89]
[PDF] The Bootstrap 10.1 Introduction 10.2 Empirical Distribution Function
Generalization to other statistics. The bootstrap can be applied to many other statistics such as sample quantiles, interquartile range, skewness (related ...Missing: interpolation | Show results with:interpolation
[90]
[PDF] Generalization Error Rates in Kernel Regression - NIPS papers
In this manuscript we consider Kernel Ridge Regression (KRR) under the Gaussian design. Exponents for the decay of the excess generalization error of KRR ...
[91]
[PDF] Twelve Proofs of the Markov Inequality 1 Introduction
This is the story of the classical Markov inequality for the k-th derivative of an algebraic polynomial and attempts to find a simpler and better proof that.Missing: growth | Show results with:growth
[92]
[PDF] Markov's Inequality for Polynomials on Normed Linear Spaces
It is not difficult to prove that the Chebyshev polynomials give the best bounds for polynomial derivatives at points outside the the open unit interval.Missing: interpolation | Show results with:interpolation
[93]
Mimetic finite difference method - ScienceDirect.com
The objective of the mimetic finite difference (MFD) method is to create discrete approximations that preserve important properties of continuum equations.
[94]
[PDF] Mimetic finite difference methods for diffusion equations
This paper reviews and extends the theory and application of mimetic finite difference methods for the solution of diffusion problems in strongly ...<|control11|><|separator|>
[95]
Mimetic Interpolation of Vector Fields on Arakawa C/D Grids in
Jan 1, 2019 · Mimetic methods aim to preserve the mathematical identities . The novelty of this paper is to apply vector interpolation methods to Earth ...
[96]
A mimetic method for polygons - ScienceDirect.com
A new method for mimetic interpolation on polygonal meshes is described. This new method is based on harmonic function interpolation.
[97]
On Shape Preserving Quadratic Spline Interpolation
In this paper we discuss the design of algorithms for interpolating discrete data using C 1 -quadratic splines in such a way that the monotonicity and/or ...Missing: seminal | Show results with:seminal
[98]
Monotone Piecewise Cubic Interpolation - SIAM Publications Library
Fritsch and Carlson [SIAM J. Numer. Anal., 17 (1980), pp. 238–246] developed an algorithm which produces a monotone $C^1 $ piecewise cubic interpolant to a ...
[99]
Monotone Piecewise Cubic Interpolation - jstor
This paper is an addition to the recent literature on shape preserving interpolation, ... 242 F. N. FRITSCH AND R. E. CARLSON. To implement Step 1 we have found ...