Multivariate interpolation

Multivariate interpolation is a fundamental technique in numerical analysis and approximation theory that involves constructing a continuous function of two or more variables to pass exactly through a prescribed set of data points in higher-dimensional space, thereby estimating function values at unobserved locations. This process generalizes univariate interpolation—where a single-variable function interpolates one-dimensional data—to multidimensional settings, such as surfaces in three dimensions or hypersurfaces in higher ones, and relies on the data points being poised, meaning they are not contained in any algebraic hypersurface of lower degree.^[1]^[2] The origins of multivariate interpolation trace back to the mid-19th century, with Leopold Kronecker's 1865 work introducing polynomial ideals for interpolating at arbitrary point configurations in multiple variables, laying the groundwork for systematic approaches despite early challenges in unisolvence (the unique existence of an interpolating polynomial). Subsequent developments in the late 19th and early 20th centuries, including Shizuo Narumi's 1927 extension of Newton formulas to tensor-product grids and Herbert Salzer's 1944 adaptation of Gregory-Newton formulas for triangular lattices, shifted focus toward general scattered data sets by the 1950s, influenced by computational advances and applications in cubature formulas.^[2] Key theoretical frameworks emerged in the late 20th century, such as multivariate divided difference calculus, which connects interpolation to jet spaces and submanifold derivatives via block Vandermonde matrices and quasi-determinants, enabling recursive computation of coefficients.^[1] Common methods for multivariate interpolation include polynomial approaches like Lagrange and Newton forms adapted to multiple variables, spline-based techniques for smooth approximations over scattered or gridded data, and radial basis functions (RBFs) for flexible, meshfree interpolation that handles irregular point distributions effectively. These methods address the curse of dimensionality—where the number of required points grows exponentially with dimensions—through strategies like tensor-product structures for rectangular grids or symmetry exploitation to reduce computational cost, though challenges persist in ensuring stability and avoiding Runge-like oscillations in higher dimensions.^[1]^[3]^[4] Multivariate interpolation finds broad applications in scientific computing, including data visualization on rectilinear or unstructured grids, surface reconstruction in computer-aided design, and modeling physical phenomena such as precipitation patterns via regularized splines with tension for accurate geospatial estimation. In engineering, it enables smooth 3D surface generation for vibrating structures using parametric cubic polynomials that maintain continuity of first- and second-order derivatives, supporting projections in various coordinate systems for resonance analysis. Its role extends to machine learning for scattered data approximation and numerical simulations, where efficient implementations on GPUs can yield up to 10-fold speedups for large datasets.^[3]^[5]^[6]

Fundamentals

Definition and motivation

Multivariate interpolation is a technique in numerical analysis used to approximate the values of a function f: \mathbb{R}^n \to \mathbb{R} (or more generally to \mathbb{R}^m) at arbitrary points in n-dimensional space, where n > 1, based on its known values at a finite set of discrete points.^[1] This process constructs an interpolating function P that estimates f(x) for any x \in \mathbb{R}^n using data from points \{x_i\} \subset \mathbb{R}^n and corresponding values f(x_i), often requiring the solution of a system where the sample matrix formed by basis functions evaluated at these points is nonsingular.^[1] A key distinction exists between exact interpolation, where P(x_i) = f(x_i) for all i, and approximation, where P provides a close but not necessarily exact match at the data points to improve stability or smoothness.^[1] The motivation for multivariate interpolation arises from the need to extend univariate interpolation—the one-dimensional case where functions are approximated along a line—to higher dimensions, enabling the reconstruction of complex multidimensional phenomena from sparse data.^[1] Historically, its foundations trace back to Leopold Kronecker's 1865 work on polynomial ideals for interpolating over arbitrary point sets in multiple variables, with significant developments in the early 20th century through tensor product methods and finite difference extensions in numerical analysis.^[2] In practice, it addresses critical challenges in fields requiring multidimensional function estimation, such as computer graphics for texture mapping via bilinear interpolation of coordinates across surfaces, scientific computing for simulating physical fields like 3D temperature distributions in heat transfer models, and data analysis for reconstructing surfaces from scattered sensor measurements in environmental monitoring.^[7]^[8]^[9] These applications underscore its role in bridging discrete observations to continuous models, essential for visualization, simulation, and prediction in engineering and science.^[10]

Relation to univariate interpolation

Univariate interpolation addresses the approximation of a function f: \mathbb{R} \to \mathbb{R} using a polynomial or piecewise polynomial that passes through a finite set of data points (x_i, f(x_i)) for i = 1, \dots, n+1. Key methods include Newton's divided difference interpolation, introduced by Isaac Newton in the late 17th century for arbitrary data spacing, and Lagrange interpolation, formalized by Joseph-Louis Lagrange in 1795, which constructs the polynomial directly as a linear combination of basis polynomials. Spline interpolation, developed by Isaac J. Schoenberg in 1946, extends these by using piecewise polynomials to ensure smoothness while mitigating oscillations in global polynomials.^[11]^[11]^[11] Extending these techniques to multivariate interpolation for functions f: \mathbb{R}^n \to \mathbb{R} introduces significant challenges, primarily due to the curse of dimensionality. A direct polynomial generalization of degree d in n variables requires approximately d^n terms in the full tensor product basis, leading to exponential growth in computational complexity and sensitivity to data configuration as n increases. This combinatorial explosion, first highlighted in multivariate contexts during the 20th century, contrasts sharply with the linear O(d) scaling in one dimension and often renders global polynomial interpolants impractical for high dimensions.^[12]^[13] A primary extension bridging univariate and multivariate cases is the tensor product construction, which combines univariate interpolants separably along each dimension to form a multivariate approximant on rectangular grids. Developed as early as 1927 by Narumi through extensions of Newton's formulas using finite differences, this approach preserves univariate stability but inherits the exponential term growth. Piecewise methods, analogous to univariate splines, offer alternatives by localizing the interpolation to avoid global high-degree issues, though they require careful handling of continuity across domains.^[2] The roots of multivariate interpolation trace to 19th-century univariate advancements, such as August Crelle's 1821 work on polynomial ratios building on Cauchy's contributions, which laid groundwork for uniqueness in low dimensions. Leopold Kronecker's 1865 paper pioneered multivariate extensions using ideal theory for arbitrary points in two variables. Significant progress to two and three dimensions occurred in the mid-20th century, driven by finite difference methods for numerical solutions of partial differential equations, with contributions from Salzer in 1944 on multiple Newton formulas and Radon in 1948 on general distributions.^[11]^[2]^[2]

Regular grid interpolation

General framework for any dimension

In multivariate interpolation on regular grids, the data points are arranged on a structured lattice in \mathbb{R}^n, formed as the Cartesian product (tensor product) of one-dimensional grids along each coordinate axis, often uniformly spaced to facilitate computations in numerical analysis and simulations.^[14] This setup ensures that the grid points are poised for unique polynomial interpolation of degree up to the specified order in each dimension, avoiding issues like ill-conditioning that arise in unstructured configurations.^[15] The general interpolation operator in this framework emphasizes methods with local support, where the estimated value at an arbitrary point x \in \mathbb{R}^n is computed solely from function values at a compact set of neighboring grid points, determined by a predefined stencil or neighborhood.^[15] Such stencils typically consist of a small, fixed number of points surrounding x, enabling efficient evaluation while maintaining smoothness and accuracy; for example, the size of the stencil scales with the order of the method raised to the power of the dimension (exponentially with the overall grid dimension), as it is the tensor product of univariate stencils.^[16] This locality contrasts with global methods and is crucial for scalability in high dimensions. A core principle underlying these operators is the tensor product construction, which derives the n-dimensional interpolant by successively applying univariate interpolation kernels along each axis, yielding a separable form where the multivariate basis functions are products of their one-dimensional counterparts.^[14] In abstract terms, the interpolant P(x) takes the form P(x) = \sum_{i} w_i f(x_i), with weights w_i computed from the tensor product of 1D kernel evaluations at the corresponding coordinates.^[15] For n=2, this manifests as bilinear interpolation when using linear 1D kernels, though explicit forms are deferred to dimension-specific discussions. Building briefly on univariate foundations, this tensorization preserves properties like exact reproduction of polynomials up to the kernel degree.^[16] The advantages of this framework lie in its implementation simplicity, particularly on uniform grids prevalent in fields like computational fluid dynamics and image processing, where tensor product structures allow straightforward extension from 1D codes without redesigning core algorithms.^[14] Moreover, local support minimizes memory usage and enables parallelization, making it viable for large-scale simulations despite the curse of dimensionality in full tensor grids.^[15]

Methods in two dimensions

In two dimensions, multivariate interpolation on regular grids often employs tensor product constructions, where univariate interpolants are extended separably across each coordinate direction.^[17] Bilinear interpolation provides a simple and efficient method for approximating function values within a rectangular cell defined by four grid points, assuming a uniform or nonuniform grid. For a point (x, y) inside the cell with corner values f(0,0), f(1,0), f(0,1), and f(1,1) on a unit grid, the interpolated value is given by

P(x,y) = (1-a)(1-b)f(0,0) + a(1-b)f(1,0) + (1-a)bf(0,1) + abf(1,1),

where a = x - \lfloor x \rfloor and b = y - \lfloor y \rfloor are the fractional parts.^[17] This formula arises from the tensor product of two linear univariate interpolants: first interpolating linearly along one axis to obtain intermediate values, then along the other axis, yielding a bilinear surface that is continuous (C^0) but not differentiable at grid lines.^[17] Bicubic interpolation extends this approach for smoother approximations, using a 4×4 stencil of 16 neighboring grid points to fit a cubic polynomial in each direction. The general separable form, known as cubic convolution, computes the interpolated value g(x, y) by convolving the grid values with a two-dimensional kernel derived from the one-dimensional cubic function

u(s) = \begin{cases} (a + 2)|s|^3 - (a + 3)|s|^2 + 1 & 0 \leq |s| < 1, \\ a|s|^3 - 5a|s|^2 + 8a|s| - 4a & 1 \leq |s| < 2, \\ 0 & |s| \geq 2, \end{cases}

with parameter a = -0.5 for optimal third-order accuracy, applied separately along x and y:

g(x, y) = \sum_{j=-1}^{2} \sum_{m=-1}^{2} f(x_j, y_m) \, u\left(\frac{x - x_j}{h_x}\right) u\left(\frac{y - y_m}{h_y}\right),

where h_x and h_y are grid spacings.^[18] This construction provides a smoother approximation than bilinear, with continuous first derivatives within patches but potential discontinuities across grid lines, at the cost of increased computation.^[18] These methods find widespread use in image resizing within computer graphics, where bilinear offers fast resampling for real-time applications and bicubic preserves finer details in high-quality scaling.^[18] In 2D finite element approximations, bilinear interpolation defines shape functions for quadrilateral elements, enabling efficient solutions to partial differential equations on structured meshes.^[19] Compared to bicubic, bilinear interpolation is simpler and faster, requiring only four points per evaluation and achieving C^0 continuity suitable for coarse approximations, while bicubic demands 16 points for higher-order smoothness and greater accuracy in gradient-sensitive tasks, though with higher computational overhead.^[17]^[18]

Methods in three dimensions

Trilinear interpolation extends bilinear interpolation to three dimensions, providing a first-order approximation for values within a cubic cell of a regular 3D grid. It uses an 8-point stencil corresponding to the vertices of the unit cube enclosing the query point (x, y, z), where fractional parts determine the weights. The interpolated value P(x, y, z) is given by

P(x, y, z) = \sum_{i=0}^{1} \sum_{j=0}^{1} \sum_{k=0}^{1} w_i^x w_j^y w_k^z f(i, j, k),

with w_0^\cdot = 1 - \mathrm{frac}(\cdot) and w_1^\cdot = \mathrm{frac}(\cdot) for each coordinate, ensuring C^0 continuity across cell boundaries.^[20] Tricubic interpolation, analogous to bicubic methods in two dimensions, provides higher-order approximation using function values from a 64-point stencil over a $4 \times 4 \times 4 neighborhood, expressed as a tensor-product polynomial with coefficients determined via a local $64 \times 64 matrix relating vertex data to the tricubic basis, enabling accurate approximation of gradients and curvatures for smoother volumetric reconstructions with C^0 continuity across cells. For explicit C^1 smoothness, Hermite-based methods incorporate derivatives at cell vertices.^[21]^[22] These methods find extensive use in medical imaging, such as interpolating voxel data from CT and MRI scans to enhance fiber tracking in diffusion tensor imaging, yielding longer and smoother white matter fiber visualizations with reduced noise compared to nearest-neighbor approaches.^[23] In fluid dynamics simulations, trilinear interpolation estimates disturbed velocities at particle locations in Euler-Lagrange point-particle models, supporting accurate drag force computations in gas-solid flows like volcanic ash transport, though correction schemes can reduce particle velocity errors by ≥70% compared to standard trilinear interpolation in such models.^[24] In three dimensions, anisotropic grids—common in simulations where resolution varies by axis—exacerbate aliasing artifacts during interpolation, as uneven sampling rates distort high-frequency components in the reconstructed field.^[25] Adaptive stencils, which adjust the interpolation kernel based on local grid spacing, help mitigate these issues by preserving isotropy and reducing oscillatory errors in such scenarios.

Extensions to higher dimensions

Extending regular grid interpolation methods to dimensions greater than three involves generalizing the tensor product construction, where univariate basis functions are combined across each dimension to form a multivariate interpolant. In this framework, the interpolating spline P(\mathbf{x}) = P(x_1, \dots, x_n) for \mathbf{x} \in \mathbb{R}^n is expressed as

P(\mathbf{x}) = \sum_{\alpha} c_{\alpha} \prod_{k=1}^n \phi_k(x_k - t_{k, \alpha_k}),

where \alpha = (\alpha_1, \dots, \alpha_n) is a multi-index, c_{\alpha} are coefficients determined by the data, \phi_k are univariate B-spline basis functions in the k-th dimension, and t_{k, j} denote the knots in that dimension. This tensor product structure preserves the smoothness and local support properties of univariate splines while enabling approximation on Cartesian product grids, with the basis functions constructed recursively from lower-dimensional cases to avoid explicit enumeration in high dimensions.^[26] However, scaling this approach to high n encounters the curse of dimensionality, where the number of degrees of freedom grows exponentially with the dimension. For a uniform grid resolution h in each direction, the total number of grid points required is on the order of O(h^{-n}), leading to prohibitive storage and computational demands even for moderate n, such as n=10, where millions of points may be needed per dimension. This exponential scaling limits direct tensor product methods to low dimensions, necessitating reduced-order approximations to maintain feasibility.^[27] To address these challenges, techniques like sparse grids and hierarchical tensor formats have been developed for efficient high-dimensional interpolation. Sparse grids, based on Smolyak's algorithm, construct the approximant as a linear combination of tensor products over anisotropic grids that omit high-order interactions across all dimensions simultaneously, achieving near-optimal convergence rates of O(N^{-r} (\log N)^{(n-1)(r+1)}) for smooth functions with N points, where r relates to the univariate order. For even higher dimensions (n \geq 4), tensor train (TT) decomposition represents the coefficient tensor in a low-rank chain format, reducing storage to O(n r^2 m) where r is the TT-rank and m the univariate basis size, enabling interpolation via cross approximation or sampling without full tensor materialization. These methods mitigate the curse by exploiting low-rank structure in the data.^[28]^[29] Such extensions find critical applications in solving high-dimensional partial differential equations (PDEs) in physics, particularly in quantum mechanics where the many-body Schrödinger equation involves wave functions in $3N dimensions for N particles. Tensor-based interpolation facilitates dimension reduction and efficient evolution in time-dependent simulations, as demonstrated in tensor train methods for approximating solution manifolds in quantum dynamics.

Scattered data interpolation

Nearest-neighbor and inverse distance weighting

Nearest-neighbor interpolation is a simple local method for estimating values at scattered data points in multivariate settings, where the interpolated value at a query point \mathbf{x} is set equal to the value at the nearest data point \mathbf{x}_i from the set \{\mathbf{x}_1, \dots, \mathbf{x}_m\} \subset \mathbb{R}^d.^[30] The nearest point is determined by minimizing the Euclidean distance \|\mathbf{x} - \mathbf{x}_i\|_2 = \sqrt{\sum_{k=1}^d (x_k - x_{i,k})^2}, making the method computationally efficient for high-dimensional scattered data with no underlying smoothness assumption beyond proximity.^[30] This approach produces piecewise constant surfaces, which can lead to discontinuities but is valuable for quick approximations in applications requiring minimal smoothing, such as initial data visualization. Inverse distance weighting (IDW) extends this locality by blending values from multiple nearby points, weighted inversely by their distances to the query point \mathbf{x}. The interpolated value is given by

P(\mathbf{x}) = \frac{\sum_{i=1}^m w_i f(\mathbf{x}_i)}{\sum_{i=1}^m w_i},

where w_i = 1 / \|\mathbf{x} - \mathbf{x}_i\|_p^r uses the p-norm distance (typically p=2 for Euclidean) raised to power r (often r=2), allowing tuning of the influence decay for desired smoothness.^[31] Introduced originally for two dimensions, the method generalizes naturally to higher dimensions via the multivariate distance metric, providing a deterministic, non-parametric estimator suitable for irregularly spaced points.^[31] IDW satisfies the interpolation property, reproducing exact data values at the sample points \mathbf{x}_i since weights concentrate there, and the weighting decreases monotonically with distance, emphasizing local structure over global trends.^[31] In geostatistics, it is widely applied to interpolate sparse sensor measurements, such as environmental variables from limited monitoring stations, due to its simplicity and ability to handle uneven distributions without assuming stationarity.^[32] Parameter r controls smoothness: higher values yield sharper local variations, while lower values promote broader averaging, though optimal selection often requires cross-validation based on data density.^[32] A key limitation of IDW arises in uneven data distributions, where it can generate "bull's-eye" artifacts—concentric rings of exaggerated values around isolated points—due to over-reliance on distant weights when local data is sparse.^[33] These artifacts reduce accuracy in regions of clustering or gaps, prompting modifications like adaptive power selection or hybrid approaches in practice, though the core method remains favored for its interpretability and low computational overhead in multivariate scattered data scenarios.^[33]

Radial basis function methods

Radial basis function (RBF) interpolation provides a flexible, meshless technique for constructing smooth approximations to multivariate functions from scattered data points in arbitrary dimensions. The core idea is to represent the interpolant as a weighted sum of radial basis functions centered at the given data sites \mathbf{x}_i \in \mathbb{R}^d, i = 1, \dots, m:

P(\mathbf{x}) = \sum_{i=1}^m \lambda_i \phi(\|\mathbf{x} - \mathbf{x}_i\|),

where \phi: [0, \infty) \to \mathbb{R} is a univariate radial kernel function, \|\cdot\| denotes the Euclidean norm, and the coefficients \lambda_i \in \mathbb{R} are chosen to ensure interpolation, i.e., P(\mathbf{x}_j) = f_j for function values f_j at each \mathbf{x}_j.^[34] To achieve affine invariance and enable polynomial reproduction, an additional low-degree polynomial term q(\mathbf{x}) of degree at most k is often included:

P(\mathbf{x}) = \sum_{i=1}^m \lambda_i \phi(\|\mathbf{x} - \mathbf{x}_i\|) + q(\mathbf{x}),

with orthogonality conditions \sum_{i=1}^m \lambda_i p(\mathbf{x}_i) = 0 for any polynomial p of degree at most k.^[35] Common radial kernels include the multiquadric function \phi(r) = \sqrt{1 + (\epsilon r)^2}, originally proposed by Hardy in 1971 for geophysical surface modeling, and the Gaussian \phi(r) = e^{-(\epsilon r)^2}, where \epsilon > 0 is a shape parameter controlling the kernel's decay.^[36] These choices yield infinitely smooth interpolants, with the multiquadric providing positive definiteness in any dimension and the Gaussian offering rapid decay for localized influence.^[37] The coefficients \lambda = (\lambda_1, \dots, \lambda_m)^T are determined by solving the symmetric linear system A \lambda = \mathbf{f}, where \mathbf{f} = (f_1, \dots, f_m)^T collects the data values and the matrix entries are A_{ij} = \phi(\|\mathbf{x}_i - \mathbf{x}_j\|) (augmented with a polynomial block for the extended form).^[34] For positive definite kernels like the Gaussian or multiquadric, Micchelli's 1986 theorem guarantees the matrix A is invertible, ensuring a unique solution.^[37] However, as the number of points m grows large, the system often becomes severely ill-conditioned, with condition numbers scaling exponentially in m for certain kernels, due to near-linear dependence among the basis functions; this necessitates preconditioning or iterative solvers for practical computation.^[36] RBF methods encompass both exact interpolation, which exactly reproduces the data, and approximate variants that introduce regularization to mitigate ill-conditioning while allowing controlled approximation error.^[35] A key variant involves compactly supported RBFs, introduced by Wendland in 1995, which are piecewise polynomials with finite support radius, yielding sparse interpolation matrices and promoting local approximation properties that scale better for large datasets.^[38] These compact kernels, such as Wendland's \phi_{3,1}(r) = (1-r)_+^4 (4r + 1) for smoothness order 2, maintain positive definiteness while restricting interactions to nearby points.^[38] In applications like surface reconstruction from 3D point clouds, RBFs excel by implicitly defining watertight surfaces through level-set representations, as demonstrated in Carr et al.'s 2001 framework for blending and repairing geometric models.^[39] The primary advantages of RBF interpolation lie in its dimension-independent formulation, requiring no underlying grid or triangulation, and its ability to produce infinitely differentiable, globally smooth approximants suitable for scientific visualization and numerical PDE solving.^[34] Inverse distance weighting emerges as a special limiting case of RBF methods when using the conditionally positive definite kernel \phi(r) = 1/r^{d+2-2k} in the polyharmonic spline family.^[36]

Triangulation-based approaches

Triangulation-based approaches to multivariate interpolation decompose a set of scattered data points in \mathbb{R}^n into a simplicial complex, enabling piecewise polynomial interpolation over each simplex while adapting to the irregularity of the data distribution. A key structure for this decomposition is the Delaunay triangulation, which is defined as the dual graph of the Voronoi diagram for the point set, ensuring that the circumsphere of each simplex contains no other points in its interior. This property maximizes the minimum angle among all simplices in the triangulation, promoting well-shaped elements that enhance numerical stability and approximation quality.^[40]^[41] The Delaunay triangulation can be computed in \mathbb{R}^n using algorithms such as incremental insertion, where points are added sequentially, and the triangulation is locally updated by flipping edges or inserting simplices to maintain the empty circumsphere criterion. This process starts with an initial simplex enclosing the points and proceeds by locating the insertion point within an existing simplex, then retriangulating the affected region. The resulting structure consists of n-simplices that tile the convex hull of the data points without overlaps or gaps.^[42]^[43] For piecewise linear interpolation on this triangulation, a query point x within an n-simplex with vertices x_0, x_1, \dots, x_n is interpolated using barycentric coordinates \lambda_j, which satisfy \sum_{j=0}^n \lambda_j = 1 and \lambda_j \geq 0, representing the relative areas (or volumes in higher dimensions) of the opposite sub-simplices. The interpolant is then given by

P(x) = \sum_{j=0}^n \lambda_j f(x_j),

where f(x_j) are the data values at the vertices; this affine combination ensures exact reproduction of the data at the nodes. In two dimensions, the barycentric coordinates for a triangle are \lambda_j = A_j / A, with A_j the area of the sub-triangle opposite vertex j and A the total area. This method yields a continuous C^0 surface that is linear within each simplex.^[44]^[45] Extensions to higher-order interpolation maintain the triangulation while increasing polynomial degree for smoother approximants. The Clough-Tocher scheme, for instance, subdivides each triangle into three cubic Bézier patches in two dimensions, using data values and estimated normals to achieve C^1 continuity across edges; it interpolates functional data with cubic precision and is inductive to higher dimensions by refining simplices. Generalizations to n dimensions involve similar macro-element subdivisions, supporting Hermite interpolation with gradients. These higher-order methods address the derivative discontinuities inherent in basic piecewise linear interpolation.^[46]^[47] Triangulation-based interpolation finds extensive application in finite element methods for solving partial differential equations on irregular domains, where the Delaunay triangulation provides a conforming mesh that adapts to boundaries and internal features without structured grid assumptions. This approach ensures monotonicity and stability in simulations, such as those in computational mechanics or geophysics. Key properties include preservation of convexity for convex data functions, as the linear pieces map convex sets to convex sets, though the global interpolant may exhibit discontinuities in the first derivatives at simplex boundaries, potentially leading to less smooth visualizations.^[48]^[43]

Challenges and considerations

Error analysis and accuracy

In multivariate interpolation, error analysis typically relies on theoretical bounds that quantify the difference between the true function f and its interpolant P. For sufficiently smooth functions f \in C^k(\Omega), where \Omega \subset \mathbb{R}^d is a bounded domain, the pointwise error satisfies

|f(\mathbf{x}) - P(\mathbf{x})| \leq C h^k \|D^k f\|_{\infty, \Omega},

with C a constant independent of h (the maximum fill distance or grid spacing), k the order of the interpolating polynomials or splines, and D^k f the tensor of k-th partial derivatives.^[49] This bound arises from extensions of the Bramble-Hilbert lemma to multivariate settings, ensuring optimal convergence rates under minimal smoothness assumptions.^[50] For polynomial interpolants, the total error can also be expressed as \|f - P\| \leq (1 + \Lambda_n) E_n(f), where E_n(f) is the best approximation error by polynomials of degree at most n-1, and \Lambda_n is the Lebesgue constant measuring the conditioning of the interpolation operator; \Lambda_n grows logarithmically with n in one dimension but accelerates in higher dimensions.^[51] High-dimensional settings exacerbate error amplification due to the curse of dimensionality and ill-conditioning. In dimensions d \geq 3, the Lebesgue constant often scales as O((\log n)^d) or worse for tensor-product nodes, leading to unstable interpolants even for moderate n.^[52] The multivariate Runge phenomenon manifests similarly, where high-degree polynomial interpolation on equispaced or certain unisolvent nodes produces oscillatory errors that diverge as n increases, particularly for analytic functions like the Runge function f(\mathbf{x}) = 1/(1 + 25 \|\mathbf{x}\|^2) extended to \mathbb{R}^d. For instance, bilinear interpolation in two dimensions illustrates this sensitivity, with error bounds tightening to O(h^2) only under uniform smoothness, but degrading near boundaries in scattered configurations. Practical validation of interpolation accuracy, especially for scattered data, employs techniques like cross-validation to estimate prediction errors without additional samples. In leave-one-out cross-validation, each data point is omitted in turn, interpolated from the rest, and the residual compared to the true value, yielding metrics such as mean squared error to rank methods like radial basis functions or inverse distance weighting.^[53] Theoretical guarantees often invoke Sobolev norms to incorporate smoothness assumptions; for f \in H^{k+1}(\Omega) (the Sobolev space of functions with square-integrable derivatives up to order k+1), error estimates take the form \|f - P\|_{H^m(\Omega)} \leq C h^{k+1-m} |f|_{H^{k+1}(\Omega)} for m \leq k+1, with the semi-norm | \cdot | controlling higher derivatives.^[54] Kernel-based interpolants may exhibit inconsistencies in higher Sobolev indices if eigenvalue decay is slow, bounding generalization error by \Omega(n^{-\epsilon}) for certain \epsilon > 0.^[55] Post-2000 advances in adaptive methods have targeted error reduction in anisotropic or high-dimensional regions by dynamically refining node sets or polynomial degrees. Sparse polynomial interpolation schemes, using nested anisotropic grids, achieve near-optimal rates O(n^{-s}) (with s depending on smoothness) while mitigating the exponential growth in node counts, as demonstrated for parametric PDEs in dimensions up to 100. These approaches, often combined with a posteriori estimators, ensure linear convergence to any tolerance without initial mesh preconditioning, extending classical adaptive finite element strategies to scattered multivariate data.^[56]

Computational complexity

The computational complexity of multivariate interpolation varies significantly depending on the data structure and method employed. For interpolation on regular grids, local methods such as bilinear interpolation in two dimensions require evaluating four neighboring points, resulting in O(1) time complexity per query after preprocessing.^[57] Bicubic interpolation extends this by incorporating 16 points, still achieving O(1) query time but with a higher constant factor due to the additional weighted summations.^[57] Preprocessing for these grid-based approaches involves storing the data points, which takes O(N) time and space, where N is the total number of grid points, enabling efficient access for subsequent evaluations.^[58] In scattered data scenarios, complexities are generally higher due to the irregular point distribution. Nearest-neighbor interpolation, when augmented with a KD-tree for spatial indexing, permits preprocessing in O(m \log m) time and O(\log m) query time, where m denotes the number of scattered points, making it suitable for large datasets.^[59] Radial basis function (RBF) methods, as discussed in scattered data interpolation, involve solving a dense linear system of size m \times m, incurring O(m^3) time for the solve phase followed by O(m) per query evaluation.^[60] This cubic bottleneck can be alleviated using fast multipole methods, which approximate the kernel interactions to achieve an overall complexity of O(m \log m) for both solving and evaluation.^[61] High-dimensional interpolation exacerbates these costs through the curse of dimensionality. Full tensor-product schemes on regular grids with m points per dimension demand O(m^n) time and space for preprocessing and storage in n dimensions, rendering them impractical beyond low dimensions.^[62] Sparse grid alternatives mitigate this by focusing on anisotropic tensor products, achieving O(m (\log m)^{n-1}) complexity for approximation and interpolation, which scales more favorably for moderate n.^[62] To address efficiency trade-offs, optimizations such as precomputation are commonly applied for scenarios with repeated queries, where initial setup costs are amortized over multiple evaluations to enable near-constant time access.^[63] Additionally, GPU acceleration has proven effective for 3D volume interpolation, utilizing hardware texture units for parallel trilinear or cubic schemes, yielding speedups of 10-100x over CPU implementations depending on volume size.^[64] For RBF-based deformations in 3D meshes, CUDA implementations further enhance solvability for large m by parallelizing matrix assembly and solves.^[65]