Fact-checked by Grok 2 weeks ago

Approximation theory

Approximation theory is a branch of mathematical analysis concerned with the approximation of functions and other objects by simpler forms, such as polynomials or rational functions, while quantifying the associated error in various norms.^[1] It seeks to find the "best" approximation from a given set, often in the sense of minimizing the maximum deviation (uniform norm) or the integral of the squared error (L² norm).^[2] This field provides foundational tools for numerical analysis, enabling the replacement of complex computations with more tractable ones.^[3] The origins of approximation theory trace back to the 18th century, with early contributions from Leonhard Euler on series expansions and interpolation in 1777.^[4] Significant developments occurred in the mid-19th century through the work of Pafnuty Chebyshev, who established principles for best uniform approximation by polynomials between 1854 and 1859, including the equioscillation theorem that characterizes unique best approximations.^[5] Karl Weierstrass later proved in 1885 that continuous functions on compact intervals can be uniformly approximated arbitrarily closely by polynomials, a cornerstone result known as the Weierstrass approximation theorem.^[6] The 20th century saw expansions by figures like Sergei Bernstein and further generalizations to abstract spaces, solidifying the field's role in functional analysis.^[4] Key concepts include Jackson and Bernstein theorems, which relate the degree of approximation to the smoothness of the function, providing rates of convergence for polynomial approximations.^[7] In normed linear spaces, the Haar condition ensures uniqueness of best approximations in certain subspaces, while Stone-Weierstrass theorems extend density results to algebras of continuous functions.^[8] Modern extensions encompass multivariate approximations, spline functions for piecewise polynomial fits, and rational approximations, often analyzed in Banach spaces.^[9] Approximation theory finds broad applications in engineering and computational science, including signal processing for Fourier and wavelet approximations, computer-aided design via B-splines, and solving partial differential equations through finite element methods.^[10] In optimization and machine learning, it underpins least-squares regression and kernel methods for data fitting.^[11] These techniques also support asymptotic analysis and numerical integration, bridging theoretical mathematics with practical algorithms.^[12]

Overview

Definition and Scope

Approximation theory is the study of how to approximate complicated functions by simpler ones, such as polynomials, rational functions, or splines, while controlling the error in a precise manner. The goal is to find a simple function g that closely mimics a target function f over a specified domain, minimizing the error \|f - g\| according to a chosen norm, such as the uniform norm or the L^2 norm. This field bridges pure mathematics, numerical analysis, and applied sciences by providing theoretical guarantees on the quality and feasibility of such approximations.^[1]^[9] The scope of approximation theory encompasses both continuous settings, where functions are approximated over intervals or compact sets in \mathbb{R}^d, and discrete settings, such as approximating data points with finite-dimensional models. It fundamentally differs from interpolation, which requires exact matching at a finite set of points, whereas approximation focuses on global error minimization over the entire domain to achieve better overall fidelity. Common error measures include the uniform norm \|f - g\|_\infty = \sup |f(x) - g(x)|, the L^2 norm \|f - g\|_2 = \sqrt{\int |f(x) - g(x)|^2 dx}, and others like L^p norms, allowing flexibility based on the application's needs, such as smoothness preservation or computational efficiency.^[1]^[9] Central objectives include solving best approximation problems—finding the infimum error achievable by a given class of approximants—and analyzing convergence rates under varying function smoothness. For continuous functions on compact sets, the Weierstrass approximation theorem guarantees that polynomials are dense in the space of continuous functions under the uniform norm, but quantitative rates are given by Jackson's theorem: if f is Lipschitz continuous with constant K on [-1, 1], then the best uniform polynomial approximation error E_n(f) of degree n satisfies E_n(f) \leq 6K / n, or more generally O(1/n). The converse, Bernstein's theorem, links approximation rates to smoothness: if E_n(f) = O(1/n), then f is Lipschitz continuous. These results establish that smoother functions admit faster convergence, guiding the choice of approximants in practice.^[9]^[13] A classic example is approximating \sin x on [-\pi/2, \pi/2] using Taylor polynomials centered at 0, such as the third-degree polynomial P_3(x) = x - x^3/6, which provides a good uniform approximation within this bounded interval but may exhibit larger errors outside it due to the local nature of the expansion, highlighting the importance of domain restriction for effective polynomial approximations.

Historical Development

The foundations of approximation theory were laid in the 19th century, with Karl Weierstrass establishing a cornerstone result in 1885 by proving that any continuous function defined on a compact interval can be uniformly approximated by polynomials to any desired degree of accuracy.^[14] This theorem, often regarded as the starting point of modern approximation theory, highlighted the dense nature of polynomials in the space of continuous functions. Earlier influences included Pafnuty Chebyshev's pioneering work in the mid-19th century (1850s–1870s) on minimax approximation, where he developed methods for finding polynomials that minimize the maximum deviation from a target function, laying groundwork for equioscillation principles in uniform approximation. The early 20th century saw significant advancements, particularly through constructive approaches. In 1912, Sergei Bernstein provided the first explicit constructive proof of Weierstrass's theorem using what are now known as Bernstein polynomials, which approximate continuous functions via probabilistic interpretations and binomial expansions.^[15] Concurrently, Dunham Jackson developed key theorems in the 1910s to 1930s quantifying approximation rates, showing that the error in best polynomial approximations decreases at rates depending on the function's smoothness, such as Lipschitz constants or modulus of continuity.^[13] These results shifted focus toward quantitative estimates and direct constructions, influencing subsequent theoretical developments. Prominent figures like Émile Borel, who in 1905 offered a rigorous treatment of Chebyshev's minimax methods using series expansions, alongside Bernstein and Mark Krein, advanced the field through functional analysis and operator theory.^[16] Krein's collaborations, notably with Akhiezer in the 1930s, explored extremal problems in approximation within normed spaces, contributing to the Soviet school's emphasis on constructive function theory.^[17] Post-World War II, this Soviet tradition flourished, driven by mathematicians like Akhiezer and Krein, fostering rapid progress in approximation amid broader mathematical growth in the USSR during the 1940s and 1950s.^[18] The mid-20th century marked a transition to computational and extended methods, with Evgeny Remez's 1934 algorithm for minimax approximations gaining practical implementations in the 1950s alongside the rise of digital computers, enabling numerical solutions to complex approximation problems.^[19] Akhiezer's influential 1956 monograph, Theory of Approximation, synthesized progress in uniform and rational approximations, emphasizing structural properties of functions in linear spaces.^[20] The 1960s and 1970s saw expansions into rational functions and splines, with seminal works on rational minimax approximations by researchers like Gonchar and Stahl, and spline theory developing from de Boor's B-spline formulations for flexible curve fitting.^[21]^[22] This era culminated in the founding of the Journal of Approximation Theory in 1967 by Oved Shisha, providing a dedicated venue for ongoing advancements.^[18]

Fundamental Concepts

Types of Function Approximation

Function approximation in approximation theory encompasses several structural classes, each tailored to specific function properties and domains. These classes differ primarily in their form—global versus local, algebraic versus transcendental—and in their ability to capture features like smoothness, periodicity, or singularities. The choice of approximation type depends on the target function's analytic properties, such as continuity, periodicity, or the presence of poles, influencing convergence rates and computational feasibility. Polynomial approximation uses finite sums of the form p(x) = a_0 + a_1 x + \cdots + a_n x^n, where the degree n is fixed, to approximate continuous functions on compact intervals like [-1, 1]. These approximants are infinitely differentiable and leverage the finite-dimensional vector space of polynomials, facilitating explicit computations via linear algebra. They excel for smooth, analytic functions without singularities, as guaranteed by the Weierstrass approximation theorem, which ensures uniform convergence on closed intervals. However, polynomials diverge near singularities outside their domain of definition, limiting global accuracy for functions with poles or branch points.^[9]^[23] Rational approximation employs ratios of polynomials, r(x) = \frac{p(x)}{q(x)}, where p and q have degrees m and k, respectively, often constructed as Padé approximants that match the Taylor series of the target function up to order m + k. This form is particularly effective for meromorphic functions with poles, as the denominator can model singularities, providing superior convergence compared to polynomials near such points—for instance, approximating \frac{1}{1 + x^2}, which has poles at x = \pm i. Rational functions maintain computational tractability through polynomial operations but can introduce spurious poles if not carefully controlled. Their domain of convergence is restricted by the locations of these poles, often failing for entire functions without them.^[24]^[25] Piecewise approximation constructs approximants from local polynomials over subintervals, joined smoothly via splines—typically cubic polynomials ensuring C^2 continuity—to mimic non-smooth or rapidly varying functions. Splines reduce global error by confining high-degree wiggles to small regions, offering better stability and adaptability for functions with discontinuities in higher derivatives, such as in data interpolation. This local structure enhances numerical efficiency and avoids the Runge phenomenon of global polynomials. Limitations arise from the need to select knot points, which can amplify errors if poorly placed, and the approximant's accuracy degrades near knots without sufficient refinement.^[26]^[27] Other notable types include trigonometric polynomials for periodic functions and exponential sums for positive or decaying ones. Trigonometric polynomials, of the form t(x) = a_0 + \sum_{k=1}^n (a_k \cos kx + b_k \sin kx), are ideal for $2\pi-periodic continuous functions on [-\pi, \pi], achieving uniform approximation via finite-dimensional spaces analogous to algebraic polynomials. They converge uniformly for continuous periodic functions but are confined to periodic domains, with slower rates for non-analytic cases. Exponential sums, s(x) = \sum_{k=1}^n c_k e^{-\lambda_k x} with positive coefficients c_k, approximate positive functions on [0, \infty), densely spanning continuous positives under Müntz-type conditions on \{\lambda_k\}. These are suited for functions like decay profiles but require careful selection of exponents to avoid ill-conditioning.^[9]^[28]

Type	Suitability	Key Advantages	Limitations
Polynomial	Smooth, analytic on compact intervals	Differentiable, easy algebraic computation	Diverges near singularities
Rational	Meromorphic with poles	Models singularities effectively	Spurious poles, convergence domains
Piecewise (Splines)	Non-smooth, varying locally	Local error control, stability	Knot selection sensitivity
Trigonometric	Periodic continuous functions	Uniform convergence on circles	Restricted to periodic domains
Exponential Sums	Positive, decaying on [0, \infty)	Dense for positives, flexible decay	Exponent conditioning

Each type's domain of convergence is inherently limited: polynomials and trigonometric polynomials succeed on compact or periodic sets but fail globally near singularities, while rational and exponential forms extend to unbounded domains at the cost of potential instability.^[9]^[28]

Error Measures and Norms

In approximation theory, error measures quantify the deviation between a target function f and its approximating function p, providing a metric for assessing the quality of the approximation. These measures are typically derived from norms on function spaces, enabling the formulation of optimization problems to minimize the error over a specified domain. Common norms include the uniform norm for worst-case scenarios and L_p norms for integral-based averages, with discrete variants for finite data sets.^[9] The uniform norm, also known as the supremum or \infty-norm, is defined as

\|f - p\|_\infty = \sup_{x \in D} |f(x) - p(x)|,

where D is a compact domain such as [a, b]. This norm captures the maximum absolute error over the domain, making it suitable for applications requiring control of the worst-case deviation, such as in Chebyshev approximation where equioscillation ensures minimax optimality.^[9]^[9] More generally, L_p norms for $1 \leq p < \infty are given by

\|e\|_p = \left( \int_D |e(x)|^p \, dx \right)^{1/p},

where e(x) = f(x) - p(x) is the error function. The L_1 norm emphasizes the total absolute deviation, akin to a median-based measure that is robust to outliers, while the L_2 norm (with p=2) corresponds to the root-mean-square error and facilitates orthogonal projections in least-squares settings. The case p = \infty recovers the uniform norm. These norms satisfy the triangle inequality: \|e_1 + e_2\|_p \leq \|e_1\|_p + \|e_2\|_p.^[9]^[9]^[9] For discrete data consisting of function values at a finite set of points \{x_i\}_{i=1}^m \subset D, discrete norms adapt the continuous forms: the \ell_p norm is

\|e\|_{\ell_p} = \left( \sum_{i=1}^m |e(x_i)|^p \right)^{1/p}

for $1 \leq p < \infty, and \|e\|_{\ell_\infty} = \max_i |e(x_i)| for p = \infty. These are essential in numerical fitting problems, such as interpolating scattered data, where the \ell_2 norm aligns with least-squares regression, the \ell_1 norm handles noisy measurements via linear programming, and the \ell_\infty norm minimizes peak errors.^[29]^[29] Key properties of these norms underpin convergence results in approximation theory. All norms on the finite-dimensional space of polynomials of fixed degree are equivalent on compact sets, meaning there exist constants c_1, c_2 > 0 such that c_1 \| \cdot \|_a \leq \| \cdot \|_b \leq c_2 \| \cdot \|_a for norms \| \cdot \|_a and \| \cdot \|_b; this ensures that convergence in one norm implies convergence in others for such subspaces. In broader contexts, density in the uniform norm on compact sets, as established by the Stone-Weierstrass theorem for subalgebras separating points, extends to L_p convergence via norm equivalence and the triangle inequality, facilitating uniform approximation guarantees for continuous functions by polynomials.^[9]^[9] A representative example is the approximation of f(x) = e^x on [0,1] by its Taylor polynomial p_n(x) = \sum_{k=0}^n \frac{x^k}{k!} of degree n centered at 0. The uniform error satisfies

\|e^x - p_n\|_\infty \leq \frac{e}{(n+1)!},

derived from the Lagrange remainder form of Taylor's theorem, where the (n+1)-th derivative is bounded by e on [0,1]; this bound decreases rapidly with n, illustrating exponential convergence for analytic functions.^[30]

Uniform Approximation

Best Uniform Approximation

In approximation theory, the best uniform approximation to a continuous function f on a compact interval [a, b] from a finite-dimensional subspace S of C[a, b] (the space of continuous functions equipped with the uniform norm \|g\|_\infty = \sup_{x \in [a, b]} |g(x)|) is the element s^* \in S that minimizes \|f - s\|_\infty.^[31] This minimax problem is central to uniform approximation, often with S taken as the space of polynomials of degree at most n, denoted \Pi_n. Existence of such an s^* is guaranteed by the finite dimensionality of S, as the unit ball in S is compact and the uniform norm is continuous.^[31] Uniqueness of the best uniform approximant holds under the Haar condition on S: every nonzero element of S has at most \dim(S) - 1 zeros in [a, b].^[32] This condition ensures that the metric projection onto S in the uniform norm is single-valued for every f \in C[a, b].^[32] The space \Pi_n satisfies the Haar condition on any interval, as any nonzero polynomial of degree at most n has at most n roots (counting multiplicity, but the distinct zeros condition holds for the uniform norm characterization). Without the Haar condition, non-uniqueness can occur, but it is satisfied by many standard approximating families like polynomials or rational functions with fixed poles outside the interval.^[32] A classic example illustrates the problem for f(x) = |x| on [-1, 1] using linear polynomials (S = \Pi_1 = \{ax + b \mid a, b \in \mathbb{R}\}, \dim(S) = 2). Due to the even symmetry of f, the optimal approximant is the constant polynomial p^*(x) = \frac{1}{2}, yielding \|f - p^*\|_\infty = \frac{1}{2}. The error function e(x) = |x| - \frac{1}{2} attains its maximum magnitude of \frac{1}{2} at x = -1, 0, 1 with alternating signs (+\frac{1}{2}, -\frac{1}{2}, +\frac{1}{2}), consistent with characterization principles for uniqueness despite the effective reduction to a one-dimensional even subspace. In contrast, the globally optimal linear approximant without subspace restrictions is the piecewise linear function f itself, achieving zero error and underscoring how non-smoothness limits polynomial performance.^[33] Theoretical results quantify convergence rates for smooth functions. For f \in C^k[a, b] (functions with k continuous derivatives), Jackson's theorem establishes that the best uniform approximation error satisfies E_n(f) := \inf_{p \in \Pi_n} \|f - p\|_\infty = O(1/n^k), where the constant depends on \|f^{(k)}\|_\infty and the interval length.^[34] More precisely, E_n(f) \leq C_k \cdot \|f^{(k)}\|_\infty / n^k for some C_k > 0, with sharper bounds involving the modulus of continuity of f^{(k)}. This rate highlights the trade-off: higher smoothness yields faster convergence, but the uniform norm penalizes worst-case deviations more severely than L^2 norms. For less regular classes, such as Lipschitz continuous functions (k=0, \alpha=1), the rate is O(1/n).^[34]

Chebyshev Approximation and Equioscillation

Chebyshev polynomials of the first kind, denoted T_n(x), are defined for x \in [-1, 1] and integer n \geq 0 by the formula T_n(x) = \cos(n \arccos x).^[35] These polynomials satisfy the recurrence relation T_{n+1}(x) = 2x T_n(x) - T_{n-1}(x) with initial conditions T_0(x) = 1 and T_1(x) = x, and they form an orthogonal basis on [-1, 1] with respect to the weight $1 / \sqrt{1 - x^2}.^[36] The leading coefficient of T_n(x) is $2^{n-1} for n \geq 1, so the monic version \tilde{T}_n(x) = T_n(x) / 2^{n-1} has uniform norm \|\tilde{T}_n\|_\infty = 1 / 2^{n-1} on [-1, 1].^[36] This property establishes \tilde{T}_n as the monic polynomial of degree n with minimal maximum deviation from zero among all such polynomials on the interval, a result known as the minimax property.^[36] Chebyshev's equioscillation theorem (1854) characterizes best uniform approximations in the space of polynomials of degree at most n.^[4] It states that a polynomial p^* of degree at most n is the unique best uniform approximation to a continuous function f on a compact interval if and only if the error f - p^* equioscillates—attains its maximum absolute value with alternating signs—at least at n+2 points in the interval. More precisely, there exist points a \leq t_1 < t_2 < \cdots < t_{n+2} \leq b such that f(t_i) - p^*(t_i) = (-1)^i \|f - p^*\|_\infty for each i. A related result by de la Vallée Poussin (1919) provides a lower bound: if a polynomial p of degree at most n satisfies f(t_i) - p(t_i) = (-1)^i \varepsilon_i at n+2 points with constant sign for the \varepsilon_i, then the minimal approximation error satisfies E_n(f) \geq \min_i |\varepsilon_i|.^[37] This equioscillation condition ensures uniqueness and optimality in the uniform norm, distinguishing Chebyshev approximations from other methods. In practice, expanding a function in the Chebyshev basis offers an "economy of approximation" by leveraging the small uniform norm of the monic Chebyshev polynomials.^[38] When approximating a function f by a scaled and shifted Chebyshev series truncated at degree n, the maximum error is reduced by a factor of approximately $2^{1-n} compared to using a monomial power basis for the same degree, due to the minimax property minimizing the contribution of the leading term.^[38] This economization technique involves replacing higher-degree terms in a power series with lower-degree Chebyshev combinations, preserving accuracy while reducing computational cost and round-off error in numerical implementations.^[38] For instance, the error bound for such a reduction satisfies \max_{-1 \leq x \leq 1} |P_n(x) - P_{n-1}(x)| = |a_n| / 2^{n-1}, where a_n is the coefficient of the highest power, leading to substantial savings in precision requirements.^[38] A representative example is the best uniform approximation of e^x on [-1, 1] using shifted Chebyshev series. The degree-n approximation p_n(x) = \sum_{k=0}^n c_k T_k(x), with coefficients c_0 = \frac{1}{\pi} \int_{-1}^1 \frac{e^x T_0(x)}{\sqrt{1 - x^2}} \, dx and c_k = \frac{2}{\pi} \int_{-1}^1 \frac{e^x T_k(x)}{\sqrt{1 - x^2}} \, dx for k \geq 1, achieves near-minimax error.^[39] The error e^x - p_n(x) equioscillates at exactly n+2 points in [-1, 1], alternating between positive and negative maxima, as required by the theorem; for n=3, these points occur near the Chebyshev extrema, with the error curve oscillating and reaching \pm \|e^x - p_3\|_\infty \approx \pm 0.005 at five distinct locations.^[37] Visualizing this, the error plot shows equal ripples, confirming optimality and illustrating how the approximation hugs the function with balanced deviations. Extensions of these ideas apply to more general settings through Chebyshev systems, also known as generalized Haar spaces. A Chebyshev system of dimension n+1 on an interval is a linear subspace of continuous functions where any nonzero linear combination has at most n zeros, ensuring the equioscillation theorem holds for best uniform approximations in that space.^[40] Examples include exponential functions \{ e^{\lambda_k x} \} for distinct \lambda_k, or polynomials, and the theory guarantees uniqueness and characterization by alternation at n+2 points.^[41] In higher dimensions, multidimensional Chebyshev systems generalize this by requiring unique interpolation and approximation properties on subdomains, though constructing them is challenging due to the lack of simple bases like powers.^[41]

Least Squares Approximation

Continuous Least Squares

In continuous least squares approximation, the goal is to find a function s in a given subspace S of square-integrable functions that minimizes the L^2 error \int_a^b [f(x) - s(x)]^2 \, dx for a target function f \in L^2[a,b]. This minimization problem has a unique solution, which is the orthogonal projection of f onto S, ensuring that the error f - s is orthogonal to every element in S.^[42] The setting is the Hilbert space L^2[a,b] equipped with the inner product \langle f, g \rangle = \int_a^b f(x) g(x) \, dx, where the norm is \|f\| = \sqrt{\langle f, f \rangle}. The best approximation s satisfies the orthogonality condition \langle f - s, v \rangle = 0 for all v \in S, a direct consequence of the Hilbert projection theorem, which guarantees the existence and uniqueness of such a projection onto any closed subspace.^[42] This framework extends to infinite-dimensional subspaces spanned by orthogonal bases, where the coefficients of s are given by inner products with the basis functions. A classic example is the Fourier series approximation on [-\pi, \pi], where S_n is the subspace of trigonometric polynomials of degree at most n, spanned by \{1, \cos(kx), \sin(kx) \mid k=1,\dots,n\}. The partial sum s_n(x) = \frac{a_0}{2} + \sum_{k=1}^n (a_k \cos(kx) + b_k \sin(kx)), with coefficients a_k = \frac{1}{\pi} \int_{-\pi}^\pi f(x) \cos(kx) \, dx and b_k = \frac{1}{\pi} \int_{-\pi}^\pi f(x) \sin(kx) \, dx, is the orthogonal projection of f onto S_n. For f \in L^2[-\pi, \pi], the Fourier series converges to f in the L^2 norm as n \to \infty, meaning \|f - s_n\|_{L^2} \to 0.^[43] For an orthonormal basis \{\phi_k\} of the subspace, Parseval's identity relates the energy of f to its Fourier coefficients: \|f\|^2 = \sum_k | \langle f, \phi_k \rangle |^2, reflecting conservation of energy in the expansion and confirming that the projection captures the full L^2 content when the basis is complete.^[44] However, continuous least squares focuses on minimizing average squared error and does not bound the maximum deviation, leading to potential overshoots near discontinuities in the approximant. In Fourier series, this manifests as the Gibbs phenomenon, where partial sums exhibit oscillations of about 9% of the jump height adjacent to jump discontinuities, persisting regardless of n.^[45]

Discrete Least Squares and Orthogonal Polynomials

In discrete least squares approximation, the goal is to find a polynomial p(x) = \sum_{k=0}^n c_k x^k of degree at most n that minimizes the weighted sum of squared errors at m distinct data points x_i with corresponding values f(x_i) and weights w_i > 0, given by \sum_{i=1}^m w_i \left( f(x_i) - p(x_i) \right)^2.^[46] This problem arises when approximating a function from finite samples, such as in data fitting or interpolation from measurements. The solution involves solving the normal equations A^T W A \mathbf{c} = A^T W \mathbf{f}, where A is the m \times (n+1) Vandermonde-like matrix with entries A_{i k} = x_i^k, W is the diagonal matrix of weights w_i, \mathbf{c} is the coefficient vector, and \mathbf{f} contains the data values f(x_i).^[47] Direct solution using the monomial basis is often numerically unstable due to the ill-conditioning of the Vandermonde matrix, which has condition numbers growing exponentially with n.^[48] Orthogonal polynomials address this instability by providing a basis where the approximation decouples into independent components. For the interval [-1, 1] with weight function w(x) = 1, the Legendre polynomials P_n(x) form such an orthogonal family, satisfying \int_{-1}^1 P_j(x) P_k(x) \, dx = \frac{2}{2k+1} \delta_{jk}.^[49] In the discrete setting, orthogonality is with respect to the discrete inner product \langle p, q \rangle = \sum_{i=1}^m w_i p(x_i) q(x_i), allowing the least squares coefficients to be computed as c_k = \frac{\langle f, \phi_k \rangle}{\langle \phi_k, \phi_k \rangle}, where \{\phi_k\} are the orthogonal polynomials.^[50] These polynomials satisfy a three-term recurrence relation x P_n(x) = a_n P_{n+1}(x) + b_n P_n(x) + c_n P_{n-1}(x), with coefficients a_n = \frac{n+1}{2n+1}, b_n = 0, and c_n = \frac{n}{2n+1}, which enables stable recursive computation and avoids the accumulation of rounding errors inherent in higher-order relations.^[51] The orthogonal basis can be constructed from the monomial basis \{1, x, x^2, \dots\} using the Gram-Schmidt orthogonalization process applied to the discrete inner product. Starting with \phi_0(x) = 1, subsequent polynomials are \phi_k(x) = x^k - \sum_{j=0}^{k-1} \frac{\langle x^k, \phi_j \rangle}{\langle \phi_j, \phi_j \rangle} \phi_j(x), normalized as needed; this yields a QR decomposition of the Vandermonde matrix, where the Q factor contains the orthogonal polynomials evaluated at the points, facilitating stable least squares solutions via \mathbf{c} = Q^T W \mathbf{f} / \| \mathbf{q}_k \|^2 for each component.^[52] For numerical implementation, the three-term recurrence is preferred over direct Gram-Schmidt to maintain stability, especially for high degrees. A representative example is fitting discrete data to a Legendre series: suppose data points (x_i, f(x_i)) for i=1,\dots,m on [-1,1] are given; the monomial least squares might yield coefficients with relative errors exceeding $10^6 for n=20 due to conditioning, whereas using precomputed Legendre polynomials via recurrence reduces errors to near machine precision, as the orthogonal projection isolates each mode without interference.^[48] This approach extends to other families like Chebyshev polynomials, which are orthogonal with respect to w(x) = (1-x^2)^{-1/2} and relate to the continuous L^2 projection discussed elsewhere. Orthogonal polynomials also underpin applications such as Gaussian quadrature, where the nodes are the roots of P_n(x) and weights derive from the Christoffel-Darboux formula, enabling exact integration of polynomials up to degree $2n-1.^[53]

Constructive Algorithms

Remez Algorithm

The Remez algorithm, introduced by E. Ya. Remez in 1934, is an iterative procedure for computing the best uniform (minimax) polynomial approximation of degree at most n to a continuous function f on a compact interval [a, b]. It leverages the equioscillation property by starting with an initial set of n+2 reference points and alternately solving for polynomial coefficients that achieve alternating errors at those points and updating the points to the local extrema of the current error function. This process constructs a sequence of polynomials whose maximum errors converge to the minimax error, providing a practical method to realize the theoretical guarantees of unique best approximation. The algorithm proceeds in two main substeps per iteration. First, given ordered reference points x_1 < x_2 < \cdots < x_{n+2} in [a, b], solve the linear system for the polynomial coefficients c_0, \dots, c_n and the error level h > 0:

\begin{cases} \sum_{j=0}^n c_j x_i^j = f(x_i) + (-1)^i h, & i = 1, \dots, n+2, \end{cases}

yielding a trial polynomial p(x) = \sum_{j=0}^n c_j x^j such that the error f(x_i) - p(x_i) alternates in sign and equals \pm h at the reference points. This system is solved via standard linear algebra techniques, such as Gaussian elimination. Second, compute the error function e(x) = f(x) - p(x) and identify its n+2 local extrema points where the error alternates in sign; these become the new reference points for the next iteration. The process repeats until the change in h or the reference points falls below a tolerance threshold, at which point p(x) approximates the minimax polynomial.^[54] For continuous f on a compact interval, the algorithm is guaranteed to converge to the unique best uniform approximation, with the error level h_k at iteration k satisfying h_{k+1} \geq h_k (hence the maximum error decreases monotonically) and approaching the minimax error from above. Under additional smoothness assumptions, such as f being twice continuously differentiable, the convergence is quadratic in the sense that the error reduces quadratically every few iterations. Numerical stability requires careful handling of the linear solves, particularly for high degrees n, where conditioning can degrade.^[54] A classic application illustrates the algorithm's effectiveness in mitigating the Runge phenomenon. For the degree-2 minimax approximation to f(x) = 1/(1 + 25x^2) on [-1, 1], starting with equispaced initial points and iterating yields a polynomial p_2(x) \approx 1 - 9.999x^2 with maximum error approximately $0.056, where the error equioscillates at five points, far outperforming equispaced interpolation which diverges near the endpoints. This example demonstrates how the Remez algorithm produces stable approximations even for functions prone to oscillation.^[54] Remez proposed two variants of the algorithm; the second, which updates the reference points by solving a nonlinear system of leveling equations to ensure equal ripple simultaneously, achieves faster convergence, often quadratic per iteration for smooth f, compared to the first variant's slower exchange-based updates. This second algorithm is preferred in modern implementations for its efficiency in locating the full equioscillation set directly.^[54]

Other Iterative Methods

Exchange algorithms extend the principles of point exchange seen in uniform norm methods to other norms, such as L1 and general Lp norms, by iteratively selecting and swapping evaluation points to minimize the approximation error. In these approaches, an initial set of points is chosen, the best approximation is computed over that set under the target norm, and then points are exchanged based on error levels to improve the global optimum until convergence. For Lp norms with 1 ≤ p < ∞, the process often involves solving weighted least squares problems iteratively, where weights are updated to emphasize larger residuals, effectively steering the solution toward the desired norm minimization. A prominent example is the Lawson algorithm, which starts with a weighted least squares approximation and iteratively adjusts weights proportional to the absolute errors to converge to the best uniform (L∞) approximation; this method achieves linear convergence and is particularly effective for discrete data sets.^[55] The best uniform approximation problem can also be reformulated as a linear program (LP), especially for discrete points, allowing solution via standard optimization techniques like the simplex method. Specifically, for approximating a function f by a polynomial p of degree at most n at m points x_i (e.g., Chebyshev nodes for better conditioning), the problem is to minimize δ subject to -δ ≤ f(x_i) - ∑_{k=0}^n a_k x_i^k ≤ δ for all i, with variables a_k (coefficients) and δ ≥ 0. This LP has 2m constraints and n+2 variables, and its dual provides insights into the equioscillation property; in practice, it is solved efficiently for moderate n and m, though the simplex method's worst-case exponential time is mitigated by polynomial average-case performance.^[56] For rational function approximation, the differential correction algorithm provides an iterative perturbation method to refine an initial approximation toward the minimax solution. Starting from a preliminary rational function r(x) = p(x)/q(x), the algorithm introduces small corrections δp and δq to the numerator and denominator polynomials, solving a sequence of linearized problems to minimize the uniform error ||f - (r + δr)||_∞, with convergence typically quadratic near the optimum under suitable conditions like non-degeneracy. This approach is advantageous for its stability in handling poles and has been extended to generalized rationals.^[57] An illustrative application of LP in L1 approximation is median regression, where the goal is to find coefficients minimizing the sum of absolute residuals ∑ |y_i - ∑ a_k x_{i,k}|, equivalent to L1 polynomial fitting for continuous functions discretized at points. This formulation introduces auxiliary variables for positive and negative deviations, yielding a standard LP solvable via simplex, and the solution corresponds to the conditional median, robust to outliers unlike L2 methods.^[58] Regarding computational complexity, exchange-based methods like the Lawson algorithm require O(n^2) operations per iteration for solving the weighted least squares (using structured matrices for polynomials), with the number of iterations often linear in n for convergence. In contrast, the LP approach for uniform or L1 norms depends on simplex efficiency, which is empirically polynomial (O(m n^2) or better with interior-point methods) but lacks strong worst-case guarantees, making it suitable for n up to a few hundred.^[55]^[59]

Advanced Topics

Rational Function Approximation

Rational functions provide a powerful tool for approximating functions that exhibit singularities or poles, where polynomial approximations often fail or converge slowly. A rational function approximant takes the form r(x) = \frac{p(x)}{q(x)}, where p(x) is a polynomial of degree at most m and q(x) is a polynomial of degree at most n, typically with q(0) = 1 to ensure uniqueness. This structure allows rational approximants to capture asymptotic behavior near poles more effectively than polynomials; for instance, functions like \tan x, which has poles at odd multiples of \pi/2, can be approximated with greater accuracy over intervals avoiding these singularities using rationals that explicitly model the pole locations.^[24]^[60] Padé approximants represent a specific class of rational approximants, denoted as [m/n], which match the Taylor series expansion of the target function up to order m + n at a point, usually x = 0. These are constructed by solving a system of linear equations derived from the power series coefficients, ensuring the rational function agrees with the first m + n + 1 terms of the expansion. For the exponential function e^x, the Padé table illustrates rapid convergence: the [0/1] approximant is \frac{1}{1 - x}, the [1/1] is \frac{2 + x}{2 - x}, and higher-order entries like [2/2] = \frac{12 + 6x + x^2}{12 - 6x + x^2} provide increasingly accurate representations, often outperforming Taylor polynomials beyond the origin. The convergence of Padé approximants for e^x is known to be geometric in certain regions, with the table demonstrating diagonal dominance in error reduction.^[24]^[61] The problem of best uniform rational approximation—minimizing the maximum error over a compact interval—is inherently nonlinear due to the ratio structure, contrasting with the linear nature of polynomial approximation. Unlike the single equioscillation set in Chebyshev polynomial theory, best rational approximants are characterized by the error f(x) - r(x) attaining its maximum deviation with alternating signs at least at m + n + 2 points, assuming no defect in the approximation space. This characterization, established through extensions of classical alternation theorems, enables algorithms like the Remez method adapted for rationals, though convergence can be slower due to the nonlinearity.^[62] A notable example is the approximation of \log(1 + x), where the Taylor series converges only for |x| < 1 due to the branch point at x = -1. The [2/1] Padé approximant \frac{6x + x^2}{6 + 4x} matches the series up to order 3 and extends accurate approximation beyond the radius of convergence, for example on [0,1] with maximum error around 0.007, superior to the divergent Taylor terms outside |x| < 1. This demonstrates the enhanced radius of convergence typical of Padé approximants for functions with nearby singularities.^[60]^[63] Key challenges in rational approximation include the stability of pole locations, where small perturbations in coefficients can cause poles to migrate dramatically, potentially leading to ill-conditioned approximants or loss of accuracy. Multipoint Padé approximants address global approximation needs by matching expansions at multiple points rather than a single one, improving uniformity over wider domains but increasing computational complexity due to the nonlinear pole-zero interactions. These issues underscore the need for robust numerical methods to ensure reliable pole placement and convergence.^[64]^[65]

Spline and Piecewise Approximation

Spline functions are piecewise polynomials of degree k defined on a partition of the interval into subintervals separated by knots, ensuring C^{k-1} continuity at the knots to maintain smoothness.^[66] This piecewise structure allows splines to provide local approximations, adapting flexibly to the function's behavior in different regions without the global oscillations often seen in high-degree polynomial approximations. B-splines serve as a local basis for representing splines, offering compact support and numerical stability in computations, as introduced in the foundational work on spline theory.^[67]^[66] Among spline types, cubic splines with k=3 are particularly prominent due to their balance of smoothness and computational efficiency, providing C^2 continuity suitable for modeling smooth curves. Interpolating splines pass exactly through given data points, while approximating splines, such as least squares splines, minimize a global error metric like the L^2 norm to fit noisy or overdetermined data, offering robustness in practical applications.^[66] The approximation properties of splines are optimal for sufficiently smooth functions with f \in C^{k+1}: for a spline of degree k on a uniform mesh with spacing h, the error satisfies \|f - s\|_\infty = O(h^{k+1}), where s is the spline approximant, achieving the best possible rate for piecewise polynomial methods of that degree.^[66] This convergence rate holds under mild regularity assumptions on f, making splines superior for local feature capture compared to global polynomials. A representative example is the natural cubic spline approximation to noisy data on [0,1] with uniform knots. For data points (x_i, y_i) where x_i = i/n for i=0,\dots,n, the natural cubic spline imposes zero second derivatives at the endpoints, yielding a smooth curve that filters noise while preserving underlying trends; for instance, with n=10 and Gaussian noise added to \sin(2\pi x), the spline reduces mean squared error by capturing local variations without overfitting.^[66] Construction of cubic splines typically involves solving a tridiagonal linear system for the second derivatives (or moments) at the knots, arising from continuity conditions on the first and second derivatives across intervals. For knots t_0 < t_1 < \dots < t_n, the system is

h_{i-1} m_{i-1} + 2(h_{i-1} + h_i) m_i + h_i m_{i+1} = 6 \left( \frac{f_{i+1} - f_i}{h_i} - \frac{f_i - f_{i-1}}{h_{i-1}} \right), \quad i=1,\dots,n-1,

with h_i = t_{i+1} - t_i and boundary conditions m_0 = m_n = 0 for the natural case; this band structure enables efficient O(n) solution via algorithms like Thomas's method.^[66] Once coefficients are obtained, evaluation of the spline at any point uses the de Boor algorithm, a stable recursive procedure that computes the value by iteratively refining control points within the relevant knot span, generalizing the de Casteljau method for B-splines and ensuring accuracy without explicit basis function computation.^[66]

Applications

In Numerical Analysis

Approximation theory plays a fundamental role in numerical analysis by providing the foundation for deriving and analyzing methods that approximate solutions to continuous problems through discrete computations. In numerical integration, polynomial approximations of the integrand lead to quadrature rules that estimate definite integrals. For instance, Simpson's rule arises from quadratic interpolation of the function over subintervals, yielding an approximation of order four for smooth functions.^[68] Error bounds for such rules can be obtained using the Peano kernel theorem, which expresses the error in terms of higher derivatives of the integrand and a kernel function associated with the quadrature formula.^[69] Numerical differentiation relies on local polynomial approximations, particularly through finite difference methods derived from Taylor series expansions. The forward difference approximates the first derivative as f'(x) \approx \frac{f(x+h) - f(x)}{h}, with truncation error of order O(h), while central differences achieve order O(h^2). Higher-order schemes, such as those using more points, extend this accuracy by incorporating additional Taylor terms, enabling precise approximations for solving differential equations or analyzing function behavior.^[70] In root-finding algorithms, the secant method employs a linear rational approximation to the inverse function of f, where the root satisfies f(x) = 0. By interpolating the inverse linearly using two prior points, the method updates the estimate without requiring derivative evaluations, achieving superlinear convergence under suitable conditions.^[71] For solving ordinary and partial differential equations (ODEs and PDEs), approximation theory underpins collocation methods, where solutions are approximated by splines or Chebyshev polynomials that satisfy the equation at specific points. Spectral methods further leverage orthogonal polynomials, such as Chebyshev or Legendre bases, to represent solutions globally, offering exponential convergence for smooth problems on bounded domains.^[72] Error analysis in these contexts often involves composite rules, which divide the domain into subintervals and apply local approximations, resulting in global errors of order O(h^r), where r reflects the approximation order and h is the step size. This scaling ensures that refinement improves accuracy predictably, guiding practical implementation and adaptive strategies.^[73]

In Signal Processing and Data Fitting

In signal processing, approximation theory plays a central role in the design of finite impulse response (FIR) filters, where least-squares methods minimize the integrated squared error between the desired and actual frequency responses to achieve optimal energy-based performance.^[74] This approach is particularly effective for applications requiring smooth frequency characteristics, such as audio equalization, by solving a system of equations derived from discrete least-squares optimization over sampled frequencies. Complementing this, minimax approximation via the Parks-McClellan algorithm, an adaptation of the Remez exchange method, equioscillates the error to minimize the maximum deviation, enabling equiripple FIR filters ideal for sharp transitions in bandpass designs. For data fitting, nonlinear least-squares techniques extend approximation principles to model complex relationships, such as fitting Gaussian peaks in spectroscopic data by iteratively minimizing the sum of squared residuals through algorithms like Levenberg-Marquardt, which balance gradient descent and Gauss-Newton steps for convergence.^[75] To mitigate overfitting in high-dimensional datasets, regularization methods like ridge regression add a penalty term to the least-squares objective, constraining model coefficients and improving generalization by trading bias for reduced variance. In data compression, wavelet approximations leverage orthogonal bases in L^2 spaces to sparsely represent signals, allowing thresholding of small coefficients for efficient encoding while preserving essential features, as demonstrated in standards like JPEG 2000.^[76] Similarly, the discrete cosine transform (DCT) in JPEG compression approximates image blocks using cosine functions derived from Chebyshev polynomials, concentrating energy in low-frequency components for substantial reduction in file size with minimal perceptual loss. A practical example is electrocardiogram (ECG) signal denoising, where approximation methods like unbiased FIR filtering smooth noise while retaining QRS complexes, achieving significant improvements in signal-to-noise ratio (SNR) in noisy recordings without distorting diagnostic features.^[77] In machine learning, classical approximation theory informs kernel methods for support vector machines (SVMs), where random Fourier features approximate nonlinear kernels to enable scalable linear solvers, reducing computational complexity from quadratic to linear in the number of samples.

References

[1]
Approximation Theory - an overview | ScienceDirect Topics
Approximation theory is the branch of mathematics which studies the process of approximating general functions by simple functions such as polynomials, finite ...
[2]
[PDF] These are classnotes for Math/CS 887, Spring '03 by Carl de Boor ...
Jan 23, 2002 · Approximation theory seeks a best approximation of an element g from a subset M in a metric space, where the metric is often a norm metric.
[3]
Approximation Theory and Approximation Practice, Extended Edition
This is the famous Weierstrass approximation theorem, proved by Karl Weierstrass when he was 70 years old [Weierstrass 1885]. ... Chapter 7. Convergence for ...
[4]
The History of Approximation Theory: From Euler to Bernstein
Mar 14, 2006 · Approximation theory is concerned with the problem of approximating a given function by functions in a special class (for instance, ...
[5]
The history of approximation theory: From euler to bernstein
The problem of approximating a given quantity is one of the oldest challenges faced by mathematicians. Its increasing importance in contemporary mathematics ...
[6]
[PDF] Weierstrass approximation theorem
In this lecture we discuss one of the key theorems of analysis, Weierstrass's approxi- mation theorem. This is an archetypical result of approximation ...<|control11|><|separator|>
[7]
[PDF] On Direct And Inverse Theorems Of Approximation The
We establish an analogue of Jackson's second theorem in the case when the 2π-periodic variable exponent p(x) ≥ 1 satisfies the condition. |p(x) − p(y)| · ln. 2π.
[8]
Elements of Approximation Theory
In this survey we introduce the general Theory of Approximation to functions in (quasisemi)normed spaces; the exposition starts with an explanation of the main ...
[9]
[PDF] A Short Course on Approximation Theory
In the present context, the focus is primarily on the approximation of real-valued continuous functions by some simpler class of functions, such as algebraic or ...
[10]
Mathematics | Special Issue : Approximation Theory and Applications
Recently, approximation methods have been applied to the construction of numerical methods for integral and partial differential equations, fractional calculus, ...Special Issue Editors · Special Issue Information
[11]
[PDF] Faster Algorithms via Approximation Theory - Now Publishers
This monograph presents ideas and techniques from approximation theory for approximating functions such as xs, x−1 and e−x, and demonstrates how these.<|control11|><|separator|>
[12]
Approximation Theory and Asymptotic Methods - Nature
Approximation theory and asymptotic methods bridge classical ideas with modern numerical analysis, enabling solutions to intractable problems using techniques ...
[13]
Jackson's Theorem -- from Wolfram MathWorld
Jackson's theorem is a statement about the error of the best uniform approximation to a real function on by real polynomials of degree at most . Let be of ...Missing: definition scope
[14]
Weierstrass - History of Approximation Theory
This is the translation of the Weierstrass 1885 paper and, as the original, it appeared in two parts and in subsequent issues, but under the same title.
[15]
Bernstein - History of Approximation Theory
This is Bernstein's famous paper where he presented a probabilistic proof of the Weierstrass Theorem, and introduced what we today call Bernstein polynomials.
[16]
[PDF] APPROXIMATION THEORY A MISS IS BETTER THAN A MILE
Nov 18, 1970 · theory in 1905 by the famous Frenctr mathematician Emile Borel, of a simple yet completely rigorous treatment of Chebyshevrs method.
[17]
Krein - History of Approximation Theory
Mark Grigorievich Krein (1907-1989). Mark Grigorievich Krein at MacTutor. Mark Krein at Wikipedia. Mark Grigorievich Krein at the Mathematics Genealogy Project.
[18]
A retrospective on 60 years of approximation theory and associated ...
The adoption of the Oberwolfach model led to a rapid growth in approximation theory, particularly after 1969 in Eastern Europe and 1973 in the USA, bolstered by ...
[19]
[PDF] BARYCENTRIC-REMEZ ALGORITHMS FOR BEST POLYNOMIAL ...
Sep 2, 2009 · The Remez algorithm, 75 years old, is a famous method for computing minimax polynomial approximations. Most implementations of this algorithm ...
[20]
Theory of approximation : Akhiezer, Naum Il'ich, 1901
Jul 18, 2014 · Theory of approximation ; Publication date: 1956 ; Topics: Approximate computation ; Publisher: New York, NY : Ungar ; Collection ...
[21]
[PDF] Rational approximation of real functions
... approximation. On the other hand, since rational approximations are closely connected with spline approxi- mations, we have included as well some results ...
[22]
[PDF] SPLINE FUNCTIONS: BASKC THEORY, THIRD EDITION
The theory of spline functions and their applications is a relatively recent development. As late as 1960~ there were no more than a handful of papers.
[23]
[PDF] first six chapters - People
The first six chapters cover: Introduction, Chebyshev Points and Interpolants, Chebyshev Polynomials and Series, Interpolants, Projections, and Aliasing, ...<|separator|>
[24]
Padé Approximant -- from Wolfram MathWorld
Padé approximations are usually superior to Taylor series when functions contain poles, because the use of rational functions allows them to be well-represented ...
[25]
[PDF] On Rational Function Techniques and Padé Approximants An ...
Sep 16, 2002 · A Padé approximant is the ratio of two polynomials constructed from the coefficients of the Taylor series expansion of a function. Since it ...<|separator|>
[26]
[PDF] piecewise polynomial surface fitting - cs.wisc.edu
The advantages of using splines or other piecewise polynomial functions to represent such profiles seems indeed fairly obvious, if one considers the principle ...
[27]
[PDF] Piecewise polynomial interpolation
Polynomials tend to oscillate (wiggle) a lot, even when our true function does not. The plan for this unit: • Piecewise linear interpolation. • Cubic splines.
[28]
[PDF] Approximation by exponential sums revisited - Applied Mathematics
We revisit the efficient approximation of functions by sums of exponentials or Gaussians in Beylkin and Monzón (2005) [16] to discuss several new results ...
[29]
[PDF] A choice of norm in discrete approximation∗
First, we describe properties of the standard l1, l2 and l∞ norms, and their essential characteristics for using as error criteria in discrete approximation.Missing: uniform | Show results with:uniform
[30]
[PDF] Math 2300: Calculus II The error in Taylor Polynomial approximations
Since ex is increasing in x, the largest value of f000(x) for x between 0 and 1 occurs at the right-hand endpoint. Thus the maximum value is e1 = e. Since e1 < ...
[31]
[PDF] Best uniform approximation; • Chebyshev polynomials; • Analysis of ...
where φ is in a finite dimensional space (e.g., space of polynomials of degree ≤ n) ä Solution is the “best uniform approximation to f”.<|control11|><|separator|>
[32]
[PDF] Title: Best Uniform Approximation by Finite Dimensional Spaces of ...
Haar characterised those G which have the property that best approximations are always unique - that is, the metric projection is single valued. Haar's con ...
[33]
How fast can $|x|$ be approximated by polynomials on $[-1,1]
Sep 18, 2021 · |f(x)−Pn(x)|≤M2(n+1)=O(1/n),. where Pn(x) is the best uniform approximation polynomial of f(x) on [−1,1]. My question is when f(x)=|x| ...
[34]
[PDF] 1 Polynomial approximation and interpolation - 1.1 Jackson theorems
The so-called Jackson theorems shows that the decay rate of the error depends on the smoothness of the function f. E.g. for f ∈ C1[a, b] we will prove an ...Missing: norm | Show results with:norm
[35]
[PDF] 8.3 - Chebyshev Polynomials
Definition Chebyshev polynomial of degree n ≥= 0 is defined as Tn(x) = cos (narccosx) , x ∈ [−1,1], or, in a more instructive form, Tn(x) = cosnθ , x = cosθ , ...Missing: arccos minimal deviation ∞ =
[36]
[PDF] Approximation Theory --- Chebyshev Polynomials & Least Squares
Definition (Monic Polynomial). A monic polynomial is a polynomial with leading coefficient 1. We get the monic Chebyshev polynomials ˜Tn(x) by dividing Tn(x) by ...Missing: T_n( T_n
[37]
[PDF] Lecture 5 5 Best approximation in C[a, b] - DAMTP
Theorem 5.4 A polynomial p∗ ∈ Tn is the best approximant to f ∈ C(T) ... Prove de La Valle-Poussin theorem: If p ∈ Pn is a polynomial such that f(ti) ...
[38]
[PDF] Chebyshev Polynomials and Economization of Power Series
The Chebyshev polynomials denoted Tn(x) for n = 0, 1,... are a set of orthogonal polynomials on the open interval (−1, 1) with.
[39]
[PDF] On Taylor-like Estimates for Chebyshev Series Approximations of $e ...
We show that ”Taylor-like” bounds valid outside of the interval of approximation may be derived for the important case of ex and Chebyshev polynomials on [-1, 1] ...<|separator|>
[40]
[PDF] Interpolation - LSEC
Sep 22, 2009 · Any basis for a Haar space is called a Chebyshev system. Here are some examples of Chebyshev systems on R: 1. 1,x,...,xn. 2. eλ1x,...,eλnx λ1 ...
[41]
[PDF] Multidimensional Chebyshev Systems (Haar systems) - arXiv
Apr 17, 2011 · Chebyshev systems play important role in the one-dimensional Moment problem and Approximation and Spline theory. We generalize the notion of.
[42]
[PDF] 1 Orthogonal Projections - LSU Math
We shall study orthogonal projections onto closed subspaces of H. In summary, we show: • If X is any closed subspace of H then there is a bounded linear ...
[43]
[PDF] Convergence) of Fourier Series - Xu-Yan Chen
Fourier coefficients are our best choices, in minimizing the error in mean. • S10(x) is the best L2 approx of f(x), among all trig polynomials of degree 10.
[44]
Parseval's Theorem -- from Wolfram MathWorld
If a function has a Fourier series given by f(x)=1/2a_0+sum_(n=1)^inftya_ncos(nx)+sum_(n=1)^inftyb_nsin(nx), (1) then Bessel's inequality becomes an ...
[45]
[PDF] 18.03SCF11 text: Gibbs' Phenomenon - MIT OpenCourseWare
Gibbs' phenomenon occurs near a jump discontinuity in the signal. It says that no matter how many terms you include in your Fourier series there will always be ...
[46]
[PDF] Lecture Notes #11 — Approximation Theory — Least Squares ...
Discrete Least Squares Approximation. Continuous Least Squares Approximation. Orthogonal Polynomials. Introduction... Normal Equations. Matrix Properties. More ...
[47]
[PDF] Least Squares Polynomials - arnold@uark.edu
These are the normal equations for the discrete linear least squares polynomial problem. ... Notice that xi is the ith row of an n × (d + 1) Vandermonde matrix V ...
[48]
[PDF] Vandermonde with Arnoldi - People - University of Oxford
Vandermonde matrices are exponentially ill-conditioned, rendering the familiar. “polyval(polyfit)” algorithm for polynomial interpolation and least-squares ...
[49]
[PDF] ORTHOGONAL POLYNOMIALS - OSU Math
Feb 6, 2014 · Orthogonal polynomials are connected with trigonometric, hypergeometric,. Bessel, and elliptic functions, are related to the theory of continued ...
[50]
[PDF] Orthogonal Polynomials and Least Squares Approximation
The. Gram-Schmidt algorithm allows us to construct orthogonal sets of polynomials on an interval [a, b] with respect to a weight function w(x). 1. Let ϕ0(x) = 1 ...
[51]
[PDF] Legendre's Polynomials Chapter-4
This recurrence relation is the classical three-term relation for Pn(x) and it is a pure recurrence relation for Legendre's polynomials. Remarks: Equating ...
[52]
[PDF] Gram-Schmidt for functions: Legendre polynomials - MIT
Mar 16, 2009 · Gram-Schmidt to them: the functions q1,q2,...,qn will form an orthonormal basis for all polynomials of degree ≤ n−1. There is another name ...
[53]
[PDF] Orthogonal Polynomials and Gaussian Quadrature
Feb 16, 2008 · The most common Gaussian quadrature formula is the special case of (a, b) = (−1, 1) and w(x) = 1. In this case, the orthogonal polynomials are ...
[54]
[PDF] Barycentric-Remez algorithms for best polynomial approximation in ...
Oct 10, 2009 · Abstract The Remez algorithm, 75 years old, is a famous method for computing minimax polynomial approximations. Most implementations of this ...
[55]
Rate of Convergence of Lawson's Algorithm - jstor
Abstract. The algorithm of Charles L. Lawson determines uniform approximations of functions as limits of weighted L2 approximations.Missing: polynomial original
[56]
[PDF] On best uniform approximation of finite sets by linear combinations ...
Sep 15, 2022 · The idea of this paper is to apply linear programming to the classical approximation problem of best uniform approxi- mation. More specifically, ...
[57]
The differential correction algorithm for generalized rational functions
The differential correction algorithm for generalized rational functions is described, and two theorems on convergence and order of convergence are given.
[58]
[PDF] Simultaneous estimation and variable selection in median ...
Jul 10, 2008 · Although there is no explicit analytic form for ˆβL1 , the minimization may be carried out easily via linear programming (see for example, ...
[59]
[PDF] Linear Programming: Chapter 12 Regression - Robert Vanderbei
Oct 17, 2007 · Median's Connection with Optimization x = argminx m. X i=1. |x − bi ... m + n and the L1 and L2 regression lines.
[60]
[PDF] Padé approximation
The Padé coefficients are normally best found from a Taylor expansion: . c0 + c1x + c2x2 + ... = a0 + a1x + a2x2 + ... 1 + ...<|control11|><|separator|>
[61]
[PDF] An Introduction to the Convergence - Theory of Pade Approximants
Our purpose is to introduce the reader to the convergence theory of interpolating rational functions known as Padé Approximants. Though the subject originated ...
[62]
[PDF] A Newton's method for best uniform rational approximation - RICAM
Feb 1, 2022 · The classical algorithm for best uniform rational approximation is the Remez algorithm (see, e.g., [6, 7, 27]), which is based on the idea of ...
[63]
[PDF] ON THE POLE STABILITY OF RATIONAL APPROXIMATION
Abstract. In this paper we investigated the stability of the system pa- rameters of the adaptive rational transformation, i.e. how specific pertur-.Missing: challenges Padé
[64]
[PDF] On the Stability and Instability of Padé Approximants - ResearchGate
We call G ∈ Rat∗(N) stable if all of its poles lie in the open left half plane C− and completely unstable if all of it poles lie in the open right half plane C ...Missing: challenges multipoint
[65]
A Practical Guide to Splines - SpringerLink
Free delivery 14-day returnsNov 29, 2001 · This book is based on the author's experience with calculations involving polynomial splines. It presents those parts of the theory that are especially useful ...
[66]
Contributions to the problem of approximation of equidistant data by ...
SOME PROBLEM OF SPLINE'S EXACTNESS · Zhaoxia Wang. Mathematics. 2008. Spline's exactness is firstly mentioned in [20] by Schoenberg in 1946. In this paper, we ...
[67]
[PDF] Numerical Integration - Jason Cantarella
Dec 28, 2007 · A strict error estimate for Simpson's rule is more difficult to obtain. ... 5.1.2 (a) Show that Simpson's rule is the unique quadrature formula of ...
[68]
[PDF] Numerical Analysis – Lecture 51 3 The Peano kernel theorem
The error function for a quadrature formula is. L(f) = Z b a w(x)f(x)dx ... This allows us to bound the approximation error for Simpson's rule. Indeed ...Missing: rules | Show results with:rules
[69]
[PDF] Numerical differentiation: finite differences
If we need to estimate the rate of change of y with respect to x in such a situation, we can use finite difference formulas to compute approximations of f0(x).
[70]
[PDF] Section 1 Polynomial interpolation
Exercise 1.4 Implement this idea with linear approximation. You should get the same method as the secant method, since the inverse of a linear function is ...
[71]
[PDF] FINITE DIFFERENCE AND SPECTRAL METHODS FOR ORDINARY ...
The other main parts of the book are. Chapter 2, which emphasizes the crucial relationship between smoothness of a function and decay of its Fourier transform ...
[72]
[PDF] CompositeQuadrature
Jan 28, 2021 · To generate more accurate quadrature rule Q(a, b) we have in principle two possibilities: * Increase the order of the interpolation polynomial ...
[73]
Revisit of the Eigenfilter Method for the Design of FIR Filters and ...
Sep 19, 2018 · The least squares based eigenfilter method has been applied to the design of both finite impulse response (FIR) filters and wideband beamformers ...Missing: approximation | Show results with:approximation
[74]
An Algorithm for Least-Squares Estimation of Nonlinear Parameters
The modified Gauss-Newton method for the fitting of non-linear regression functions by least squares, Technometrics, 3 (1961), 269–280
[75]
[PDF] signal processing and compression with wavelet pac&ets
This paper describes new algorithms for signal processing and data compression using orthogonal functions, including wavelet packets, which can reduce ...
[76]
ECG Signal Denoising and Features Extraction Using Unbiased FIR ...
It is shown that the adaptive UFIR algorithm developed in such a way provides better denoising and suboptimal features extraction in terms of the output signal- ...