Fact-checked by Grok 2 weeks ago

Bernstein polynomial

In the field of approximation theory, Bernstein polynomials are a family of polynomials that form a basis for the space of all polynomials of degree at most n on the unit interval [0, 1], defined by the basis functions b_{n,k}(u) = \binom{n}{k} u^k (1-u)^{n-k} for k = 0, 1, \dots, n, where \binom{n}{k} denotes the binomial coefficient.^[1] For a continuous function f on [0, 1], the nth Bernstein polynomial of f is defined as B_n f(u) = \sum_{k=0}^n f\left(\frac{k}{n}\right) b_{n,k}(u). Any polynomial p of degree at most n can be uniquely expressed as p(u) = \sum_{k=0}^n c_k b_{n,k}(u), with the coefficients c_k representing the control points that interpolate p at the endpoints u=0 and u=1.^[1]^[2] Introduced by Russian mathematician Sergei Natanovich Bernstein (1880–1968) in his 1912 paper "Démonstration du Théorème de Weierstrass fondée sur le calcul des Probabilités," these polynomials provided the first explicit constructive proof of the Weierstrass approximation theorem, which states that any continuous function on a closed interval can be uniformly approximated by polynomials.^[1] Bernstein's probabilistic approach, drawing on the law of large numbers via the binomial distribution, demonstrated convergence rates and marked a significant advancement in the Russian school of approximation theory initiated by Chebyshev.^[1] The basis gained renewed prominence in the 1960s through independent work by Paul de Faget de Casteljau and Pierre Bézier, who adapted it for parametric curve and surface design in computer-aided geometric design (CAGD), enabling smooth free-form modeling in industries like automotive and aerospace engineering.^[1] Key properties of Bernstein polynomials include their non-negativity on [0, 1], ensuring that convex combinations yield non-negative results; the partition of unity, where \sum_{k=0}^n b_{n,k}(u) = 1 for all u in [0, 1]; and symmetry, with b_{n,k}(u) = b_{n,n-k}(1-u).^[1] They also exhibit unimodality, peaking near u = k/n, and variation-diminishing behavior, preserving the number of sign changes in a manner analogous to totally positive kernels.^[1] These attributes contribute to their numerical stability, with lower condition numbers compared to the monomial power basis, making them preferable for computations involving root finding and polynomial evaluation.^[1] Beyond approximation and design, Bernstein polynomials find applications in optimization, robust control analysis, finite-element methods, and probabilistic modeling due to their ties to the binomial distribution.^[1]

Introduction and Definition

Overview and motivation

Bernstein polynomials emerged in the field of approximation theory as a pivotal tool for constructing explicit polynomial approximations to continuous functions. In 1885, Karl Weierstrass established that every continuous function on a compact interval can be uniformly approximated by polynomials, but his proof was non-constructive, relying on abstract limits rather than specific formulas.^[3] To address this gap and provide a concrete method, Sergei Natanovich Bernstein introduced these polynomials in 1912, offering the first explicit construction that demonstrates the theorem's validity.^[1] The primary motivation behind Bernstein's innovation was to bridge the theoretical assurance of Weierstrass's result with practical computability, enabling mathematicians to generate approximating polynomials directly from function values. This constructive approach not only affirmed the density of polynomials in the space of continuous functions but also highlighted their utility in numerical analysis and later applications. By focusing on the interval [0,1], Bernstein's work laid a foundation for broader extensions while emphasizing the theorem's implications for representing complex functions through simpler algebraic forms.^[1] At their core, Bernstein polynomials can be intuitively understood as weighted averages of a function's values evaluated at equidistant points within [0,1], where the weights derive from binomial probabilities. This probabilistic perspective—viewing the approximation as an expected value under a binomial distribution—merges algebraic manipulation with statistical intuition, ensuring the approximants remain non-negative and sum to unity, much like a probability distribution. Such a blend underscores their robustness for approximation tasks, setting the stage for deeper exploration of their properties and generalizations.^[1]

Bernstein basis polynomials

The Bernstein basis polynomials of degree n, denoted b_{k,n}(x) for k = 0, 1, \dots, n, are defined by the formula

b_{k,n}(x) = \binom{n}{k} x^k (1-x)^{n-k}, \quad x \in [0,1].

This definition was introduced by Sergei Bernstein in his 1912 proof of the Weierstrass approximation theorem using probabilistic methods.^[4] These basis polynomials exhibit several fundamental properties on the interval [0,1]. They are non-negative for all x \in [0,1] and k = 0, \dots, n, since the binomial coefficient \binom{n}{k} is positive and x^k (1-x)^{n-k} \geq 0 in this domain.^[1] Additionally, they form a partition of unity, satisfying

\sum_{k=0}^n b_{k,n}(x) = \sum_{k=0}^n \binom{n}{k} x^k (1-x)^{n-k} = [x + (1-x)]^n = 1

for all x \in [0,1], as a direct consequence of the binomial theorem.^[5] They are supported on [0,1], meaning b_{k,n}(x) \geq 0 with equality only at the endpoints for certain k, ensuring the basis is appropriate for approximations on this closed interval.^[1] Combinatorially, the coefficients \binom{n}{k} are binomial coefficients, which count the number of ways to choose k successes in n independent trials. This links the Bernstein basis to probability: b_{k,n}(x) represents the probability mass function of a binomial random variable with parameters n (number of trials) and success probability p = x.^[4] These basis elements are used to construct Bernstein polynomials that approximate continuous functions on [0,1].

Bernstein polynomials of a function

Given a function f that is continuous on the interval [0,1], the nth Bernstein polynomial of f is the polynomial B_n(f; x) of degree at most n defined by

B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) b_{k,n}(x),

where b_{k,n}(x) denotes the Bernstein basis polynomials of degree n.^[4] This construction evaluates f at the equidistant points k/n for k = 0, 1, \dots, n and weights these values using the basis functions, yielding an explicit summation form

B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k}.

^[4] A key probabilistic interpretation arises from the binomial distribution: B_n(f; x) equals the expected value \mathbb{E}\left[ f\left( \frac{K}{n} \right) \right], where K follows a binomial distribution with parameters n and success probability x.^[4] This view stems from modeling n independent Bernoulli trials each with success probability x, where K counts the number of successes, and the expectation captures the approximation of f(x) through the law of large numbers as n increases.^[4] The operator preserves constants and linear functions exactly for all n \geq 1: B_n(1; x) = 1 and B_n(t; x) = x, reflecting the properties that the basis sums to unity and the mean of K/n is x.^[6]

Key Properties

Algebraic properties

The Bernstein operator B_n, which maps a function f: [0,1] \to \mathbb{R} to its Bernstein polynomial B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k}, possesses several fundamental algebraic properties arising from its construction as a positive linear operator. Chief among these is its linearity: for any scalars a, b \in \mathbb{R} and functions f, g \in C[0,1],
B_n(af + bg; x) = a B_n(f; x) + b B_n(g; x)
for all x \in [0,1]. This follows directly from the finite linear combination in the definition, which distributes over addition and scalar multiplication.^[7] A related property is the preservation of positivity. If f(x) \geq 0 for all x \in [0,1], then B_n(f; x) \geq 0 for all x \in [0,1]. This holds because the Bernstein basis polynomials B_{n,k}(x) = \binom{n}{k} x^k (1-x)^{n-k} are nonnegative on [0,1] and satisfy the partition of unity \sum_{k=0}^n B_{n,k}(x) = 1, rendering B_n(f; x) a convex combination of the nonnegative values f(k/n). Consequently, B_n maps the cone of nonnegative continuous functions to itself.^[7]^[8] The output B_n(f) is always a polynomial of degree at most n, regardless of the continuous function f, due to the maximum degree of the basis polynomials being n. If f is itself a polynomial of degree m \leq 1, then B_n(f) = f exactly for all n \geq m; for higher-degree polynomials or general continuous f, the degree is precisely n unless the leading coefficients vanish.^[7] The Bernstein basis admits a simple recurrence relation that facilitates algebraic manipulations and computations:
B_{n,k}(x) = x B_{n-1,k-1}(x) + (1-x) B_{n-1,k}(x)
for k = 1, \dots, n, with B_{n,0}(x) = (1-x)^n and B_{n,n}(x) = x^n. This forward recurrence, provable by binomial coefficient identities, allows iterative construction of higher-degree bases from lower ones and underpins de Casteljau's algorithm for Bézier curves.^[9]^[8] Finally, the operator exhibits a saturation property concerning its approximation behavior: the uniform convergence rate of B_n(f) to f is O(1/n) for twice continuously differentiable functions f that are not linear, as captured by Voronovskaya's theorem, which states
\lim_{n \to \infty} n \left( B_n(f; x) - f(x) \right) = \frac{1}{2} x(1-x) f''(x).
Moreover, if the error satisfies \| B_n(f) - f \|_\infty = o(1/n), then f must be linear on [0,1]. This saturation order distinguishes Bernstein polynomials from operators achieving faster rates without such restrictions.^[7]^[10]

Analytic properties

The Bernstein polynomial B_n(f) of a continuous function f on [0,1] is itself a polynomial of degree at most n, and thus infinitely differentiable (i.e., C^\infty) on the open interval (0,1), regardless of the mere continuity of f. This smoothness arises inherently from the polynomial nature of the operator, providing a regularized approximation that enhances differentiability properties for further analysis or computation.^[11] A key analytic feature stems from the probabilistic interpretation of the Bernstein operator, where B_n(f;x) = \mathbb{E}[f(K/n)], with K following a binomial distribution \mathrm{Bin}(n, x). The first moment, or mean, satisfies \mathbb{E}[K/n] = x, reflecting reproduction of linear functions. The variance is \mathrm{Var}(K/n) = x(1-x)/n, which decreases as n grows, indicating concentration around x. Higher-order moments follow directly from those of the binomial distribution, enabling detailed probabilistic analysis of the approximation behavior.^[12] Specifically, the second moment is given by

\mathbb{E}\left[\left(\frac{K}{n}\right)^2\right] = x^2 + \frac{x(1-x)}{n},

derived as the sum of the squared mean and variance.^[12] The operator also exhibits shape-preserving properties, maintaining qualitative features of the input function. If f is monotonic (increasing or decreasing) on [0,1], then B_n(f) shares the same monotonicity. Similarly, convexity or concavity of f is preserved in B_n(f), a consequence of the operator's positivity and the reproduction of constants and linear terms. These properties ensure that the approximation respects the geometric or variational structure of f.^[13] In the space of continuous functions C[0,1] equipped with the supremum norm \|\cdot\|_\infty, the Bernstein operator B_n acts as a contraction with operator norm \|B_n\|_\infty = 1 for each n, due to its positivity and normalization properties. This norm remains fixed at 1 across all degrees n, though the operator converges pointwise to the identity as n \to \infty, tightening the approximation without altering the contraction constant.^[14]

Approximation Capabilities

Uniform approximation of continuous functions

The Weierstrass approximation theorem asserts that any continuous function on a closed interval can be uniformly approximated by polynomials. In 1912, Sergei Bernstein provided the first explicit constructive proof of this theorem using what are now known as Bernstein polynomials.^[15] Specifically, for any continuous function f: [0,1] \to \mathbb{R}, the Bernstein polynomial B_n(f)(x) = \sum_{k=0}^n f\left(\frac{k}{n}\right) \binom{n}{k} x^k (1-x)^{n-k} satisfies \|B_n(f) - f\|_\infty \to 0 as n \to \infty, where \| \cdot \|_\infty denotes the supremum norm on [0,1].^[16] This uniform convergence holds for every continuous f on the compact interval [0,1], ensuring that the polynomials approximate f arbitrarily closely in the uniform norm for sufficiently large n.^[15] The proof leverages probabilistic interpretations, viewing B_n(f)(x) as the expected value of f(K/n) where K follows a binomial distribution with parameters n and x, and exploits the law of large numbers to show concentration around f(x).^[16] Qualitatively, the convergence of Bernstein polynomials is generally slow, with the rate improving for smoother functions; for instance, if f has a continuous second derivative, the error is O(1/n).^[17] Near the endpoints x=0 and x=1, convergence tends to be even slower due to the reduced variance x(1-x)/n in the underlying binomial distribution, leading to larger local errors for fixed n.^[15] Despite this, the operators preserve key qualitative features of f, such as monotonicity and convexity, making them valuable for approximation tasks.^[7]

Convergence rates and error estimates

The convergence rate of Bernstein polynomials to a continuous function f on [0,1] is quantified using the modulus of continuity \omega_f(\delta) = \sup_{|x-y| \leq \delta} |f(x) - f(y)|. A general pointwise error bound is given by

|B_n(f; x) - f(x)| \leq \frac{3}{4} \omega_f \left( \sqrt{\frac{x(1-x)}{n}} \right) + \frac{5}{4} \omega_f \left( \frac{1}{\sqrt{n}} \right),

which implies uniform convergence at rate O(\omega_f(1/\sqrt{n})).^[18] For Lipschitz continuous functions with constant L, where \omega_f(\delta) \leq L \delta, the error simplifies to a bound of O(1/\sqrt{n}). Specifically,

|B_n(f; x) - f(x)| \leq L \sqrt{\frac{x(1-x)}{n}} \leq \frac{L}{2\sqrt{n}},

with the maximum error occurring near x = 1/2.^[19] For smoother functions in C^2[0,1], the approximation error improves to O(1/n). This follows from the Voronovskaya theorem, which provides the asymptotic expansion

\lim_{n \to \infty} n \left( B_n(f; x) - f(x) \right) = \frac{1}{2} x(1-x) f''(x)

for twice continuously differentiable f.^[19] The saturation order of the Bernstein operator is $1/n. Polynomials of degree at most 1 are reproduced exactly (trivial class). The saturation class consists of polynomials of degree at most 2, for which the error is O(1/n). For functions outside this class, the convergence rate is generally slower, such as O(\omega_f(1/\sqrt{n})) for continuous f, where \omega_f is the modulus of continuity. While Jackson-type estimates using higher-order moduli of continuity can refine rates for other operators, for Bernstein polynomials, the saturation order of $1/n means that even for smoother functions (e.g., C^\infty), the convergence rate remains O(1/n) and does not improve further.^[20]

Proofs of the approximation theorem

The approximation theorem for Bernstein polynomials states that for any continuous function f on [0, 1], the sequence of Bernstein polynomials B_n(f; x) converges uniformly to f(x) as n \to \infty.^[21]

Probabilistic Proof

The original proof by Bernstein relies on a probabilistic interpretation of the Bernstein operator. Consider K a binomial random variable with parameters n and success probability x, so B_n(f; x) = \mathbb{E}[f(K/n)]. The error satisfies

|B_n(f; x) - f(x)| = |\mathbb{E}[f(K/n) - f(x)]| \leq \mathbb{E}[|f(K/n) - f(x)|].

Given the uniform continuity of f on [0, 1], for any \epsilon > 0 there exists \delta > 0 such that |y - x| < \delta implies |f(y) - f(x)| < \epsilon. Thus,

\mathbb{E}[|f(K/n) - f(x)|] \leq \epsilon + 2\|f\|_\infty \mathbb{P}(|K/n - x| \geq \delta).

Applying Chebyshev's inequality yields

\mathbb{P}(|K/n - x| \geq \delta) \leq \frac{\mathrm{Var}(K/n)}{\delta^2} = \frac{x(1-x)}{n \delta^2} \leq \frac{1}{4n \delta^2}.

Choosing n sufficiently large ensures the probability term is less than \epsilon / (2\|f\|_\infty), making the total error less than $2\epsilon. Since the bound is uniform in x (as the maximum variance is $1/(4n)), convergence is uniform. This approach establishes uniform convergence without assuming differentiability.^[22]^[16]

Elementary Proof

An alternative, non-probabilistic proof uses the modulus of continuity \omega_f(\delta) = \sup \{ |f(y) - f(z)| : |y - z| \leq \delta \}, which satisfies \omega_f(\delta) \to 0 as \delta \to 0 by uniform continuity. The error is bounded as

|B_n(f; x) - f(x)| = \left| \sum_{k=0}^n \beta_{n,k}(x) (f(k/n) - f(x)) \right| \leq \sum_{k=0}^n \beta_{n,k}(x) |f(k/n) - f(x)|.

Splitting the sum into regions where |k/n - x| \leq \delta and > \delta, the first part is at most \omega_f(\delta), while the second is at most $2\|f\|_\infty \mathbb{P}(|K/n - x| > \delta), with the probability bounded elementarily by the variance as above, yielding a total of \omega_f(\delta) (1 + 1/(2n\delta^2)) (adjusting constants). Optimizing \delta = \sqrt{x(1-x)/n} gives

|B_n(f; x) - f(x)| \leq C \omega_f\left( \sqrt{\frac{x(1-x)}{n}} \right)

for some constant C, and uniformly,

\|B_n(f; \cdot) - f\|_\infty \leq C \omega_f\left( \sqrt{\frac{1}{4n}} \right).

As n \to \infty, the right-hand side tends to 0, proving uniform convergence without differentiability assumptions. This direct algebraic estimate avoids explicit probability measures.^[23]^[24] The probabilistic proof underscores Bernstein's innovative use of probability to simplify the Weierstrass theorem, while the elementary proof offers a more accessible, purely analytical route relying on continuity modules and variance bounds.^[21]

Multivariable and Other Generalizations

Higher-dimensional Bernstein polynomials

The higher-dimensional Bernstein polynomials generalize the univariate Bernstein operator to functions defined on the unit hypercube [0,1]^d via a tensor-product construction. For a continuous function f: [0,1]^d \to \mathbb{R} and multi-index \mathbf{n} = (n_1, \dots, n_d) with each n_i \in \mathbb{N}, the multivariate Bernstein operator is given by

B_{\mathbf{n}}(f; \mathbf{x}) = \sum_{\mathbf{k} \in \{0,\dots,n_1\} \times \cdots \times \{0,\dots,n_d\}} f\left( \frac{\mathbf{k}}{\mathbf{n}} \right) \prod_{i=1}^d b_{k_i, n_i}(x_i),

where \mathbf{k}/\mathbf{n} denotes componentwise division and b_{k_i, n_i}(x_i) = \binom{n_i}{k_i} x_i^{k_i} (1 - x_i)^{n_i - k_i} is the univariate Bernstein basis polynomial of degree n_i in the i-th variable.^[25] The multivariate basis functions take the explicit product form \prod_{i=1}^d b_{k_i, n_i}(x_i), reflecting the separable structure across dimensions.^[25] This operator inherits key algebraic properties from its univariate counterpart. It is linear: for scalars \alpha, \beta and functions f, g, B_{\mathbf{n}}(\alpha f + \beta g; \mathbf{x}) = \alpha B_{\mathbf{n}}(f; \mathbf{x}) + \beta B_{\mathbf{n}}(g; \mathbf{x}). It preserves positivity: if f(\mathbf{y}) \geq 0 for all \mathbf{y} \in [0,1]^d, then B_{\mathbf{n}}(f; \mathbf{x}) \geq 0 for all \mathbf{x} \in [0,1]^d, since the basis coefficients are nonnegative. Additionally, the basis satisfies the partition of unity property: \sum_{\mathbf{k}} \prod_{i=1}^d b_{k_i, n_i}(x_i) = 1 for all \mathbf{x} \in [0,1]^d, as the sum factors into the product of univariate partitions of unity.^[25] Regarding approximation capabilities, for any continuous f on the compact set [0,1]^d, B_{\mathbf{n}}(f; \mathbf{x}) converges uniformly to f(\mathbf{x}) as \min_i n_i \to \infty. This follows from the density of polynomials in the space of continuous functions on [0,1]^d, guaranteed by the Stone-Weierstrass theorem, combined with the fact that the Bernstein operators approximate the identity operator in the uniform norm. Convergence rates are governed by the smoothness of f, typically expressed in terms of mixed moduli of continuity that account for variations across multiple variables simultaneously; for example, the error satisfies \|B_{\mathbf{n}}(f) - f\|_\infty = O\left( \left( \sum_{i=1}^d \frac{1}{n_i} \right)^{1/2} \right) when f is Lipschitz continuous.^[25] In the isotropic case where all n_i = n, the operator B_{\mathbf{n}}(f; \mathbf{x}) admits a probabilistic interpretation relating to the multinomial distribution, as the basis weights correspond to the joint probability mass function of d independent Binomial(n, x_i) random variables, which can be embedded into a multinomial framework on the associated simplex.^[25]

Variations and extensions

Bernstein polynomials can be generalized to arbitrary closed intervals [a, b] by applying an affine transformation to map the domain to the unit interval [0, 1], computing the Bernstein approximation there, and then mapping back. Specifically, for a function f on [a, b], define g(y) = f(a + (b - a)y) where y \in [0, 1], approximate g using standard Bernstein polynomials, and transform the result inversely; the Bernstein coefficients under this mapping are related by a summation involving binomial coefficients and the affine parameters \alpha = (b-a), \beta = a. This preserves the uniform approximation properties of the original operator on continuous functions.^[26] A notable q-deformed variant, known as q-Bernstein polynomials, replaces binomial coefficients with q-binomial coefficients \binom{n}{k}_q = \frac{_q !}{_q ! [n-k]_q !}, where _q = \frac{1 - q^m}{1 - q} for q \neq 1, and powers with q-analogs. The basis functions are given by

b_{k,n}^q(x) = \binom{n}{k}_q x^k [1 - x]_q^{n-k},

where [1 - x]_q^{n-k} = \prod_{j=0}^{n-k-1} (1 - q^j x). These polynomials arise in quantum calculus and retain approximation capabilities for continuous functions on [0, 1], though convergence rates differ from the classical case: for fixed $0 < q < 1, uniform convergence holds if and only if the function is linear, but broader approximation occurs when q = q_n \to 1 as n \to \infty, with error bounded by the modulus of continuity scaled by $1/_q. For q > 1 fixed, iterates converge faster than classical Bernstein polynomials for monomials.^[27]^[28] Simplicial Bernstein polynomials extend the construction to the standard simplex \Delta^k = \{ x \in [0,1]^k : \sum_{i=1}^k x_i \leq 1 \}, using a multinomial basis analogous to the Dirichlet-multinomial distribution in probability. For a multi-index v = (v_1, \dots, v_k) with |v| = \sum v_i \leq n, the basis is

B_{v,n}(x) = \frac{n!}{v! (n - |v|)!} x^v (1 - |x|)^{n - |v|},

where x^v = \prod_{i=1}^k x_i^{v_i} and v! = \prod v_i!. These form a partition of unity and provide uniform approximation of continuous functions on the simplex as n \to \infty, with the operator B_n f(x) = \sum_{|v| \leq n} f(v/n) B_{v,n}(x) converging to f. Similar extensions apply to balls via radial mappings, preserving positivity and stability properties.^[29] Extensions incorporating stochastic elements, such as those generalizing to unbounded domains like [0, \infty), replace the binomial distribution with Poisson kernels for approximation under different measures. One such form is P(u; f)(x) = \sum_{v=0}^\infty e^{-u x} \frac{(u x)^v}{v!} f(v/u), which converges uniformly to continuous f on (0, \infty) as u \to \infty if f is bounded on finite subintervals and grows at most polynomially at infinity. These variants, including further stochastic interpretations via generalized distributions, maintain core approximation theorems but exhibit adjusted convergence rates tailored to the domain and measure, such as slower rates on unbounded sets compared to compact intervals.^[30]

Applications

In computer-aided design and graphics

Bernstein polynomials form the basis for Bézier curves, which are widely used in computer-aided design (CAD) and computer graphics for modeling smooth curves and surfaces. A Bézier curve of degree n is defined parametrically as \mathbf{B}(t) = \sum_{k=0}^n \mathbf{P}_k b_{k,n}(t), where \mathbf{P}_k are the control points and b_{k,n}(t) = \binom{n}{k} t^k (1-t)^{n-k} are the Bernstein basis polynomials of degree n, with t \in [0,1].^[31] This representation allows designers to intuitively shape curves by adjusting control points, as seen in applications like automotive body design.^[32] The Bézier curve was developed by French engineer Pierre Bézier in the 1960s while working at Renault, where it facilitated the design of car bodywork through the UNISURF system introduced in 1972.^[32] Key properties of Bézier curves derive from the Bernstein basis, including endpoint interpolation, where \mathbf{B}(0) = \mathbf{P}_0 and \mathbf{B}(1) = \mathbf{P}_n; containment within the convex hull of the control points, ensuring the curve does not deviate unexpectedly; and affine invariance, meaning the curve transforms consistently under affine mappings applied to its control points.^[31] Additionally, the de Casteljau algorithm enables efficient evaluation and subdivision of Bézier curves through recursive linear interpolation on control points, providing a stable method for rendering and manipulation in graphics software.^[31] The derivative of a Bézier curve highlights its smoothness, given by \mathbf{B}'(t) = n \sum_{k=0}^{n-1} \Delta \mathbf{P}_k b_{k,n-1}(t), where \Delta \mathbf{P}_k = \mathbf{P}_{k+1} - \mathbf{P}_k denotes the forward difference operator.^[33] This form leverages the recursive structure of Bernstein polynomials for efficient computation of tangents and higher derivatives. The numerical stability of the Bernstein basis is particularly advantageous in CAD software, as it minimizes rounding errors during evaluations and transformations compared to power bases.^[34] In higher dimensions, Bézier surfaces extend this framework using tensor products of univariate Bézier curves, defined over a control net \mathbf{P}_{i,j} as \mathbf{S}(u,v) = \sum_{i=0}^m \sum_{j=0}^n \mathbf{P}_{i,j} b_{i,m}(u) b_{j,n}(v), with u,v \in [0,1].^[35] This construction inherits the core properties of curves, enabling the modeling of complex free-form surfaces in applications such as vehicle exteriors and animated graphics.^[36]

In probability and statistics

Bernstein polynomials possess a natural probabilistic interpretation, arising as the expected value of a function evaluated at the normalized outcome of a binomial random variable. Specifically, for a function f: [0,1] \to \mathbb{R},

B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k} = \mathbb{E}\left[ f\left( \frac{K}{n} \right) \right],

where K \sim \operatorname{Binomial}(n, x). This representation connects the polynomials directly to binomial probabilities and facilitates analysis in stochastic settings, such as deriving convergence properties through laws of large numbers applied to the binomial process.^[37] In Bayesian nonparametrics, Bernstein polynomials serve as priors for density estimation on the unit interval, enabling approximation of posterior densities for continuous distributions. The prior is constructed by placing independent priors on the polynomial coefficients, which correspond to evaluations of the density at equidistant points, yielding a mixture of beta distributions that can flexibly approximate smooth densities. Consistency of the resulting posterior is established under mild conditions on the true density and prior specification, with contraction rates achieving near-parametric efficiency for densities in appropriate smoothness classes. This framework, known as the Bernstein polynomial posterior, leverages the probabilistic structure to justify asymptotic normality akin to the Bernstein-von Mises phenomenon in nonparametric settings.^[38] Empirical Bernstein polynomials extend this to data-driven approximation by smoothing sample-based measures, particularly for estimating distribution and density functions from observed data. A standard form for the empirical estimator of a distribution function F is

\hat{F}_{m,n}(x) = \sum_{k=0}^m F_n\left( \frac{k}{m} \right) \binom{m}{k} x^k (1-x)^{m-k},

where F_n denotes the empirical cumulative distribution function from n i.i.d. samples and m is a smoothing parameter. For density estimation, the estimator is obtained as the derivative \hat{f}_{m,n}(x) = \frac{d}{dx} \hat{F}_{m,n}(x), or equivalently through a Bernstein-type mixture with empirical weights \hat{p}_k derived from binned or ranked data points, such as \hat{B}_m(f; x) = \sum_{k=0}^m f(\hat{q}_k) b_{k,m}(x), where \hat{q}_k are empirical quantiles. These estimators produce non-negative, monotone approximations with automatic boundary correction, outperforming the raw empirical distribution in mean integrated squared error for smooth targets.^[39]^[40] Concentration inequalities play a key role in bounding deviations of empirical Bernstein estimators from the true distribution, often invoking Bernstein's inequality for sums of independent bounded random variables to control the supremum norm of the empirical process. For instance, the variance of the estimator inherits the binomial structure, allowing tail bounds of the form \mathbb{P}(|\hat{F}_{m,n}(x) - F(x)| > t) \leq 2 \exp(-2 m t^2) under sub-Gaussian assumptions, which tightens with data-dependent choice of m. In modern statistical applications, Bernstein polynomials smooth empirical distributions effectively, with optimal rates of O(n^{-2/3}) for uniform convergence achieved by selecting m \approx n^{2/3} adaptively via cross-validation or bias-variance trade-off criteria, improving upon fixed-m schemes in finite samples.^[40]

Other uses

Bernstein polynomials find application in global optimization algorithms, where they facilitate the representation of functions under shape constraints such as unimodality, aiding in the construction of test functions to evaluate algorithm performance. Their ability to preserve monotonicity and convexity makes them suitable for approximating unimodal landscapes in optimization problems.^[41]^[42] In numerical analysis, Bernstein polynomials underpin quadrature rules for computing definite integrals over the interval [0,1], offering reliable error bounds through their inherent positivity and bounded variation properties. These methods approximate the integrand via Bernstein expansions, enabling explicit estimation of integration errors without requiring higher derivatives. For example, the error in such approximations is controlled by the modulus of continuity of the function, ensuring stability for smooth integrands.^[43]^[44] In physics, particularly quantum mechanics, Bernstein polynomials approximate wave functions via efficient polynomial expansions in spectral methods, such as the B-polynomial-Galerkin method, for solving differential eigenvalue problems like the Schrödinger equation. This approach is applied to model bound states, such as in the quantum harmonic oscillator, where wave functions are expanded as linear combinations of Bernstein basis functions to compute energy levels and eigenstates accurately. Recent extensions include Bernstein spectral methods for quasinormal modes in black hole perturbations and other quantum eigenvalue problems as of 2023.^[45]^[46] Bernstein polynomials function as totally positive kernels, possessing variation-diminishing properties that underpin developments in spline theory during the 1950s by I. J. Schoenberg. These properties ensure that the approximation preserves or reduces the oscillatory behavior of the original function, which is crucial for constructing shape-preserving interpolants in spline constructions. The variation-diminishing property is formally stated as the number of sign changes in the Bernstein approximation being no greater than in the original function:

S(B_n(f)) \leq S(f),

where S(g) denotes the number of sign changes of g.^[47]^[48]^[49] More recent applications include their use in optimal control for trajectory generation in autonomous systems (as of 2022) and composite Bernstein approximants for solving optimal control problems (as of 2024).^[50]^[51]

References

[1]
[PDF] The Bernstein polynomial basis: a centennial retrospective
Polynomials can uniformly approximate any continuous f(x), x ∈ [a, b]. k2 dt with a Dirac delta function, and relies heavily on analytic limit arguments.Missing: history | Show results with:history
[2]
The Bernstein polynomial basis: A centennial retrospective
This survey provides a brief historical perspective on the evolution of the Bernstein polynomial basis, and a synopsis of the current state of associated ...
[3]
[PDF] Proof of the theorem of Weierstrass based on the calculus of ...
Bernstein. A translation of. Démonstration du théor`eme de Weierstrass fondée sur le calcul des probabilités, Comm. Kharkov Math. Soc. 13 (1912), 1–2∗.
[4]
1.3.1 Bernstein polynomials - MIT
The Bernstein polynomials are defined as. (1.19) They form a basis for polynomials (see Sect. 4.4) and have several properties of interest: Non-negativity: .
[5]
Bernstein Polynomials - G. G. Lorentz - Google Books
Presents an exposition of the main facts about the Bernstein polynomials and discusses some of their applications in Analysis.
[6]
[PDF] Bernstein Polynomials - NET
We begin this section by defining a totally positive matrix, and discuss the nature of linear transformations when the matrix is totally positive. We will apply ...Missing: algebraic linearity saturation
[7]
[PDF] BERNSTEIN POLYNOMIALS
In these notes we discuss another of the commonly used bases for the space of polynomials, the Bernstein basis, and discuss its many useful properties. Page 2 ...Missing: linearity recurrence saturation
[8]
[PDF] 1. Polynomials - NTNU
Here are some basic properties of Bernstein polynomials. 1. Page 2. • Bn i are positive on (0,1). (Positivity). • Bn i (x) = Bn n−i(1 - x). (Symmetry). • Pn i ...
[9]
Inverse Theorems for Bernstein Polynomials - jstor
This is the so-called saturation theorem for the Bernstein polynomials. (The theorem is precise even up to the value of the constant.) Here we are interested.
[10]
On the Derivatives of Bernstein Polynomials: An Application for the ...
Mar 15, 2011 · ... Bernstein polynomials themselves is proved, and a formula ... infinitely differentiable function in terms of those of the original ...
[11]
The moments for q-Bernstein operators in the case 0 < q < 1
Jul 1, 2009 · In this note we give the estimates of the central moments for q-Bernstein operators (0 < q < 1) which can be used for studying the ...<|separator|>
[12]
Shape-preserving properties of a new family of generalized ...
Sep 14, 2018 · In this paper, we introduce a new family of generalized Bernstein operators based on q integers, called ( α , q ) -Bernstein operators, ...Missing: contraction | Show results with:contraction
[13]
Optimal Constant in Approximation by Bernstein Operators
... Bernstein operators of degree n, ω2 is the second order modulus and || ⋅ || is the sup-norm. Article PDF. Download to read the full article text. Similar ...Missing: contraction | Show results with:contraction
[14]
https://link.springer.com/article/10.1023/A:1022898728718
[15]
[PDF] Weierstrass approximation theorem by S. Bernstein
There is a lovely proof of the Weierstrass approximation theorem by S. Bernstein. We shall show that any function, continuous on the closed interval [0,1] ...
[16]
[PDF] Iterated Bernstein polynomial approximations - arXiv
Oct 16, 2009 · However the slow optimal rate O(1/n) of convergence of the classical Bernstein polynomial approximation makes it not so attractive. Many authors ...
[17]
[PDF] a new bernstein-stancu type operator with negative parameter
Feb 22, 2019 · ... (5 4/ ), in the case of classical Bernstein polynomials, but it is ... ω δ denotes the modulus of continuity of f ′ . In particular, we ...<|separator|>
[18]
[PDF] On the convergence of derivatives of Bernstein approximation - UiO
Thus the k-th derivative of Bn(f) converges at the rate of 1/n when the k-th derivative of x(1 − x)f00(x) is non-zero. We remark that after completion of this ...
[19]
Voronovskaya-type formulas and saturation of convergence for q ...
In the case q = 1, we know that the Bernstein polynomials possess saturation: no func- tion f ∈ C[0, 1] can be approximated with error better than o(1/n) unless ...
[20]
Bernstein approximation and beyond: proofs by means of ... - arXiv
Jul 21, 2023 · Bernstein polynomials provide a constructive proof for the Weierstrass approximation theorem, which states that every continuous function on a closed bounded ...
[21]
None
### Summary of Bernstein Polynomials and Related Content
[22]
[PDF] Approximation of Continuous Functions
3.2 Proof of the Weierstrass Approximation Theorem. All of the Bernstein polynomials are positive, except at 0 and 1, where they are 0. When n is very large ...
[23]
[PDF] Approximation of smooth functions using Bernstein polynomials in ...
Sep 7, 2016 · We shall state and prove this theorem following Bernstein [1], except for fixing the slight flaw that only pointwise convergence was proven. The ...
[24]
Optimal approximation class for multivariate Bernstein operators - MSP
For the Bernstein polynomial approximation process on a sim- plex or a cube, the class of functions yielding optimal approximation will be given.
[25]
[PDF] Translation of Bernstein Coefficients Under an Affine Mapping of the ...
We have derived an expression relating the Bernstein coefficients of a polynomial on the unit domain x ∈ [0,1] to those for the mapped do- main x −→ αx + β. The ...Missing: transformation | Show results with:transformation
[26]
[PDF] On the Convergence and Iterates of q-Bernstein Polynomials
The convergence properties of q-Bernstein polynomials are investigated. When q ⩾ 1 is fixed the generalized Bernstein polynomials Bnf of f, a one.
[27]
(PDF) On the q-Bernstein Polynomials - ResearchGate
PDF | We discuss here recent developments on the convergence of the q-Bernstein polynomials B(sub n)f which replaces the classical Bernstein polynomial.
[28]
None
**Summary of Bernstein Polynomials on Simplex (arXiv:1106.2482)**
[29]
[PDF] Generalization of S. Bernstein's Polynomials to the Infinite Interval
The paper studies the convergence of P(u, x) to f(x) as u —> °o . The results obtained are generalized analogs, for the interval 0< x < °°, of known properties ...<|separator|>
[30]
[PDF] Lecture 24: Bezier Curves and Surfaces
Bezier curves have the following key geometric features: they are affine invariant, lie in the convex hull of their control points, satisfy the variation ...
[31]
[PDF] Bezier Curves - Tsinghua Graphics and Geometric Computing Group
Renault Car company, propose a new kind of curve representation, and finally developed a system UNISURF for car surface design in 1972. Page 5. • Pierre Etienne ...
[32]
Fast Forward Differencing - Bézier Curves and Surfaces
You can read as the derivative of a Bernstein polynomial can be expressed as the degree of the polynomial (n) multiplied by the difference of two Bernstein ...
[33]
The Bernstein polynomial basis: A centennial retrospective
This survey provides a brief historical perspective on the evolution of the Bernstein polynomial basis, and a synopsis of the current state of associated ...Missing: original paper
[34]
On multivariate polynomials in Bernstein–Bézier form and tensor ...
In this paper we make use of (multilinear) tensors to describe and manipulate multivariate polynomials in their Bernstein–Bézier form.
[35]
Bernstein-Bézier Methods for the Computer-Aided Design of Free ...
The purpose of this paper is to analyze the Bézier techniques and to explore various extensions and generalizations.<|separator|>
[36]
[PDF] Bernstein polynomials and Brownian motion
Perhaps the following probabilistic interpretation can help: Rn(x) is the probability that a walker starting at the origin returns to the origin after ...
[37]
Consistency of Bernstein Polynomial Posteriors - Oxford Academic
We study the consistency of the posterior from a Bernstein prior. We first show that, under mild assumptions, the posterior is weakly consistent.
[38]
Application of Bernstein Polynomials for smooth estimation of a ...
Subsequently, we can adapt the following theorem using Bernstein polynomials for smoothing the empirical distribution function over the interval [0,1].
[39]
[PDF] On estimating distribution functions using Bernstein polynomials
Nov 20, 2011 · We are now ready to state the basic properties of the Bernstein estimator ˆFm,n. Theorem 1 Under assumption (4), we have for x ∈ (0, 1) ...
[40]
[PDF] Unimodal Density Estimation Using Bernstein Polynomials
Bernstein polynomials are an attractive approach to density estimation as they are the simplest example of a polynomial approximation with a probabilistic ...
[41]
An Enhanced Differential Evolution Algorithm with Bernstein ...
The comparative experimental results show that BROMLDE has higher global optimization capability and convergence speed on most functions and engineering ...
[42]
Numerical Integration Methods by Using Bernstein Polynomials and ...
Mar 24, 2016 · Numerical Integration Methods by Using Bernstein Polynomials and Their Properties. ... error bounds. The problems range from integration ...
[43]
Some numerical integration methods based on Bernstein polynomials
In this work we find numerical integration of a function by some approximations of the function using Bernstein polynomials, and we introduce a new fast ...
[44]
[PDF] Solutions of the harmonic oscillator equation in a B-polynomial basis
Bernstein polynomials basis is given in ref.[11],. Di,n(x) = n j=0 αi,jBi,j(x) ... The wave functions are expressed as a linear combination of B-polynomials of ...
[45]
[PDF] Spline functions and total positivity.
If A is strictly totally positive, then, for all x € R" \ {0},. S*(Ax) ≤ S¯(x). This means that A has the variation-diminishing property. Roughly speaking ...
[46]
A property of Bernstein-Schoenberg spline operators - ResearchGate
Aug 6, 2025 · Like the Bernstein polynomial, this operator is 'variation diminishing' and therefore has certain 'shape preserving' properties, (see also [4] ) ...Missing: kernels | Show results with:kernels
[47]
On Variation Diminishing Spline Approximation Methods
Schoenberg I. J., On variation diminishing approximation methods, Proceedings of MRC Symposium „On numerical approximation“, Madison, Wisconsin, 249-274 (1958).Missing: date | Show results with:date