Bernstein polynomial
In the field of approximation theory, Bernstein polynomials are a family of polynomials that form a basis for the space of all polynomials of degree at most n on the unit interval [0, 1], defined by the basis functions b_{n,k}(u) = \binom{n}{k} u^k (1-u)^{n-k} for k = 0, 1, \dots, n, where \binom{n}{k} denotes the binomial coefficient.[1] For a continuous function f on [0, 1], the nth Bernstein polynomial of f is defined as B_n f(u) = \sum_{k=0}^n f\left(\frac{k}{n}\right) b_{n,k}(u). Any polynomial p of degree at most n can be uniquely expressed as p(u) = \sum_{k=0}^n c_k b_{n,k}(u), with the coefficients c_k representing the control points that interpolate p at the endpoints u=0 and u=1.[1][2] Introduced by Russian mathematician Sergei Natanovich Bernstein (1880–1968) in his 1912 paper "Démonstration du Théorème de Weierstrass fondée sur le calcul des Probabilités," these polynomials provided the first explicit constructive proof of the Weierstrass approximation theorem, which states that any continuous function on a closed interval can be uniformly approximated by polynomials.[1] Bernstein's probabilistic approach, drawing on the law of large numbers via the binomial distribution, demonstrated convergence rates and marked a significant advancement in the Russian school of approximation theory initiated by Chebyshev.[1] The basis gained renewed prominence in the 1960s through independent work by Paul de Faget de Casteljau and Pierre Bézier, who adapted it for parametric curve and surface design in computer-aided geometric design (CAGD), enabling smooth free-form modeling in industries like automotive and aerospace engineering.[1] Key properties of Bernstein polynomials include their non-negativity on [0, 1], ensuring that convex combinations yield non-negative results; the partition of unity, where \sum_{k=0}^n b_{n,k}(u) = 1 for all u in [0, 1]; and symmetry, with b_{n,k}(u) = b_{n,n-k}(1-u).[1] They also exhibit unimodality, peaking near u = k/n, and variation-diminishing behavior, preserving the number of sign changes in a manner analogous to totally positive kernels.[1] These attributes contribute to their numerical stability, with lower condition numbers compared to the monomial power basis, making them preferable for computations involving root finding and polynomial evaluation.[1] Beyond approximation and design, Bernstein polynomials find applications in optimization, robust control analysis, finite-element methods, and probabilistic modeling due to their ties to the binomial distribution.[1]Introduction and Definition
Overview and motivation
Bernstein polynomials emerged in the field of approximation theory as a pivotal tool for constructing explicit polynomial approximations to continuous functions. In 1885, Karl Weierstrass established that every continuous function on a compact interval can be uniformly approximated by polynomials, but his proof was non-constructive, relying on abstract limits rather than specific formulas.[3] To address this gap and provide a concrete method, Sergei Natanovich Bernstein introduced these polynomials in 1912, offering the first explicit construction that demonstrates the theorem's validity.[1] The primary motivation behind Bernstein's innovation was to bridge the theoretical assurance of Weierstrass's result with practical computability, enabling mathematicians to generate approximating polynomials directly from function values. This constructive approach not only affirmed the density of polynomials in the space of continuous functions but also highlighted their utility in numerical analysis and later applications. By focusing on the interval [0,1], Bernstein's work laid a foundation for broader extensions while emphasizing the theorem's implications for representing complex functions through simpler algebraic forms.[1] At their core, Bernstein polynomials can be intuitively understood as weighted averages of a function's values evaluated at equidistant points within [0,1], where the weights derive from binomial probabilities. This probabilistic perspective—viewing the approximation as an expected value under a binomial distribution—merges algebraic manipulation with statistical intuition, ensuring the approximants remain non-negative and sum to unity, much like a probability distribution. Such a blend underscores their robustness for approximation tasks, setting the stage for deeper exploration of their properties and generalizations.[1]Bernstein basis polynomials
The Bernstein basis polynomials of degree n, denoted b_{k,n}(x) for k = 0, 1, \dots, n, are defined by the formula b_{k,n}(x) = \binom{n}{k} x^k (1-x)^{n-k}, \quad x \in [0,1]. This definition was introduced by Sergei Bernstein in his 1912 proof of the Weierstrass approximation theorem using probabilistic methods.[4] These basis polynomials exhibit several fundamental properties on the interval [0,1]. They are non-negative for all x \in [0,1] and k = 0, \dots, n, since the binomial coefficient \binom{n}{k} is positive and x^k (1-x)^{n-k} \geq 0 in this domain.[1] Additionally, they form a partition of unity, satisfying \sum_{k=0}^n b_{k,n}(x) = \sum_{k=0}^n \binom{n}{k} x^k (1-x)^{n-k} = [x + (1-x)]^n = 1 for all x \in [0,1], as a direct consequence of the binomial theorem.[5] They are supported on [0,1], meaning b_{k,n}(x) \geq 0 with equality only at the endpoints for certain k, ensuring the basis is appropriate for approximations on this closed interval.[1] Combinatorially, the coefficients \binom{n}{k} are binomial coefficients, which count the number of ways to choose k successes in n independent trials. This links the Bernstein basis to probability: b_{k,n}(x) represents the probability mass function of a binomial random variable with parameters n (number of trials) and success probability p = x.[4] These basis elements are used to construct Bernstein polynomials that approximate continuous functions on [0,1].Bernstein polynomials of a function
Given a function f that is continuous on the interval [0,1], the nth Bernstein polynomial of f is the polynomial B_n(f; x) of degree at most n defined by B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) b_{k,n}(x), where b_{k,n}(x) denotes the Bernstein basis polynomials of degree n.[4] This construction evaluates f at the equidistant points k/n for k = 0, 1, \dots, n and weights these values using the basis functions, yielding an explicit summation form B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k}. [4] A key probabilistic interpretation arises from the binomial distribution: B_n(f; x) equals the expected value \mathbb{E}\left[ f\left( \frac{K}{n} \right) \right], where K follows a binomial distribution with parameters n and success probability x.[4] This view stems from modeling n independent Bernoulli trials each with success probability x, where K counts the number of successes, and the expectation captures the approximation of f(x) through the law of large numbers as n increases.[4] The operator preserves constants and linear functions exactly for all n \geq 1: B_n(1; x) = 1 and B_n(t; x) = x, reflecting the properties that the basis sums to unity and the mean of K/n is x.[6]Key Properties
Algebraic properties
The Bernstein operator B_n, which maps a function f: [0,1] \to \mathbb{R} to its Bernstein polynomial B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k}, possesses several fundamental algebraic properties arising from its construction as a positive linear operator. Chief among these is its linearity: for any scalars a, b \in \mathbb{R} and functions f, g \in C[0,1],B_n(af + bg; x) = a B_n(f; x) + b B_n(g; x)
for all x \in [0,1]. This follows directly from the finite linear combination in the definition, which distributes over addition and scalar multiplication.[7] A related property is the preservation of positivity. If f(x) \geq 0 for all x \in [0,1], then B_n(f; x) \geq 0 for all x \in [0,1]. This holds because the Bernstein basis polynomials B_{n,k}(x) = \binom{n}{k} x^k (1-x)^{n-k} are nonnegative on [0,1] and satisfy the partition of unity \sum_{k=0}^n B_{n,k}(x) = 1, rendering B_n(f; x) a convex combination of the nonnegative values f(k/n). Consequently, B_n maps the cone of nonnegative continuous functions to itself.[7][8] The output B_n(f) is always a polynomial of degree at most n, regardless of the continuous function f, due to the maximum degree of the basis polynomials being n. If f is itself a polynomial of degree m \leq 1, then B_n(f) = f exactly for all n \geq m; for higher-degree polynomials or general continuous f, the degree is precisely n unless the leading coefficients vanish.[7] The Bernstein basis admits a simple recurrence relation that facilitates algebraic manipulations and computations:
B_{n,k}(x) = x B_{n-1,k-1}(x) + (1-x) B_{n-1,k}(x)
for k = 1, \dots, n, with B_{n,0}(x) = (1-x)^n and B_{n,n}(x) = x^n. This forward recurrence, provable by binomial coefficient identities, allows iterative construction of higher-degree bases from lower ones and underpins de Casteljau's algorithm for Bézier curves.[9][8] Finally, the operator exhibits a saturation property concerning its approximation behavior: the uniform convergence rate of B_n(f) to f is O(1/n) for twice continuously differentiable functions f that are not linear, as captured by Voronovskaya's theorem, which states
\lim_{n \to \infty} n \left( B_n(f; x) - f(x) \right) = \frac{1}{2} x(1-x) f''(x).
Moreover, if the error satisfies \| B_n(f) - f \|_\infty = o(1/n), then f must be linear on [0,1]. This saturation order distinguishes Bernstein polynomials from operators achieving faster rates without such restrictions.[7][10]