Sard's theorem, also known as the Morse–Sard theorem, is a fundamental result in differential topology that asserts: for a smooth map f: [M](/page/M) \to [N](/page/N) between smooth manifolds of finite dimension, the set of critical values of f—that is, the image under f of the critical points where the differential df is not surjective—has Lebesgue measure zero in N.[1] This implies that almost every point in the codomain N is a regular value, for which the preimage f^{-1}(y) is either empty or a smooth submanifold of [M](/page/M) of dimension \dim M - \dim N.[2]Proved by the American mathematician Arthur Sard in 1942, the theorem generalizes earlier work by Anthony P. Morse from 1939 on scalar-valued functions and builds on results by A. P. Brown from 1935 showing the density of regular values.[1] Sard's original proof, published in the Bulletin of the American Mathematical Society, applies to C^k maps with k sufficiently large relative to the dimensions (specifically, k > \max(m - n + 1, 1) for maps from \mathbb{R}^m to \mathbb{R}^n), and it extends to manifolds via charts.[1] The result holds for second-countable Hausdorff manifolds and uses tools like Fubini's theorem and Taylor expansions to show that critical sets project to measure-zero subsets.[3]In differential topology, Sard's theorem is essential for the concept of genericity and general position, ensuring that "typical" smooth maps avoid pathological behaviors like unwanted singularities in preimages.[2] It underpins the transversality theorem, which states that maps transverse to submanifolds are dense in the space of smooth maps, facilitating the study of intersections and homotopies.[3] Applications include degree theory (e.g., the Hopf degree theorem), embedding theorems (such as Whitney's embedding theorem), and proofs of fixed-point theorems like Brouwer's, by guaranteeing the existence of regular values for defining topological invariants.[2] Variants extend to infinite-dimensional settings under stronger regularity assumptions, though counterexamples exist without them.[3]
Introduction and Background
Historical Development
The development of Sard's theorem traces its roots to the early 20th century, amid advances in the calculus of variations, where mathematicians sought to characterize the measure-theoretic properties of critical sets to better understand extremal problems and the behavior of smooth functions near singularities. A. P. Brown's 1935 result showed the density of regular values for sufficiently smooth maps, providing an early foundation for controlling exceptional sets in such analyses. An unpublished collaboration between Marston Morse and Arthur Sard around 1935 laid further groundwork, investigating conditions under which critical values of functions from \mathbb{R}^n to \mathbb{R} form sets of Lebesgue measure zero, motivated by the need to control the "size" of exceptional sets in variational analysis.[4]In 1939, Anthony P. Morse advanced this line by proving a special case for scalar-valued functions, demonstrating that if f: \mathbb{R}^n \to \mathbb{R} is sufficiently smooth, the image of its critical set has measure zero; this result, building directly on variational techniques, appeared under the title "The behavior of a function on its critical set" in the Annals of Mathematics. Morse's work highlighted the role of critical points in restricting the dimensionality and measure of function images, providing essential tools for subsequent generalizations.The full theorem emerged in 1942 through Arthur Sard's seminal paper "The measure of the critical values of differentiable maps," published in the Bulletin of the American Mathematical Society, where he established the result for C^k mappings between Euclidean spaces of arbitrary dimensions with k \geq \max\{1, m - n + 1\} for maps from \mathbb{R}^m to \mathbb{R}^n, showing that critical values form a set of Lebesgue measure zero. This proof extended Morse's ideas using iterative differentiation and measure-theoretic arguments, confirming Sard's motivation from critical point analysis in the calculus of variations while resolving the general finite-dimensional case.[5]
Motivational Examples
To build intuition for the role of critical values in smooth mappings, consider the one-dimensional function f: \mathbb{R} \to \mathbb{R} given by f(x) = x^3. The derivative f'(x) = 3x^2 vanishes solely at the critical point x = 0, and the image f(0) = 0 is a single point, which has Lebesgue measure zero in \mathbb{R}.[6] In one dimension, the critical values of a smooth function form a set of measure zero.[6]A similar phenomenon arises in projections of curves in the plane. For the curve \gamma: \mathbb{R} \to \mathbb{R}^2 parametrized by \gamma(t) = (t, t^3), which traces a cubic graph, the projection onto the y-axis is the map \pi_y(\gamma(t)) = t^3. This projection has a critical point at t = 0, where the tangent vector \gamma'(t) = (1, 3t^2) becomes horizontal relative to the y-direction, yielding the critical value 0—a single point of measure zero in \mathbb{R}. Folds in such projections create isolated critical values, but the overall set remains negligible in measure, ensuring that generic smooth functions exhibit well-behaved (regular) behavior almost everywhere, avoiding pathological issues at these points.[7]This low-dimensional perspective highlights why critical values constitute a "small" set: even when the image covers the target space, the exceptional loci where the mapping fails to be locally invertible are confined to measure-zero subsets.[7]
Mathematical Prerequisites
Smooth Mappings and Derivatives
In the context of Sard's theorem, smooth mappings form the foundational framework for analyzing the behavior of functions between Euclidean spaces or manifolds. A mapping f: U \to \mathbb{R}^m, where U \subseteq \mathbb{R}^n is open, is said to be of class C^k for a positive integer k if all partial derivatives of f up to order k exist and are continuous on U.[8] If f is C^k for every positive integer k, then f is smooth, or C^\infty. These definitions extend naturally to mappings between open subsets of Euclidean spaces, ensuring that smoothness captures higher-order differentiability essential for measure-theoretic properties in the theorem.[8]The derivative of a smooth mapping f: \mathbb{R}^n \to \mathbb{R}^m at a point x \in \mathbb{R}^n is represented by the Jacobian matrix Df(x), an m \times n matrix whose entries are the first-order partial derivatives:[Df(x)]_{ij} = \frac{\partial f_i}{\partial x_j}(x).This matrix encodes the best linear approximation to f near x, and its rank, which is the dimension of the image of the associated linear map, satisfies \operatorname{[rank](/page/Rank)}(Df(x)) \leq \min(m, n) by properties of matrix rank.[8] Higher-order derivatives follow analogously, with smoothness requiring continuity of all such partials.For mappings between smooth manifolds, the concept of smoothness is defined locally via coordinate charts. A smooth manifold M of dimension n is equipped with an atlas \mathcal{A} = \{(U_\alpha, \phi_\alpha)\}, where each U_\alpha \subseteq M is open, \phi_\alpha: U_\alpha \to \mathbb{R}^n is a homeomorphism onto an open set, and transition maps \phi_\beta \circ \phi_\alpha^{-1} are smooth for overlapping charts. A mapping f: M \to N between smooth manifolds M and N is smooth if, for every pair of charts (U, \phi) on M and (V, \psi) on N with f(U) \subseteq V, the coordinate representation \psi \circ f \circ \phi^{-1}: \phi(U) \to \psi(V) is a smooth mapping between open subsets of Euclidean spaces.[8] This local Euclidean structure allows global properties of f to be studied through these chart-induced maps.The differential of a smooth mapping f: M \to N at a point x \in M, denoted df_x: T_x M \to T_{f(x)} N, is the linear map between tangent spaces induced by f. In local coordinates, df_x is represented by the Jacobian matrix of the coordinate expression of f, providing a precise linearization that preserves the rank condition locally.[8] Points where the rank of df_x drops below the dimension of the codomain are of particular interest, as they relate to the critical points underlying Sard's theorem.[8]
Critical Points and Regular Values
In the context of a smooth mapping f: \mathbb{R}^n \to \mathbb{R}^m, the Jacobian matrix Df(x) at a point x \in \mathbb{R}^n plays a central role in classifying points in the domain.[1] A point x is defined as a critical point if the rank of Df(x) is less than the dimension m of the codomain; otherwise, it is a regular point.[1] The set of all critical points, denoted C_f, is formally given byC_f = \{ x \in \mathbb{R}^n \mid \operatorname{rank}(Df(x)) < m \}.[1]The critical values of f are the points in the codomain that arise as images of critical points, specifically the set f(C_f).[1] In contrast, a regular value is any point in \mathbb{R}^m that does not lie in f(C_f), meaning it is either outside the image of f or is the image solely of regular points.[1]These concepts extend to smooth mappings f: M \to N between smooth manifolds of dimensions n and m, respectively. A point x \in M is critical if the rank of df_x is less than m (i.e., df_x is not surjective); otherwise, it is regular. The set of critical points is C_f = \{ x \in M \mid \operatorname{rank}(df_x) < m \}, critical values are f(C_f) \subseteq N, and a regular value is any point in N not in f(C_f).[8]A notable case occurs when the domain dimension n is less than the codomain dimension m: here, the rank of Df(x) is at most n < m for every x, so every point in \mathbb{R}^n is critical.[1] The analogous situation holds for manifolds with \dim M < \dim N.
Core Statement
General Formulation
Sard's theorem provides a fundamental result in real analysis concerning the size of the image of critical points under sufficiently smooth mappings between Euclidean spaces. Specifically, consider a mapping f: U \subseteq \mathbb{R}^n \to \mathbb{R}^m, where U is open and f belongs to the class C^k with k \geq \max(n - m + 1, 1). A point x \in U is critical if the rank of the Jacobian matrix Df(x) is strictly less than m, and the critical set is denoted C_f \subseteq U. The theorem states that the set of critical values f(C_f) has Lebesgue measure zero in \mathbb{R}^m.[1]The differentiability condition on k is essential, as it guarantees that higher-order derivatives provide the necessary flatness near critical points to ensure their images do not contribute positive measure. For C^\infty mappings, this holds unconditionally since infinite differentiability exceeds any finite k. In equation form, if \mu denotes the Lebesgue measure on \mathbb{R}^m, then\mu(f(C_f)) = 0.This formulation captures the theorem's core assertion in the finite-dimensional setting.[9]A trivial case arises when n < m, where the rank of Df(x) is at most n < m for all x, making the entire domain critical. Consequently, the whole image f(U) has Lebesgue measure zero in \mathbb{R}^m, independent of smoothness beyond continuity. This underscores how the theorem aligns with dimensional constraints on mappings.[1]
Finite-Dimensional Case
In the finite-dimensional case, Sard's theorem applies to smooth mappings between Euclidean spaces \mathbb{R}^n and \mathbb{R}^m. A key corollary states that for a C^\infty map f: \mathbb{R}^n \to \mathbb{R}^m with n \geq m, the set of critical values has Lebesgue measure zero in \mathbb{R}^m.[10] This follows from the general formulation of the theorem, specialized to Euclidean domains via local coordinates. More generally, the result holds for maps that are merely C^k with k = n - m + 1, ensuring the critical values form a set of measure zero.[10]Whitney showed in 1935 that the smoothness condition is sharp by constructing a C^1 map from \mathbb{R}^2 to \mathbb{R} whose critical values form a set of positive Lebesgue measure.[11]A concrete example illustrates the theorem for f: \mathbb{R}^2 \to \mathbb{R}^1, such as the height function on a smooth surface embedded in \mathbb{R}^3, where the projection onto the z-axis yields critical values corresponding to horizontal tangent planes; these form a set of measure zero in \mathbb{R}.[10]For the case m=1, the critical values of f: \mathbb{R}^n \to \mathbb{R} can be expressed as a countable union of points arising from higher-order vanishing of derivatives:f\left( \bigcup_{k=1}^\infty S_k \right),where S_k = \{ x \in \mathbb{R}^n \mid D^j f(x) = 0 \text{ for all } 1 \leq j \leq k \} denotes the set of points where the first k derivatives vanish, and each f(S_k) consists of isolated points or lower-dimensional images that collectively have measure zero.[10]
Proof Techniques
Measure-Theoretic Foundations
The Lebesgue measure \mu on \mathbb{R}^m is the standard measure that generalizes the notion of volume, assigning to each measurable set its m-dimensional content.[12] A subset E \subseteq \mathbb{R}^m has Lebesgue measure zero, denoted \mu(E) = 0, if for every \epsilon > 0, E can be covered by a countable collection of open balls whose total volume is less than \epsilon.[12] Such sets are considered negligible in the context of integration and geometric analysis, as they do not contribute to the value of Lebesgue integrals over \mathbb{R}^m.[13]A key property of Lebesgue measure is that the countable union of sets of measure zero also has measure zero.[12] This follows from the subadditivity of the outer measure: if \{E_k\}_{k=1}^\infty are sets with \mu(E_k) = 0 for each k, then for any \epsilon > 0, each E_k can be covered by balls with total volume less than \epsilon / 2^k, and the union is covered by balls with total volume less than \epsilon.[14] In the setting of Sard's theorem, this property ensures that images of critical sets under smooth mappings, which can often be expressed as countable unions of measure-zero sets, remain negligible.[13]Sard's theorem establishes that the set of critical values of a smooth mapping has Lebesgue measure zero in the codomain, rendering these values negligible for purposes such as integration, where functions can be altered on such sets without affecting integrals, or for genericity arguments in differential topology, where "almost all" points behave regularly.[13] For instance, a straight line in \mathbb{R}^2 has Lebesgue measure zero, as it can be covered by thin rectangular strips of arbitrarily small total area, providing an intuitive analogy for the "thinness" of critical value sets in higher dimensions.[12]
Iterative Differentiation Approach
The iterative differentiation approach to proving Sard's theorem relies on a finite induction on the dimension of the domainspace, decomposing the critical set into layers defined by the order of vanishing of higher-order derivatives and applying the inductive hypothesis to lower-dimensional slices or restrictions of the map.[3] For a smooth map f: \mathbb{R}^m \to \mathbb{R}^n with m \geq n, the proof begins by assuming the theorem holds for maps from \mathbb{R}^{m-1} to \mathbb{R}^n, and proceeds by considering the critical set C_f = \{x \in \mathbb{R}^m \mid Df(x) is not surjective\}.[15] The critical set is partitioned into a countable union of subsets C_f = \bigcup_{k=0}^\infty (C_k \setminus C_{k+1}) \cup \bigcap_{k=0}^\infty C_k, where C_k = \{x \in C_f \mid D^j f(x) = 0 \text{ for all } 1 \leq j \leq k\} denotes the points where the derivatives vanish up to order k, ensuring the image of each part has measure zero.[16]For points in C_f \setminus C_1, where the first derivative Df(x) has rank less than n but is non-zero in some direction, a change of coordinates can be made so that the map is regular in the first variable, allowing the use of the inverse function theorem to parametrize the domain via a diffeomorphism and reduce the problem to slices of dimension m-1.[3] By the inductive hypothesis, the image under f of the critical set in each such slice has measure zero in \mathbb{R}^n, and integrating over the parameter space via Fubini's theorem yields that the measure of f(C_f \setminus C_1) is zero.[15] Specifically, for a suitable decomposition, the measure satisfies\mu(f(C_f \setminus C_1)) \leq \int \mu(f(C_f \cap \pi^{-1}(t))) \, d\mu_{\mathbb{R}}(t),where \pi projects onto a one-dimensional factor, each slice \pi^{-1}(t) has dimension m-1, and the inner measures are zero by induction.[16]For the layers C_k \setminus C_{k+1}, the (k+1)-th derivative D^{k+1} f(x) does not vanish, enabling a similar coordinate adjustment where the map restricted to a hyperplane transverse to the kernel behaves like a lower-order critical map, again reducing to the inductive case on dimension m-1 after applying the inverse function theorem around points where a partial derivative of order k+1 is non-zero.[3] This step ensures that the image f(C_k \setminus C_{k+1}) maps to a set of measure zero, as the restriction inherits the smoothness and the critical values lie in a lower-dimensional fiber whose image has measure zero by hypothesis.[15]Finally, for the residual set \bigcap_{k=0}^\infty C_k of points where all higher derivatives vanish to infinite order (flat points), Taylor's theorem with remainder bounds the image size: on a small cube of side length \delta, the diameter of f(C_k \cap Q) is at most C \delta^{k+1}, and subdividing into l^m subcubes with l > 1/\delta shows that the measure \mu(f(C_k \cap Q)) \leq C l^{m - n(k+1)} \to 0 as l \to \infty for sufficiently large k > m/n - 1, ensuring the infinite intersection maps to measure zero.[16] This completes the induction, as the finite-dimensional case follows from local charts, and the full theorem holds for smooth maps since finite-order approximations suffice locally.[3]
Variants and Extensions
Manifold Versions
Sard's theorem extends naturally to smooth mappings between finite-dimensional smooth manifolds. Consider smooth manifolds M^n and N^m of dimensions n and m respectively, equipped with atlases of coordinate charts. For a smooth map f: M \to N, a point x \in M is defined as critical if the rank of the differential df_x: T_x M \to T_{f(x)} N is strictly less than m, meaning df_x fails to be surjective. The critical values are the points in the image f(C_f) \subseteq N, where C_f \subset M denotes the set of critical points. The theorem asserts that this set has measure zero in N with respect to any smoothvolume form on N.[17][18]The proof relies on reducing the problem to the Euclidean case via local coordinate charts. Since M and N are locally Euclidean, for any point p \in M, there exist charts (U, \phi) around p and (V, \psi) around f(p) such that the composition \psi \circ f \circ \phi^{-1}: \phi(U) \subset \mathbb{R}^n \to \psi(V) \subset \mathbb{R}^m is a smooth map between open subsets of Euclidean spaces. By the classical Sard's theorem, the image of the critical points under this local map has Lebesgue measure zero in \mathbb{R}^m. Thus, in each chart on N, the local contribution to the critical values has measure zero.[17][18]To establish the global result, cover N with a countable atlas of charts \{(V_i, \psi_i)\}. The set of critical values is contained in the union over i of \psi_i^{-1} applied to the measure-zero sets in each \psi_i(V_i). Since the volume form on N induces Lebesgue measure in local charts (up to diffeomorphism), and measure-zero sets in charts pull back to measure-zero sets with respect to the volume form, the total set f(C_f) has measure zero in N. Some presentations embed N into a high-dimensional Euclidean space using a weak form of Whitney's embedding theorem and apply the Euclidean Sard's theorem to the composed map, yielding the same conclusion.[15][18]This manifold version, central to differential topology since the mid-20th century, underpins key results such as the existence of regular values and transversality, facilitating proofs in embedding and immersion theory.[17]
Infinite-Dimensional Generalizations
Extending Sard's theorem to infinite-dimensional spaces, such as Banach or Hilbert spaces, encounters fundamental obstacles, as the classical measure-theoretic conclusion fails in general. For C^1 maps between Banach spaces, the set of critical values may not be of measure zero even when projected to finite-dimensional subspaces, due to the lack of a canonical translation-invariant measure in infinite dimensions. Counterexamples demonstrate that this issue persists for C^\infty maps; for instance, there exist smooth functions from the separable Hilbert space \ell^2 to \mathbb{R} whose critical values comprise the entire codomain.[19]Partial generalizations succeed under restrictive conditions, particularly for maps with Fredholm derivatives. In 1965, Stephen Smale proved a version for C^q Fredholm maps (q > \max(\mathrm{index}(f), 0)) between manifolds modeled on Banach spaces, where the derivative at each point is a Fredholm operator of constant index. Here, the set of critical values—images of points where the derivative fails to be surjective—is of first category (meager), meaning it is a countable union of nowhere dense sets, by the Baire category theorem. This holds robustly in separable Hilbert spaces, where the separability ensures the Baire property applies effectively, replacing the measure-zero condition with category-theoretic smallness.[20][21]Weaker notions of differentiability also yield limited analogs. For maps that are Gâteaux differentiable almost everywhere in separable Banach spaces (with respect to Gaussian measures or prevalence), the critical set can be controlled to have complement of full measure in certain senses, though the image of critical points may still be dense without further assumptions like continuity of the derivative. The full C^\infty analog requires additional structure, such as Fredholm properties, to avoid pathological behavior observed in counterexamples.[22]In the early 2000s, David Preiss and collaborators established that Lipschitz maps between separable Banach spaces are Fréchet differentiable Γ-almost everywhere, provided they are regularly Gâteaux differentiable Γ-almost everywhere.[23] This differentiability result facilitates Sard-type conclusions in infinite dimensions using notions like prevalence (sets of "measure zero" with respect to Gaussian measures). A prevalent transversality theorem for Lipschitz functions between infinite-dimensional spaces was proved by Hunt, Sauer, and Yorke in 2006.[24] More recent work explores Sard properties for polynomial maps in infinite dimensions (Tiberio, 2024) and relaxed versions in one dimension (as of March 2025).[25][26]Such infinite-dimensional extensions underpin applications in partial differential equations (PDEs), where solution spaces are often Banach or Hilbert spaces of functions. Smale's theorem, for example, ensures the genericity of regular values for nonlinear Fredholm operators arising in elliptic PDEs, facilitating proofs of local uniqueness and existence of solutions with prescribed properties in variational settings.[21][27]
Applications
Role in Transversality Theory
Sard's theorem provides the measure-theoretic foundation for transversality theory in differential topology by establishing that the image of the critical set under a smoothmap has measure zero. This property ensures that small perturbations of a map can avoid pathological behaviors, making transverse intersections the generic case for smooth mappings between manifolds.The transversality theorem, originally due to René Thom, states that given smooth manifolds M^m and N^n, a submanifold S \subset N, and any smoothmap f: M \to N, there exists a smoothmap g arbitrarily close to f such that g is transverse to S. In the standard proof, one embeds the parameter space into a higher-dimensional manifold and considers a projectionmap whose critical values are controlled by Sard's theorem; this shows that the set of parameters yielding non-transverse maps has measure zero, hence transversality holds for a dense open set of perturbations.This framework is pivotal in embedding theorems, such as Whitney's, where an immersion f: M \to \mathbb{R}^k (with k \geq 2\dim M) can be perturbed to an embedding by ensuring transversality to the diagonal in M \times M. Specifically, the self-intersection set corresponds to the preimage of the diagonal under f \times f: M \to M \times \mathbb{R}^k, and Sard's theorem guarantees that critical values avoid this submanifold generically, allowing elimination of double points through small adjustments.Sard's theorem thus resolves apparent paradoxes in higher-dimensional topology, such as Smale's demonstration of the eversion of the 2-sphere, by enabling regular homotopies that avoid singularities via transverse perturbations in dimensions three and above.
Implications for Embeddings and Immersions
Sard's theorem underpins the proofs of the Whitney immersion and embedding theorems, which establish the existence of smooth embeddings and immersions of manifolds into Euclidean spaces of minimal dimension. The Whitney immersion theorem states that any smooth n-dimensional manifold M (with n \geq 2) admits a smooth immersion into \mathbb{R}^{2n-1}.A key step in establishing this involves first obtaining an immersion of M into \mathbb{R}^{2n}. Sard's theorem then guarantees that the set of linear projections from \mathbb{R}^{2n} to \mathbb{R}^{2n-1} for which the projected map fails to be an immersion—due to critical values where the differential becomes degenerate—has measure zero in the space of all such projections. Thus, a generic choice of projection direction yields an immersion into \mathbb{R}^{2n-1}.For embeddings, the Whitney embedding theorem asserts that any smooth n-dimensional manifold M can be smoothly embedded into \mathbb{R}^{2n}. The proof proceeds similarly by starting with an immersion into \mathbb{R}^{2n+1} and applying a generic linear projection to \mathbb{R}^{2n}. By Sard's theorem, the critical values of this projection, which would introduce singularities such as points where the image folds or intersects non-transversely, form a set of measure zero. A generic projection therefore avoids these critical values, resulting in an embedding without singularities.[28]Self-intersections in the image, a primary obstacle to embeddings, are analyzed via the difference map associated to a smooth map f: M \to \mathbb{R}^k given by (x,y) \mapsto f(x) - f(y) on M \times M. Double points arise when this map attains the value $0forx \neq y. Sard's theorem applied to this difference map shows that the set of f for which $0 is a critical value has measure zero, ensuring that for generic f, any double points are transverse. In the context of the Whitney theorems, this transversality, combined with the dimension $2n, implies that generic immersions into \mathbb{R}^{2n}have only transverse double points, while the embedding construction in\mathbb{R}^{2n}$ resolves them entirely.These applications of Sard's theorem demonstrate that the sets of immersions into \mathbb{R}^{2n-1} and embeddings into \mathbb{R}^{2n} are both open and dense in the space of all smooth maps from M to \mathbb{R}^{2n-1} (or \mathbb{R}^{2n} for embeddings), with respect to the C^\infty topology.