Fact-checked by Grok 2 weeks ago

Prokhorov's theorem

Prokhorov's theorem is a cornerstone result in that characterizes the relative of families of probability measures on complete separable spaces (also known as spaces) in terms of a property called tightness. Specifically, it states that a set \Gamma of Borel probability measures on such a space X is relatively compact in the space of probability measures \mathcal{P}(X) equipped with the topology of if and only if \Gamma is tight, meaning that for every \epsilon > 0, there exists a compact subset K \subseteq X such that \mu(K) \geq 1 - \epsilon for all \mu \in \Gamma. This equivalence provides a practical criterion for ensuring the existence of weakly convergent subsequences, which is essential for proving limit theorems in stochastic processes. Named after the Soviet mathematician Yurii Vasilevich Prokhorov (1929–2013), who was a student of and a prominent figure in , the appeared in his 1956 paper on the of random processes. Prokhorov proved the result in the context of metric spaces, building on earlier work in and to address challenges in . The 's proof relies on the Prokhorov metric, which metrizes the on \mathcal{P}(X) and induces compactness precisely when tightness holds. The significance of Prokhorov's theorem lies in its foundational role for the modern theory of of measures, as developed in classic texts like Patrick Billingsley's Convergence of Probability Measures. It underpins applications in the for empirical measures, the study of diffusion processes, and large deviation principles, where verifying tightness allows researchers to extract convergent subsequences from potentially infinite families of distributions. Extensions of the theorem to non-Polish spaces or signed measures have been explored, but the original version remains the most widely used due to its applicability to standard probability spaces like \mathbb{R}^d.

Background Concepts

Tightness of Measures

In , a family K \subset \mathcal{P}(S) of probability measures on a (S, \rho) is said to be tight if, for every \epsilon > 0, there exists a compact set K_\epsilon \subset S such that \mu(S \setminus K_\epsilon) < \epsilon for all \mu \in K. This condition ensures that the measures in the family concentrate their mass within compact subsets, uniformly across the family. Intuitively, tightness captures the idea that no significant portion of the probability mass "escapes to infinity" as one considers different measures in the family; instead, the measures remain bounded in a topological sense, allowing approximation by finite-dimensional or compact-supported distributions. This property is particularly useful in spaces where compactness is not automatic, such as infinite-dimensional metric spaces, and it relates to weak convergence by providing a criterion for the existence of convergent subsequences. On the real line \mathbb{R}, which is \sigma-compact, every individual probability measure is tight, including families of Gaussian measures \mathcal{N}(\mu_n, \sigma_n^2) where the means \mu_n and variances \sigma_n^2 are bounded (e.g., |\mu_n| \leq M and \sigma_n^2 \leq V for fixed M, V > 0). In contrast, the family of Dirac measures \{\delta_n : n \in \mathbb{N}\}, where \delta_n places unit mass at the point n \to \infty, is not tight, as for any compact interval [-R, R] and \epsilon = 1/2, the measure \delta_n([-R, R]^c) = 1 for all sufficiently large n, exceeding \epsilon. A key property of tightness is its preservation under weak limits: if a \{\mu_n\} \subset K converges weakly to some \mu on S, and K is tight, then \mu is also tight. To see this, fix \epsilon > 0; by tightness of K, there exists a compact K_\epsilon such that \mu_n(S \setminus K_\epsilon) < \epsilon/2 for all n. Since K_\epsilon is compact, the portmanteau theorem implies \mu(S \setminus K_\epsilon) \leq \liminf_{n \to \infty} \mu_n(S \setminus K_\epsilon) < \epsilon/2 < \epsilon, so \mu satisfies the tightness condition. This preservation holds more generally for families indexed by directed sets, ensuring that weak limits inherit the concentration property.

Relative Compactness in Probability Measures

In the context of probability theory, the space \mathcal{P}(S) consists of all probability measures on a topological space S. The weak topology on \mathcal{P}(S) is the coarsest topology such that the maps \mu \mapsto \int f \, d\mu are continuous for every bounded continuous function f: S \to \mathbb{R}. This induces weak convergence: a sequence \{\mu_n\} in \mathcal{P}(S) converges weakly to \mu \in \mathcal{P}(S) if \int f \, d\mu_n \to \int f \, d\mu for all such f. A subset K \subset \mathcal{P}(S) is relatively compact in the weak topology if its closure is compact, which equivalently means that every sequence in K admits a subsequence converging weakly to some limit in \mathcal{P}(S). Relative compactness thus captures the topological property that K is "precompact," ensuring the existence of weak limits for subsequences without requiring the full space to be compact. When S is a separable metric space, the weak topology on \mathcal{P}(S) is metrizable, for instance, by the , which generates the same convergent sequences. In such metrizable settings, relative compactness is equivalent to sequential compactness, as every relatively compact set allows extraction of convergent subsequences from any sequence within it. Tightness of measures provides a key criterion for establishing relative compactness in these spaces.

Formal Statement

Version for Separable Metric Spaces

Prokhorov's theorem in the setting of separable metric spaces provides a fundamental characterization of relatively compact families of probability measures in terms of tightness. Let (S, \rho) be a separable metric space, and let \mathcal{B}(S) denote its Borel \sigma-algebra generated by the open sets. The space \mathcal{P}(S) consists of all probability measures on (S, \mathcal{B}(S)), endowed with the weak topology, in which a sequence \{\mu_n\}_{n=1}^\infty \subset \mathcal{P}(S) converges to \mu \in \mathcal{P}(S) if \int_S f \, d\mu_n \to \int_S f \, d\mu for every bounded continuous function f: S \to \mathbb{R}. A subset K \subset \mathcal{P}(S) is said to be tight if, for every \epsilon > 0, there exists a compact subset C \subset S such that \mu(S \setminus C) < \epsilon for all \mu \in K. The theorem states that K is relatively compact in (\mathcal{P}(S), \text{weak})—meaning its closure is compact—if K is tight. The converse holds if the space S is complete (i.e., Polish); see the version for Polish spaces below. This equivalence (in the complete case) relies on the separability of S, which ensures that the Borel \sigma-algebra is generated by a countable collection of open sets and that the weak topology on \mathcal{P}(S) admits a compatible metric, such as the Prokhorov metric d_P(\mu, \nu) = \inf \{ \epsilon > 0 : \mu(A) \leq \nu(A^\epsilon) + \epsilon \ \text{and} \ \nu(A) \leq \mu(A^\epsilon) + \epsilon \ \forall A \in \mathcal{B}(S) \}, where A^\epsilon = \{ x \in S : \inf_{y \in A} \rho(x,y) < \epsilon \}. Relative compactness in the weak topology coincides with that in the Prokhorov metric under these assumptions. The result was originally established by Yuri V. Prokhorov in 1956 for complete separable metric spaces, but the implication from tightness to relative compactness extends to the more general separable case without requiring completeness of the metric.

Version for Polish Spaces

A Polish space is a topological space that admits a complete separable metric, meaning it is homeomorphic to a complete metric space with a countable dense subset. Examples include Euclidean spaces \mathbb{R}^n, separable Hilbert spaces, and the space of continuous functions on a compact set equipped with the supremum norm. These spaces possess desirable topological properties, such as the existence of Borel isomorphisms to the Baire space of irrational numbers, which facilitate the study of probability measures. In the context of Prokhorov's theorem, when the underlying space S is , the space \mathcal{P}(S) of probability measures on S, endowed with the , admits a complete metric d_0 that generates this topology; a standard choice is the Prokhorov metric, defined as d_0(\mu, \nu) = \inf\left\{\varepsilon > 0 : \mu(A) \leq \nu(A^\varepsilon) + \varepsilon \text{ and } \nu(A) \leq \mu(A^\varepsilon) + \varepsilon \ \forall A \in \mathcal{B}(S)\right\}, where A^\varepsilon denotes the \varepsilon-enlargement of the A. This metrizability ensures that the is separable and completely metrizable, making \mathcal{P}(S) itself a . The theorem refines as follows: a subset K \subseteq \mathcal{P}(S) is tight if and only if its closure in (\mathcal{P}(S), d_0) is compact. Tightness here means that for every \varepsilon > 0, there exists a compact subset C \subseteq S such that \mu(S \setminus C) < \varepsilon for all \mu \in K. This equivalence holds because, in Polish spaces, every Borel probability measure is tight (i.e., inner regular with respect to compact sets), and relative compactness in the weak topology coincides with sequential compactness due to the metrizability. Thus, every sequence in a tight family K has a subsequence converging weakly to some \mu \in \mathcal{P}(S). For instance, if K \subseteq S is compact, then the set \mathcal{P}(K) of probability measures supported on K is compact in the weak topology, as it is tight (with K itself serving as the concentrating compact set) and its closure is contained within the tight family. This compactness property underpins many results in empirical process theory and stochastic approximation on Polish spaces.

Corollaries

In Euclidean Spaces

In finite-dimensional Euclidean spaces \mathbb{R}^m, Prokhorov's theorem provides a direct characterization of relative compactness for sequences of probability measures. A sequence (\mu_n) in \mathcal{P}(\mathbb{R}^m) is relatively compact in the topology of weak convergence if and only if it is tight. Consequently, every tight sequence admits a weakly convergent subsequence \mu_{n_k} \to \mu for some \mu \in \mathcal{P}(\mathbb{R}^m). The proof relies on the general statement of , noting that \mathbb{R}^m equipped with the Euclidean metric is a complete separable metric space, ensuring that tightness implies relative compactness via the theorem's characterization. This result extends the one-dimensional , which guarantees convergent subsequences for tight families of distribution functions on \mathbb{R}, to higher dimensions by leveraging the separability of \mathbb{R}^m. A practical criterion for tightness in \mathcal{P}(\mathbb{R}^m) is the uniform boundedness of first moments: the sequence (\mu_n) is tight if \sup_n \int_{\mathbb{R}^m} \|x\| \, d\mu_n(x) < \infty. This condition prevents probability mass from concentrating at infinity and ties into uniform integrability of the norm under the measures. This corollary applies, for instance, to sequences of distributions of random vectors whose norms have uniformly bounded expectations, ensuring the existence of weakly convergent subsequences. Another example arises with empirical measures \mu_n = n^{-1} \sum_{i=1}^n \delta_{X_i} from i.i.d. samples X_i drawn from a distribution with finite first moment, where tightness follows from the moment bound, yielding subsequential weak convergence.

Implications for Weak Convergence

A key implication of Prokhorov's theorem for weak convergence arises in the context of sequences of probability measures on \mathbb{R}^m. Specifically, if a sequence (\mu_n) in \mathcal{P}(\mathbb{R}^m) is tight and every weak limit point of (\mu_n) is the same measure \mu, then \mu_n \to \mu weakly. This follows because tightness ensures relative compactness by Prokhorov's theorem, meaning every subsequence has a further subsequence converging weakly to some limit point, and the uniqueness condition guarantees that all such limits coincide with \mu, implying full convergence of the original sequence. Prokhorov's theorem plays a foundational role in the Helly–Bray theorem, which characterizes weak convergence through the pointwise convergence of characteristic functions under tightness. The Helly–Bray theorem states that if the characteristic functions \phi_n(t) = \int_{\mathbb{R}^m} e^{i \langle t, x \rangle} \, d\mu_n(x) converge pointwise to \phi(t) = \int_{\mathbb{R}^m} e^{i \langle t, x \rangle} \, d\mu(x) for all t \in \mathbb{R}^m, and if (\mu_n) is tight, then \mu_n \to \mu weakly. Here, Prokhorov's theorem supplies the tightness condition to bridge the convergence of Fourier transforms to full weak convergence, as relative compactness ensures the existence of subsequential limits that match the candidate \mu. This framework extends to tests involving moments in \mathbb{R}^m, where uniform integrability (implying tightness) combined with convergence of all moments can verify weak convergence, though characteristic functions provide a more direct and complete criterion. For instance, in the setup of the central limit theorem, the Lindeberg condition on independent random variables ensures tightness of the normalized partial sums, allowing the pointwise convergence of their characteristic functions to the Gaussian characteristic function to imply weak convergence to the normal distribution.

Extensions

To Complex Measures

The extension of Prokhorov's theorem to complex measures applies to families of complex on . Specifically, for a S, a family \Pi of complex on S is sequentially precompact in the weak topology if and only if \Pi is tight and uniformly bounded in the total variation norm, that is, \sup_{\mu \in \Pi} \|\mu\| < \infty, where \|\mu\| denotes the total variation of \mu. Tightness in this context is defined using the total variation measure: for every \epsilon > 0, there exists a compact set K_\epsilon \subset S such that |\mu|(S \setminus K_\epsilon) < \epsilon for all \mu \in \Pi, with |\mu| being the total variation measure associated to \mu. This condition ensures that the mass of the measures is concentrated on compact subsets, analogous to the probability case but adapted to the non-normalized nature of complex measures. Unlike the original Prokhorov's theorem for probability measures, where uniform boundedness follows from the total mass being 1, the extension to complex measures necessitates the explicit uniform bound on the total variation norm to prevent sequences from escaping to infinity in norm while remaining tight. This boundedness, combined with tightness, guarantees the existence of weakly convergent subsequences in the space of complex measures equipped with the weak topology.

To Non-Separable Metric Spaces

In non-separable metric spaces S, the weak topology on the space of probability measures \mathcal{P}(S) is generally not metrizable, which complicates the notion of relative compactness compared to the separable case. Tightness alone does not suffice to guarantee relative compactness in this setting, as the lack of separability prevents the uniform control over the supports of measures that holds in Polish spaces. A partial extension addresses this by showing that a family \mathcal{K} \subset \mathcal{P}(S) is weak*-precompact if, for every \epsilon > 0, there exists a compact K \subset S such that \mu(K) \geq 1 - \epsilon for all but finitely many \mu \in \mathcal{K}. For families of bounded positive measures supported on subspaces of S, relative compactness holds provided the family is tight and satisfies additional conditions such as of the measures with respect to bounded continuous functions. This leverages the fact that every Borel on a is concentrated on a separable complete subspace, allowing the application of the standard on that . Counterexamples illustrate the failure of tightness to imply compactness in fully non-separable settings. For instance, in an uncountable , families of measures that assign positive mass to uncountably many isolated points cannot be covered by compact sets in a way, leading to non-compact closures under the ; under the , such constructions exist without Borel extensions, as shown by Banach and Kuratowski. Modern extensions often circumvent these issues by embedding non-separable spaces into separable ones or adapting the Prokhorov metric to quasi-metric structures. For example, by restricting measures to their separable supports, tightness in the original space implies relative in the embedded separable , preserving weak convergence properties. The Prokhorov metric has also been generalized to settings involving embeddings of metric measure spaces, enabling comparisons and results even when separability fails globally.

In Central Limit Theorems

Prokhorov's theorem plays a pivotal role in establishing tightness for the sequence of distributions in the classical central limit theorem (CLT) for sums of independent and identically distributed (i.i.d.) random vectors in \mathbb{R}^d. Consider i.i.d. random vectors X_1, X_2, \dots with mean zero and finite positive definite covariance matrix \Sigma. The normalized partial sums S_n / \sqrt{n}, where S_n = X_1 + \cdots + X_n, have distributions \mu_n that satisfy the tightness condition of Prokhorov's theorem due to the uniform boundedness of second moments, ensuring that for every \epsilon > 0, there exists a compact set K \subset \mathbb{R}^d such that \mu_n(K^c) < \epsilon for all n. This relative compactness in the space of probability measures with the weak topology guarantees that every subsequence of \{\mu_n\} has a further subsequence converging weakly to some limit measure. Combined with the pointwise convergence of characteristic functions to \exp(-\frac{1}{2} t^\top \Sigma t), the Gaussian characteristic function, Prokhorov's theorem facilitates the full weak convergence of \mu_n to the d-dimensional normal distribution \mathcal{N}(0, \Sigma). The theorem's utility extends to the Lindeberg-Feller theorem, which generalizes the CLT to sums of independent but not necessarily identically distributed random vectors under the Lindeberg condition: for every \eta > 0, \frac{1}{s_n^2} \sum_{k=1}^n \mathbb{E}[ \|X_k\|^2 \mathbf{1}_{\{\|X_k\| > \eta s_n\}} ] \to 0 as n \to \infty, where s_n^2 = \sum_{k=1}^n \mathbb{E}[\|X_k\|^2]. In finite dimensions, this condition, along with centered variables and s_n \to \infty, implies tightness via Prokhorov's criterion, as the second moments are uniformly integrable and bounded in probability. Subsequent convergence of characteristic functions to the Gaussian form then yields weak convergence to \mathcal{N}(0, I_d), after normalization. This tie-in underscores how Prokhorov's theorem bridges moment assumptions to the structural requirements for weak limits in non-i.i.d. settings. A notable application appears in the Berry-Esseen theorem, which quantifies the CLT convergence rate for i.i.d. variables with finite third moments. Assuming \mathbb{E}[X] = 0, \mathbb{E}[\|X\|^2] = 1, and \rho = \mathbb{E}[\|X\|^3] < \infty, the theorem bounds the supremum distance \sup_{x \in \mathbb{R}^d} | \mu_n((-\infty, x]) - \Phi(x) | \leq C_d \rho / \sqrt{n}, where \Phi is the standard normal cdf and C_d is a dimension-dependent constant. Here, the relative compactness from , ensured by the finite third moments controlling tail behavior, supports the uniform tightness needed for these global error estimates across the space. This enables precise asymptotic analysis without subsequence limitations. While the focus remains on finite-dimensional cases, Prokhorov's theorem also underpins infinite-dimensional CLTs in separable Hilbert spaces, such as those for empirical processes, where tightness is verified through operator norms on covariance and moment conditions, ensuring weak convergence to Gaussian measures in the space of square-integrable functions.

In Large Deviation Principles

Prokhorov's theorem is instrumental in large deviation theory, where families of scaled probability measures (e.g., \mu_n / n for rare events) require relative compactness to establish large deviation principles (LDPs). Tightness ensures that weak limit points exist in the space of probability measures, allowing identification of the rate function via the Laplace-Varadhan lemma or contraction principle. For instance, in empirical measure LDPs for i.i.d. samples, verifying tightness of the sequence of occupation measures under exponential scaling confirms convergence to a LDP with good rate function, governing the asymptotics of \mathbb{P}(\bar{X}_n \in A) \approx \exp(-n I(A)) for closed sets A, where I is the relative entropy. This application extends the theorem's utility beyond CLTs to exponential rare-event analysis in stochastic systems.

Connection to Arzelà–Ascoli Theorem

Prokhorov's theorem establishes that tightness of a family of probability measures on a metric space implies relative compactness in the weak topology, providing a foundational tool for weak convergence. In the specific context of the space C[0,1] of continuous functions on [0,1] equipped with the supremum metric, the complements this by characterizing the compact subsets as those that are closed, uniformly bounded, and equicontinuous. This characterization directly informs tightness criteria for families of measures on \mathcal{P}(C[0,1]): a sequence \{P_n\} is tight if, for every \epsilon > 0, there exists a compact set K \subset C[0,1] such that P_n(K) \geq 1 - \epsilon for sufficiently large n, which translates to uniform boundedness (e.g., \lim_{a \to \infty} \limsup_n P_n(\|x\|_\infty \geq a) = 0) and control on the (e.g., \lim_{\delta \to 0} \limsup_n P_n(w'(x, \delta) \geq \epsilon) = 0 for all \epsilon > 0, where w'(x, \delta) = \sup_{t_1 - t_2 \geq \delta} |x(t_1) - x(t_2)|). The synergy arises because Arzelà–Ascoli supplies the structural conditions for compactness in function spaces, allowing Prokhorov's theorem to yield relative compactness of tight families on C[0,1]. Specifically, ensures that functions do not oscillate wildly, while uniform boundedness prevents divergence to infinity, enabling the identification of compact supports for measures. This connection is pivotal in applications such as , where tightness of measures on C[0,1] is verified using these criteria to establish to . A concrete example is the weak convergence of approximations to paths, such as scaled random walks, where tightness is checked by bounding the probability that the exceeds a threshold, leveraging Arzelà–Ascoli to confirm and thus applying Prokhorov's theorem for convergence to the measure. Unlike Prokhorov's theorem, which applies generally to any complete separable without specifying compact set structures, the Arzelà–Ascoli theorem is tailored to spaces of continuous functions, providing explicit functional-analytic criteria that are essential for probabilistic arguments in infinite-dimensional settings.

References

  1. [1]
    Convergence of Random Processes and Limit Theorems in ...
    Convergence of Random Processes and Limit Theorems in Probability Theory. Author: Yu. V. ProkhorovAuthors Info & Affiliations. https://doi.org/10.1137/1101016.
  2. [2]
    [PDF] Probability measures on metric spaces
    The main focus is on. Prokhorov's theorem, which serves both as an important tool for future use and as an illustration of techniques that play a role in the ...
  3. [3]
    [PDF] Weak Convergence of Measures: - UC Davis Math
    The following theorem, due to Prokhorov, is basic to the application of weak convergence in probability theory. The family II is said to be tight if, for each.
  4. [4]
    [PDF] Weak Convergence of Probability Measures - arXiv
    Jul 20, 2020 · Theorem 3.9 Prokhorov's theorem, the direct part. If a family of probability measures Π on (S, S) is tight, then it is relatively compact.<|control11|><|separator|>
  5. [5]
    [PDF] Convergence of Probability Measures - CERMICS
    This book focuses on weak convergence of probability measures on metric spaces, a more inclusive theory than classical weak convergence of distribution ...
  6. [6]
    [PDF] Probability theory II Exercise Sheet 6
    Nov 13, 2019 · Prokhorov's theorem: Let (S, d) be a separable metric space and M1(S) denote the collection of all probability measures defined on S. Let ...
  7. [7]
    (PDF) An expository note on Prohorov metric and Prohorov Theorem
    Jan 20, 2021 · The main aim of this article is to give an exposition of weak convergence, Prohorov theorem and Prohorov spaces.<|control11|><|separator|>
  8. [8]
    [PDF] Exploring the Foundations of the Central Limit Theorem
    4.1 Weak convergence of finite measures. The terms weak convergence, convergence in law and convergence in distribu- tion are often used interchangeably. We ...
  9. [9]
    WEAK CONVERGENCE OF PROBABILITIES ON NONSEPARABLE
    Prokhorov [9] for separable metric spaces, will be given in 2. The rest of the paper deals with "empirical measures" on Euclidean spaces, whose study ...Missing: non- | Show results with:non-
  10. [10]
    [PDF] A note on the Gromov-Hausdorff-Prokhorov distance ... - CERMICS
    We present an extension of the Gromov-Hausdorff metric on the set of compact met- ric spaces: the Gromov-Hausdorff-Prokhorov metric on the set of compact ...<|control11|><|separator|>
  11. [11]
    [PDF] Lecture 21: Tightness of measures - MIT OpenCourseWare
    Nov 27, 2013 · Theorem 2 (Prohorov's Theorem). Suppose sequence Pn is tight. Then it con tains a weakly convergent subsequence Pn(k) ⇒ P. The converse of ...