The moment problem is a central question in mathematical analysis and probability theory that seeks to determine whether a given infinite sequence of real numbers \{s_n\}_{n=0}^\infty can be realized as the moments of a positive Borel measure \mu on the real line, meaning s_n = \int_{-\infty}^{\infty} x^n \, d\mu(x) for each nonnegative integer n, and to characterize the conditions under which such a representing measure exists and is unique.[1]This problem originated in the late 19th century with Thomas Joannes Stieltjes's work on continued fractions, where he investigated the representation of sequences as moments of measures supported on the nonnegative reals, laying the groundwork for what is now known as the Stieltjes moment problem.[2] In the early 1920s, Hans Hamburger extended the framework to measures on the entire real line, formulating the Hamburger moment problem, while Felix Hausdorff addressed the case of compact support intervals, such as [0,1], in the Hausdorff moment problem.[3][1]The solvability of these problems hinges on the positive semidefiniteness of associated Hankel matrices formed from the sequence s_n, which ensures the existence of at least one representing measure; for the Hamburger case, the condition is that the Hankel determinants are nonnegative, while the Stieltjes case requires this for both the original and shifted sequences. A key distinction arises between determinate and indeterminate cases: a moment sequence is determinate if it corresponds to a unique measure, often verified by Carleman's criterion \sum_{n=1}^\infty s_{2n}^{-1/(2n)} = \infty, whereas indeterminate sequences admit infinitely many measures, leading to rich structures like the N-extremal solutions.[4]Beyond existence and uniqueness, the moment problem connects deeply to orthogonal polynomials, where the moments define the inner product for polynomials orthogonal with respect to \mu, facilitating applications in quadrature formulas, spectral theory of operators, and approximation theory. It also plays a crucial role in modern optimization, such as semidefinite programming for polynomial inequalities and moment-based relaxations in control theory and finance, as well as in reconstructing distributions from partial moment data in statistics and physics.[5][6]
Overview and History
Definition and Basic Concepts
The moment problem in mathematics concerns the inversion of the mapping that associates a probability measure \mu on a suitable space with its sequence of moments, defined as m_n = \int x^n \, d\mu(x) for nonnegative integers n, where the integral is taken over the support of \mu, such as the real line \mathbb{R} or a subset thereof. The central goal is to determine whether a given sequence (m_n)_{n=0}^\infty of real numbers corresponds to the moments of some measure \mu, and if so, to identify or characterize such measures. This inversion problem arises in various fields, including probability theory, approximation theory, and orthogonal polynomials, where recovering the underlying distribution from power moments provides insight into the measure's properties.A key prerequisite is the notion of Borel measures, which are positive finite measures on the Borel \sigma-algebra of a topological space, ensuring the integrals defining the moments are well-defined and the support is compact or has controlled growth to guarantee finite moments. For a sequence (m_n) to qualify as a moment sequence, it must exhibit positive definiteness, meaning that for every finite n and all real coefficients \xi_0, \dots, \xi_n, the quadratic form \sum_{k,l=0}^n m_{k+l} \xi_k \xi_l \geq 0. This condition reflects the positivity of the measure and ensures the sequence can potentially arise from an integral representation. Classical variants of the moment problem, such as the Hamburger problem on \mathbb{R} or the Stieltjes problem on [0, \infty), impose domain restrictions on the support of \mu.Moment problems are classified as determinate or indeterminate based on the uniqueness of the representing measure. A problem is determinate if the given moment sequence corresponds to exactly one measure \mu, implying that the moments fully characterize the distribution; otherwise, it is indeterminate, allowing multiple distinct measures to share the same moments. This distinction is crucial, as indeterminacy can lead to a continuum of solutions, often parameterized by their behavior at infinity or via orthogonal polynomial expansions.A representative example is the moment sequence of the standard normal distribution \mu with density \frac{1}{\sqrt{2\pi}} e^{-x^2/2} on \mathbb{R}, where m_n = 0 for odd n due to symmetry, and for even n = 2k, m_{2k} = (2k-1)!! = 1 \cdot 3 \cdot 5 \cdots (2k-1), such as m_0 = 1, m_2 = 1, and m_4 = 3. This sequence is determinate, as the normal distribution is uniquely recovered from its moments.
Historical Development
The moment problem traces its origins to the late 19th century, when Pafnuty Chebyshev and Andrey Markov investigated inequalities involving sequences of moments, laying foundational groundwork for understanding the conditions under which such sequences could arise from probability distributions. Chebyshev's work in the 1860s and 1870s on probabilistic inequalities, extended by Markov in the 1880s and 1890s, focused on bounds for higher-order moments and their implications for extremal problems in approximation theory.[7]A pivotal advancement came in 1894 with Thomas Joannes Stieltjes' formulation of the problem in the context of continued fractions, where he sought to determine whether a given sequence of positive real numbers could serve as moments of a positive measure supported on the non-negative real line. Stieltjes provided necessary and sufficient conditions for the existence and uniqueness of such measures, linking the problem to the convergence of associated continued fractions. His results were posthumously compiled in a 1918 memoir within his collected works, which became a cornerstone reference for subsequent developments.[8]In the early 1920s, the problem expanded to broader domains. Hans Hamburger generalized Stieltjes' approach in 1920–1921 to measures on the entire real line, establishing conditions via positive definiteness of Hankel matrices for the existence of representing measures. Concurrently, Felix Hausdorff in 1921 addressed the case of measures on the unit interval [0,1], introducing criteria based on completely monotonic sequences that ensured the sequence was a moment sequence for a unique measure on that bounded support.[9]The 1943 monograph by J. A. Shohat and J. D. Tamarkin synthesized these classical results and delved into indeterminate cases, where multiple measures share the same moments, providing tools for parametrizing all solutions and advancing the theory of orthogonal polynomials associated with such problems.[10]Post-World War II, Mark Grigorievich Krein made significant contributions in the 1940s and 1950s, developing canonical representations for solutions in indeterminate scenarios through the lens of self-adjoint extensions of differential operators and the theory of canonical systems, which facilitated deeper insights into the structure of all possible measures.In the 2000s, computational approaches gained prominence, with Jean B. Lasserre establishing connections between the moment problem and semidefinite programming hierarchies, enabling numerical solutions to generalized moment problems and polynomial optimization via relaxations that converge to exact representations under suitable conditions.
Classical Formulations
Hamburger Moment Problem
The Hamburger moment problem concerns the recovery of a positive Borel measure \mu on the entire real line \mathbb{R} from its sequence of moments (m_n)_{n=0}^\infty, where m_n = \int_{-\infty}^{\infty} x^n \, d\mu(x) for each n \geq 0. Formulated by Hans Hamburger in 1920 as an extension of the Stieltjes moment problem to unbounded domains, it seeks to determine whether such a measure exists for a given real sequence and, if so, whether it is unique. A necessary and sufficient condition for existence is the positive semidefiniteness of the associated Hankel matrices \Gamma_n = (m_{i+j})_{0 \leq i,j \leq n}.[11]This problem is intimately connected to the theory of orthogonal polynomials on unbounded intervals, where the moments define the inner product \langle p, q \rangle = \int_{-\infty}^{\infty} p(x) q(x) \, d\mu(x) for polynomials p and q. The monic orthogonal polynomials (p_n)_{n=0}^\infty satisfy a three-term recurrence relation derived from the moments, and their roots provide Gaussian quadrature nodes for approximating integrals with respect to \mu. In contrast to bounded-support cases, the lack of compactness on \mathbb{R} allows for the possibility of indeterminate problems, where multiple measures share the same moments.[12]A classic determinate example is the standard Gaussian measure d\mu(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2} \, dx, whose moments m_n = \int_{-\infty}^{\infty} x^n \frac{1}{\sqrt{2\pi}} e^{-x^2/2} \, dx (with m_n = 0 for odd n and m_{2k} = (2k-1)!! for even n) uniquely determine \mu, as verified by Carleman's condition on moment growth.[13]The absence of compact support in the Hamburger setting introduces significant challenges, including the potential for non-uniqueness even when a representing measure exists, as the unbounded domain permits measures with tails that produce identical moments but differ in their distributions. This indeterminacy complicates applications in approximation theory and spectral analysis, where identifying the correct measure requires additional constraints beyond the moments alone.[12]
Stieltjes Moment Problem
The Stieltjes moment problem seeks to determine whether there exists a positive Borel measure \mu on the non-negative real line [0, \infty) such that the given real sequence \{m_n\}_{n=0}^\infty satisfiesm_n = \int_0^\infty x^n \, d\mu(x)for all n \geq 0. This formulation restricts the support of \mu to the half-line, distinguishing it from problems allowing negative values, and requires \mu to be uniquely normalized by m_0 = \int_0^\infty d\mu(x) = 1 when considering probability measures. The problem originates from Thomas Joannes Stieltjes' investigations into the convergence of continued fractions, where he sought conditions under which sequences of coefficients correspond to integrals against positive measures on [0, \infty).[14]A key aspect of the Stieltjes moment problem lies in its connections to integral transforms. The power series \sum_{n=0}^\infty m_n s^n represents the moment generating function, which is analytically related to the Laplace transform \mathcal{L}\mu(s) = \int_0^\infty e^{-sx} \, d\mu(x) evaluated at appropriate points, providing a tool for inversion and uniqueness analysis. Stieltjes further linked the problem to continued fractions of the form\frac{1}{1 + \frac{a_1 s}{1 + \frac{a_2 s}{1 + \ddots}}},where the partial denominators a_k > 0 are derived from the moments via recurrence relations, and convergence of these fractions yields approximations to the Stieltjes transform \int_0^\infty \frac{d\mu(x)}{x + s} for \Re(s) > 0. These connections facilitate numerical solutions and theoretical extensions, such as quadrature formulas for approximating integrals.[14][15]Existence of a solution \mu is characterized by the positive definiteness of the Hankel matrices formed by the moments, ensuring the sequence admits a representing measure on [0, \infty). For instance, the moments of the exponential distribution with rate parameter \lambda = 1, given by m_n = n!, arise from the measure d\mu(x) = e^{-x} \, dx on [0, \infty). This case is determinate, meaning the measure is uniquely recovered from the moments, as verified by criteria such as Carleman's condition \sum_{n=1}^\infty m_{2n}^{-1/(2n)} = \infty.[16][17]Despite the non-negative support, the Stieltjes moment problem admits indeterminate cases, where infinitely many distinct measures share the same moments. Stieltjes himself identified early examples of such indeterminacy through continued fraction analysis. A prominent illustration involves the log-normal distribution, whose moments m_n = \exp(n^2 \sigma^2 / 2 + n \mu) for parameters \mu, \sigma > 0 can be perturbed by adding a small singular component, such as a Cantor-like measure scaled by \varepsilon \in [-1, 1], yielding a family of measures \mu_\varepsilon all matching the original moments but differing in their densities or singularities. This indeterminacy highlights the problem's subtlety, even on the half-line, and underscores the role of entire functions in parametrizing all solutions via Nevanlinna theory.[14][18]
Hausdorff Moment Problem
The Hausdorff moment problem seeks to determine a positive Borel measure \mu on the compact interval [0,1] such that its power moments satisfy m_n = \int_0^1 x^n \, d\mu(x) for n = 0, 1, 2, \dots, where typically m_0 = 1 for a probability measure.[3] This formulation, introduced by Felix Hausdorff in 1921, in two parts published in Mathematische Zeitschrift, addresses the case of bounded support, distinguishing it from problems on unbounded domains.[19][20] A solution exists if and only if the sequence (m_n)_{n \geq 0} is completely monotonic, meaning that (-1)^k \Delta^k m_n \geq 0 for all k, n \geq 0, where \Delta denotes the forward difference operator \Delta m_n = m_{n+1} - m_n and \Delta^{k+1} m_n = \Delta(\Delta^k m_n).[3]Unlike moment problems on unbounded intervals, the Hausdorff problem is always determinate: if a solution exists, the measure \mu is unique. This follows from the Weierstrass approximation theorem, which guarantees that polynomials are dense in the space of continuous functions C[0,1] with the uniform norm, combined with the Riesz representation theorem, ensuring that two measures agreeing on all polynomials must coincide.[3][21]Representations of the measure often involve finite differences for explicit recovery. Bernstein polynomials also play a key role in approximations: for a density g on [0,1], the Bernstein approximants B_{n,g}(x) = \sum_{j=0}^n g(j/n) \binom{n}{j} x^j (1-x)^{n-j} converge uniformly to g(x) as n \to \infty, facilitating moment-based reconstructions.[19]Representative examples include the uniform measure on [0,1], with moments m_n = \int_0^1 x^n \, dx = \frac{1}{n+1}, which satisfies the complete monotonicity condition since \Delta^k m_0 = (-1)^k / (k+1).[3] More generally, beta distributions on [0,1] with shape parameters \alpha > 0, \beta > 0 provide a parametric family, where m_n = \frac{B(\alpha + n, \beta)}{B(\alpha, \beta)} = \prod_{k=0}^{n-1} \frac{\alpha + k}{\alpha + \beta + k}, and their moments uniquely determine the parameters via the determinate nature of the problem.[19]
Trigonometric Moment Problem
The trigonometric moment problem seeks to determine whether a given bi-infinite sequence of complex numbers (m_n)_{n \in \mathbb{Z}} can be represented as the Fourier coefficients of a positive Borel measure \mu on the interval [0, 2\pi), satisfyingm_n = \int_0^{2\pi} e^{in\theta} \, d\mu(\theta)for all integers n, where \mu has finite total variation. This formulation arises naturally in contexts involving periodic phenomena, as the moments m_n capture the correlation structure of the measure through its exponential transforms. Unlike the real-line moment problems, this setup incorporates the periodicity of the domain, leading to a closed, compact support on the circle.A key feature of the trigonometric moment problem is its equivalence to the power moment problem on the unit circle in the complex plane. By setting z = e^{i\theta}, the substitution maps the integral tom_n = \int_{|z|=1} z^n \, d\nu(z),where \nu is the image measure induced by \mu under the exponential map, supported on the unit circle \mathbb{T} = \{ z \in \mathbb{C} : |z| = 1 \}. This perspective highlights the problem's connection to complex analysis and Laurent polynomials, facilitating the use of tools from Hardy spaces and analytic continuation. The equivalence preserves the positivity and finite moment conditions, allowing solutions on the circle to inform trigonometric representations and vice versa.Central to solving the trigonometric moment problem are Szegő polynomials, which are the orthogonal polynomials associated with the measure \mu on the unit circle. These polynomials \Phi_n(z) satisfy orthogonality relations \int_0^{2\pi} \Phi_m(e^{i\theta}) \overline{\Phi_n(e^{i\theta})} \, d\mu(\theta) = \delta_{mn} h_n, and their coefficients are recursively determined from the moments m_n via the Szegő recurrence. They provide a canonical basis for approximation and enable the construction of the reproducing kernel for the associated Hardy space, essential for indeterminate cases where multiple measures may fit the moments. Szegő polynomials thus bridge the moment sequence to explicit quadrature and prediction formulas.The trigonometric moment problem has significant applications in modeling cyclic processes, such as periodic stationary time series or circular data in signal processing, where the moments correspond to autocovariances of cyclostationary signals. In these contexts, recovering \mu from m_n allows for spectral estimation and filtering of periodic components, with Szegő polynomials facilitating minimum-phase representations in digital signal processing tasks like autoregressive modeling on the circle.Uniqueness of the representing measure is addressed through the Fejér-Riesz theorem, which guarantees the spectral factorization of positive trigonometric polynomials into boundary values of analytic functions inside the unit disk. For a positive definite sequence (m_n), the theorem ensures that the associated Toeplitz form admits a unique outer factor, implying determinacy of \mu when the moments satisfy Carathéodory conditions (e.g., the associated Fejér kernel being positive). This provides a criterion for when the moment sequence determines a unique cyclic measure, contrasting with indeterminate cases on the real line.
Existence Conditions
Positive Definiteness of Hankel Matrices
In the context of the Hamburger moment problem, the Hankel matrix of order n+1 associated with a sequence of real numbers (m_k)_{k=0}^\infty is defined asH_n = (m_{i+j})_{i,j=0}^n,where the entry in row i and column j (starting from 0) is the moment m_{i+j}. This matrix captures the bilinear form induced by the moments on the space of polynomials of degree at most n.[22]A necessary and sufficient condition for the existence of a positive Borel measure \mu on \mathbb{R} such that m_k = \int_{-\infty}^{\infty} x^k \, d\mu(x) for all k \geq 0 is that every Hankel matrix H_n (for n = 0, 1, 2, \dots) is positive semi-definite. That is, for every n, \mathbf{v}^T H_n \mathbf{v} \geq 0 for all vectors \mathbf{v} \in \mathbb{R}^{n+1}. This algebraic condition ensures the moments are compatible with a representing measure and was established by Hamburger in his foundational work on the extension of the Stieltjes moment problem to the entire real line.[22]The proof of this condition proceeds by constructing an inner product on the space of polynomials via the quadratic forms defined by the Hankel matrices. Specifically, the positive semidefiniteness of H_n for all n allows a Cholesky decomposition H_n = L_n L_n^T, where L_n is a lower triangular matrix with positive diagonal entries (assuming strict positivity for simplicity in the initial steps). This decomposition corresponds to the Gram matrix of the monomial basis \{1, x, \dots, x^n\} under an inner product \langle p, q \rangle = \int p(x) q(x) \, d\mu(x), and the entries of L_n yield the coefficients for orthogonal polynomials P_k(x) via the Gram-Schmidt process applied to the monomials. These orthogonal polynomials P_n(x) satisfy \langle P_k, P_m \rangle = \delta_{km}, with leading coefficients and norms derived from ratios of determinants of the Hankel matrices, such as \|P_n\|^2 = \det H_n / \det H_{n-1}. The consistency across all degrees ensures the inner product extends to a positive linear functional on continuous functions, which by the Riesz representation theorem (as referenced in Haviland's topological criterion) corresponds to integration against a representing measure.[22][23]For an illustrative example, consider a discrete measure with finite support, such as \mu = \frac{1}{2} \delta_{-1} + \frac{1}{2} \delta_{1}, which has moments m_n = \int x^n \, d\mu(x) = \begin{cases} 1 & n \ even, \\ 0 & n \ odd. \end{cases} The corresponding Hankel matrices are H_0 = (1), H_1 = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}, H_2 = \begin{pmatrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{pmatrix}, and so on; each H_n is positive semi-definite with eigenvalues non-negative, reflecting the rank deficiency for higher n due to the finite support (rank at most 2). This verifies the existence condition while demonstrating how the matrices capture the measure's structure without requiring infinite rank.[22]
Haviland's Theorem and Representations
Haviland's theorem provides a functional analytic criterion for the existence of a representing measure for a positive linear functional defined on the space of polynomials, serving as a cornerstone for solving the moment problem in its general form. Specifically, let K \subseteq \mathbb{R}^n be a closed subset, and let L: \mathbb{R}[X_1, \dots, X_n] \to \mathbb{R} be a linear functional with L(1) > 0. Then there exists a positive Radon measure \mu on K such that L(p) = \int_K p \, d\mu for all polynomials p if and only if L(p) \geq 0 whenever p \geq 0 on K. This characterization, established by E. K. Haviland in 1935 and 1936, generalizes earlier one-dimensional results and applies to multivariate settings, ensuring the functional arises from moments of a measure supported on K.[24]The theorem is intimately connected to the Riesz representation theorem, which asserts that every positive linear functional on the space of continuous functions with compact support C_c(K) can be represented by integration against a unique positive Radon measure on K. Haviland's result bridges the gap by focusing on the dense subspace of polynomials within C(K) (or C_c(K) for non-compact K), where the positivity condition on polynomials that vanish outside compact subsets of K guarantees a continuous extension to C_c(K), thereby invoking the Riesz representation to yield the measure. This extension requires growth control implicit in the moment sequence, distinguishing it from mere positivity on all polynomials.[24]In the context of the moment problem, Haviland's theorem applies directly to indeterminate cases, where a given moment sequence admits multiple representing measures. Here, the positive linear functional L defined by the moments satisfies the theorem's condition for some support K, but the extension to C_c(K) may not be unique, leading to distinct measures that agree on polynomials yet differ on higher-order continuous functions. For instance, in the classical Stieltjes moment problem on [0, \infty), indeterminate sequences like those from the log-normal distribution allow multiple extensions of L from polynomials to continuous functions, each corresponding to a different representing measure supported on [0, \infty). This multiplicity underscores that existence does not imply uniqueness, with the theorem providing only the foundational guarantee of at least one such measure.[24]A concrete example arises in the Hausdorff moment problem, where K is a compact interval, say [0, 1], and the functional L is initially defined on polynomials via a moment sequence. If L(p) \geq 0 for all polynomials p \geq 0 on [0, 1]—a condition generalizing the positive definiteness of Hankel matrices from earlier considerations—Haviland's theorem ensures L extends to a positive functional on C[0, 1], represented by a unique probability measure on [0, 1] by the Riesz representation theorem restricted to compact sets. This extension process highlights how moment sequences on compact supports can be realized as integrals against atomic or absolutely continuous measures, facilitating numerical approximations in applications.
Uniqueness and Indeterminacy
Criteria for Determinacy
The determinacy of the moment problem refers to the existence of a unique probability measure consistent with a given sequence of moments, assuming existence has been established. Various criteria provide sufficient conditions for this uniqueness, differing by the support of the measure—unbounded for the Hamburger and Stieltjes problems, or compact for the Hausdorff problem. These criteria often involve the growth rate of the moments, balancing rapid growth, which can lead to indeterminacy, against slower growth favoring uniqueness.A prominent sufficient condition for determinacy in both the Hamburger and Stieltjes moment problems is Carleman's condition, which assesses the divergence of a series derived from the even moments. For the Hamburger moment problem on \mathbb{R}, if \sum_{n=1}^\infty m_{2n}^{-1/(2n)} = \infty, where m_{2n} are the even moments, then the problem is determinate. This condition, introduced by Carleman in 1922, ensures that the moments do not grow too rapidly, preventing the proliferation of distinct measures with the same moments. Similarly, for the Stieltjes moment problem on [0, \infty), the same series divergence implies determinacy, though the precise formulation may adjust the denominator for the half-line case. Carleman's criterion is particularly effective because it is checkable directly from the moment sequence and applies broadly to distributions with subexponential tail behavior.[25]In contrast, conditions for indeterminacy highlight scenarios where multiple measures exist, often tied to rapid moment growth. Krein's condition provides a sufficient criterion for indeterminacy in the Hamburger case: if the distribution has a density f satisfying \int_{-\infty}^\infty \frac{-\log f(x)}{1 + x^2} \, dx < \infty, then the moment problem is indeterminate. This integral measures the "entropy-like" flatness of the density, allowing perturbations that preserve moments but alter the measure. A analogous condition holds for the Stieltjes case, involving \int_0^\infty \frac{-\log f(x)}{1 + x^2} \, dx < \infty, indicating growth rates where uniqueness fails, such as in heavy-tailed distributions. These growth-related criteria, developed by Krein in 1945, complement Carleman's by identifying boundaries where determinacy breaks down.[26]For the Stieltjes moment problem specifically, Heyde's theorem offers insights into uniqueness on the half-line, particularly for distributions without mass at zero. If a determinate Stieltjes measure is perturbed by adding a small mass at the origin and renormalizing, the resulting sequence may become indeterminate if the smallest eigenvalues of the associated shifted Hankel matrices converge to a positive limit. This result, from Heyde's 1963 analysis, underscores how boundary behavior near zero can tip the balance toward non-uniqueness in half-line problems, even when the original measure is determinate. Heyde's work also famously showed that the lognormal distribution on [0, \infty) is indeterminate, providing a concrete example where moments grow exponentially but fail to uniquely determine the measure.[27][28]The criteria vary significantly across problem domains due to support constraints. In the Hausdorff moment problem on a compact interval like [0,1], the problem is always determinate whenever a representing measure exists, as the compactness ensures that polynomials densely approximate continuous functions via the Stone-Weierstrass theorem, uniquely pinning down the measure from the moments. This contrasts sharply with the unbounded cases, where indeterminacy arises from tail behavior and rapid moment growth, requiring explicit growth conditions like Carleman's for guarantees. No such additional criteria are needed for Hausdorff, making it the most robust for uniqueness among classical formulations.[27]
Indeterminate Cases and Examples
In the context of the moment problem, a sequence of moments is said to be indeterminate if there exist multiple distinct positive measures (or probability distributions) that possess exactly those moments. This non-uniqueness arises typically in unbounded domains, such as the Hamburger or Stieltjes formulations, where the growth of the moments is sufficiently rapid to allow for a continuum of solutions.[8]A seminal example of an indeterminate Stieltjes moment problem is the log-normal distribution, first identified by Stieltjes in 1894. The standard log-normal density is given by f(x) = \frac{1}{x \sqrt{2\pi}} \exp\left( -\frac{(\ln x)^2}{2} \right) for x > 0, with all moments finite and equal to s_n = e^{n^2 / 2}. Despite these moments uniquely determining the distribution under certain conditions, the log-normal case permits infinitely many alternative measures, such as the perturbed forms d\mu_c(x) = \left[1 + c \sin(2\pi \ln x)\right] f(x) \, dx for suitable c where the density remains nonnegative, all sharing the same moments. This indeterminacy stems from the super-exponential growth of the moments, which fails criteria like Carleman's condition for uniqueness.[8]Stieltjes also provided an explicit family of indeterminate measures on [0, \infty) resembling perturbed exponentials. Consider densities of the form proportional to e^{-x^\alpha} for $0 < \alpha < 1/2; these generate moment sequences that are indeterminate, as multiple measures (including discrete components or further perturbations) can reproduce the moments. For \alpha = 1, the standard exponential density e^{-x} yields a determinate problem with moments s_n = n!, but reducing \alpha below $1/2 introduces non-uniqueness by slowing the decay just enough to allow alternative representations while preserving the moments.[8]Another notable indeterminate example is Nörlund's continuous analog of the Poisson distribution, constructed via generalized hypergeometric functions and associated orthogonal polynomials, which produces a moment sequence on [0, \infty) shared by multiple densities due to the underlying continued fraction structure leading to non-unique inversions.[29]In such indeterminate cases, the full set of representing measures can be systematically described using canonical representations, as developed by M. G. Krein, which parameterize the solutions via analytic functions in the complex plane and their boundary values.[8]
Solution Methods
Orthogonal Polynomials and Recurrence
In the context of determinate moment problems, orthogonal polynomials offer a constructive method to approximate the underlying measure from its moments. Given a sequence of moments \mu_n = \int x^n \, d\mu(x) associated with a positive measure \mu on the real line, the monic orthogonal polynomials p_n(x) of degree n are constructed by applying the Gram-Schmidt orthogonalization process to the basis \{1, x, x^2, \dots\} with respect to the inner product \langle f, g \rangle = \int f(x) g(x) \, d\mu(x). This process yields polynomials satisfying \langle p_m, p_n \rangle = 0 for m \neq n and \langle p_n, p_n \rangle > 0, with the norms determined recursively from the moments via the Cholesky factorization of the Hankel matrix whose entries are the moments.[30]The orthogonal polynomials satisfy a three-term recurrence relation, which provides an efficient way to generate them from the moments. Specifically, the monic versions obeyx p_n(x) = p_{n+1}(x) + a_n p_n(x) + b_n p_{n-1}(x),where the coefficients a_n and b_n are real numbers, with b_n \geq 0, and can be expressed in terms of ratios of Hankel determinants: b_n = \Delta_{n-1} \Delta_{n+1} / (\Delta_n)^2, where \Delta_n = \det(H_n) and H_n is the n \times n Hankel matrix of moments. This recurrence, known as Favard's theorem in its converse form, ensures that any such sequence arises from a unique measure when the moment problem is determinate.[30][31]The Christoffel-Darboux formula further connects these polynomials to kernel functions useful for quadrature. For orthonormal polynomials q_n(x) = p_n(x) / \sqrt{\langle p_n, p_n \rangle}, the kernel isK_n(x, y) = \sum_{k=0}^n q_k(x) q_k(y) = \frac{q_{n+1}(x) q_n(y) - q_n(x) q_{n+1}(y)}{x - y},which reproduces the orthogonal projection onto polynomials of degree at most n. This kernel plays a key role in quadrature by enabling the representation of weights and nodes.[30][32]Orthogonal polynomials underpin Gaussian quadrature rules for approximating integrals against the measure \mu, which is particularly valuable in moment problems for numerical solution. The n-point Gaussian quadrature uses the zeros \tau_{G,\nu} of p_n(x) as nodes and weights \lambda_{G,\nu} = 1 / \sum_{k=0}^{n-1} q_k^2(\tau_{G,\nu}) (or equivalently from the Christoffel numbers), ensuring exactness for polynomials up to degree $2n-1. Thus, the first $2n moments determine a quadrature that exactly recovers them, providing a practical approximation to \mu even when the full measure is unknown.[33][34]
Formal Inversion Formulas
One approach to inverting the moment sequence to recover the underlying measure is through formal series expansions, particularly in the context of the Hamburger moment problem on the real line. A classic formal inversion formula, attributed to Borel, expresses the density of the measure as an infinite sum involving higher-order derivatives of the Dirac delta function. Specifically, if m_n = \int_{-\infty}^{\infty} x^n \, d\mu(x) are the moments, the density \rho(x) can be formally represented as\rho(x) = \sum_{n=0}^{\infty} \frac{(-1)^n m_n}{n!} \delta^{(n)}(x),where \delta^{(n)}(x) denotes the nth distributional derivative of the Dirac delta distribution. This expression derives from the formal Taylor series inversion of the characteristic function \phi(t) = \mathbb{E}[e^{itX}] = \sum_{n=0}^{\infty} \frac{(it)^n m_n}{n!}, which, when inverted formally, yields the above sum in the sense of distributions. The formula provides a conceptual tool for understanding the relationship between moments and the measure but remains formal, as the series generally diverges and requires summation methods like Borel summation for interpretation in indeterminate cases.For the Stieltjes moment problem, where the support is restricted to [0, \infty), inversion techniques leverage the connection to the Laplace transform. The moments m_n = \int_0^{\infty} x^n \, d\mu(x) determine the Laplace-Stieltjes transform F(s) = \int_0^{\infty} e^{-sx} \, d\mu(x) = \sum_{n=0}^{\infty} \frac{m_n}{s^{n+1}} for \Re(s) > 0. A key explicit inversion method is the Post-Widder formula, which recovers the density f(x) = \frac{d\mu}{dx} (assuming absolute continuity) via successive derivatives:f(x) = \lim_{k \to \infty} \frac{(-1)^k}{k!} \left( \frac{k}{x} \right)^{k+1} F^{(k)} \left( \frac{k}{x} \right),where F^{(k)} is the kth derivative of F. This formula, developed as part of the operational calculus for Laplace transforms, allows pointwise recovery of f(x) under suitable growth conditions on the moments, such as complete monotonicity of (-1)^k F^{(k)}(s). It provides a practical, though computationally intensive, means to reconstruct the measure from its asymptotic series expansion.In determinate moment problems, where the measure is uniquely determined by the moments, analytic continuation offers another inversion strategy. The generating function, such as the Stieltjes transform G(z) = \int \frac{d\mu(x)}{z - x} for z \notin \mathbb{R}, admits an analytic continuation from its moment-derived power series to a larger domain in the complex plane. For determinate cases, this continuation uniquely identifies the measure via the inversion formulad\mu(x) = \frac{1}{\pi} \lim_{\epsilon \to 0^+} \Im G(x + i\epsilon) \, dx,exploiting the boundary values on the real line. This method relies on the uniqueness criterion, ensuring the continued function's singularities correspond precisely to the support of \mu. It is particularly effective for Hamburger and Stieltjes problems where the moments grow slowly enough to permit continuation beyond the radius of convergence.These formal inversion techniques, while elegant, exhibit significant limitations in indeterminate cases, where multiple measures share the same moments. The Borel series diverges pointwise and fails to select a unique measure, as the distributional sum does not converge in the space of tempered distributions without additional regularization. Similarly, the Post-Widder formula may yield inconsistent densities for different choices of continuation paths, and analytic continuations of the generating function branch non-uniquely, reflecting the N-extremal solutions parameterized by Pick-Nevanlinna functions. Thus, these methods succeed only when determinacy holds, often requiring supplementary conditions like Carleman's criterion for convergence and uniqueness.
Variations and Generalizations
Truncated Moment Problem
The truncated moment problem arises in scenarios where only a finite number of moments, say the first $2n+1 or $2n+2, of a probability measure on \mathbb{R} are given, and the goal is to determine the existence of measures supported on \mathbb{R} (Hamburger case) or a compact interval (Hausdorff or Stieltjes cases) that match these moments exactly.[35] This finite-dimensional variant is particularly relevant in computational mathematics and optimization, as it avoids the challenges of infinite sequences while enabling practical approximations.[36]Existence of such a representing measure is characterized by the positive semidefiniteness of the associated Hankel (or moment) matrix M_n(y) = (\gamma_{i+j})_{0 \leq i,j \leq n}, where \gamma_k are the given moments, along with additional rank or range conditions to ensure consistency for odd or even truncation degrees.[35] For the even case up to degree $2n, this requires M_n(y) \succeq 0 and \operatorname{rank} M_n(y) = \operatorname{rank} [M_{n-1}(y) \mid c_n(y)], where c_n(y) is the column of shifted moments; similar conditions hold for the odd case.[35] If these hold, there exists an atomic representing measure with at most n+1 support points, corresponding to the roots of an orthogonal polynomial associated with the moments, which serves as a canonical discrete solution akin to Gaussian quadrature.[35]Unlike the full infinite moment problem, where uniqueness may hold under Stieltjes conditions, the truncated version generally lacks uniqueness, admitting infinitely many representing measures when the Hankel matrix has full rank.[35] To navigate this non-uniqueness in applications like polynomial optimization, moment hierarchies extend the truncated problem by iteratively refining moment sequences through semidefinite constraints, as developed by Lasserre, providing convergent approximations to global optima.[36]Practical algorithms for the truncated problem often employ semidefinite programming to compute bounds on expectations, such as \inf \int p(t) \, d\mu(t) over measures \mu matching the given moments, by solving moment relaxations \inf \langle p, y \rangle subject to M_k(y) \succeq 0 for increasing k \geq n.[36] These SDP-based methods yield tight lower bounds that converge to the true infimum as the relaxation order increases, especially on compact supports, and are implemented in tools like GloptiPoly for numerical solution.[36]
Multivariate and Infinite-Dimensional Cases
The multivariate moment problem seeks to determine a positive Borel measure \mu on \mathbb{R}^d from its sequence of moments m_\alpha = \int_{\mathbb{R}^d} x^\alpha \, d\mu(x), where \alpha = (\alpha_1, \dots, \alpha_d) \in \mathbb{N}^d is a multi-index and x^\alpha = x_1^{\alpha_1} \cdots x_d^{\alpha_d}. This generalization introduces tensor-product structures for the moments, leading to multivariate Hankel matrices (or moment matrices) whose entries are m_{\alpha + \beta} for multi-indices \alpha, \beta up to a given degree n. Positive semidefiniteness of these matrices is a necessary condition for the existence of a representing measure, analogous to the univariate case, but the higher-dimensional setting amplifies computational and theoretical complexities due to the exponential growth in the number of monomials.Key challenges in the multivariate framework revolve around flat extensions of these moment matrices, which occur when a rank-preserving extension to higher degrees maintains positive semidefiniteness and reveals the variety (zero set) associated with the measure. Such extensions are pivotal for solving truncated problems, as they enable the identification of atomic measures and the resolution of the representing measure when the original matrix data is flat (i.e., the rank equals the number of shifts).[37] In higher dimensions, determinacy—ensuring a unique representing measure—proves elusive compared to one dimension, lacking simple criteria like Carleman's condition; instead, it often hinges on geometric properties of the support, such as virtual compactness when the support dimension is one, or the density of polynomials in weighted L^1 spaces for broader cases. These difficulties underscore the need for tools from real algebraic geometry, including Positivstellensätze, to certify existence and uniqueness.The infinite-dimensional case extends the problem to operator-valued moments in Hilbert spaces, where moments are defined via expectations \operatorname{tr}(A_k \rho) or integrals against operator-valued measures, with A_k self-adjoint operators on a separable Hilbert space \mathcal{H}.[38] This formulation arises naturally in quantum mechanics, where it characterizes the spectral distribution of observables, such as position or momentum operators, through their moments in the state space, facilitating the reconstruction of quantum states from operator traces.[39] Operator moment problems in this context often involve non-commutative variables and von Neumann algebras, with determinacy linked to the essential self-adjointness of multiplication operators on the associated L^2 space.A foundational recent result in these generalizations is Tchakaloff's theorem (often credited jointly with Richter), which asserts that for a truncated moment functional on a compact semi-algebraic set in \mathbb{R}^d, there exists a representing measure supported on at most \binom{d + m}{d} atoms, where m is the degree, ensuring finite atomic approximations for quadrature and optimization applications. This theorem, proven via convex separation and Carathéodory's principle, extends seamlessly to the multivariate and operator settings, providing a bridge to numerical methods while highlighting the atomic nature of solutions in finite-dimensional approximations.[37]
Applications
In Probability and Statistics
In probability theory, the moment problem plays a crucial role in understanding the relationship between sequences of moments and the underlying probability distributions of random variables. A key result is the Fréchet-Shohat theorem, which establishes that if a sequence of probability measures converges in moments—meaning their moment sequences converge pointwise—and if the limiting measure is determinate (uniquely specified by its moments), then under suitable growth conditions on the moments (such as Carleman's condition ensuring determinacy), the measures converge weakly to the limiting distribution. This theorem provides a foundational link between moment convergence and distributional convergence, essential for proving limit theorems where moments are easier to handle than full distributions.[40]The method of moments, introduced by Karl Pearson, applies the moment problem to parameter estimation in statistical models by equating sample moments to theoretical population moments derived from the assumed distribution. For instance, to estimate parameters of a distribution like the gamma or normal, one solves the system of equations formed by matching the first few sample moments (computed from data) to the corresponding expressions in terms of the parameters, yielding estimators that are often simple and robust, though potentially less efficient than maximum likelihood estimators for small samples. This approach is particularly useful for fitting distributions where explicit likelihoods are intractable, relying on the assumption that the moments uniquely determine the distribution when it is determinate.The central limit theorem (CLT) further illustrates how moments govern asymptotic normality in probabilistic limits. Under finite second-moment conditions, the standardized sums of independent random variables converge in distribution to a standard normal, whose moments are fully determined by its mean (zero) and variance (one), with all higher moments fixed accordingly. Thus, the CLT implies that the limiting moments uniquely identify the normal distribution via the moment problem, justifying the use of normal approximations in large-sample inference even when the underlying distribution is non-normal but has matching first two moments. For example, in indeterminate cases like the log-normal distribution, where multiple distributions share the same moments, the CLT still selects the normal limit based on these shared low-order moments.In Bayesian statistics, indeterminate moment problems introduce challenges for prior specification, as multiple distributions compatible with observed moments can lead to non-unique posteriors, highlighting the brittleness of inference under moment constraints. When moments from data do not uniquely determine the likelihood, Bayesian updates can vary widely depending on the choice of prior within the indeterminate class, necessitating robust methods like moment-matching priors or sensitivity analysis to bound posterior uncertainty. This underscores the importance of verifying determinacy in applications involving partial moment information, such as empirical Bayes estimation.
In Approximation Theory and Quadrature
In approximation theory, the moment problem provides a foundational framework for constructing numerical methods to approximate integrals and functions, particularly through the determination of representing measures from moment sequences. Gaussian quadrature rules, which achieve high accuracy for polynomial integrands, rely on orthogonal polynomials derived from the moments of a weight function. Specifically, for a positive measure with moments m_k = \int_a^b x^k \, d\mu(x), the orthogonal polynomials p_n(x) satisfy a three-term recurrence relation obtained from the Hankel moment matrix, and their roots serve as the quadrature nodes x_i, with weights w_i = \int_a^b p_n(x) l_i(x) \, d\mu(x), where l_i(x) are the Lagrange basis polynomials. This setup ensures that the n-point Gaussian quadrature exactly integrates polynomials of degree up to $2n-1, linking the solvability of the Hamburger or Stieltjes moment problem to the stability and convergence of the quadrature scheme.The connection extends to the truncated moment problem, where finite moment sequences yield approximate Gaussian rules via positive semidefinite moment matrices, allowing computation of nodes and weights even for indeterminate cases by selecting canonical representing measures. For instance, in the Stieltjes case on [0, \infty), the quadrature nodes are the eigenvalues of a Jacobi matrix built from the recurrence coefficients, which are extracted from the continued fraction expansion of the generating function associated with the moments. This approach not only facilitates efficient computation but also ensures error minimization in the Christoffel-Darboux kernel sense.[41]Padé approximants and continued fractions play a crucial role in approximating the Stieltjes transform S(z) = \int_0^\infty \frac{d\mu(t)}{z - t}, which encodes the moments via its series expansion S(z) = \sum_{k=0}^\infty \frac{m_k}{z^{k+1}}. For the Stieltjes moment problem, the diagonal Padé approximants [n/n] to S(z) correspond to convergents of the J-fraction continued fraction S(z) = \frac{1}{z - a_0 - \frac{b_1^2}{z - a_1 - \frac{b_2^2}{z - a_2 - \cdots}}}, where coefficients a_k, b_k are determined from the moments through the recurrence of orthogonal polynomials. These approximants provide rational functions that interpolate the first $2n+1 moments exactly and converge to S(z) outside the support of \mu, enabling precise approximation of the generating function and, by inversion, the measure itself. In indeterminate cases, different continued fraction truncations yield distinct representing measures, useful for selecting optimal approximations in spectral methods.[42]For the Hausdorff moment problem on the bounded interval [0,1], Bernstein polynomials offer a constructive approximation scheme. Given moments m_k = \int_0^1 x^k \, d\mu(x), the Bernstein operator B_n(f; x) = \sum_{k=0}^n f\left( \frac{k}{n} \right) \binom{n}{k} x^k (1-x)^{n-k} approximates continuous functions f uniformly, and the moments determine the discrete measure supported on \{0/n, 1/n, \dots, n/n\} that matches the first n+1 moments. This yields a sequence of atomic measures converging weakly to \mu, providing a practical method to solve the moment problem via positive linear functionals on polynomials. The saturation order of convergence is O(1/n), with explicit rates depending on the modulus of continuity of f, making it valuable for numerical inversion on compact supports.[43]Error bounds in these approximations often arise from discrepancies in higher-order moments between the true measure and its approximant. For Gaussian quadrature, the remainder term satisfies |R_n(f)| \leq \frac{\|f^{(2n)}\|_\infty}{(2n)!} \int_a^b |p_n(x)|^2 w(x) \, dx, where deviations in tail moments quantify the impact of truncation in the moment sequence. Similarly, for Padé approximants, the error |S(z) - [n/n](z)| is bounded by the next moment discrepancy via the remainder in the continued fraction expansion, ensuring controlled approximation in regions away from the cut. In the Hausdorff case, moment discrepancies |m_k - \tilde{m}_k| for k > n bound the total variation distance between measures by O(\sqrt{k} |m_k - \tilde{m}_k|), providing rigorous guarantees for convergence in approximation algorithms. These bounds underscore the moment problem's utility in certifying the accuracy of numerical schemes without full knowledge of the underlying measure.[44]
In Optimization and Control Theory
In optimization, the moment problem provides a foundational framework for solving nonconvex polynomial optimization problems through semidefinite programming (SDP) relaxations, particularly via the Lasserre hierarchy. This hierarchy addresses the minimization of a polynomial objective function over semi-algebraic sets defined by polynomial inequalities, by formulating a sequence of moment relaxations that yield upper bounds converging to the global optimum as the relaxation order increases. Each relaxation corresponds to a finite-dimensional SDP involving moment matrices, which represent candidate moment sequences consistent with the constraints; finite convergence is guaranteed under archimedeanity of the feasible set.The dual perspective of these moment relaxations leverages sum-of-squares (SOS) representations to certify nonnegativity of polynomials on semi-algebraic sets, enabled by positive moment functionals. A polynomial is nonnegative on such a set if it admits an SOS decomposition modulo the ideal and quadratic module generated by the constraints, which can be checked via SDP by ensuring the associated localizing matrices are positive semidefinite. This duality between the moment problem (primal) and SOS decompositions (dual) underpins the hierarchy's tightness, allowing extraction of global minimizers from flat moment matrices at sufficient orders.[45]In robust control, moment-based approaches model uncertainty in system parameters or disturbances by optimizing over semi-algebraic sets that incorporate moment constraints, providing inner approximations to robust feasible regions. For instance, in problems involving universal quantifiers over uncertainty sets, the hierarchy constructs converging SDP relaxations for robust polynomial matrix inequalities, ensuring stability margins under polynomial uncertainty descriptions. This enables the design of controllers that guarantee performance despite distributional ambiguities captured by moments.[46]Computational implementations of the moment-SOS hierarchy include GloptiPoly, a MATLAB toolbox that solves generalized moment problems by generating SDP relaxations with support for multiple measures and linear constraints, interfacing with solvers like SeDuMi. Similarly, YALMIP facilitates polynomial optimization through its moments module, enabling sparse and structured formulations for large-scale problems in control design. These post-2000 tools have democratized the approach for engineering applications.