Hilbert matrix

In linear algebra, the Hilbert matrix is an n \times n square matrix H_n with entries h_{ij} = \frac{1}{i+j-1} for i,j = 1, \dots, n, where indices are typically 1-based.^[1]^[2] This matrix, introduced by David Hilbert in 1894 in connection with the theory of Legendre polynomials, serves as a fundamental example in numerical analysis due to its explicit construction and intriguing properties.^[3]^[2] The Hilbert matrix is symmetric (h_{ij} = h_{ji}), positive definite (ensuring x^T H_n x > 0 for all nonzero vectors x), and totally positive (every submatrix has a positive determinant).^[1] It is also a special case of a Hankel matrix, where the anti-diagonals are constant.^[1] Notably, its inverse admits an explicit formula: the (i,j)-th entry of H_n^{-1} is (-1)^{i+j} (i+j-1) \binom{n+i-1}{n-j} \binom{n+j-1}{n-i} \binom{i+j-1}{i-1}^2.^[2] A defining characteristic of the Hilbert matrix is its extreme ill-conditioning, where the condition number \kappa(H_n) grows exponentially with n, approximately as \kappa(H_n) \approx \exp(3.5n) for large n.^[4] This sensitivity to perturbations makes it an ideal test case for algorithms in matrix factorization, inversion, and solving linear systems, highlighting challenges in floating-point arithmetic.^[4]^[2] Arising as the Gram matrix for the L^2[0,1] inner product of monomial basis functions, it models least-squares polynomial approximation problems over the unit interval.^[4] Beyond numerical testing, the infinite-dimensional Hilbert matrix (with i,j \geq 1) defines a bounded linear operator on \ell^2 sequence space, with applications in operator theory, integral equations, and even modern fields like image processing and cryptology.^[2] Its Cholesky factorization and other decompositions can be computed exactly using integer arithmetic, avoiding numerical instability for moderate n.^[4]

Definition and Construction

Matrix Entries

The n \times n Hilbert matrix H_n is defined by its entries H_{i,j} = \frac{1}{i + j - 1} for i, j = 1, 2, \dots, n.^[5] This uses the standard 1-based indexing convention.^[5] The Hilbert matrix arises as a special case of a Cauchy matrix.^[6] For illustration, the $2 \times 2 Hilbert matrix is

\begin{pmatrix} 1 & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{3} \end{pmatrix},

with trace $1 + \frac{1}{3} = \frac{4}{3}.^[5] The $3 \times 3 Hilbert matrix is

\begin{pmatrix} 1 & \frac{1}{2} & \frac{1}{3} \\ \frac{1}{2} & \frac{1}{3} & \frac{1}{4} \\ \frac{1}{3} & \frac{1}{4} & \frac{1}{5} \end{pmatrix},

with trace $1 + \frac{1}{3} + \frac{1}{5} = \frac{23}{15}.^[5] These examples highlight the decreasing pattern in entries as i + j grows, consistent with the reciprocal form.^[5] The matrix is symmetric by construction.^[7]

Integral Representation

The entries of the finite-dimensional Hilbert matrix H_n admit an integral representation given by

H_{ij} = \int_0^1 x^{i-1} x^{j-1} \, dx = \int_0^1 x^{i+j-2} \, dx = \frac{1}{i+j-1},

for i,j = 1, \dots, n, where the equality follows from direct evaluation of the definite integral.^[8] This formulation establishes the direct equivalence between the continuous integral expression and the discrete algebraic definition of the matrix entries. This integral representation arises naturally as the Gram matrix associated with the standard L^2[0,1] inner product \langle f, g \rangle = \int_0^1 f(x) g(x) \, dx applied to the monomial basis functions \{ \phi_k(x) = x^{k-1} \}_{k=1}^n = \{1, x, x^2, \dots, x^{n-1}\}.^[8] Specifically, the (i,j)-th entry is the inner product H_{ij} = \langle \phi_i, \phi_j \rangle, capturing the overlap between these basis elements in the space of square-integrable functions on [0,1]. The space L^2[0,1] equipped with this inner product forms a Hilbert space, and the resulting Gram matrix H_n inherits positive definiteness from the linear independence of the monomials in this space.^[8] Intuitively, this positive definiteness reflects the fact that the monomials, while not orthogonal, span a finite-dimensional subspace without linear dependencies, ensuring that the quadratic form \mathbf{v}^T H_n \mathbf{v} = \|\sum v_k \phi_k \|_{L^2}^2 > 0 for any nonzero coefficient vector \mathbf{v}. This construction generalizes to broader contexts by incorporating weights or altering the integration domain; for instance, weighted versions take the form \int_0^1 w(x) x^{i+j-2} \, dx for a positive weight function w, yielding matrices that retain similar Gram matrix properties in weighted L^2 spaces.^[9] Such generalizations appear in studies of operators on Bergman or Hardy spaces, where the weight influences boundedness and spectral behavior.^[10]

Mathematical Properties

Algebraic Structure

The Hilbert matrix H_n of order n is symmetric, as its entries satisfy H_{ij} = H_{ji} for all i, j = 1, \dots, n, since the denominators i + j - 1 are invariant under index interchange. It belongs to the class of Hankel matrices, characterized by constant values along each anti-diagonal, because each entry H_{ij} = 1/(i + j - 1) depends solely on the sum i + j. Additionally, the Hilbert matrix is a special case of a Cauchy matrix, obtained by setting the parameters x_i = i and y_j = j - 1 for i,j = 1, \dots, n in the general form C_{ij} = 1/(x_i + y_j). The Hilbert matrix is positive definite, meaning that for any nonzero real vector \mathbf{x} \in \mathbb{R}^n, the quadratic form \mathbf{x}^T H_n \mathbf{x} > 0. A proof of this property leverages the integral representation H_{ij} = \int_0^1 t^{i+j-2} \, dt, which follows from direct computation of the integral. Substituting into the quadratic form yields

\mathbf{x}^T H_n \mathbf{x} = \int_0^1 \left( \sum_{k=1}^n x_k t^{k-1} \right)^2 dt.

The integrand is nonnegative for all t \in [0,1] and strictly positive on a set of positive measure unless \mathbf{x} = \mathbf{0}, ensuring the integral is positive. As a symmetric positive definite matrix, the Hilbert matrix admits a unique Cholesky decomposition H_n = L L^T with a positive definite lower triangular factor L. The Hilbert matrix is totally positive, meaning that every square submatrix has a positive determinant. This property arises from its structure as a Gram matrix of the monomial basis \{1, t, t^2, \dots, t^{n-1}\} with respect to the inner product \langle f, g \rangle = \int_0^1 f(t) g(t) \, dt on the interval [0,1], where the positive definite kernel ensures all minors are positive. Total positivity implies variation-diminishing properties, such as the fact that the sign changes in the solution to H_n \mathbf{x} = \mathbf{b} are at most as many as in \mathbf{b}, which has applications in oscillation theory for linear systems. The Hankel structure of the Hilbert matrix connects it to moment matrices in approximation theory, as its entries form the Hankel matrix associated with the moments \mu_k = \int_0^1 t^k \, dt = 1/(k+1) of the uniform distribution on [0,1], with H_{ij} = \mu_{i+j-2}. This representation underscores its role in problems involving orthogonal polynomials and quadrature, where the positive definiteness guarantees unique representations.

Spectral Characteristics

The Hilbert matrix H_n is symmetric positive definite, ensuring that its eigenvalues \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n are real and positive. The largest eigenvalue \lambda_n approaches \pi as n increases, reflecting the spectral norm of the infinite-dimensional Hilbert operator. The eigenvalues exhibit a distribution where the larger ones cluster near \pi, while the smaller ones decay rapidly toward zero. The smallest eigenvalue \lambda_1 decays exponentially with n, with the asymptotic behavior \lambda_1 \sim c n^{-1/2} (1 + \sqrt{2})^{-4n} for a constant c > 0 involving \pi, leading to superpolynomial ill-conditioning. More precise approximations indicate \lambda_k \approx \pi / (n + k)^2 for intermediate indices k, providing bounds on the spectral decay. This rapid clustering of smaller eigenvalues near zero underscores the matrix's tendency toward singularity as n grows. Since H_n is symmetric, its singular values \sigma_i coincide with the eigenvalues \lambda_i. Consequently, the singular values also feature one large value near \pi and the rest decaying geometrically to exponentially small magnitudes, amplifying sensitivity in low-rank approximations. The 2-norm condition number \kappa_2(H_n) = \lambda_n / \lambda_1 \approx (1 + \sqrt{2})^{4n} / \sqrt{\pi n}, exhibiting exponential growth dominated by the term $4n \log(1 + \sqrt{2}) \approx 3.525 n in the logarithm. This logarithmic rate implies that even modest increases in n render H_n numerically unstable for standard floating-point arithmetic, as perturbations amplify by factors exceeding machine precision. For instance, when n=5, \kappa_2(H_5) \approx 4.8 \times 10^5.

Explicit Formulas

The determinant of the n \times n Hilbert matrix H_n admits a closed-form expression as

\det(H_n) = \frac{c_n^4}{c_{2n}},

where c_m = \prod_{k=1}^{m-1} k! denotes the product of the first m-1 factorials (with c_1 = 1 by convention as the empty product).^[5] This formula arises as a special case of the Cauchy determinant for matrices of the form $1/(x_i + y_j), with x_i = i and y_j = j-1.^[11] For small dimensions, explicit values include \det(H_1) = 1, \det(H_2) = 1/12, \det(H_3) = 1/2160, and \det(H_4) \approx 1.605 \times 10^{-7}, illustrating the rapid decay that underscores the matrix's ill-conditioning.^[5] This determinant carries a geometric interpretation as the square of the volume of the parallelepiped spanned by the monomials \{1, x, x^2, \dots, x^{n-1}\} in the Hilbert space L^2[0,1] equipped with the inner product \langle f, g \rangle = \int_0^1 f(x) g(x) \, dx, since H_n is the Gram matrix of these functions.^[5] Combinatorially, the factorial products in c_n count the number of permutations of multisets, reflecting the underlying structure of the matrix entries derived from beta integral representations. The inverse of H_n also possesses an explicit entrywise formula:

(H_n^{-1})_{i,j} = (-1)^{i+j} (i+j-1) \binom{n+i-1}{n-j} \binom{n+j-1}{n-i} \binom{i+j-2}{i-1}^2,

where the indices i,j range from $1ton. This expression reveals an alternating sign pattern governed by (-1)^{i+j}and confirms that all entries are integers, a notable property stemming from the [binomial](/page/Binomial) coefficients. The formula can be derived by recognizingH_nas a [Cauchy matrix](/page/Cauchy_matrix) and applying the general explicit inversion for such matrices, or equivalently through connections to [Lagrange interpolation](/page/Interpolation) at the points{0,1,\dots,n-1}$ in the theory of orthogonal polynomials associated with the monomial basis.^[11]

Historical Development

Origins with David Hilbert

David Hilbert first introduced the matrix now known as the Hilbert matrix in his 1894 paper "Ein Beitrag zur Theorie des Legendreschen Polynoms," published in Acta Mathematica. In this seminal work, Hilbert explored the expansion of arbitrary functions in series of Legendre polynomials, connecting the matrix to problems in integral equations and the least squares approximation of continuous functions by polynomials on bounded intervals. The matrix emerged naturally as the coefficient matrix in the normal equations for minimizing the L² error in such approximations, where the entries represent inner products of monomials under the uniform measure.^[12] Hilbert employed the Hilbert matrix to prove the existence of non-zero polynomials with integer coefficients whose L² norm on the interval [0,1] is less than 1/4. This result stems from analyzing the quadratic form associated with the matrix, which quantifies the L² norm of polynomials. By examining the structure of the matrix, Hilbert demonstrated that its positive definiteness ensures the approximation problem is well-posed, with the non-zero determinant guaranteeing the invertibility of the system and thus the uniqueness of the least squares solution.^[12] A key insight from Hilbert's analysis involves the moment problem, where the Hilbert matrix serves as the moment matrix for the moments of the uniform distribution on [0,1]. He showed that for any interval of length less than 4, non-zero polynomials with integer coefficients exist whose L² norm on that interval can be made arbitrarily small relative to their degree, facilitating effective approximations of continuous functions. This threshold of 4 arises from the scaling properties of the Gram matrix determinant, which decreases sufficiently fast to imply the presence of lattice points (corresponding to integer coefficients) in small L² balls when the interval is short enough. The invertibility ensured by the non-zero determinant further supports the stability of these approximation procedures in the context of the moment problem.^[12]

Subsequent Mathematical Advances

In the early 20th century, the Hilbert matrix was recognized as a special case of a Hankel matrix, where entries depend solely on the sum of indices, a property inherent to its construction that facilitated studies in operator theory.^[13] It was also identified as a symmetric Cauchy matrix, with entries expressible in the form \frac{1}{x_i + y_j} where x_i = i - 1/2 and y_j = j - 1/2, enabling connections to broader classes of structured matrices.^[13] By the mid-20th century, significant progress emerged in understanding the Hilbert matrix's total positivity, a property where all minors are nonnegative (and strictly positive for the infinite case), as detailed in Gantmacher and Krein's foundational work on oscillation matrices. This led to oscillation theorems characterizing the sign changes in eigenvectors, with Karlin extending these results to totally positive kernels and matrices, confirming the Hilbert matrix's strict total positivity and its implications for eigenvalue interlacing. In 1983, Choi provided an elegant exposition of the explicit inverse of the finite Hilbert matrix, whose entries are integers given by (-1)^{i+j} (i+j-1) \binom{n+i-1}{n-j} \binom{n+j-1}{n-i} \binom{i+j-2}{i-1}^2, and refined bounds on its determinant, building on earlier formulas. These results highlighted the matrix's algebraic richness without delving into numerical challenges. Early literature on the Hilbert matrix emphasized theoretical and algebraic properties, with limited awareness of its severe ill-conditioning—evident in exponentially growing condition numbers \kappa_2(H_n) \approx e^{3.5n}—until the rise of numerical computing in the 1950s and 1960s, when it became a benchmark for testing algorithms.^[14] Recent advances from 2021 to 2024 have focused on the spectrum of the Hilbert operator on analytic function spaces. In Hardy spaces H^p (1 < p < ∞), Silbermann (2021) described the spectrum as a curved contour in the complex plane, parametrized by V_p(\xi) = i \pi \sinh^{-1}(\pi (\xi + i/p)) for \xi \in \mathbb{R}. For weighted Bergman spaces A^p_\alpha, Dmitrovic and Karapetrović (2023) established the operator norm as \pi / \sin((\alpha+2)\pi / p) under specific p-ranges, later verified and extended by Dai (2024) for various α. Montes-Rodríguez and Virtanen (2024) determined the continuous spectrum on H^2 as [0, π] with spectral measure dr(t) = (2/\pi^2) \arcosh(\pi x) \, dx for x ∈ [0, π], leveraging the Mehler-Fock transform—a integral transform involving Legendre functions used in quantum mechanics for conical potentials—to derive these properties.^[15]

Applications and Uses

In Approximation Theory

The Hilbert matrix plays a central role in the method of moments for polynomial least-squares approximation on the unit interval [0,1]. In this framework, one seeks a polynomial p(x) = \sum_{k=1}^n \alpha_k x^{k-1} that approximates a given continuous function f(x) by minimizing the L^2 error \int_0^1 |f(x) - p(x)|^2 \, dx. The normal equations for the coefficients \alpha take the form H_n \alpha = \mu, where \mu_k = \int_0^1 f(x) x^{k-1} \, dx represents the k-th moment of f, and H_n is the n \times n Hilbert matrix with entries H_{ij} = \int_0^1 x^{i+j-2} \, dx = \frac{1}{i+j-1}. This setup arises because the monomials \{1, x, \dots, x^{n-1}\} form the basis, and the Hilbert matrix captures their Gram matrix under the L^2[0,1] inner product.^[16]^[17] In the theory of orthogonal polynomials on [0,1] with respect to the Lebesgue measure, the Hilbert matrix facilitates the computation of expansion coefficients for function approximations. The matrix serves as the Gram matrix for the monomial basis, and its inversion yields the transformation coefficients that express the monic orthogonal polynomials (shifted Legendre polynomials) in terms of powers of x. These coefficients enable the projection of f onto the orthogonal basis, where the expansion is f(x) \approx \sum_{k=0}^{n-1} c_k \pi_k(x) and c_k = \langle f, \pi_k \rangle / \langle \pi_k, \pi_k \rangle, avoiding the direct solution of ill-conditioned systems for higher-degree approximations. The explicit inverse of the Hilbert matrix, as derived in foundational studies, supports this coefficient computation analytically for moderate n.^[18]^[19] Hilbert's original 1894 work leverages the Hilbert matrix in analyzing the L^2 norms of non-constant polynomials p(x) \in \mathbb{Z} on symmetric intervals [-a, a]. He demonstrated that if the interval length $2a < 4, the infimum of \|p\|_{L^2[-a,a]} over such non-zero polynomials is zero, implying arbitrarily good approximations to zero in the L^2 sense (and thus density of integer polynomials in L^2[-a,a]). This result, established through determinants and asymptotic properties of finite Hilbert matrices, highlights the matrix's role in quantifying how integer constraints enable dense approximation in L^2[-a,a].^[20] Due to the Hilbert matrix's extreme ill-conditioning—its condition number grows as approximately e^{3.5n} for dimension n—practical applications in these approximation problems are limited for large n. Exact solutions to H_n \alpha = \mu or related inversions demand symbolic computation or arbitrary-precision arithmetic, as floating-point methods introduce errors that dominate the approximation accuracy beyond n \approx 10. This constraint underscores the need for alternative bases, like orthogonal polynomials, to mitigate numerical instability while preserving theoretical insights.^[21]^[22]

In Numerical Analysis

The Hilbert matrix is widely used as a benchmark problem in numerical analysis to test the stability and accuracy of algorithms for solving linear systems, owing to its extreme ill-conditioning that amplifies rounding errors in finite-precision computations. For example, Gaussian elimination applied to Hilbert systems of order greater than 10 typically fails to produce reliable solutions, as the process propagates floating-point errors, resulting in relative residuals that can exceed machine epsilon by orders of magnitude.^[23]^[14] This ill-conditioning is quantified by the matrix's 2-norm condition number, which grows exponentially as \kappa_2(H_n) \approx \exp(3.5n), reaching about $1.6 \times 10^{13} for n=10; such a large value means that perturbations on the order of machine precision (around $10^{-16} in double precision) can lead to solutions accurate to only a few digits.^[23]^[24] In practice, software libraries incorporate the Hilbert matrix for validation of numerical routines: MATLAB's hilb function generates it to assess direct methods like LU decomposition, where computed factors deviate substantially from exact ones for moderate n, while SciPy's scipy.linalg.hilbert in Python serves similar purposes for iterative methods such as GMRES, highlighting convergence issues under ill-conditioning.^[14]^[25] To address these numerical challenges, techniques such as diagonal scaling to equilibrate the matrix, preconditioning via rank-revealing decompositions or approximate inverses (e.g., using the Hilbert matrix's integer inverse for n \leq [14](/page/14)), and high-precision arithmetic have been developed to achieve inverse-equivalent accuracy independent of the condition number.^[26] In recent machine learning contexts, the Hilbert matrix analogs ill-conditioned kernel Gram matrices arising in reproducing kernel Hilbert spaces, aiding the study of regularization in support vector machines and Gaussian processes.^[27]