Stirling's approximation
Stirling's approximation is a mathematical formula that provides an asymptotic estimate for the factorial n! (or equivalently, the gamma function \Gamma(n+1)) when n is a large positive integer, expressed as n! \approx \sqrt{2 \pi n} \left( \frac{n}{e} \right)^n, where the relative error approaches zero as n increases.[1][2] This approximation, named after the Scottish mathematician James Stirling, was first published in his 1730 treatise Methodus differentialis, where he refined an earlier, less precise form derived from the Wallis product for \pi.[1][2] The formula originates from 18th-century efforts to approximate large factorials, stemming from correspondence between Stirling and Abraham de Moivre in the 1720s, with de Moivre publishing a related result in Miscellanea Analytica the same year as Stirling's work.[2] Stirling's version uniquely incorporates the constant \sqrt{2\pi}, obtained possibly through numerical interpolation or the Wallis product, enabling precise bounds such as \sqrt{2\pi n} \, n^{n+1/2} e^{-n + 1/(12n+1)} < n! < \sqrt{2\pi n} \, n^{n+1/2} e^{-n + 1/(12n)}.[1][2] An extended asymptotic series, known as Stirling's series, adds further terms like $1 + \frac{1}{12n} + \frac{1}{288n^2} - \cdots for even greater accuracy in computations.[1] Stirling's approximation holds fundamental importance in asymptotic analysis, probability theory, and statistics, where it facilitates approximations for large-scale combinatorial problems, such as the normal distribution's fit to the binomial via de Moivre-Laplace theorem.[2] It also appears in physics for entropy calculations in statistical mechanics and in numerical methods for evaluating integrals like the gamma function.[1] The logarithmic form, \ln n! \approx n \ln n - n + \frac{1}{2} \ln (2 \pi n), is particularly useful for avoiding overflow in computational settings.[2]Introduction
Definition and Basic Formula
Stirling's approximation, also known as Stirling's formula, is an asymptotic expression for the factorial function n! of a positive integer n, given by n! \sim \sqrt{2 \pi n} \left( \frac{n}{e} \right)^n as n \to \infty, where e is the base of the natural logarithm and \sim indicates asymptotic equivalence.[1] This formula provides a useful estimate for large factorials, which grow rapidly and are impractical to compute directly via the product definition n! = 1 \times 2 \times \cdots \times n.[2] Asymptotic equivalence means that the ratio of the exact factorial to the approximating expression approaches 1 in the limit: \lim_{n \to \infty} \frac{n!}{\sqrt{2 \pi n} \left( \frac{n}{e} \right)^n} = 1. This convergence ensures the relative error diminishes as n increases, making the approximation particularly effective for large n.[2] To illustrate, consider small values of n. For n=5, $5! = 120, while the Stirling approximation yields \sqrt{10 \pi} (5/e)^5 \approx 118.02, resulting in a relative error of approximately $1.65\%. For n=10, $10! = 3{,}628{,}800, and the approximation gives \sqrt{20 \pi} (10/e)^{10} \approx 3{,}598{,}696, with a relative error of about $0.82\%. These examples demonstrate the approximation's reasonable accuracy even for modest n, though it excels for much larger values.[1] The formula applies directly to positive integers n, but it extends naturally to positive real numbers x > 0 via the gamma function, where \Gamma(x+1) = x! for positive integers and the approximation holds asymptotically as x \to \infty.[1]Significance and Applications
Stirling's approximation is essential for estimating large factorials n! when direct computation is impractical, as factorials grow rapidly and can exceed standard numerical limits or require prohibitive time in iterative products.[2] This utility arises in scenarios involving high-dimensional counting or optimization, where the logarithmic form \ln n! \approx n \ln n - n + \frac{1}{2} \ln (2\pi n) enables efficient evaluation without overflow.[2] In combinatorics, the approximation simplifies estimates of binomial coefficients \binom{n}{k}, particularly for large n and k \approx pn with $0 < p < 1, yielding \binom{n}{k} \approx \frac{2^{n H(p)}}{\sqrt{2\pi n p (1-p)}} where H(p) = -p \log_2 p - (1-p) \log_2 (1-p) is the binary entropy function; this form connects combinatorial growth to information-theoretic limits. Such approximations are vital for analyzing algorithm complexities and random walks in large systems.[3] In statistical mechanics, Stirling's approximation underpins entropy computations for systems with many particles, approximating the multiplicity \Omega \propto N! in the Sackur-Tetrode equation for ideal gas entropy as S \approx Nk_B (\ln(V/N) + \frac{3}{2} \ln T + \text{const}), where the \ln N! term simplifies to N \ln N - N. This enables deriving thermodynamic properties like heat capacities from microstate counts, with the approximation ensuring extensivity of entropy and resolving the Gibbs paradox for indistinguishable particles in ideal gas mixtures.[4] The approximation also holds significance in information theory, where \ln n! approximations quantify the entropy of uniform distributions over n outcomes, approximating Shannon entropy H \approx \log_2 n for large n, foundational for source coding and channel capacity bounds. In machine learning, Stirling's approximation aids in normalizing log-likelihoods for probabilistic models involving factorials, such as variational inference in factorial hidden Markov models or Poisson regression, where it bounds \ln \Gamma(z) terms to stabilize training for high-dimensional data. For instance, in subsampling for generalized linear models, it refines log-factorial bounds to improve risk estimates without exact computations. Despite these applications, Stirling's approximation is reliable primarily for n \gg 1, as the relative error grows for small n—for example, it underestimates $5! by about 1.65% and $10! by 0.83%—necessitating exact methods or refined series for modest values.[2]History
Precursors to Stirling
In the mid-17th century, efforts to approximate transcendental quantities like π through infinite products and integrals laid essential groundwork for factorial estimates. John Wallis, in his 1656 treatise Arithmetica infinitorum, derived the first infinite product representation for π/2 while interpolating a sequence of integrals related to the area under the curve of \sqrt{1 - x^2}: \frac{\pi}{2} = \prod_{n=1}^{\infty} \frac{(2n)^2}{(2n-1)(2n+1)} = \prod_{n=1}^{\infty} \frac{4n^2}{4n^2 - 1}. This formula emerged from evaluating partial sums involving binomial expansions of (1 - x^2)^{m/2} for integer m, which inherently connect to products of integers akin to factorials. The partial products of Wallis's formula yield increasingly tight bounds on π and, by inversion, provide inequalities bounding n!, such as relating double factorials or even-odd products to factorial growth, marking the earliest systematic approach to asymptotic behavior of large products.[5][6] Concurrent developments by James Gregory and Isaac Newton advanced the mathematical machinery for such approximations. Gregory, working in the 1660s, pioneered infinite series expansions for trigonometric functions like \arctan x and \arcsin x, as well as areas under curves, in works such as Vera circuli et hyperbolae quadratura (1667). These series, predating the formal Taylor theorem, enabled asymptotic evaluations of integrals and functions for large arguments, indirectly supporting later factorial analyses through convergent representations of transcendental expressions. Newton, building directly on Wallis's interpolation techniques in the 1660s, generalized the binomial theorem to non-integer exponents, yielding infinite series for (1 + x)^\alpha = \sum_{k=0}^{\infty} \binom{\alpha}{k} x^k, where the generalized binomial coefficients \binom{\alpha}{k} = \frac{\alpha (\alpha-1) \cdots (\alpha-k+1)}{k!} extend factorial-like products to continuous parameters. Newton's finite difference methods also allowed interpolation of discrete sequences, including factorials, facilitating asymptotic insights into their growth.[7][8] By the early 18th century, burgeoning applications in probability theory heightened the need for accurate logarithmic approximations to n!, as binomial coefficients \binom{2n}{n} required estimating large factorials for chance calculations. Abraham de Moivre addressed this in Miscellanea Analytica de Seriebus et Quadraturis (1730), presenting an asymptotic expansion for the natural logarithm of the factorial: \log n! \approx n \log n - n + \frac{1}{2} \log n + c, where c is a constant term later identified as \frac{1}{2} \log (2 \pi) by Stirling, derived from refining tables of factorial logarithms. De Moivre extended this with higher-order corrections in an alternating series involving Bernoulli numbers, providing a practical tool for large-n computations in combinatorial problems. This work reflected wider 18th-century pursuits in asymptotic analysis of transcendental functions, fueled by advances in calculus and integrals, where approximating the rapid growth of n! via logarithms became crucial for solving equations in analysis and probability.[9][6]Formulation and Refinements
James Stirling published the first complete form of the approximation in his 1730 treatise Methodus differentialis: sive tractatus de summatione et interpolatione serierum infinitarum, deriving that for large positive integers n, n! \approx \sqrt{2\pi n} \left( \frac{n}{e} \right)^n, where e is the base of the natural logarithm.[2] This formula emerged from Stirling's interpolation of logarithmic tables for factorials and his analysis of infinite series, marking a significant advance in asymptotic analysis for large factorials.[10] Independently, Nicolas Bernoulli explored similar approximations for sums of logarithms related to factorials during the 1710s and 1720s, focusing on probabilistic contexts and series expansions, though his results remained unpublished until after his death in 1759.[11] These efforts paralleled Stirling's work but lacked the precise \sqrt{2\pi n} prefactor, contributing foundational ideas on logarithmic summation that influenced later refinements.[11] In 1809, Adrien-Marie Legendre introduced the first-order correction term to enhance the approximation's precision, modifying the logarithmic form to include +\frac{1}{12n} in the expansion of \log n!, which reduced the relative error for moderate n.[12] This adjustment built directly on Stirling's base formula, providing better bounds for computational tables of factorials and gamma values.[12] During the early 19th century, Carl Friedrich Gauss further refined the approximation by integrating it with the gamma function \Gamma(z), extending its applicability beyond integers to real and complex arguments via the relation n! = \Gamma(n+1).[13] Gauss's 1813 notation and product representation for \Gamma(z) formalized the asymptotic behavior, enabling broader use in analysis and number theory.[13] These developments formed a chronological progression: Bernoulli's early logarithmic explorations in the 1710s laid groundwork, Stirling's 1730 publication established the core formula, Legendre's 1809 correction improved accuracy for practical computations, and Gauss's 19th-century extensions generalized it through the gamma function, culminating in the modern asymptotic framework by the mid-1800s.[2][12][13]Derivations of the Basic Approximation
Logarithmic Integral Approach
One common approach to deriving Stirling's approximation begins with the logarithmic form of the factorial, expressing \log n! = \sum_{k=1}^n \log k. This sum can be approximated by treating it as a Riemann sum for the integral of \log x, yielding \sum_{k=1}^n \log k \approx \int_1^n \log x \, dx. Evaluating the integral step by step: \int \log x \, dx = x \log x - x + C, so \int_1^n \log x \, dx = [n \log n - n] - [1 \cdot \log 1 - 1] = n \log n - n + 1, since \log 1 = 0. This provides the leading terms but lacks precision for the subleading behavior, particularly the dependence on \log n.[14] To refine this approximation, the Euler-Maclaurin formula is applied, which relates sums to integrals via \sum_{k=a}^b f(k) \approx \int_a^b f(x) \, dx + \frac{f(a) + f(b)}{2} + \sum_{m=1}^M \frac{B_{2m}}{(2m)!} (f^{(2m-1)}(b) - f^{(2m-1)}(a)) + R, where B_{2m} are Bernoulli numbers and R is a remainder term. At the basic level, retaining only the integral and endpoint correction suffices for the primary terms; higher Bernoulli contributions are deferred to asymptotic series expansions. For f(x) = \log x, a=1, and b=n, the formula gives \sum_{k=1}^n \log k \approx \int_1^n \log x \, dx + \frac{\log 1 + \log n}{2} + R_n = n \log n - n + 1 + \frac{1}{2} \log n + R_n, since \log 1 = 0. The boundary term \frac{f(1) + f(n)}{2} thus introduces the \frac{1}{2} \log n factor, capturing the square-root growth in the approximation for n!. This endpoint correction arises from the trapezoidal rule interpretation of the Euler-Maclaurin formula, adjusting for the function's values at the limits of summation.[15][14] The remainder R_n approaches a constant as n \to \infty. Specifically, analysis of the Euler-Maclaurin expansion shows R_n \to \frac{1}{2} \log(2\pi) + o(1), yielding the refined estimate \log n! \approx n \log n - n + \frac{1}{2} \log n + \frac{1}{2} \log(2\pi) + o(1). Exponentiating both sides produces Stirling's basic formula: n! \approx \sqrt{2\pi n} \left( \frac{n}{e} \right)^n, where the o(1) term ensures relative error vanishing as n \to \infty. The constant \frac{1}{2} \log(2\pi) emerges from the limiting behavior of the remainder, often verified by comparing with independent methods like the Wallis product, though the integral approach alone establishes the form up to this additive constant.[15][14]Wallis Product Method
The Wallis product method derives the basic Stirling's approximation by expressing the infinite product for π in terms of factorials and performing an asymptotic analysis via logarithms, highlighting the combinatorial structure underlying the approximation. The Wallis product, introduced by John Wallis in 1655, is given by \frac{\pi}{2} = \lim_{n \to \infty} \prod_{k=1}^n \frac{4k^2}{4k^2 - 1}. This product arises from early attempts to approximate π using continued fractions and ratios of integers, with a combinatorial origin in interpolating between integer values for areas under curves.[2] The partial product can be rewritten using factorials. The product \prod_{k=1}^n \frac{4k^2}{4k^2 -1} is the reciprocal of the partial product for the sine function's infinite product representation, but for the factorial expression, it leads to the relation \frac{(2n)!}{4^n (n!)^2} \sim \frac{1}{\sqrt{\pi n}} as n \to \infty, where 4^n = 2^{2n}. This equivalence follows from equating the partial product to the known limit and expressing the terms using the central binomial coefficient \binom{2n}{n} = (2n)! / (n!)^2.[16] Taking natural logarithms of both sides of the asymptotic relation yields \log((2n)!) - 2\log(n!) - 2n \log 2 \sim -\frac{1}{2} \log(\pi n). This equation connects the logarithms of the factorials to the logarithmic form of the approximation. To obtain the leading asymptotic for \log n!, substitute the assumed form from basic summation estimates, \log n! \approx n \log n - n + \frac{1}{2} \log n + c, where c is a constant to be determined. For (2n)!, this becomes \log (2n)! \approx 2n \log (2n) - 2n + \frac{1}{2} \log (2n) + c. Substituting these into the left side gives (2n \log 2 + 2n \log n - 2n + \frac{1}{2} \log 2 + \frac{1}{2} \log n + c) - 2(n \log n - n + \frac{1}{2} \log n + c) - 2n \log 2 \sim 2n \log 2 + 2n \log n - 2n + \frac{1}{2} \log 2 + \frac{1}{2} \log n + c - 2n \log n + 2n - \log n - 2c - 2n \log 2. Simplifying the expression results in \frac{1}{2} \log 2 - \frac{1}{2} \log n - c \sim -\frac{1}{2} \log (\pi n). Equating the constant and logarithmic terms yields c = \frac{1}{2} \log (2\pi), leading to the basic Stirling approximation \log n! \approx n \log n - n + \frac{1}{2} \log (2 \pi n). Exponentiating gives the standard form n! \approx \sqrt{2 \pi n} , (n/e)^n. This derivation emphasizes the role of the product in fixing the constant term through combinatorial ratios of factorials.[10] The limit in the Wallis product is justified by its connection to the infinite product representation of the sine function, \sin x = x \prod_{k=1}^\infty (1 - x^2/(\pi k)^2), evaluated at x = \pi/2, as shown by Euler in 1736. This trigonometric identity provides the rigorous basis for the convergence without relying on integral approximations.[2]Advanced Derivations
Complex Analytic Derivation
The complex analytic derivation of Stirling's approximation relies on the Hankel contour integral representation of the reciprocal gamma function, which extends naturally to complex arguments and facilitates asymptotic analysis via contour deformation. The reciprocal gamma function admits the representation \frac{1}{\Gamma(z)} = \frac{1}{2\pi i} \int_H e^t \, t^{-z} \, \mathrm{d}t, where the Hankel contour H originates at -\infty just below the negative real axis, encircles the origin counterclockwise in a small circle of radius \epsilon > 0, and returns to -\infty just above the negative real axis. The multivalued function t^{-z} = \exp(-z \Log t) employs the principal branch of the logarithm with branch cut along the negative real axis, such that -\pi < \arg t < \pi. This integral converges for all complex z except nonpositive integers. For the asymptotic behavior as |z| \to \infty, the method of steepest descent is applied by scaling the integration variable t = z u, yielding \mathrm{d}t = z \, \mathrm{d}u and transforming the integral to \frac{1}{\Gamma(z)} = \frac{z^{1-z} e^z}{2\pi i} \int_{H_z} e^{z \phi(u)} \, \mathrm{d}u, where \phi(u) = u - \ln u (with the branch of \ln u consistent with the original contour) and H_z is the scaled contour. The phase function \phi(u) possesses a saddle point at u=1, where \phi'(u) = 1 - 1/u = 0. For large |z| with |\arg z| < \pi, the original Hankel contour can be deformed to pass through this saddle point along the path of steepest descent, where \operatorname{Re} \phi(u) decreases maximally from its value at the saddle, without crossing the branch cut or singularities. The dominant contribution arises from a small neighborhood of u=1.[17] Expanding \phi(u) around u=1 gives \phi(1) = 1 and \phi''(1) = 1/u^2 \big|_{u=1} = 1, so near the saddle, \phi(u) \approx 1 - \frac{1}{2} (u-1)^2. Parameterizing along the steepest descent direction (the imaginary axis for u-1, where the quadratic term is negative real), the local integral approximates a Gaussian: \int e^{z \phi(u)} \, \mathrm{d}u \sim e^z \sqrt{\frac{2\pi}{z}}, with the square root oriented positively along the descent path (accounting for the contour orientation yielding a factor that aligns with the imaginary direction). Substituting back produces the leading asymptotic \frac{1}{\Gamma(z)} \sim \sqrt{\frac{z}{2\pi}} \, z^{-z} e^{z}. Inverting and taking the natural logarithm yields the leading terms of Stirling's approximation for the gamma function: \ln \Gamma(z) \sim \left(z - \frac{1}{2}\right) \ln z - z + \frac{1}{2} \ln (2\pi) as |z| \to \infty in the sector |\arg z| < \pi. Specializing to positive integers by setting z = n+1, where n is a large positive integer, gives \Gamma(n+1) = n!, so \ln n! \sim \left(n + \frac{1}{2}\right) \ln n - n + \frac{1}{2} \ln (2\pi). Exponentiating recovers the familiar form n! \sim \sqrt{2\pi n} \, (n/e)^n. This derivation holds in the complex plane within |\arg z| < \pi, excluding the branch cut along the negative real axis, as the deformation to the steepest descent path through u=1 remains valid in this sector, ensuring the integral's contribution is captured accurately without residual terms from other regions dominating.Probabilistic Derivation via CLT and Poisson
A probabilistic interpretation of Stirling's approximation arises from viewing \log n! = \sum_{k=1}^n \log k as the sum of a sequence of terms that can be normalized and analyzed using tools from probability theory. Specifically, rewrite the sum as \log n! = n \log n + \sum_{k=1}^n \log(k/n). The normalized average (1/n) \sum_{k=1}^n \log(k/n) can be regarded as an empirical mean of the function \log x evaluated at equally spaced points k/n \in [1/n, 1], which approximate samples from a uniform distribution on [0,1]. By the law of large numbers, this average converges to the expected value \int_0^1 \log x \, dx = -1 as n \to \infty, yielding the leading approximation \log n! \approx n \log n - n, or n! \approx (n/e)^n.[2] To capture the next-order term \frac{1}{2} \log(2\pi n), the central limit theorem is applied to the fluctuations around this mean. The variance of \log U, where U is uniform on [0,1], is \int_0^1 (\log x)^2 \, dx - \left( \int_0^1 \log x \, dx \right)^2 = 2 - 1 = 1. Thus, the centered sum \sum_{k=1}^n \left[ \log(k/n) + 1 \right] behaves asymptotically like a normal random variable with mean 0 and variance n, scaled by \sqrt{n}. The Gaussian nature of these fluctuations contributes a corrective factor of \sqrt{2\pi n} to the approximation for n!, arising from the normalizing constant of the normal density evaluated at its mode. This heuristic connects the subleading term directly to the central limit theorem's characterization of variability in large sums. An alternative and more direct probabilistic derivation leverages the Poisson distribution and its normal approximation via the central limit theorem. Let Y be a Poisson random variable with mean n, so P(Y = k) = e^{-n} n^k / k! for nonnegative integers k. The probability at the mode k = n is P(Y = n) = e^{-n} n^n / n!. For large n, the central limit theorem implies that Y is approximately normal with mean n and variance n. The local probability mass at the mean can thus be approximated by the normal density $1 / \sqrt{2\pi n}, or more accurately using continuity correction as P(n - 1/2 < Y < n + 1/2) \approx 1 / \sqrt{2\pi n}.[18][19] Equating these expressions gives e^{-n} n^n / n! \approx 1 / \sqrt{2\pi n}, which rearranges to Stirling's approximation n! \approx \sqrt{2\pi n} \, (n/e)^n. This approach, which emphasizes the Poisson as a sum of independent exponential waiting times (each approximable by normals under the central limit theorem), provides a clean probabilistic justification for the full formula, including the fluctuation term from the Gaussian tail behavior near the mean rather than rare events. The connection to the sum-of-logs view emerges since the Poisson probabilities involve factorials in the denominator, linking back to the logarithmic sum approximation.[18]Asymptotic Series Expansion
Higher-Order Terms
The higher-order terms in Stirling's approximation extend the basic formula by incorporating corrections derived from the Euler-Maclaurin summation formula applied to \log n! = \sum_{k=1}^n \log k. This approach approximates the sum by its integral plus boundary corrections and a series involving Bernoulli numbers, yielding an asymptotic expansion beyond the leading \sqrt{2\pi n} (n/e)^n term.[2][15] The Euler-Maclaurin formula gives \log n! = \int_1^n \log x \, dx + \frac{\log 1 + \log n}{2} + \sum_{k=1}^m \frac{B_{2k}}{2k(2k-1)} \left( n^{-(2k-1)} - 1 \right) + R_m, where B_{2k} are the Bernoulli numbers and R_m is the remainder term. The integral evaluates to n \log n - n + 1, and the boundary term simplifies to \frac{1}{2} \log n. The constant contributions from the -1 in the sum and the remainder analysis yield the \log \sqrt{2\pi} factor, resulting in the expanded form \log n! \sim n \log n - n + \frac{1}{2} \log (2\pi n) + \sum_{k=1}^\infty \frac{B_{2k}}{2k(2k-1) n^{2k-1}} as n \to \infty. Exponentiating provides the series for n! itself: n! \sim \sqrt{2\pi n} \left( \frac{n}{e} \right)^n \left( 1 + \sum_{k=1}^\infty c_k n^{-k} \right), where the c_k derive from the logarithmic expansion.[2][15][20] The first correction term arises from k=1, where B_2 = 1/6, yielding \frac{1/6}{2 \cdot 1 \cdot n} = \frac{1}{12n} in the logarithmic series. Thus, the approximation improves to n! \approx \sqrt{2\pi n} \left( \frac{n}{e} \right)^n \left( 1 + \frac{1}{12n} + \cdots \right). The second term, for k=2 with B_4 = -1/30, is \frac{-1/30}{4 \cdot 3 \cdot n^3} = -\frac{1}{360 n^3}. These terms provide successively better accuracy for large n, with the relative error decreasing as O(1/n^3) after the first correction.[2][15] This series is asymptotic, meaning it diverges for any fixed n but partial sums up to an optimal number of terms m \approx n minimize the error, achieving accuracy better than the leading approximation alone. Truncation after m terms bounds the remainder R_m by the next term's magnitude, ensuring the approximation's utility in asymptotic analysis.[2][15]Full Stirling's Series
The full Stirling's series is the infinite asymptotic expansion for the natural logarithm of the factorial function, expressed as \ln n! \sim n \ln n - n + \frac{1}{2} \ln (2 \pi n) + \sum_{k=1}^\infty \frac{B_{2k}}{2k (2k-1) n^{2k-1}} as n \to \infty, where B_{2k} denotes the $2k-th Bernoulli number with B_2 = \frac{1}{6}, B_4 = -\frac{1}{30}, B_6 = \frac{1}{42}, B_8 = -\frac{1}{30}, and so on.[12] This expansion refines the basic Stirling approximation by incorporating higher-order corrections that become increasingly significant for precise computations at moderate to large n. The coefficients in the series arise directly from the even-indexed Bernoulli numbers, yielding explicit terms such as \frac{1}{12n}, -\frac{1}{360 n^3}, \frac{1}{1260 n^5}, -\frac{1}{1680 n^7}, \frac{1}{1188 n^9}, and subsequent contributions that alternate in sign and decrease initially in magnitude for fixed n.[2] These Bernoulli numbers grow factorially as |B_{2k}| \sim 4 \sqrt{\pi k} \left( \frac{k}{\pi e} \right)^{2k} for large k, which governs the behavior of the series terms. Although the series provides arbitrarily accurate approximations when truncated appropriately as n \to \infty, it is divergent for any fixed finite n, meaning the infinite sum does not converge and the partial sums will eventually grow without bound after an initial decrease.[12] The terms diminish in size until reaching a minimum around the k \approx \pi n-th term, after which they increase due to the rapid growth of the Bernoulli numbers overpowering the n^{-(2k-1)} decay; optimal truncation at or just before this point minimizes the error.[21] The remainder after such truncation has the same sign as the first neglected term for real positive n and is bounded in magnitude by approximately that term, yielding an exponentially small error of order e^{-2\pi n}.[12] To illustrate the behavior, consider partial sums for n = 10, where \ln 10! \approx 15.104412573. The table below shows the approximation from the first few terms of the series (beyond the leading n \ln n - n + \frac{1}{2} \ln (2 \pi n) \approx 15.096006 term), demonstrating initial improvement before the eventual divergence (though for small n, many terms remain negligible until far later).| Number of additional terms | Partial sum approximation | Absolute error |
|---|---|---|
| 0 (basic) | 15.096006 | 0.008407 |
| 1 | 15.104339 | 0.000074 |
| 2 | 15.104336 | 0.000076 |
| 3 | 15.104337 | 0.000075 |
Generalizations
Extension to the Gamma Function
Stirling's approximation generalizes naturally to the Gamma function \Gamma(z) for complex arguments z with large modulus, providing an asymptotic estimate valid in sectors of the complex plane excluding the branch cut along the negative real axis. The leading-order formula is \Gamma(z) \sim \sqrt{2\pi} \, z^{z - 1/2} e^{-z} as |z| \to \infty in the sector |\arg z| \leq \pi - \delta for any fixed \delta > 0.[12] This equivalence holds uniformly in such sectors, ensuring the approximation remains reliable away from the poles of \Gamma(z) on the non-positive real axis.[12] The result follows primarily from the asymptotic expansion of \ln \Gamma(z), obtained by approximating the infinite sum in the Weierstrass product form $1/\Gamma(z) = z e^{\gamma z} \prod_{k=1}^\infty (1 + z/k) e^{-z/k} (where \gamma is the Euler-Mascheroni constant) via an Euler-Maclaurin summation, or equivalently by integrating the asymptotic form of the digamma function \psi(z) = \frac{d}{dz} \ln \Gamma(z) \sim \ln z - 1/(2z) - \sum_{k=1}^\infty B_{2k}/(2k z^{2k}) (with B_{2k} the Bernoulli numbers).[22] The leading logarithmic approximation is \ln \Gamma(z) \sim \left(z - \frac{1}{2}\right) \ln z - z + \frac{1}{2} \ln (2\pi) as |z| \to \infty in the specified sector, with higher-order terms involving powers of $1/z.[12] For positive integers n, setting z = n+1 recovers the classical Stirling approximation n! = \Gamma(n+1) \sim \sqrt{2\pi n} \, (n/e)^n.[12] A notable special case occurs for half-integer arguments, where \Gamma(z) admits exact expressions that can be asymptotically refined using the general Stirling series for large arguments. Specifically, for positive integers n, \Gamma\left(n + \frac{1}{2}\right) = \frac{(2n-1)!!}{2^n} \sqrt{\pi}, with (2n-1)!! = 1 \cdot 3 \cdot 5 \cdots (2n-1) the double factorial. For large n, substituting the Stirling approximation into the double factorial (via (2n-1)!! = (2n)! / (2^n n!)) or applying the general asymptotic directly to \Gamma(n + 1/2) yields a refined estimate, such as \Gamma(n + 1/2) \sim \sqrt{2\pi} (n + 1/2)^{n} e^{-(n + 1/2)}, capturing the leading exponential and pre-factorial growth.[22]Convergent Series Version
The convergent version of Stirling's approximation provides a globally valid representation for the logarithm of the Gamma function through Binet's second integral formula, which ensures convergence for all complex z with positive real part, in contrast to the divergent nature of the standard asymptotic Stirling series. This formula is given by \log \Gamma(z) = \left(z - \frac{1}{2}\right) \log z - z + \frac{1}{2} \log (2\pi) + \int_0^\infty \left( \frac{1}{2} - \frac{1}{t} + \frac{1}{e^t - 1} \right) \frac{e^{-z t}}{t} \, dt, where the integral converges absolutely for \operatorname{Re}(z) > 0.[23] Expanding the integrand and integrating term by term yields a convergent power series in odd negative powers of z: \log \Gamma(z) = \left(z - \frac{1}{2}\right) \log z - z + \frac{1}{2} \log (2\pi) + \sum_{k=1}^\infty \frac{c_k}{z^{2k-1}}, with coefficients c_k explicitly involving values of the Riemann zeta function at even positive integers, such as relations derived from the generating function for zeta values through the Bernoulli numbers via B_{2k} = (-1)^{k+1} \frac{2 (2k)!}{(2\pi)^{2k}} \zeta(2k). This series converges for all \operatorname{Re}(z) > 0, offering a uniform approximation across the half-plane without the divergence issues of the classical Stirling expansion.[24] This representation was developed in the 19th century by mathematicians including Joseph Ludwig Raabe and Jacques Philippe Marie Binet, building on earlier asymptotic work to achieve global convergence.[23] For moderate values of |z| (e.g., |z| ≈ 10–50), this convergent form provides superior numerical accuracy and stability compared to truncating the divergent Stirling series, as the integral remainder can be evaluated efficiently using quadrature methods while avoiding oscillatory behavior in the asymptotic terms.[25]Error Analysis
Convergence Rate
The relative error in the basic Stirling approximation, defined as \left| \frac{n!}{\sqrt{2\pi n} (n/e)^n} - 1 \right|, has a leading asymptotic term of \frac{1}{12n} for large n.[26] This indicates that the approximation improves linearly with n in the inverse sense, with the ratio \frac{n!}{\sqrt{2\pi n} (n/e)^n} approaching 1 from above.[2] The proof of this convergence rate follows from the Euler-Maclaurin summation formula applied to the sum \sum_{k=1}^n \log k = \log(n!), where the remainder term after the leading contributions yields an overall error of O(1/n).[15] Specifically, the formula provides \log(n!) = n \log n - n + \frac{1}{2} \log(2\pi n) + \frac{\theta}{12n} + O(1/n^2), with $0 < \theta < 1, confirming the O(1/n) bound.[15] In terms of logarithmic error, \left| \log\left( \frac{n!}{\sqrt{2\pi n} (n/e)^n} \right) \right| \sim \frac{1}{12n}, which aligns with the relative error since \log(1 + x) \sim x for small x.[26] This logarithmic form is particularly useful in applications where factorials appear in exponents or products, as the error remains bounded by O(1/n).[2] To illustrate the $1/n decay, the following table shows approximate relative errors for selected values of n, computed using the leading term \frac{1}{12n} (actual errors are slightly larger due to higher-order corrections but follow the same scaling):| n | Approximate Relative Error |
|---|---|
| 10 | 0.0083 |
| 100 | 0.00083 |
| 1000 | 0.000083 |