Fact-checked by Grok 2 weeks ago

Poisson limit theorem

The Poisson limit theorem, also known as the law of rare events, states that under certain conditions, the converges in distribution to the . Specifically, consider a sequence of independent random variables X_{n,1}, \dots, X_{n,n} each with success probability p_n = \lambda / n, where \lambda > 0 is fixed; let S_n = \sum_{i=1}^n X_{n,i}. As n \to \infty, the distribution of S_n converges to a with parameter \lambda, meaning P(S_n = k) \to e^{-\lambda} \lambda^k / k! for each integer k \geq 0. This theorem provides a foundational in , particularly for modeling the number of occurrences of rare, independent events within a fixed , where the expected number of events \lambda remains constant even as the opportunities for events proliferate but each becomes increasingly unlikely. It complements the by addressing scenarios where the variance \lambda is finite and small, rather than growing with n, and is especially useful when n is large and p is small, such as in , reliability analysis, and for counting infrequent incidents like defects or arrivals. The result was first derived by Siméon Denis Poisson in his 1837 work Recherches sur la probabilité des jugements en matière criminelle et en matière civile, where it emerged in the context of error analysis and rare judicial errors, though the modern interpretation as a limit theorem developed later. In 1898, Ladislaus von Bortkiewicz popularized its application to real-world rare events, such as horse-kick fatalities in Prussian cavalry, dubbing it the "law of small numbers" to emphasize the stability of small counts under Poisson-like behavior. Proofs typically rely on the method of generating functions, where the probability generating function of the binomial (1 - p + p s)^n approaches e^{\lambda (s-1)} as n \to \infty with np = \lambda, or alternatively via Stirling's approximation for factorials in the direct probability computation. Extensions of the theorem appear in more general settings, such as for dependent events or compound processes, but the core version remains central to introductory probability.

Preliminaries

Binomial distribution

The binomial distribution models the number of successes in a fixed number n of independent Bernoulli trials, where each trial has the same probability p of success and $1-p of failure. This discrete probability distribution is fundamental in scenarios involving repeated independent experiments with binary outcomes, providing a framework for calculating probabilities of specific success counts. The of a K is given by P(K = k) = \binom{n}{k} p^k (1-p)^{n-k}, where k = 0, 1, \dots, n and \binom{n}{k} denotes the , representing the number of ways to choose k successes out of n trials. The (mean) is E[K] = np, and the variance is \operatorname{Var}(K) = np(1-p), both of which scale linearly with n for fixed p. Named after , the distribution was formally introduced in his posthumously published book in 1713, marking a foundational contribution to . Common examples include modeling the number of heads in n flips of a (where p = 0.5) or the count of defective items in a quality control sample of size n from a production process with defect probability p. In the Poisson limit theorem, the serves as the starting point, approximating the under certain conditions on n and p.

Poisson distribution

The is a that models the number of events occurring within a fixed of time or , under the assumptions of a constant average rate \lambda > 0 and independent occurrences of . It arises naturally in scenarios where events are sporadic and the probability of more than one event in a very small is negligible. The probability mass function of a Poisson random variable X is given by P(X = k) = \frac{[e](/page/E!)^{-\lambda} \lambda^[k](/page/K')}{[k](/page/K)!}, \quad k = 0, 1, 2, \dots where [e](/page/E!) is the base of logarithm and [k](/page/K')! denotes the of [k](/page/K'). The expected value (mean) is E[X] = \lambda, and the variance is \operatorname{Var}(X) = \lambda, making the distribution equidispersed with mean equal to variance. This distribution has infinite support over the non-negative integers and is commonly applied to count data, such as the number of defects in a manufactured item or the number of arrivals at a service facility. For instance, it describes the number of radioactive decays in a sample over a fixed period or the number of customer arrivals in a during a given hour, assuming the events occur independently at a constant rate. The serves as a limiting case for the when the number of trials is large and the success probability is small, with \lambda = np.

Theorem

Statement

The Poisson limit theorem, also known as the law of rare events, asserts that under suitable conditions, the converges to the as the number of trials increases indefinitely while the expected number of successes remains fixed. Formally, let X_n follow a with parameters n and p_n, where n p_n = \lambda for a fixed \lambda > 0, n \to \infty, and thus p_n \to 0. Then X_n converges in distribution to a with parameter \lambda, denoted X_n \xrightarrow{d} \mathrm{Poisson}(\lambda). This convergence is equivalent to pointwise convergence of the probability mass functions: for each fixed nonnegative integer k, \lim_{n \to \infty} P(X_n = k) = e^{-\lambda} \frac{\lambda^k}{k!}. The conditions p_n \to 0 and n p_n = \lambda (constant) ensure the approximation is valid in the regime of rare but numerous independent events.

Interpretation

The Poisson limit theorem, also known as the law of rare events, intuitively captures how the binomial distribution simplifies to the Poisson distribution under conditions of rarity and abundance. Consider a scenario with n independent Bernoulli trials, each having a small success probability p, such that the expected number of successes λ = np remains fixed as n grows large. In this regime, successes become rare events across the many trials, with the probability of exactly k successes approximating the Poisson form due to the limiting behavior of the binomial probabilities, where higher-order terms diminish and the distribution is dominated by isolated occurrences. The law of rare events formalizes this intuition: when events are sufficiently improbable yet numerous trials are conducted, the probability of exactly k occurrences approximates the form e^{-λ} λ^k / k!, assuming . This arises because the PMF \binom{n}{k} p^k (1-p)^{n-k} limits to \frac{\lambda^k e^{-\lambda}}{k!} as n → ∞ with np = λ fixed. Originally derived by in his analysis of judicial error probabilities, this principle underscores the theorem's role in modeling phenomena where successes are sparse but collectively informative.

Proofs

Direct probabilistic proof

The direct probabilistic proof establishes the Poisson limit theorem by computing the pointwise limit of the probability mass function of a binomial random variable X_n \sim \operatorname{Bin}(n, p) as n \to \infty with p = \lambda / n fixed, for each fixed nonnegative integer k. This approach relies on the known limiting behavior of the binomial probabilities under the rare events regime where the expected number of successes \lambda = np is constant. The probability mass function of X_n is given by P(X_n = k) = \binom{n}{k} p^k (1-p)^{n-k} = \binom{n}{k} \left( \frac{\lambda}{n} \right)^k \left(1 - \frac{\lambda}{n}\right)^{n-k}, for k = 0, 1, \dots, n. Substituting the binomial coefficient \binom{n}{k} = \frac{n(n-1) \cdots (n-k+1)}{k!} yields P(X_n = k) = \frac{n(n-1) \cdots (n-k+1)}{k!} \cdot \frac{\lambda^k}{n^k} \cdot \left(1 - \frac{\lambda}{n}\right)^{n-k}. This can be rewritten as P(X_n = k) = \frac{\lambda^k}{k!} \cdot \prod_{j=0}^{k-1} \left(1 - \frac{j}{n}\right) \cdot \left(1 - \frac{\lambda}{n}\right)^{n-k}. To evaluate the limit as n \to \infty for fixed k, observe that the product \prod_{j=0}^{k-1} (1 - j/n) \to 1 since each term $1 - j/n \to 1 and there are finitely many factors. Next, factor the remaining exponential term as \left(1 - \frac{\lambda}{n}\right)^{n-k} = \left(1 - \frac{\lambda}{n}\right)^n \cdot \left(1 - \frac{\lambda}{n}\right)^{-k}. It is well-known that \left(1 - \frac{\lambda}{n}\right)^n \to e^{-\lambda} and \left(1 - \frac{\lambda}{n}\right)^{-k} \to 1 as n \to \infty. Therefore, \lim_{n \to \infty} P(X_n = k) = \frac{\lambda^k}{k!} \cdot 1 \cdot e^{-\lambda} \cdot 1 = \frac{e^{-\lambda} \lambda^k}{k!}, which is the probability mass function of the \operatorname{Poisson}(\lambda) distribution. This convergence holds for every fixed k \geq 0, completing the proof.

Generating functions proof

The proof of the Poisson limit theorem using probability generating functions (PGFs) provides an elegant transform-based approach to establishing convergence in distribution from the to the . The PGF of a non-negative integer-valued X is defined as G(s) = \mathbb{E}[s^X] for |s| \leq 1, and it uniquely determines the of X. Consider a binomial random variable X_n with parameters n and success probability p_n = \lambda / n, where \lambda > 0 is fixed. The PGF of X_n is G_n(s) = \left(1 - \frac{\lambda}{n} + \frac{\lambda}{n} s \right)^n, \quad |s| \leq 1. As n \to \infty, this expression converges pointwise to G(s) = e^{\lambda (s - 1)}, \quad |s| \leq 1, which is the PGF of a Poisson random variable with parameter \lambda. The convergence follows from the standard limit \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n = e^x applied after rewriting the binomial PGF and taking logarithms. Since the PGF uniquely determines the distribution and PGFs converge pointwise on the compact interval [-1, 1], the continuity theorem for PGFs implies that X_n converges in distribution to a Poisson random variable with parameter \lambda. This establishes the Poisson limit theorem, where the binomial distribution approximates the Poisson under the specified conditions. Although ordinary generating functions (without the probabilistic ) bear a close resemblance, PGFs are the standard tool for distributions in this due to their direct connection to probability mass functions via .

Characteristic functions proof

The provides an analytic tool for establishing the Poisson limit theorem, leveraging transforms to demonstrate in . Consider a sequence of independent trials where X_n \sim \operatorname{Bin}(n, p_n) with n p_n \to \lambda > 0 as n \to \infty. The of X_n is \phi_{X_n}(t) = \mathbb{E}[e^{i t X_n}] = (1 - p_n + p_n e^{i t})^n. Substituting p_n = \lambda / n, this simplifies to \phi_{X_n}(t) = \left(1 + \frac{\lambda (e^{i t} - 1)}{n}\right)^n. As n \to \infty, the expression converges pointwise to e^{\lambda (e^{i t} - 1)}, which is the characteristic function of the Poisson distribution with parameter \lambda. To verify the limit, consider the natural logarithm of the characteristic function: \log \phi_{X_n}(t) = n \log \left(1 + \frac{\lambda (e^{i t} - 1)}{n}\right). For large n, the argument of the logarithm is small, so the Taylor expansion \log(1 + x) = x + O(x^2) as x \to 0 yields \log \phi_{X_n}(t) = n \left[ \frac{\lambda (e^{i t} - 1)}{n} + O\left(\frac{1}{n^2}\right) \right] = \lambda (e^{i t} - 1) + O\left(\frac{1}{n}\right), which approaches \lambda (e^{i t} - 1) as n \to \infty. Exponentiating gives the desired Poisson characteristic function. Lévy's continuity theorem then implies that pointwise convergence of the characteristic functions ensures convergence in distribution: X_n \Rightarrow \operatorname{Poisson}(\lambda). This theorem, which links characteristic function convergence to distributional limits under mild continuity conditions on the limiting function, underpins the result. The characteristic function approach offers broader applicability than direct probabilistic methods, readily extending to sums of independent but non-identical Bernoulli random variables (where \sum p_{n,m} \to \lambda and \max p_{n,m} \to 0) and to general infinitely divisible distributions, facilitating proofs in more abstract settings beyond simple discrete counts.

Applications and extensions

Statistical applications

The Poisson limit theorem finds significant application in statistical hypothesis testing, particularly for approximating distributions in scenarios involving . When testing about the success probability in a model with large sample size n and small probability p, the Poisson approximation with \lambda = np simplifies the computation of statistics and p-values, avoiding the need for exact calculations that become computationally intensive for large n. For instance, in chi-squared goodness-of-fit tests for binned data where expected frequencies are low (e.g., rare outcomes in tables), the Poisson model provides a more accurate alternative to the standard chi-squared , which assumes adequate cell counts and can lead to inflated Type I error rates otherwise. In constructing confidence intervals for rare event probabilities, the theorem enables the use of Poisson-based methods to approximate intervals efficiently. Specifically, for estimating the proportion p in a setting, confidence bounds for p can be derived from those of the mean \lambda = [np](/page/NP) by , such as \hat{p}_L = \hat{\lambda}_L / n and \hat{p}_U = \hat{\lambda}_U / n, where \hat{\lambda}_L and \hat{\lambda}_U are lower and upper bounds; this approach is particularly reliable when [np](/page/NP) \leq 10. Such approximations are valuable in fields like or , where events are infrequent, ensuring intervals remain conservative without requiring extensive . A practical example arises in for estimating defect rates in processes. When inspecting a large number n of items with a low defect probability p, the number of defects follows a that can be well-approximated by a Poisson with \lambda = np, facilitating quick assessments of process stability and setting control limits for defect counts per batch. The Poisson approximation also enhances computational efficiency in software implementations for statistical analyses involving large-scale binomial data. Unlike the binomial probability mass function, which requires summing up to n+1 terms and can suffer from numerical underflow for large n, the Poisson formula P(X = k) = e^{-\lambda} \lambda^k / k! involves fewer operations and is easier to implement stably, especially in simulations or iterative algorithms for rare event modeling. Historically, the theorem's implications were applied in early 20th-century to model rates as . utilized the Poisson approximation to analyze the distribution of in populations, treating them as infrequent occurrences in large genetic samples, which informed estimates of rates and their role in .

Generalizations to other limits

The Poisson limit theorem extends beyond the independent and identically distributed (i.i.d.) case to sums of independent but non-identical indicators, as captured by . This result provides a bound on the distance between the distribution of the sum S = \sum_{i=1}^n X_i, where the X_i are independent random variables with possibly different success probabilities p_i, and a with mean \lambda = \sum p_i. Specifically, the distance is at most $2 \sum p_i^2, allowing for approximation even when probabilities vary, provided they are small and their sum remains moderate. In the multivariate setting, the theorem generalizes to the approximation of a by a product of independent distributions. For a multinomial vector (N_1, \dots, N_k) with parameters n trials and probabilities (p_1, \dots, p_k) where \sum p_j = 1, k \to \infty, \max p_j \to 0, and np_j \to \lambda_j < \infty as n \to \infty (with \sum \lambda_j = n \to \infty), the multinomial approximates the conditional distribution of independent with means \lambda_j given that their sum equals n. This extension relies on methods and applies to superpositions of point processes approximating processes, with error bounds in distance derived from univariate binomial approximations. Poissonization provides another generalization, particularly useful in random allocation problems such as balls and bins. Here, the fixed number of balls m is replaced by a Poisson random variable with m, transforming the joint of bin occupancies from dependent to independent with means m p_i. Conditioning on the total number of balls recovers the original model, facilitating analysis of limits like maximum load, where the Poisson paradigm yields asymptotic results equivalent to the binomial case as m \to \infty. This technique simplifies proofs for rare event approximations in combinatorial settings. For dependent rare events, compound Poisson limits arise, generalizing the theorem to sums where indicators may cluster due to dependence, such as in Markov chains. In multi-state Markov chains with rare transitions, the sum of indicators for state visits converges in distribution to a , where the compounding reflects the cluster sizes induced by dependence. This extends earlier results for two-state chains and applies to higher-order dependencies, bounding errors via generating functions under mixing conditions. Stein's method further extends Poisson approximation to dependent settings via the Chen-Stein approach, providing explicit error bounds for sums of locally dependent indicators. The total variation distance between the law of W = \sum X_i and a Poisson with mean \lambda = E[W] is at most (1 - e^{-\lambda})/\lambda \cdot (b_1 + b_2 + b_3), where the b-terms quantify neighborhood dependencies: b_1 sums squared probabilities, b_2 expected neighbor influences, and b_3 conditional expectations outside neighborhoods. This method applies to complex dependencies in random graphs, sequences, and point processes, improving on Le Cam's bounds for non-independent cases.

References

  1. [1]
    Chapter 26. Poisson limit for rare events - IISc Math
    The Poisson limit describes how the sum of rare, independent events approaches a Poisson distribution when the chance of each event decreases as n increases, ...
  2. [2]
    [PDF] Poisson limit theorem - Physics Courses
    Mar 19, 2018 · The Poisson limit theorem states that the Poisson distribution can approximate the binomial distribution under certain conditions.
  3. [3]
    [PDF] Poisson Approximation and the Chen-Stein Method - USC Dornsife
    The central limit theorem has enjoyed a long and much deserved celebrated history. Overshadowed but perhaps of no less importance are theorems involving rare ...
  4. [4]
    [PDF] Lecture 13: The Law of Small Numbers
    Nov 20, 2020 · The expression was coined in 1898, when Ladislaus von Bortkiewicz2 published Das Gesetz der kleinen Zahlen [The Law of Small Numbers] [22]. He ...Missing: history | Show results with:history
  5. [5]
    [PDF] POISSON LIMIT THEOREMS FOR THE GENERALIZED ...
    Poisson limit theorems are studied for discrete probability mod- els. Consider the allocation of n balls into N boxes. Let µr(N, K, n) de- note the number of ...
  6. [6]
    10.2 - Is X Binomial? | STAT 414 - STAT ONLINE
    A coin is weighted in such a way so that there is a 70% chance of getting a head on any particular toss. Toss the coin, in exactly the same way, 100 times. Let ...
  7. [7]
    Discrete Random Variable Distribution Families - Utah State University
    The Binomial Distribution arises when a random process is repeated independently a fixed number of times (n n ), each trial results in a success (1) or a ...
  8. [8]
    Lesson 10: The Binomial Distribution - STAT ONLINE
    To understand the derivation of the formula for the binomial probability mass function. ... To learn the definition of a cumulative probability distribution.
  9. [9]
    10.5 - The Mean and Variance | STAT 414
    If X is a binomial random variable, then the variance of X is: σ 2 = n p ( 1 − p ). and the standard deviation of X is: σ = n p ( 1 − p ).
  10. [10]
    [PDF] Bernoulli's Ars Conjectandi and Its Pedagogical Implications
    Ars Conjectandi (The Art of Conjecturing), Jacob Bernoulli's seminal work, was written between 1684 and 1689. The book was finally published in 1713, eight ...
  11. [11]
    [PDF] The Binomial distribution Outline Coin tossing example Tossing an ...
    We have n coin tosses, with probability p of Heads. We conduct the trials independently. What is the probability of getting exactly j Heads? IP{Y = j} =.
  12. [12]
    Binomial Distribution - Quality Gurus Inc.
    For example, you can use the binomial distribution to calculate the probability of getting 5 heads in 10 coin flips. Mean and variance: The binomial ...Properties of Binomial... · Probability Mass Function...<|control11|><|separator|>
  13. [13]
    Lesson 12: The Poisson Distribution - STAT ONLINE
    To explore the key properties, such as the moment-generating function, mean and variance, of a Poisson random variable. To learn how to use the Poisson ...
  14. [14]
    [PDF] Chapter 8 The exponential family: Basics - People @EECS
    The Poisson distribution. The probability mass function (i.e., the density respect to counting measure) of a Poisson random variable is given as follows: p(x ...
  15. [15]
    12.3 - Poisson Properties | STAT 414 - STAT ONLINE
    Just as we did for the other named discrete random variables we've studied, on this page, we present and verify four properties of a Poisson random variable.
  16. [16]
    [PDF] Random Variables and Probability Distributions - Kosuke Imai
    Feb 22, 2006 · Theorem 5 (Poisson approximation to Binomial) If n is large and p is small, Poisson prob- ability mass function can approximate Binomial ...<|control11|><|separator|>
  17. [17]
    [PDF] Theorem The Poisson(µ) distribution is the limit of the binomial(n, p ...
    Theorem The Poisson(µ) distribution is the limit of the binomial(n, p) distribution with. µ = np as n → с. Proof Let the random variable X have the binomial ...
  18. [18]
    [PDF] Probability Lecture Notes - CMU Math
    1.2 Theorem. Let X and Y be nonnegative random variables. We have (a) if X ≤ Y , then EX ≤ EY , (b) for a ≥ 0, E(a + X) = a + EX and E(aX) = aEX, (c) if EX = 0 ...<|control11|><|separator|>
  19. [19]
    Recherches sur la probabilité des jugements en matière criminelle ...
    Jan 22, 2016 · Recherches sur la probabilité des jugements en matière criminelle et en matière civile precédées des règles générales du calcul des probabilités ...
  20. [20]
    [PDF] POISSON PROCESSES 1.1. The Rutherford-Chadwick-Ellis ...
    By Theorem 5, the Poisson process N (t ) is the limit, in the sense (10), of Bernoulli processes N m (t ). It is elementary that the interoccurrence times ...
  21. [21]
    [PDF] Lecture 13 - Math 3081 (Probability and Statistics)
    The Poisson limit theorem serves as a sort of complement to the central limit theorem for binomial distributions. The central limit theorem says that as n → ∞, ...
  22. [22]
    [PDF] Zero-Inflated Poisson Regression, With an Application to Defects in ...
    Zero-inflated Poisson (ZIP) regression is a model for count data with excess zeros, where a perfect state has probability p and a Poisson(A) variable has ...
  23. [23]
    [PDF] Poisson distribution - Matthew N. Bernstein
    Theorem 1 Let X ∼ Bin(n, p) where n → ∞, and np = λ. Then X follows a. Poisson distribution. Proof: The binomial p.m.f. is p(x) = n.
  24. [24]
  25. [25]
    [PDF] 1957-feller-anintroductiontoprobabilitytheoryanditsapplications-1.pdf
    Page 1. WILEY SERIES IN PROBABILITY. AND MATHEMATICAL STATISTICS. ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL S. WILKS. Editors . Ralph A. Bradley. David G ...
  26. [26]
    [PDF] Chapter 4: Generating Functions
    This chapter looks at Probability Generating Functions (PGFs) for discrete random variables. PGFs are useful tools for dealing with sums and limits of.
  27. [27]
    [PDF] Probability: Theory and Examples Rick Durrett Version 5 January 11 ...
    Jan 11, 2019 · ... Probability: Theory and Examples. Rick Durrett. Version 5 January 11, 2019. Copyright 2019, All rights reserved. Page 2. ii. Page 3. Preface.
  28. [28]
    [PDF] Poisson approximations
    Poisson approximations are used when a binomial distribution is approximated when n is large, p is small, and λ=np, especially for rare events.
  29. [29]
    [PDF] Poisson likelihood and chi squared - UMD Astronomy
    The Poisson Distribution. In the Poisson distribution, if the expected number of discrete “events” is m (which can be any positive real number), then the ...
  30. [30]
    [PDF] Confidence Bounds and Intervals for Parameters of the Poisson and ...
    May 11, 2005 · Thus confidence bounds for p = λ/n can be based on those obtained via the Poisson distribution, namely by using ˆλU (γ,k)/n and ˆλL(γ,k)/n.
  31. [31]
    6.3.3.1. Counts Control Charts - Information Technology Laboratory
    From Poisson tables or computer software we find that P ( 1 ) = 0.0005 and P ( 2 ) = 0.0027 , so the lower limit should actually be 2 or 3.Missing: theorem rates<|separator|>
  32. [32]
    12.4 - Approximating the Binomial Distribution - STAT ONLINE
    It is important to keep in mind that the Poisson approximation to the binomial distribution works well only when n is large and p is small. In general, the ...
  33. [33]
    [PDF] Haldane and the mutation rate - Indian Academy of Sciences
    The numbers of the mutant and "wild type" alleles are then equal, and, as there is symmetry to a close approximation if codominant, and recessive and dominant ...
  34. [34]
    An approximation theorem for the Poisson binomial distribution.
    An approximation theorem for the Poisson binomial distribution. Lucien Le Cam. DOWNLOAD PDF + SAVE TO MY LIBRARY. Pacific J. Math. 10(4): 1181-1197 (1960).
  35. [35]
    Poisson approximations of multinomial distributions and point ...
    We present an extension to the multinomial case of former estimations for univariate Poisson binomial approximation problems and generalize a result ...
  36. [36]
    [PDF] Randomized algorithms Lecture 11: The Poisson approximation and ...
    May 6, 2020 · This lecture goes over three primary topics: the Poisson Distribution, the Poisson approximation for balls and bins, and Poissonization. 1.1 ...
  37. [37]
    Compound Poisson limit theorems for Markov chains
    This paper establishes a compound Poisson limit theorem for the sum of a sequence of multi-state Markov chains. Our theorem generalizes an earlier one by ...