Fact-checked by Grok 2 weeks ago

Negative binomial distribution

The negative binomial distribution is a that models the number of independent Bernoulli trials required to achieve a fixed number r of successes, or equivalently, the number of failures preceding r successes, where each trial has a constant success probability p (0 < p ≤ 1) and r is a positive integer parameter. In one common parameterization, a negative binomial random variable X represents the number of failures before the r-th success, taking non-negative integer values (X = 0, 1, 2, ...). The probability mass function is given by
P(X = k) = \binom{k + r - 1}{k} p^r (1 - p)^k
for k = 0, 1, 2, ..., where \binom{k + r - 1}{k} is the binomial coefficient. An alternative parameterization defines Y = X + r as the total number of trials until the r-th success (Y = r, r+1, ...), with probability mass function
P(Y = n) = \binom{n - 1}{r - 1} p^r (1 - p)^{n - r}. Both forms share the same variance but differ in their means: E[X] = r(1 - p)/p and E[Y] = r/p. The variance for either is \operatorname{Var}(X) = \operatorname{Var}(Y) = r(1 - p)/p^2, which exceeds the mean, indicating overdispersion compared to the Poisson distribution.
Key properties include the moment-generating function M(t) = \left[ p / (1 - (1 - p)e^t) \right]^r for the failures parameterization (valid for t < -\ln(1 - p)), from which the mean and variance can be derived via differentiation. The distribution is unimodal, with the mode at \lfloor (r - 1)(1 - p)/p \rfloor for X. A negative binomial random variable with parameters r and p is the sum of r independent , each representing the number of failures before a single success (with the same p). When r = 1, it reduces exactly to the . As r → ∞ and p → 1 with the mean held constant, the negative binomial converges to a , making it useful for modeling processes with extra variability. Sums of independent negative binomials with the same p but different r values also follow a negative binomial distribution. The negative binomial distribution finds broad applications in statistics for analyzing count data that display greater variability than expected under a Poisson model, such as species abundances in ecology, accident frequencies in insurance, or word counts in linguistics. In regression contexts, negative binomial models extend generalized linear models to handle overdispersed outcomes, common in fields like epidemiology and quality control. It also appears in classic probability problems, such as the Banach matchbox problem or series of games like the "problem of points." For large r, normal approximations facilitate further analysis and inference.

Definitions

Probability Mass Function

The negative binomial distribution describes the probability of observing exactly k failures before the r-th success in a sequence of independent Bernoulli trials, each with success probability p. The probability mass function is given by P(X = k) = \binom{k + r - 1}{k} p^r (1-p)^k for k = 0, 1, 2, \dots, where r > 0 is the number of successes (typically a positive integer) and $0 < p \leq 1 is the success probability. The support of the distribution consists of the non-negative integers k. This formula arises from the product of the binomial coefficient, which counts the number of ways to arrange k failures and r-1 successes in the first k + r - 1 trials, and the Bernoulli probabilities p^r (1-p)^k for the r successes and k failures overall, with the final trial being a success. The probabilities sum to 1 over all k \geq [0](/page/0) because the expression corresponds to the negative binomial series expansion of [p / (1 - (1-p))]^r = 1^r = 1. When r = [1](/page/1), this reduces to the geometric distribution.

Cumulative Distribution Function

The cumulative distribution function of the negative binomial random variable X, which counts the number of failures before r successes in independent Bernoulli trials each with success probability p, is defined as
F(x; r, p) = P(X \leq x) = \sum_{k=0}^{\lfloor x \rfloor} \binom{k + r - 1}{k} p^r (1-p)^k
for x \geq 0, r > 0, and $0 < p < 1.
This infinite series in its direct form admits a closed-form expression in terms of the regularized incomplete beta function:
F(x; r, p) = I_p(r, \lfloor x \rfloor + 1),
where
I_z(a, b) = \frac{1}{B(a, b)} \int_0^z t^{a-1} (1-t)^{b-1} \, dt
is the regularized incomplete beta function and B(a, b) is the beta function.
Equivalently, the survival function or tail probability is
P(X > x) = 1 - F(x; r, p) = I_{1-p}(\lfloor x \rfloor + 1, r).
Direct evaluation of the defining F(x; r, p) poses numerical challenges for large \lfloor x \rfloor or r, as it involves many terms with potentially large coefficients that can cause or require high-precision arithmetic. The representation mitigates these issues by leveraging efficient algorithms for incomplete beta integrals, such as continued fractions or series expansions, which are implemented in numerical libraries for even in asymptotic regimes where x is large and F(x; r, p) approaches 1. For very large x, the tail P(X > x) becomes small, and its asymptotic decay follows the geometric tail behavior modulated by the coefficients, often approximated via saddlepoint methods. For large r, normal approximations from the apply to the central region of the distribution.

Parameterizations and Formulations

Standard r-p Parameterization

The standard r-p parameterization of the negative binomial distribution employs two key parameters: r > 0, which represents the number of successes, and p, the probability of success on each independent , satisfying $0 < p \leq 1. In its classical form, r is a positive integer denoting the fixed number of successes required to terminate the sequence of trials. This setup models the distribution of the number of failures preceding the r-th success in a series of independent trials. The parameterization extends naturally to real-valued r > 0, facilitating applications in generalized linear models for overdispersed count data, where r acts as a parameter controlling the variance-to-mean relationship. Under this , the number of failures is given by \mu = \frac{r(1 - p)}{p}, and the variance by \sigma^2 = \frac{r(1 - p)}{p^2}. These expressions highlight the property, as the variance exceeds the when p < 1. This r-p form serves as the conventional baseline in theoretical probability and is widely implemented in statistical software libraries. For instance, Python's SciPy module uses scipy.stats.nbinom(n=r, p=p), supporting real-valued n (equivalent to r). Similarly, R's base package employs dnbinom(x, size=r, prob=p), where size can be non-integer to accommodate the generalized case.

Alternative Parameterizations

The negative binomial distribution can be parameterized using the mean \mu and the success probability p, where the shape parameter is expressed as r = \frac{\mu p}{1 - p}. This form maintains compatibility with the standard probability mass function while facilitating direct specification of the expected value in applications such as count regression models. Another common alternative is the overdispersion parameterization, which uses the mean \mu and variance \sigma^2 > \mu, with p = \frac{\mu}{\sigma^2} and r = \frac{\mu^2}{\sigma^2 - \mu}. This approach is particularly useful in modeling count data where variance exceeds the mean, such as in ecological or econometric analyses, by explicitly accounting for the dispersion parameter \phi = \frac{1}{r} = \frac{\sigma^2 - \mu}{\mu^2} in the variance formula \sigma^2 = \mu + \phi \mu^2. In the limit as r \to 0, the negative binomial distribution relates to the logarithmic series distribution, which arises as a special case for modeling species abundance or rare events, with the simplifying to P(X = k) = -\frac{\theta^k}{k \ln(1 - \theta)} for $0 < \theta < 1. This logarithmic parameterization highlights the distribution's connection to infinite series expansions and is often applied in biodiversity studies. Transformations between parameter sets, such as from (r, p) to (\mu, \sigma^2), involve solving the relations \mu = \frac{r(1-p)}{p} and \sigma^2 = \frac{r(1-p)}{p^2}, with the inverse given above.

Interpretations

Number of Failures Before Fixed Successes

The negative binomial distribution arises in the context of a sequence of independent Bernoulli trials, each with a fixed success probability p (where $0 < p < 1), where the random variable X represents the number of failures observed before achieving exactly r successes, with r being a positive integer parameter. This setup models scenarios involving repeated independent experiments until a predetermined number of positive outcomes occur, such as testing items for defects until a specified number of non-defective units are found. When r = 1, this interpretation simplifies to the geometric distribution, which counts the number of failures preceding the first success in Bernoulli trials. For the general case of r > 1, X is the sum of r independent and identically distributed geometric random variables, each capturing the failures occurring between consecutive successes: the first geometric variable for failures before the initial success, the second for failures between the first and second success, and so on, up to the r-th success. This additive structure intuitively builds the distribution by accumulating waiting times for each successive success milestone. The emerges from a sequential probability argument in this trial process: for X = k (where k = 0, 1, 2, \dots), exactly r-1 es must occur in the first k + r - 1 trials, interspersed with k failures, followed by a on the (k + r)-th trial. This can be conceptualized via a , where branches denote or failure at each step, and the total probability sums the paths leading to precisely k failures before the r-th , accounting for the combinatorial arrangements of those outcomes. This failures-before-successes interpretation has historical ties to early probability developments in the 17th and 18th centuries, stemming from analyses of problems involving repeated plays until a fixed number of wins, as explored in foundational works on . It also saw early applications in reliability testing, such as modeling production defects until a quota of reliable components is met.

Number of Trials Until Fixed Successes

In the negative binomial distribution, an alternative interpretation arises by considering the total number of trials required to achieve a fixed number r of successes, building on the perspective of counting failures before those successes. Let X denote the number of failures before the r-th success in a sequence of independent Bernoulli trials with success probability p; then define Y = X + r as the total number of trials until the r-th success occurs. The random variable Y follows a negative binomial distribution, with support y = r, r+1, r+2, \dots, and probability mass function P(Y = y) = \binom{y-1}{r-1} p^r (1-p)^{y-r}. This formulation is commonly applied in quality control to model the total number of items inspected until r defects are identified, aiding in process monitoring and decisions. In clinical trials, it describes the total number of patients enrolled until r positive treatment responses are observed, supporting sequential design and rules. The mean and variance of Y reflect the shift from the failures-only count: \mathbb{E}[Y] = r / p and \mathrm{Var}(Y) = r(1-p)/p^2. Historically, this version of the negative binomial distribution—emphasizing trials until r successes—is synonymous with the Pascal distribution, named after for its roots in early probability problems involving repeated trials.

Properties

Moments

The negative binomial distribution in its standard parameterization models the number of failures X before observing r successes in a sequence of independent trials, each with success probability p (where $0 < p < 1 and r is a positive integer). The moments of this distribution can be derived by recognizing that X is the sum of r independent geometric random variables, each representing the number of failures before a single success, with mean \frac{1-p}{p} and variance \frac{1-p}{p^2}. The expected value is obtained by linearity of expectation: \E[X] = r \cdot \frac{1-p}{p} = \frac{r(1-p)}{p}. This follows directly from the additivity of the means of the component geometric distributions. Similarly, the variance is the sum of the individual variances: \Var(X) = r \cdot \frac{1-p}{p^2} = \frac{r(1-p)}{p^2}. Since \Var(X) = \E[X(X-1)] + \E[X] - (\E[X])^2, the second factorial moment \E[X(X-1)] = r(r+1) \frac{(1-p)^2}{p^2} can also be used to confirm this result via the probability generating function or direct summation. For p < 1, the variance exceeds the mean (\Var(X) > \E[X]), a property known as , which makes the distribution suitable for modeling count data with greater variability than a . Higher-order moments describe the shape of the distribution. The , measuring , is positive and decreases with r: \gamma_1 = \frac{2 - p}{\sqrt{r(1 - p)}}. This expression arises because the skewness of the sum of r i.i.d. geometric random variables (each with skewness \frac{2 - p}{\sqrt{1 - p}}) scales by a factor of $1/\sqrt{r}. The , indicating tail heaviness, is \gamma_2 = 3 + \frac{6}{r} + \frac{p^2}{r(1 - p)}, which exceeds 3 for finite r and p < 1, reflecting leptokurtic tails relative to the normal distribution; as r \to \infty, it approaches 3. This can be derived from the fourth central moment using the representation as a sum of geometrics or via the probability generating function \left( \frac{p}{1 - (1 - p)t} \right)^r.

Generating Functions

The generating functions of the negative binomial distribution provide powerful tools for deriving its moments and analyzing its properties, particularly in the context of sums of independent random variables. For a negative binomial random variable X with parameters r > 0 (number of successes) and $0 < p < 1 (success probability on each Bernoulli trial), where X counts the number of failures before the rth success, these functions are derived from the probability mass function P(X = k) = \binom{k + r - 1}{k} p^r (1 - p)^k for k = 0, 1, 2, \dots. The probability generating function (PGF) is defined as G(s) = \mathbb{E}[s^X] = \sum_{k=0}^\infty P(X = k) s^k. For the negative binomial distribution, it takes the form G(s) = \left( \frac{p}{1 - (1 - p)s} \right)^r, \quad |s| \leq \frac{1}{1 - p}. This closed-form expression arises from recognizing the PGF as the rth power of the geometric distribution's PGF (for r = 1) and summing the resulting negative binomial series. The moment-generating function (MGF) is M(t) = \mathbb{E}[e^{tX}], which for the negative binomial is M(t) = \left( \frac{p e^t}{1 - (1 - p) e^t} \right)^r, \quad t < -\ln(1 - p). The domain ensures convergence, as (1 - p) e^t < 1. This MGF is obtained analogously to the PGF by substituting s = e^t into the series expansion. The characteristic function, \phi(t) = \mathbb{E}[e^{itX}], extends the MGF to complex arguments and uniquely determines the distribution. For the negative binomial, it is \phi(t) = \left( \frac{p}{1 - (1 - p) e^{it}} \right)^r. This form follows directly from replacing t with it in the MGF. Moments of X can be systematically derived from these generating functions via differentiation. For the PGF, the mean is \mathbb{E}[X] = G'(1) = r(1 - p)/p, and higher moments follow from further derivatives evaluated at s = 1. Similarly, for the MGF, \mathbb{E}[X] = M'(0) = r(1 - p)/p and \mathrm{Var}(X) = M''(0) - [M'(0)]^2 = r(1 - p)/p^2, confirming the known expressions without direct summation of the probability mass function. These functions also facilitate proofs of distributional properties, such as the negative binomial arising as the sum of r independent geometric random variables (each counting failures before one success). The MGF of the sum is the product of the individual MGFs, yielding [p e^t / (1 - (1 - p) e^t)]^r, which matches the negative binomial MGF and thus establishes the result under the assumption of independence.

Recurrence Relations

The probability mass function of the negative binomial distribution admits a useful recurrence relation that allows for iterative calculation of probabilities without directly evaluating large binomial coefficients. For a random variable X \sim \mathrm{NB}(r, p), where r > 0 is the number of successes and $0 < p < 1 is the success probability on each Bernoulli trial (with X denoting the number of failures until the rth success), the probabilities satisfy (k+1) P(X = k+1) = (r + k)(1 - p) P(X = k) for k = 0, 1, 2, \dots , with initial condition P(X = 0) = p^r. This relation arises from the ratio of successive binomial coefficients in the PMF expression \binom{k + r - 1}{k}. This recurrence is particularly valuable for efficient numerical evaluation of the PMF and CDF when r or k is large, as direct computation of \binom{k + r - 1}{k} can lead to numerical instability from overflow or underflow in factorials or gamma functions. Starting from P(X = 0) and applying the relation sequentially minimizes computational cost and maintains precision, making it suitable for software implementations in statistical computing. Recurrence relations also extend to the moments of the distribution, often derived via differential equations from the probability generating function G(s) = \left[ \frac{p}{1 - (1-p)s} \right]^r. For instance, the cumulants \kappa_n obey the recurrence \kappa_{n+1} = PQ \frac{d \kappa_n}{d Q}, where P = (1-p)/p and Q = 1/p, with \kappa_1 = rQ as the starting point; this allows systematic computation of higher-order cumulants for theoretical analysis or approximation purposes. Similar recurrences apply to factorial moments, obtained by applying the operator s \frac{d}{ds} repeatedly to G(s), yielding relations like \mu_{(n+1)} = (1-p) (r + n) \mu_{(n)} for the falling factorial moments \mu_{(n)} = E[X(X-1)\cdots(X-n+1)]. In special cases, particularly for non-integer r, closed-form expressions for the CDF involve . The survival function P(X > k) = I_p(r, k+1), where I_x(a, b) is the regularized incomplete , can be equivalently expressed using the Gauss {}_2F_1 via the integral representation of the , providing exact solutions without summation for certain parameter values.

Connection to Geometric Distribution

The negative binomial distribution generalizes the , which models the number of failures before the first success in a sequence of independent trials with success probability p. When the number of successes r = 1, the negative binomial distribution reduces exactly to the , as both describe the waiting time until the initial success. In its general form, a negative binomial random variable X \sim \text{NB}(r, p) can be expressed as the sum of r independent and identically distributed geometric random variables G_i \sim \text{Geometric}(p), where each G_i counts the number of failures before the i-th : X = \sum_{i=1}^r G_i. This representation arises because the process of achieving r successes consists of r independent phases, each waiting for one success after the previous. To verify this equivalence, consider the (PGF) approach. The PGF of a single geometric random variable G \sim \text{Geometric}(p) (number of failures before first success) is G(s) = \frac{p}{1 - (1-p)s} for |s| < \frac{1}{1-p}. For the sum of r i.i.d. geometrics, the PGF is the product [G(s)]^r = \left(\frac{p}{1 - (1-p)s}\right)^r, which matches the PGF of the negative binomial distribution \text{NB}(r, p). Alternatively, the connection can be established via convolution of the probability mass functions (PMFs). The PMF of a geometric G \sim \text{Geometric}(p) is P(G = k) = (1-p)^k p for k = 0, 1, 2, \dots. The PMF of the sum X = \sum_{i=1}^r G_i is obtained by the r-fold convolution, yielding P(X = x) = \sum_{k_1 + \cdots + k_r = x} \prod_{i=1}^r [(1-p)^{k_i} p] = p^r (1-p)^x \sum_{k_1 + \cdots + k_r = x} 1, where the sum counts the number of non-negative integer solutions, which is \binom{x + r - 1}{r - 1}. Thus, P(X = x) = \binom{x + r - 1}{r - 1} p^r (1-p)^x for x = 0, 1, 2, \dots, matching the negative binomial PMF (in the failures parameterization). This sum representation has practical implications for simulation: to generate a negative binomial random variable \text{NB}(r, p), independently simulate r geometric random variables \text{Geometric}(p) and sum their values, leveraging efficient algorithms for the geometric case.

Gamma-Poisson Mixture

The negative binomial distribution arises as a marginal distribution in a hierarchical model where the rate parameter of a is itself random and follows a gamma distribution. Specifically, let \Lambda follow a gamma distribution with shape parameter \alpha > 0 and \beta > 0, so that the of \Lambda is f(\lambda) = \frac{1}{\beta^\alpha \Gamma(\alpha)} \lambda^{\alpha-1} e^{-\lambda / \beta} for \lambda > 0. Then, conditional on \Lambda = \lambda, let X follow a with mean \lambda, having P(X = x \mid \lambda) = \frac{\lambda^x e^{-\lambda}}{x!} for x = 0, 1, 2, \dots. The unconditional distribution of X is obtained by integrating out \lambda: P(X = x) = \int_0^\infty P(X = x \mid \lambda) f(\lambda) \, d\lambda = \int_0^\infty \frac{\lambda^x e^{-\lambda}}{x!} \cdot \frac{1}{\beta^\alpha \Gamma(\alpha)} \lambda^{\alpha-1} e^{-\lambda / \beta} \, d\lambda. This integral simplifies to the probability mass function of a negative binomial distribution with parameters r = \alpha and success probability p = \frac{1}{1 + \beta}: P(X = x) = \frac{\Gamma(\alpha + x)}{x! \Gamma(\alpha)} \left( \frac{1}{1 + \beta} \right)^\alpha \left( \frac{\beta}{1 + \beta} \right)^x, \quad x = 0, 1, 2, \dots. The derivation relies on the fact that the Poisson-gamma mixture yields a negative binomial form due to the conjugacy of the gamma distribution, which facilitates the integration. Under this parameterization, the mean of X is E[X] = \alpha \beta, matching the mean of the gamma-distributed rate \Lambda. The variance is Var(X) = \alpha \beta (1 + \beta), which exceeds the mean by the factor \alpha \beta^2, the variance of \Lambda. This overdispersion—where variance surpasses the mean—contrasts with the Poisson distribution (where mean equals variance) and makes the negative binomial suitable for modeling count data with extra variability due to unobserved heterogeneity in rates. In a hierarchical Bayesian framework, the gamma serves as the conjugate prior for the Poisson rate, enabling posterior inference: given observed x, the posterior for \Lambda is gamma with updated parameters \alpha + x and scale \beta / (1 + \beta). This setup naturally incorporates prior uncertainty about the rate into the marginal predictive distribution for counts, facilitating robust modeling in Bayesian analyses of overdispersed data.

Compound Poisson Representation

The negative binomial distribution admits a representation as a , in which the total outcome is the sum of a of jumps, each following a logarithmic series distribution. This formulation highlights the distribution's infinitely divisible nature and provides a framework for analyzing processes involving clustered events, such as apparent in counting experiments or aggregated risks in modeling. Specifically, a negative binomial random variable X \sim \mathrm{NB}(r, p) (in the "number of failures" parameterization, with mean r(1-p)/p) can be expressed as X = \sum_{i=1}^N Y_i, where N \sim \mathrm{Poisson}(\lambda) is the number of Poisson events, the Y_i are independent and identically distributed as \mathrm{Logarithmic}(\theta) (a discrete distribution supported on the positive integers with probability mass function P(Y_i = k) = -\theta^k / (k \ln(1-\theta)) for k = 1, 2, \dots), and N is independent of the Y_i. The parameters are related by \lambda = -r \ln(1-\theta) and \theta = 1-p, ensuring compatibility with the standard negative binomial form. This equivalence can be derived using probability generating functions (PGFs). The PGF of the logarithmic series distribution is G_Y(s) = \frac{\ln(1 - \theta s)}{\ln(1 - \theta)}, \quad |s| < 1/\theta. The PGF of the compound sum X is then G_X(s) = \exp\bigl( \lambda \bigl( G_Y(s) - 1 \bigr) \bigr). Substituting the expressions for \lambda and G_Y(s) simplifies to G_X(s) = \exp\left( \frac{r \ln(1 - \theta s)}{\ln(1 - \theta)} \right) = \left( \frac{1 - \theta}{1 - \theta s} \right)^r = \left( \frac{p}{1 - (1-p)s} \right)^r, which matches the PGF of the negative binomial distribution \mathrm{NB}(r, p). The parameter choices also ensure that the first two moments coincide with those of the standard negative binomial: the mean is \mathbb{E}[X] = \lambda \mathbb{E}[Y_1] = r \theta / (1 - \theta) = r (1-p)/p, and the variance is \mathrm{Var}(X) = \lambda \mathbb{E}[Y_1] + \lambda \mathrm{Var}(Y_1) = r \theta / (1 - \theta)^2 = r (1-p)/p^2. This moment matching underscores the representation's consistency for applications in fields like insurance, where it models total claim counts as Poisson-distributed clusters with logarithmic-sized jumps.

Parameter Estimation

Method of Moments

The method of moments estimation for the negative binomial distribution, which models the number of failures before the r-th success in independent Bernoulli trials with success probability p, relies on matching the first two population moments to their sample counterparts. The theoretical mean is given by \mathbb{E}[X] = \frac{r(1-p)}{p}, and the variance by \mathrm{Var}(X) = \frac{r(1-p)}{p^2}. To solve for the parameters, set the sample mean \bar{X} equal to \frac{r(1-p)}{p} and the sample variance s^2 equal to \frac{r(1-p)}{p^2}, yielding a system of two equations. Dividing the variance equation by the mean equation simplifies to \frac{s^2}{\bar{X}} = \frac{1}{p}, so \hat{p} = \frac{\bar{X}}{s^2}; substituting back gives \hat{r} = \frac{\bar{X}^2}{s^2 - \bar{X}}. These estimators assume the data are independent and identically distributed according to the negative binomial distribution, with the sample size n sufficient to compute reliable moments. For large n, the estimators are consistent, converging in probability to the true parameters as n approaches infinity, and asymptotically normal. However, they exhibit bias, particularly when r is small, leading to overestimation or underestimation in finite samples; simulations show moment estimators perform reasonably but with higher mean squared error compared to alternatives in small samples. A key limitation arises if the sample is underdispersed, where s^2 < \bar{X}, resulting in a negative \hat{r}, which is invalid since r must be positive. In such cases, the negative binomial assumption may not hold, and alternative distributions or estimation approaches should be considered to avoid nonsensical results.

Maximum Likelihood Estimation

The maximum likelihood estimators (MLEs) for the parameters r > 0 and $0 < p < 1 of the negative binomial distribution are obtained by maximizing the log-likelihood function based on a random sample k_1, \dots, k_n of independent observations. The log-likelihood is given by l(r, p) = \sum_{i=1}^n \ln \binom{k_i + r - 1}{k_i} + n r \ln p + \left( \sum_{i=1}^n k_i \right) \ln (1 - p), where \ln \binom{k_i + r - 1}{k_i} = \ln \Gamma(k_i + r) - \ln \Gamma(r) - \ln \Gamma(k_i + 1) and \Gamma denotes the . Setting the partial derivative with respect to p to zero yields the conditional MLE \hat{p}(r) = \frac{n r}{n r + \sum_{i=1}^n k_i} = \frac{r}{r + \bar{k}}, where \bar{k} = n^{-1} \sum_{i=1}^n k_i is the sample mean. Substituting this into the partial derivative with respect to r and setting it to zero gives the equation for the MLE \hat{r}: n \ln \hat{p}(\hat{r}) - n \psi(\hat{r}) + \sum_{i=1}^n \psi(\hat{r} + k_i) = 0, where \psi(\cdot) is the digamma function, defined as the derivative of the log-gamma function \psi(z) = \frac{d}{dz} \ln \Gamma(z). This equation has no closed-form solution and must be solved numerically for \hat{r}, after which \hat{p} is obtained by substitution. Numerical solutions for the joint MLE (\hat{r}, \hat{p}) typically employ iterative optimization methods such as Newton-Raphson, which leverages the second derivatives (involving the ) for , or for root-finding in the equation for \hat{r}. When r is restricted to integers (as in some applications modeling fixed successes), the likelihood can be evaluated directly over a grid of integer values near an initial moment-based estimate. Alternatively, the expectation-maximization () algorithm provides a robust approach by interpreting the negative binomial as a Poisson-gamma : in the E-step, latent gamma-distributed means are inferred given current parameters; in the M-step, the parameters are updated by solving a gamma MLE problem, iterating until . Under standard regularity conditions, the MLEs (\hat{r}, \hat{p}) are consistent (converging in probability to the true values as n \to \infty) and asymptotically , with asymptotic distribution \sqrt{n} \begin{pmatrix} \hat{r} - r \\ \hat{p} - p \end{pmatrix} \overset{d}{\to} \mathcal{N}\left( \mathbf{0}, \mathcal{I}(r, p)^{-1} \right), where \mathcal{I}(r, p) is the evaluated at the true parameters; the inverse provides the asymptotic covariance for inference, such as confidence intervals via the . For finite samples, the MLE \hat{r} (or equivalently the dispersion parameter \alpha = 1/r) exhibits positive , particularly when is strong or n is small; this can lead to underestimated variance in applications like . Bias-corrected , such as the first-order correction \tilde{\alpha} = \hat{\alpha} / (1 + b/n) where b is derived from expected terms, reduce this bias to second order and improve small-sample accuracy without sacrificing asymptotic properties.

Applications

Modeling Overdispersion in Count Data

Overdispersion in count data occurs when the variance exceeds the mean, violating the equidispersion assumption of the Poisson distribution. The negative binomial distribution addresses this by incorporating an additional dispersion parameter r, where finite r allows the variance to surpass the mean, modeling extra variation from unobserved heterogeneity or clustering. This arises naturally as a gamma-Poisson mixture, where the Poisson rate follows a gamma distribution, leading to unconditional overdispersion. Negative binomial regression extends this to covariate-dependent counts, formulated as a generalized linear model (GLM) with a logarithmic link function to ensure positive predicted means. The dispersion parameter, often denoted as \alpha = 1/r, scales the variance as \mu + \alpha \mu^2, where \mu is the , enabling flexible handling of without altering the mean structure. To assess suitability against the Poisson model, researchers apply a , which evaluates the of no overdispersion (\alpha = 0) by comparing log-likelihoods. Information criteria such as Akaike's Information Criterion (AIC) and (BIC) further aid , penalizing complexity while favoring better fit. Implementation is straightforward in common statistical software. In R, the glm.nb function from the MASS package fits negative binomial GLMs, automatically estimating the dispersion parameter. In Python, the statsmodels library offers the NegativeBinomial class within its discrete models module for similar estimation. Empirical applications include modeling road accident frequencies, where negative binomial regression captures from factors like varying driver behavior or environmental conditions not fully observed in the data. Similarly, in , it analyzes abundance counts across sites, accounting for extra variation due to habitat heterogeneity.

Waiting Times in Bernoulli Processes

The negative binomial distribution arises naturally in the context of a , which consists of a sequence of independent and identically distributed trials, each with success probability p. In this setting, the distribution models the number of trials required to achieve r successes, or equivalently, the number of failures preceding the r-th success. This interpretation captures waiting times for a fixed number of rare events in discrete time, where each trial represents a potential occurrence of the event with fixed probability p. A practical example occurs in discrete-time queueing systems, where customer arrivals follow a : in each time unit, an arrival happens with probability p, independently of other units. Here, the time until the r-th arrival follows a negative binomial distribution, providing a model for the waiting duration until r customers have joined the . This framework is useful for analyzing system performance in slotted-time environments, such as digital communication networks or scheduled service intervals. When the success probability p is small, the negative binomial waiting time approximates the waiting time in a continuous-time process. Specifically, for r=1, the (a special case) approaches an , and for general r, it relates to an Erlang (, bridging discrete and continuous models of rare event occurrences. In , the negative binomial distribution models the number of system before the r-th successful repair, treating each operational cycle as a where "success" is a repair and "" is a with probability $1-p. This application aids in predicting maintenance schedules and system downtime for components subject to repeated testing or operation.

Physical and Biological Contexts

In high-energy physics, the negative binomial distribution has been employed to model multiplicity distributions of charged s produced in proton-proton and proton-antiproton collisions. Experimental data from the UA1 collaboration at the , collected at center-of-mass energies up to 546 GeV, showed that these distributions fit the negative binomial form well, capturing the observed in particle counts beyond expectations. Subsequent analyses in proton-lead collisions at the confirmed this applicability, with the distribution's parameters varying systematically with energy to describe fluctuations in hadron production. In biological contexts, particularly , the negative binomial distribution models counts of and sizes of clonal expansions, where the relates to variability in branching processes across generations. For instance, in studies of cancer-initiating cells, observed clonal sizes in genetic models of mammary tumorigenesis were best fitted by a negative binomial distribution for clones below 50 cells, reflecting heterogeneous growth rates. Similarly, integrative methods for pan-cancer analysis use the negative binomial to fit somatic counts across tumor samples, accounting for due to biological heterogeneity and improving detection of . In , the negative binomial distribution describes abundance patterns in sampled communities, especially for aggregated distributions where variance exceeds the mean. It serves as a foundational model for overdispersed count data, such as or abundances in quadrats, and converges to Fisher's logarithmic series in the limit as the approaches zero, which approximates dominance in diverse ecosystems. This connection has been validated in global analyses of abundance distributions, where the negative binomial captures sampling from an underlying gamma-distributed abundance, explaining observed in empirical data from forests and environments. Recent applications in 21st-century leverage the negative binomial for modeling read counts in high-throughput sequencing data, addressing from technical and biological variability. In single-cell sequencing analyses post-2020, the distribution effectively approximates transcript counts, enabling robust expression testing despite noise in low-abundance genes. For , frameworks using the negative binomial for read abundance differences have improved accuracy in identifying variants from bulk and single-cell data, as seen in tools developed for tumor-normal sequencing pairs.

History

Origins in Early Probability Theory

The origins of the negative binomial distribution trace back to early probability theory in the 17th century. Special forms of the distribution were discussed by Blaise Pascal in 1679, particularly in the context of the "problem of points," which involved dividing stakes in interrupted games of chance and used binomial coefficients to model sequences of Bernoulli trials. Although Pascal's 1654 correspondence with Pierre de Fermat focused primarily on the binomial distribution for equal probabilities, these problems laid foundational ideas that implicitly connected to waiting times for successes, later formalized in the negative binomial framework. The distribution was first explicitly studied in 1713 by Pierre Rémond de Montmort in his Essay d'analyse sur les jeux de (second edition), where he examined it as the distribution of the number of trials required to obtain a fixed number of successes in games of chance, such as the game of "." This work provided the earliest concrete probabilistic formulation. A key precursor, the (a special case with r=1), emerged in Abraham de Moivre's 1718 treatise The Doctrine of Chances, where he calculated the probability of the number of trials until the first success in sequences of independent Bernoulli trials with success probability p. De Moivre derived this using infinite series expansions, anticipating the general case for multiple successes. This built on Nicolas Bernoulli's 1713 proof of the negative expansion (1 - x)^{-k}, establishing the mathematical foundation. An early application beyond games appeared in a medical context with John Arbuthnot's 1710 analysis of birth sex ratios in from 1629 to 1710. Arbuthnot applied probabilities, using the expansion of (1/2 + 1/2)^n, to argue for in the observed excess of male births. While using the , this highlighted the utility of modeling repeated independent trials with unequal outcomes. The extension to multiple successes was advanced in the early through works on probability and series expansions. Pierre-Simon Laplace's 1812 Théorie analytique des probabilités incorporated generating functions for calculations involving repeated independent trials, such as in decisions and astronomical observations, contributing to approximations for extended sequences. Similarly, Siméon Denis Poisson's contributions to limit theorems for in probability computations linked expansions to related distributions for large n. The term "negative binomial" originated in 19th-century texts from the distribution's probability generating function, which corresponds to the binomial series expansion with a negative exponent, (1 - p)^{-r}, generalizing the positive binomial theorem. This naming, reflecting the negative upper index in generalized binomial coefficients, was standardized in works on infinite series and probability limits.

Developments in the 20th Century

In the early , particularly during the and , the negative binomial distribution emerged as a key tool in ecological and biological sampling. Ronald A. advanced its application in modeling overdispersed count data from natural populations, such as species abundances, building on his foundational work in statistical methods for research workers. Concurrently, Felix Eggenberger and generalized the urn model in 1923, deriving the negative binomial as a process distribution that captures dependence in sequential draws, influencing subsequent probabilistic interpretations. By mid-century, the distribution's utility in addressing was further established. In 1949, J. Anscombe applied the negative binomial to analyze count data, demonstrating its superiority over the for handling variance exceeding the mean in biological assays. Around the same time, and contributed to characterizations of the negative binomial, including proofs of its uniqueness under certain conditions and quadratic variance functions within natural exponential families. The 1960s and 1980s saw the solidification of mixture representations for the negative , expanding on Greenwood and Yule's 1920 gamma-Poisson mixture for modeling multiple events like disease recurrences. William Feller's 1968 treatise provided rigorous computational methods, including recursive algorithms and generating function expansions, facilitating practical calculations for probabilities and moments in large-scale applications. In the late , the negative was integrated into regression frameworks through generalized linear models (GLMs). Peter McCullagh and John A. Nelder's 1989 monograph formalized its use in GLMs for overdispersed count responses, enabling parameter estimation via maximum likelihood and linking it to broader statistical modeling paradigms. This period also marked early software integration, with implementations in packages like PROC GENMOD by the early 1990s, allowing routine fitting of negative regressions in statistical analysis.

References

  1. [1]
    None
    ### Summary of Negative Binomial Distribution from the Document
  2. [2]
    11.4 - Negative Binomial Distributions | STAT 414 - STAT ONLINE
    Any specific negative binomial distribution depends on the value of the parameter p . A geometric distribution is a special case of a negative binomial ...
  3. [3]
  4. [4]
    11.5 - Key Properties of a Negative Binomial Random Variable
    Proof. Since we used the m.g.f. to find the mean, let's use it to find the variance as well. That is, let's use:.
  5. [5]
    Probability Playground: The Negative Binomial Distribution
    Note that a negative binomial(r, p) random variable is the sum of r independent geometric(p) random variables, with mean 1/p and variance (1 − p)/p². By ...
  6. [6]
    Tutorial 3c: Probability distributions and their stories - Justin Bois
    In a certain limit, which is easier implemented using the (μ,ϕ) parametrization below, the Negative Binomial distribution becomes a Poisson distribution. Notes.
  7. [7]
    An Overview of Modern Applications of Negative Binomial Modelling ...
    Negative binomial modelling is one of the most commonly used statistical tools for analysing count data in ecology and biodiversity research.
  8. [8]
    [PDF] AN INTRODUCTION TO THE NEGATIVE BINOMIAL DISTRIBUTION ...
    In an attempt to bridge the gap of intuitive feel for the negative binomial,. Simon (1960) discussed the curve using the more familiar Poisson as a referent.
  9. [9]
    Negative Binomial Regression | R Data Analysis Examples
    Negative binomial regression is for modeling count variables, usually for over-dispersed count outcome variables.Missing: pmf | Show results with:pmf
  10. [10]
    [PDF] 4.4 Negative Binomial Distribution
    This leads to the definition of a negative binomial random variable. Definition 4.6 A discrete random variable X with probability mass function f(x) = x+r −1 r ...
  11. [11]
    Negative Binomial Distribution -- from Wolfram MathWorld
    The negative binomial distribution, also known as the Pascal distribution or Pólya distribution, gives the probability of r-1 successes and x failures in x+r-1 ...Missing: origin | Show results with:origin<|control11|><|separator|>
  12. [12]
    scipy.special.nbdtr — SciPy v1.16.2 Manual
    ### Summary of Negative Binomial CDF Computation in `scipy.special.nbdtr`
  13. [13]
    [PDF] nbpdf
    Mar 21, 1997 · DESCRIPTION. The negative binomial distribution is used when there are exactly two mutually exclusive outcomes of a trial.
  14. [14]
    scipy.stats.nbinom — SciPy v1.16.2 Manual
    Negative binomial distribution describes a sequence of iid Bernoulli trials, repeated until a predefined, non-random number of successes occurs.1.12.01.13.11.7.01.15.01.11.4
  15. [15]
    The Negative Binomial Distribution - R
    R Documentation. The Negative Binomial Distribution. Description. Density, distribution function, quantile function and random generation for the negative ...
  16. [16]
    R: The Negative Binomial Distribution - MIT
    An alternative parametrization (often used in ecology) is by the mean mu (see above), and size , the dispersion parameter, where prob = size/(size+mu) . The ...
  17. [17]
    [PDF] Alternative Variance Parameterizations in Count Data Models
    The negative binomial distribution is the most common choice when over-dispersion is present; however, it cannot analyze data that exhibit under-dispersion ...
  18. [18]
    Sampling Theory of the Negative Binomial and Logarithmic ... - jstor
    2, k and p are defined so that the mean = lp, variance kp( 1 + 1)). The notation for the logarithmic series distribution will be: a., p, X, n,. as defined above ...
  19. [19]
    [PDF] More on Bayesian Methods - Stat@Duke
    distribution. ▷ In many cases, it is defined as Y = number of failures before the rth success. This formulation is statistically equivalent to the one ...
  20. [20]
    [PDF] Additional Notes for Negative Binomial Random Variables
    binomial random variable using the number of failures. Let Y = the number of failures until the rth success, then. pY(y) = (. r + y − 1. r − 1. )pr(1 ...Missing: interpretation | Show results with:interpretation
  21. [21]
    [PDF] LECTURE - 1) Negative Binomial Random Variable 2 ... - People
    Let Wy denote the number of failures prior to the 1st success. Let W₂ denote the number of failures between the. 1 at and 2nd success.
  22. [22]
    [PDF] 14.310x Spring 2023 Lecture 10: Special Distribution
    Geometric. • A negative binomial distribution with r = 1 is a geometric distribution [number of failures before the first success] x. • f (x; p) = pq if x = 0 ...Missing: interpretation | Show results with:interpretation
  23. [23]
    [PDF] STAT 511 - Lecture 6: The Binomial, Hypergeometric, Negative ...
    Feb 5, 2019 · The trials are identical, and each trial can result in one of the same two possible outcomes, which are denoted by success (S) or failure (F). 3 ...Missing: interpretation | Show results with:interpretation
  24. [24]
    [PDF] Grinstead and Snell's Introduction to Probability
    Probability theory began in seventeenth century France when the two great French mathematicians, Blaise Pascal and Pierre de Fermat, corresponded over two ...
  25. [25]
    [PDF] Reliability Estimation for Negative Binomial Distribution Under Type ...
    Mar 1, 2016 · In this paper, we discussed estimation procedures for parameter and reliability function of Negative Binomial distribution. We have obtained ML ...
  26. [26]
    [PDF] Hand-book on STATISTICAL DISTRIBUTIONS for experimentalists
    The above form of the Negative Binomial distribution is often referred to as the Pascal distribution after the french mathematician, physicist and ...Missing: synonym | Show results with:synonym
  27. [27]
    [PDF] APPLICATION OF DISCRETE DISTRIBUTIONS IN QUALITY ...
    3) The economic design of quality control charts. ... They are the Poisson, negative binomial, weighted sums of two. Poissons and combination Poisson and negative ...
  28. [28]
    Clinical Trial Design Using A Stopped Negative Binomial Distribution
    This completes the proof that (1) is the distribution of the stopping time and it is a valid probability mass function. We next consider an interim analysis ...
  29. [29]
    [PDF] Generating functions
    (α) dy change: y = t + θt(1 − s). = 1/(1 + θ). 1 − θs/(1 + θ) α . Putting p = 1/(1 + θ) we get the probability generating function of the negative binomial.<|control11|><|separator|>
  30. [30]
    [PDF] 1 Some Important Continuous Distributions
    • Characteristic function ϕ(t) = h p. 1−(1−p)eit ir . • If r is integer, random variable X having negative binomial NB(r, p) distribution represents the num-.
  31. [31]
    [PDF] 4 Moment generating functions - Arizona Math
    So the sum of n independent geometric random variables with the same p gives the negative binomial with parameters p and n. 4.3 Other generating functions. The ...
  32. [32]
    Negative Binomial Distribution - VrcAcademy
    ( t ) = E ( e i t X ) = M X ( i t ) = ( Q − P e i t ) − r = p r ( 1 − q e i t ) − r . Recurrence Relation for the probability of Negative Binomial Distribution.
  33. [33]
    Recursive formulas for discrete distributions
    No readable text found in the HTML.<|control11|><|separator|>
  34. [34]
    [PDF] Negative Binomial Probability distribution - La Salle University
    In the Negative Binomial Distribution, one asks about the number of tries to get the r^th success. If the x^th try produces the r^th success then there were ...
  35. [35]
    Lesson 11: Geometric and Negative Binomial Distributions | STAT 414
    This lesson covers the negative binomial and geometric distributions, including their properties and how to calculate probabilities.
  36. [36]
    [PDF] Sums of Independent Random Variables
    The convolution of k geometric distributions with common parameter p is a negative binomial distribution with parameters p and k. This can be seen by con ...
  37. [37]
    [PDF] Bayesian Data Analysis Third edition (with errors fixed as of 20 ...
    ... Bayesian Data Analysis. Third edition. (with errors fixed as of 20 February 2025). Andrew Gelman. Columbia University. John B. Carlin. University of Melbourne.
  38. [38]
    [PDF] Negative Binomial - Paul Johnson Homepage
    Hence, for small values of r, we should expect to see a very skewed distribution, whereas for higher values of r, the distribution should be more symmetrical.<|control11|><|separator|>
  39. [39]
    [PDF] Negative binomial and mixed Poisson regression
    Negative binomial regression models address overdispersion in count data, and are related to mixed Poisson models. The negative binomial distribution is ...
  40. [40]
    SAMPLING THEORY OF THE NEGATIVE BINOMIAL AND ...
    F. J. ANSCOMBE; SAMPLING THEORY OF THE NEGATIVE BINOMIAL AND LOGARITHMIC SERIES DISTRIBUTIONS, Biometrika, Volume 37, Issue 3-4, 1 December 1950, Pages 358.
  41. [41]
    [PDF] MIXED COMPOUND POISSON DISTRIBUTIONS*
    which is a logarithmic series pgf. For some other mixing distributions ... compound negative binomial and a compound Poisson (with "severity" distribu-.
  42. [42]
    [PDF] 36-310 Spring 2004: Estimation II: Some Methods of Estimation
    Apr 6, 2004 · Recall the negative binomial distribution, a discrete distribution for the number X ... is a consistent, unbiased, asymptotically normal ...
  43. [43]
    Method of Moment - an overview | ScienceDirect Topics
    In general, the estimators obtained by the method of moments are consistent, asymptotically unbiased, and have asymptotic normal distribution. However ...
  44. [44]
    Small Sample Comparison of Different Estimators of Negative ... - jstor
    Four methods of estimating the negative binomial parameters from small samples were examined: moment, maximum likelihood (ML), digamma and zero-class ...<|control11|><|separator|>
  45. [45]
    Method of Moments: Negative Binomial Distribution
    Describes how to estimate the alpha and beta parameters of the negative binomialdistribution that fits a set of data using the method of moments in Excel.
  46. [46]
    Maximum Likelihood Estimation for the Negative Binomial ... - jstor
    This paper details maximum likelihood estimation for the dispersion parameter of a negative binomial distribution, a follow-up to Clark and Perry (1989).
  47. [47]
    [PDF] Estimating a Gamma distribution 1 Introduction 2 Maximum likelihood
    2.1 Negative binomial. The maximum-likelihood problem for the negative binomial distribution is quite similar to that for the Gamma. This is because the ...
  48. [48]
    [PDF] Essentials of Count Data Regression - Colin Cameron
    Jun 30, 1999 · obtain maximum likelihood estimates. Censored counts ... A negative binomial INAR(1) model arises if εt is negative binomial distributed.
  49. [49]
    Bias-Corrected Maximum Likelihood Estimator of the Negative ... - jstor
    Maximum likelihood estimation for the negative binomial dispersion parameter. Biometrics. 46, 863-867. Ross, G. J. S. and Preece, D. A. (1985). The negative ...
  50. [50]
    [PDF] Models for Count Data With Overdispersion - Germán Rodríguez
    Nov 6, 2013 · With this information we can compute the unconditional distribution of the outcome, which happens to be the negative binomial distribution.
  51. [51]
    Chapter 6 Regression for Count Data | STAT 245 Course Notes
    Poisson and negative binomial models are fitted via maximum likelihood, so AIC or BIC may be used for model selection. How can we use model selection ...
  52. [52]
    [PDF] Support Functions and Datasets for Venables and Ripley's MASS
    A modification of the system function glm() to include estimation of the additional parameter, theta, for a Negative Binomial generalized linear model. Usage.
  53. [53]
    statsmodels.discrete.discrete_model.NegativeBinomial
    The NegativeBinomial class is a Negative Binomial Model, taking a 1-d endogenous response variable and a nobs x k array of regressors.
  54. [54]
    [PDF] Analytic Methods in Accident Research - Purdue Engineering
    For example, the negative binomial model (or Poisson–Gamma) became widely used because it can handle overdispersed data (data where the mean of the frequencies ...<|separator|>
  55. [55]
    [PDF] QUASI-POISSON VS. NEGATIVE BINOMIAL REGRESSION
    A common way to deal with overdispersion for counts is to use a generalized linear model framework (McCullagh and Nelder 1989), where the most common approach ...
  56. [56]
    [PDF] arXiv:1601.05179v2 [math.PR] 8 Feb 2016
    Feb 8, 2016 · ... negative binomial distribution. The link to waiting times in Bernoulli processes will be explored in Sectoin 8. ... Bernoulli process with ...
  57. [57]
    Analysis of Repairable Geo/G/1 Queues with Negative Customers
    Dec 24, 2014 · queue [7]. Excellent surveys on negative customers have been ... negative binomial distribution. Figures 1 to 3 confirm that the mean ...
  58. [58]
    [PDF] Poisson processes
    Don't confuse the exponential density with the exponential function. Notice the parallels between the negative binomial distribution (in discrete time) and the.
  59. [59]
    [PDF] PDF
    Apr 30, 1985 · The two parameters of the distribution depend on energy in a regular manner, which affords the possibility of predicting multiplicity ...
  60. [60]
    Clonal Dynamics of Cancer-Initiating Cells in a Mouse Genetic ...
    Apr 3, 2022 · The observed clonal sizes were best fit by a negative binomial model for clones below 50 cells (KS test, p-value<0.1, see methods for details) ( ...
  61. [61]
    NIMBus: a negative binomial regression based Integrative Method ...
    Resultantly, it modeled the mutation counts data using a two-parameter negative binomial distribution, which improved data fitting dramatically as compared to ...
  62. [62]
    Models for the logarithmic species abundance distributions
    ... Fisher's logarithmic series distribution for species abundance. ... Sampling theory of the negative binomial and logarithmic series distributions.
  63. [63]
    Unveiling global species abundance distributions - Nature
    Sep 4, 2023 · The negative-binomial distribution corresponds to Poisson sampling of an underlying gamma distribution, with the log-series corresponding to a ...
  64. [64]
    Theoretical framework for the difference of two negative binomial ...
    ... sample sizes, improving the reliability and performance of comparative analysis of HTS data. ... On the negative binomial distribution and its generalizations.
  65. [65]
    The Binomial Distribution: Historical Origin and Evolution of Its ...
    We present five historical links to the binomial phenomenon where problem situations of increasing complexity were addressed.
  66. [66]
    [PDF] A History of Probability and Statistics and Their Applications before ...
    They derived the binomial and negative binomial distributions, the hypergeometric distribution, the multivariate version of these distribu- tions, the ...
  67. [67]
    A Manuscript on Chance Written by John Arbuthnot - jstor
    Using the chance model leading to the binomial distribution, Arbuthnot finds that the ... Arbuthnot (1710) narrowly defined a chance event in the sex of newbo.
  68. [68]
    The doctrine of chances: or, a method of calculating the probability ...
    Oct 1, 2023 · The doctrine of chances: or, a method of calculating the probability of events in play. By A. de Moivre. F. R. S. 1718. by: Moivre, Abraham de.Missing: negative binomial
  69. [69]
    [PDF] The doctrine of chances - IME-USP
    Difadvantages ofthofe Gameswherein Chance is concerned. Befides the Endowments of the Mind which you have in common with Thofe w^hofe natural Talents havebeen.
  70. [70]
    [PDF] THE ANALYTIC THEORY OF PROBABILITIES Third Edition Book I ...
    Pierre-Simon Laplace, Théorie analytique de probabilités, Paris: Courcier, Paris, 1812. 5. Pierre-Simon Laplace, Théorie analytique des probabilités, Paris: ...
  71. [71]
    binomial versus poisson1 - Project Euclid
    As a member of compound Poisson distribution family, negative binomial can play a better role than Poisson.
  72. [72]
    Negative Binomial Series -- from Wolfram MathWorld
    The series which arises in the binomial theorem for negative integer -n ,. (x+a)^(-n), = sum_(k=0)^(infty)(-n; k). (1). = sum_(k=0)^(infty)(-1)^k.Missing: 19th century naming
  73. [73]
    [PDF] The Relation Between the Number of Species and the Number of ...
    The limiting form of the negative binomial, excluding zero observations. In many of its applications the number n observed in any sample may have all integral ...<|separator|>