Fact-checked by Grok 2 weeks ago

Exponential distribution

The exponential distribution is a continuous probability distribution defined on the non-negative real numbers, with probability density function f(x; \lambda) = \lambda e^{-\lambda x} for x \geq 0 and rate parameter \lambda > 0, modeling the waiting time until the first event in a Poisson process where events occur continuously and independently at a constant average rate \lambda.^[1] It is distinguished by its memoryless property, which states that the conditional probability of the waiting time exceeding x + y given that it has already exceeded x equals the unconditional probability of exceeding y, for x, y > 0.^[2] This property implies a constant hazard rate of \lambda, making the exponential distribution the only continuous distribution with a failure rate independent of time.^[2] Key statistical properties include a mean of $1/\lambda and variance of $1/\lambda^2, with the cumulative distribution function given by F(x; \lambda) = 1 - e^{-\lambda x} for x \geq 0.^[3]^[2] The moment-generating function is M(t) = \lambda / (\lambda - t) for t < \lambda.^[3] These characteristics position the exponential distribution as a foundational model in stochastic processes, where it serves as the interarrival time distribution for the Poisson process.^[1] The exponential distribution finds extensive applications across disciplines, including queueing theory for modeling customer arrival intervals, reliability engineering for constant-failure-rate components such as electronic systems, and survival analysis for lifetimes or time-to-event data in biological and medical contexts.^[1]^[2]^[4] It also approximates processes like radioactive decay and photon emissions, where events follow a Poisson pattern.^[2]

Definitions

Probability Density Function

The probability density function (PDF) of the exponential distribution with rate parameter \lambda > 0 is given by

f(x; \lambda) = \lambda e^{-\lambda x}, \quad x \geq 0,

and f(x; \lambda) = 0 for x < 0.^[1]^[5]^[6] This PDF exhibits an exponential decay, beginning at a maximum value of \lambda when x = 0 and asymptotically approaching 0 as x increases to infinity, resulting in a right-skewed curve that is strictly positive over the non-negative real line.^[5]^[6] The support is confined to x \geq 0, reflecting the distribution's application to non-negative quantities such as durations or waiting times.^[1]^[5] The parameter \lambda represents the instantaneous rate of occurrence of an event, where higher values of \lambda correspond to a steeper initial decay and more frequent events on average.^[1]^[6] In the context of a Poisson process, the exponential distribution arises naturally as the distribution of inter-arrival times between successive events, with \lambda denoting the average rate of the process.^[1]^[5]

Cumulative Distribution Function

The cumulative distribution function (CDF) of an exponential random variable X with rate parameter \lambda > 0 is given by F(x; \lambda) = P(X \leq x). It is derived by integrating the probability density function (PDF) f(t) = \lambda e^{-\lambda t} for t \geq 0 from 0 to x:

F(x; \lambda) = \int_0^x \lambda e^{-\lambda t} \, dt = \left[ -e^{-\lambda t} \right]_0^x = 1 - e^{-\lambda x}, \quad x \geq 0,

with F(x; \lambda) = 0 for x < 0.^[7]^[1] This CDF exhibits key properties: F(0; \lambda) = 0, \lim_{x \to \infty} F(x; \lambda) = 1, and it is strictly increasing and continuous on [0, \infty). The PDF can be recovered as the derivative of the CDF, f(x; \lambda) = \frac{d}{dx} F(x; \lambda).^[7]^[8] In probability calculations, the CDF directly computes P(X \leq x) = F(x; \lambda). The complementary survival function, S(x; \lambda) = P(X > x) = 1 - F(x; \lambda) = e^{-\lambda x} for x \geq 0, quantifies the probability of exceeding x.^[7] Graphically, the CDF traces a smooth S-shaped curve, originating at (0, 0) and asymptotically approaching 1 as x grows, reflecting the accumulation of probability over the positive real line.^[9]

Parameterizations

The exponential distribution is commonly parameterized using a rate parameter \lambda > 0, which represents the number of events per unit time, such as arrivals or failures. In this standard rate parameterization, the mean of the distribution is $1/\lambda.^[10]^[11] An equivalent scale parameterization uses \beta > 0, defined as the mean lifetime or expected value, where \beta = 1/\lambda. The probability density function in this form is given by

f(x; \beta) = \frac{1}{\beta} e^{-x/\beta}, \quad x \geq 0.

^[10]^[11] The conversion between parameterizations is straightforward: \lambda = 1/\beta, ensuring equivalence in the moments and cumulative distribution function across both forms.^[10]^[12] The rate parameterization with \lambda is prevalent in modeling Poisson processes, where it directly corresponds to the intensity of event occurrences.^[13] In contrast, the scale parameterization with \beta is more common in reliability engineering, emphasizing durations like time to failure under constant hazard rates.^[14]^[15] The rate form is typically preferred for high-frequency events, such as queueing arrivals, while the scale form suits analyses of prolonged durations, like component lifetimes.^[12]^[14]

Moments and Basic Properties

Mean, Variance, and Higher Moments

The expected value of an exponential random variable X with rate parameter \lambda > 0 is given by E[X] = \frac{1}{\lambda}. This follows from the definition of the probability density function f(x) = \lambda e^{-\lambda x} for x \geq 0, where the mean is computed as the integral \int_0^\infty x \lambda e^{-\lambda x} \, dx. Using integration by parts, let u = x and dv = \lambda e^{-\lambda x} \, dx, so du = dx and v = -e^{-\lambda x}, yielding \left[ -x e^{-\lambda x} \right]_0^\infty + \int_0^\infty e^{-\lambda x} \, dx = 0 + \frac{1}{\lambda} = \frac{1}{\lambda}.^[16] In the scale parameterization, where the density is f(x) = \frac{1}{\beta} e^{-x/\beta} for scale parameter \beta > 0, the mean is E[X] = \beta, with \beta = 1/\lambda. The variance is \text{Var}(X) = E[X^2] - (E[X])^2. First, E[X^2] = \int_0^\infty x^2 \lambda e^{-\lambda x} \, dx, which by repeated integration by parts (or recognizing it as the second moment of a Gamma(2, \lambda) distribution) equals \frac{2}{\lambda^2}. Thus, \text{Var}(X) = \frac{2}{\lambda^2} - \left(\frac{1}{\lambda}\right)^2 = \frac{1}{\lambda^2}. In scale form, \text{Var}(X) = \beta^2.^[16]^[17] The higher-order moments are E[X^k] = \frac{k!}{\lambda^k} for positive integer k. This general formula arises from the integral \int_0^\infty x^k \lambda e^{-\lambda x} \, dx = \lambda \cdot \frac{\Gamma(k+1)}{\lambda^{k+1}} = \frac{k!}{\lambda^k}, since \Gamma(k+1) = k! for integer k, leveraging the Gamma function representation of the exponential distribution as a special case of the Gamma(1, $1/\lambda) family. Alternatively, the moment-generating function M(t) = \frac{\lambda}{\lambda - t} for t < \lambda yields the k-th moment as the k-th derivative evaluated at t=0, confirming the factorial form. In scale parameterization, E[X^k] = k! \beta^k.^[16] The coefficient of variation, defined as \text{CV}(X) = \frac{\sqrt{\text{Var}(X)}}{E[X]}, equals 1 for the exponential distribution, since \sqrt{1/\lambda^2} / (1/\lambda) = 1. This unit value indicates that the standard deviation equals the mean, reflecting the high relative variability inherent in the distribution's heavy right tail.^[18] The skewness, measuring asymmetry, is \gamma_1 = \frac{E[(X - \mu)^3]}{\sigma^3} = 2, where \mu = E[X] and \sigma^2 = \text{Var}(X). This positive value of 2 underscores the exponential distribution's right-skewed nature, computed using the third central moment derived from E[X^3] = \frac{6}{\lambda^3}: E[(X - \mu)^3] = E[X^3] - 3\mu E[X^2] + 2\mu^3 = \frac{6}{\lambda^3} - 3 \cdot \frac{1}{\lambda} \cdot \frac{2}{\lambda^2} + 2 \left(\frac{1}{\lambda}\right)^3 = \frac{2}{\lambda^3}, so \gamma_1 = \frac{2/\lambda^3}{(1/\lambda^2)^{3/2}} = 2.^[17]^[18]

Median and Quantiles

The median of an exponential random variable with rate parameter \lambda > 0 is the value m such that the cumulative distribution function F(m) = 0.5, given by m = \frac{\ln 2}{\lambda} \approx \frac{0.693}{\lambda}.^[19]^[20] The general quantile function, or inverse cumulative distribution function, for the exponential distribution is Q(p) = -\frac{\ln(1-p)}{\lambda} for p \in (0,1), which provides the value x such that F(x) = p.^[19]^[21] This follows from solving the cumulative distribution function equation $1 - e^{-\lambda x} = p for x, yielding e^{-\lambda x} = 1 - p, \lambda x = -\ln(1-p), and thus x = -\frac{\ln(1-p)}{\lambda}.^[21] The first quartile is Q(0.25) = -\frac{\ln(0.75)}{\lambda} \approx \frac{0.288}{\lambda} and the third quartile is Q(0.75) = -\frac{\ln(0.25)}{\lambda} = \frac{\ln 4}{\lambda} \approx \frac{1.386}{\lambda}.^[19] The interquartile range, the difference between the third and first quartiles, is therefore \frac{\ln 3}{\lambda} \approx \frac{1.099}{\lambda}.^[19] Due to the positive skewness of the exponential distribution, the median is less than the mean, with \frac{\ln 2}{\lambda} < \frac{1}{\lambda}.^[19]

Key Characteristics

Memorylessness Property

The memoryless property of the exponential distribution states that the conditional probability of the random variable X exceeding a sum s + t, given that it already exceeds s, equals the unconditional probability of exceeding t, for all s, t > 0:
P(X > s + t \mid X > s) = P(X > t). ^[6] This property can be proven using the survival function, which for an exponential random variable with rate parameter \lambda > 0 is P(X > x) = e^{-\lambda x} for x > 0. Substituting into the conditional probability yields
P(X > s + t \mid X > s) = \frac{P(X > s + t)}{P(X > s)} = \frac{e^{-\lambda(s+t)}}{e^{-\lambda s}} = e^{-\lambda t} = P(X > t). ^[22] Among continuous distributions supported on the positive reals, the exponential distribution is the only one exhibiting this memoryless property.^[23] The proof involves showing that the memoryless condition implies the survival function satisfies S(s + t) = S(s)S(t), whose general solution for continuous cases is the exponential form S(x) = e^{-\lambda x}.^[24] The memoryless property implies that the distribution exhibits no aging: the expected remaining lifetime is independent of the time already elapsed, making it suitable for modeling phenomena where past duration does not influence future behavior.^[25] This independence of elapsed time from remaining lifetime underscores the distribution's lack of "memory" of prior events.^[26] The memoryless property forms the foundation for continuous-time Markov chains, where holding times in states follow exponential distributions to ensure the Markov property—that future states depend only on the current state, not the history.^[27]

Maximum Entropy Distribution

The differential entropy H(X) of a continuous random variable X with probability density function f is defined as

H(X) = -\int_{0}^{\infty} f(x) \ln f(x) \, dx,

where the integral is over the support of the distribution, assuming a non-negative domain for this context.^[28] For the exponential distribution with rate parameter \lambda > 0, which has density f(x) = \lambda e^{-\lambda x} for x \geq 0 and mean \mu = 1/\lambda, the differential entropy evaluates to $1 - \ln \lambda (in nats).^[28] This value is derived by substituting the density into the entropy integral and computing the expectation: H(X) = -\mathbb{E}[\ln f(X)] = -(\ln \lambda - \lambda \cdot \mu) = 1 - \ln \lambda, leveraging the known mean.^[29] Among all probability distributions supported on [0, \infty) with a fixed mean \mu, the exponential distribution achieves the maximum possible entropy.^[30] This maximization can be proven using the method of Lagrange multipliers, where the functional to optimize is the entropy subject to the normalization constraint \int_{0}^{\infty} f(x) \, dx = 1 and the mean constraint \int_{0}^{\infty} x f(x) \, dx = \mu, with f(x) \geq 0. The resulting Euler-Lagrange equation yields the exponential form f(x) = (1/\mu) e^{-x/\mu}, confirming it as the unique maximizer.^[28] Variational methods similarly establish that no other distribution with the same support and mean can exceed this entropy bound, with equality holding if and only if the distribution is exponential.^[30] This property positions the exponential distribution as a cornerstone in information theory, embodying Jaynes' principle of maximum entropy, which advocates selecting the distribution that is maximally noncommittal given the available constraints, thereby providing the least biased probabilistic inference.^[31]

Advanced Properties

Distribution of the Minimum of Independent Exponentials

Consider n independent and identically distributed (i.i.d.) exponential random variables X_1, X_2, \dots, X_n each with rate parameter \lambda > 0, so that each X_i has cumulative distribution function (CDF) F(x) = 1 - e^{-\lambda x} for x \geq 0. Define Y = \min\{X_1, X_2, \dots, X_n\} as the minimum of these variables.^[32] The CDF of Y is derived as follows:

P(Y \leq y) = 1 - P(Y > y) = 1 - P(X_1 > y, X_2 > y, \dots, X_n > y).

By independence, P(Y > y) = [P(X_1 > y)]^n = [e^{-\lambda y}]^n = e^{-n\lambda y} for y \geq 0. Thus,

P(Y \leq y) = 1 - e^{-n\lambda y},

which is the CDF of an exponential random variable with rate n\lambda. Therefore, Y \sim \operatorname{Exp}(n\lambda).^[18]^[26] This result indicates that the minimum of i.i.d. exponentials remains exponentially distributed, but with the rate scaled by the number of variables n. The scaling reflects an increased likelihood of the minimum occurring sooner as more variables are considered.^[33] In reliability theory, this distribution arises as the lifetime of a series system, where the system fails upon the failure of the first component, corresponding to the minimum lifetime among n i.i.d. exponential component lifetimes.^[18]

Sum of Independent Exponential Random Variables

Consider the sum S = X_1 + \dots + X_n, where X_1, \dots, X_n are independent and identically distributed (i.i.d.) exponential random variables, each with rate parameter \lambda > 0.^[34] This sum S follows an Erlang distribution with shape parameter n and rate parameter \lambda, which is a special case of the gamma distribution where the shape is an integer.^[34]^[24] The probability density function (PDF) of S is given by

f_S(s) = \frac{\lambda^n s^{n-1} e^{-\lambda s}}{(n-1)!}, \quad s > 0,

and f_S(s) = 0 otherwise.^[34] This result can be derived using the convolution of the densities of the individual exponentials. For two i.i.d. exponentials X_1 and X_2, the density of S_2 = X_1 + X_2 is the convolution

f_{S_2}(s) = \int_0^s \lambda e^{-\lambda u} \lambda e^{-\lambda (s-u)} \, du = \lambda^2 s e^{-\lambda s}, \quad s > 0.

By induction, repeated convolution yields the general PDF for n variables.^[34] Alternatively, the derivation uses moment-generating functions (MGFs). The MGF of each X_i is M_{X_i}(t) = \frac{\lambda}{\lambda - t} for t < \lambda. Since the X_i are independent, the MGF of S is

M_S(t) = \left( \frac{\lambda}{\lambda - t} \right)^n, \quad t < \lambda,

which matches the MGF of the Erlang distribution with parameters n and \lambda.^[24] The expected value and variance of S are \mathbb{E}[S] = \frac{n}{\lambda} and \mathrm{Var}(S) = \frac{n}{\lambda^2}, respectively, which follow from the linearity of expectation and variance for independent random variables.^[34]^[24]

Joint Moments of Order Statistics

Let X_1, X_2, \dots, X_n be independent and identically distributed (i.i.d.) random variables following an exponential distribution with rate parameter \lambda > 0, denoted \operatorname{Exp}(\lambda). The corresponding order statistics are defined as X_{(1)} \leq X_{(2)} \leq \dots \leq X_{(n)}.^[35] The joint moments of these order statistics can be derived using the representation in terms of spacings. Define the spacings D_k = X_{(k)} - X_{(k-1)} for k = 1, \dots, n, where X_{(0)} = 0. These spacings are independent, with D_k \sim \operatorname{Exp}(\lambda (n - k + 1)). Consequently, X_{(i)} = \sum_{k=1}^i D_k for each i = 1, \dots, n.^[35] For $1 \leq i < j \leq n, the second-order joint moment is given by

E[X_{(i)} X_{(j)}] = E[X_{(i)}] E[X_{(j)}] + \operatorname{Var}(X_{(i)}),

since \operatorname{Cov}(X_{(i)}, X_{(j)}) = \operatorname{Var}(X_{(i)}) due to the independence of the spacings after X_{(i)}. The marginal expectations are

E[X_{(i)}] = \frac{1}{\lambda} \sum_{k=1}^i \frac{1}{n - k + 1} = \frac{1}{\lambda} (H_n - H_{n-i}),

where H_m = \sum_{\ell=1}^m \frac{1}{\ell} is the m-th harmonic number (with H_0 = 0). The variance is

\operatorname{Var}(X_{(i)}) = \frac{1}{\lambda^2} \sum_{k=1}^i \frac{1}{(n - k + 1)^2} = \frac{1}{\lambda^2} \sum_{m = n - i + 1}^n \frac{1}{m^2}.

Thus,

E[X_{(i)} X_{(j)}] = \frac{1}{\lambda^2} \left[ (H_n - H_{n-i})(H_n - H_{n-j}) + \sum_{m = n - i + 1}^n \frac{1}{m^2} \right].

These expressions follow from the independent spacing representation.^[35] These joint moments are particularly useful in non-parametric inference for estimating the rate parameter \lambda from ordered exponential data, such as in spacing-based estimators that leverage the ordered sample structure for improved efficiency.^[36] Closed-form expressions for second-order joint moments are straightforward via the above sums, but higher-order joint moments (e.g., E[X_{(i)} X_{(j)} X_{(k)}]) become more intricate, often requiring recursive computations or expansions of multivariate sums over the independent spacings.^[35]

Information and Divergence Measures

Kullback-Leibler Divergence

The Kullback-Leibler (KL) divergence is a measure of the difference between two probability distributions P and Q with corresponding probability density functions f_P and f_Q, defined for continuous distributions as

D_{\text{KL}}(P \parallel Q) = \int_{-\infty}^{\infty} f_P(x) \ln \left( \frac{f_P(x)}{f_Q(x)} \right) \, dx.

This quantity quantifies the expected additional information required to encode samples from P using a coding scheme optimized for Q, representing the information loss incurred when approximating P by Q. It is always non-negative and equals zero if and only if P = Q almost everywhere. For the exponential distribution, consider P = \text{Exp}(\lambda) with density f_P(x) = \lambda e^{-\lambda x} for x \geq 0 and Q = \text{Exp}(\mu) with density f_Q(x) = \mu e^{-\mu x} for x \geq 0, where \lambda > 0 and \mu > 0 are the rate parameters. The KL divergence between these distributions is

D_{\text{KL}}(\text{Exp}(\lambda) \parallel \text{Exp}(\mu)) = \ln\left(\frac{\lambda}{\mu}\right) + \frac{\mu}{\lambda} - 1.

This closed-form expression arises from substituting the densities into the general definition:

D_{\text{KL}}(\text{Exp}(\lambda) \parallel \text{Exp}(\mu)) = \int_0^\infty \lambda e^{-\lambda x} \ln \left( \frac{\lambda e^{-\lambda x}}{\mu e^{-\mu x}} \right) \, dx = \int_0^\infty \lambda e^{-\lambda x} \left[ \ln\left(\frac{\lambda}{\mu}\right) + (\mu - \lambda) x \right] \, dx.

The first term integrates to \ln(\lambda / \mu), while the second term integrates to (\mu - \lambda) \cdot (1 / \lambda) = \mu / \lambda - 1, yielding the final result.^[37] The divergence D_{\text{KL}}(\text{Exp}(\lambda) \parallel \text{Exp}(\mu)) vanishes precisely when \lambda = \mu, confirming the distributions are identical, and increases as the rates diverge, penalizing mismatches in the tail decay or mean ($1/\lambda vs. $1/\mu). In statistical applications, such as model selection within exponential families, this measure assesses the relative fit of candidate models by quantifying the inefficiency of using one rate parameter to approximate another, often as part of criteria like the Akaike information criterion that incorporate KL-based penalties.

Fisher Information

The Fisher information measures the amount of information that an observable random variable carries about an unknown parameter in a statistical model. For a single observation from a parametric family with density f(x; \lambda), it is defined as I(\lambda) = \mathbb{E}\left[ \left( \frac{\partial}{\partial \lambda} \ln f(X; \lambda) \right)^2 \right] = -\mathbb{E}\left[ \frac{\partial^2}{\partial \lambda^2} \ln f(X; \lambda) \right], where the expectations are taken with respect to the distribution parameterized by \lambda.^[38] For the exponential distribution with rate parameter \lambda > 0, the probability density function is f(x; \lambda) = \lambda e^{-\lambda x} for x \geq 0. The log-likelihood for a single observation is \ln f(x; \lambda) = \ln \lambda - \lambda x, so the score function (first derivative) is \frac{\partial}{\partial \lambda} \ln f(x; \lambda) = \frac{1}{\lambda} - x. The second derivative is \frac{\partial^2}{\partial \lambda^2} \ln f(x; \lambda) = -\frac{1}{\lambda^2}, which is non-random and negative, confirming regularity conditions. Thus, the Fisher information is I(\lambda) = -\mathbb{E}\left[ -\frac{1}{\lambda^2} \right] = \frac{1}{\lambda^2}.^[38]^[39] This value decreases as \lambda increases, indicating less information about the rate parameter for distributions with higher rates (shorter expected lifetimes). For a sample of n independent and identically distributed exponential random variables, the Fisher information adds up, yielding I_n(\lambda) = \frac{n}{\lambda^2}.^[38]^[39] The Fisher information plays a central role in asymptotic inference by determining the Cramér-Rao lower bound for the variance of any unbiased estimator \hat{\lambda} of \lambda: \mathrm{Var}(\hat{\lambda}) \geq \frac{1}{n I(\lambda)} = \frac{\lambda^2}{n}. This bound is achieved asymptotically by the maximum likelihood estimator, highlighting the efficiency of such estimators for large n.^[38] In the scale parameterization, where the exponential distribution has mean \beta = 1/\lambda > 0 and density f(x; \beta) = \frac{1}{\beta} e^{-x/\beta} for x \geq 0, the Fisher information is equivalently I(\beta) = \frac{1}{\beta^2}, reflecting the reparameterization invariance up to the Jacobian factor. For n i.i.d. observations, it becomes \frac{n}{\beta^2}, and the Cramér-Rao bound is \mathrm{Var}(\hat{\beta}) \geq \frac{\beta^2}{n}.^[39]

Risk Measures

Conditional Value at Risk

The Conditional Value at Risk (CVaR), also known as Expected Shortfall, at confidence level \alpha \in (0,1) is defined as the conditional expectation of a loss random variable X given that it exceeds its Value at Risk (VaR) at level \alpha, that is,

\text{CVaR}_\alpha(X) = \mathbb{E}[X \mid X > \text{VaR}_\alpha(X)],

where \text{VaR}_\alpha(X) denotes the \alpha-quantile of the distribution of X. For an exponential random variable X \sim \text{Exp}(\lambda) with rate parameter \lambda > 0, the VaR at level \alpha is

\text{VaR}_\alpha(X) = \frac{-\ln(1 - \alpha)}{\lambda}.

The corresponding CVaR is then

\text{CVaR}_\alpha(X) = \text{VaR}_\alpha(X) + \frac{1}{\lambda} = \frac{-\ln(1 - \alpha)}{\lambda} + \frac{1}{\lambda}.

This closed-form expression arises from the memorylessness property of the exponential distribution, which states that the excess life X - q given X > q follows the same \text{Exp}(\lambda) distribution for any q > 0, yielding \mathbb{E}[X \mid X > q] = q + 1/\lambda.^[40]^[41] The result can also be derived by integrating the tail expectation using the survival function S(x) = e^{-\lambda x}, confirming the additive mean offset.^[40] CVaR_\alpha quantifies the average severity of losses in the upper (1 - \alpha) tail of the distribution, beyond the VaR threshold, thus capturing the magnitude of extreme events. For instance, at \alpha = 0.95, it emphasizes the expected loss conditional on exceeding the 95th percentile, providing insight into tail risk for applications like insurance or finance. In contrast to VaR, which merely thresholds potential losses, CVaR incorporates their average depth, offering a more robust assessment of downside risk. Furthermore, CVaR satisfies the axioms of coherent risk measures—subadditivity, positive homogeneity, monotonicity, and translation invariance—ensuring desirable properties for risk aggregation and portfolio optimization.^[40]

Buffered Probability of Exceedance

The buffered probability of exceedance (bPOE) extends the concept of probability of exceedance by accounting for a buffer beyond the value at risk threshold, offering a refined view of tail risks. For a random variable X and confidence level \alpha \in (0,1), it is defined as \text{bPOE}_\alpha(\delta) = P(X > \text{[VaR](/page/Var)}_\alpha + \delta), where \text{[VaR](/page/Var)}_\alpha denotes the \alpha-value at risk and \delta > 0 is the buffer amount. This formulation quantifies the likelihood of surpassing a heightened loss threshold, incorporating additional stress beyond the standard VaR level. For the exponential distribution with rate parameter \lambda > 0, the \alpha-VaR is given by

\text{VaR}_\alpha = -\frac{\ln(1 - \alpha)}{\lambda}.

This follows from solving P(X \leq \text{[VaR](/page/Var)}_\alpha) = \alpha using the cumulative distribution function F(x) = 1 - e^{-\lambda x}, yielding the quantile formula above.^[41] The bPOE for this distribution simplifies due to the explicit survival function S(t) = P(X > t) = e^{-\lambda t}. Substituting the VaR expression gives \begin{align*} \text{bPOE}\alpha(\delta) &= S(\text{VaR}\alpha + \delta) \ &= e^{-\lambda (\text{VaR}\alpha + \delta)} \ &= e^{-\lambda \text{VaR}\alpha} \cdot e^{-\lambda \delta} \ &= (1 - \alpha) e^{-\lambda \delta}, \end{align*} where the step e^{-\lambda \text{[VaR](/page/Var)}_\alpha} = 1 - \alpha holds by construction of the VaR. This closed-form expression arises directly from the memoryless property of the exponential distribution, which ensures that tail probabilities factor independently of the initial threshold.^[41] The bPOE serves to evaluate the probability of losses exceeding a deliberately stressed benchmark, making it valuable in financial applications such as stress testing portfolios under adverse scenarios. By adding the buffer \delta, it captures the potential for more severe deviations than those implied by VaR alone, enhancing robustness assessments in regulatory and operational contexts. Compared to the standard probability of exceedance P(X > \text{VaR}_\alpha) = 1 - \alpha, the bPOE is more conservative for \delta > 0, as the elevated threshold reduces the exceedance probability to (1 - \alpha) e^{-\lambda \delta} < 1 - \alpha, emphasizing rarer but more extreme events.

Erlang and Gamma Connections

The Erlang distribution arises as the distribution of the sum of k independent and identically distributed exponential random variables, each with rate parameter \lambda > 0.^[42] Specifically, if X_1, X_2, \dots, X_k are i.i.d. \operatorname{Exp}(\lambda), then S = X_1 + \dots + X_k follows an \operatorname{Erlang}(k, \lambda) distribution for integer k \geq 1.^[42] The probability density function of the Erlang distribution is given by

f_S(x) = \frac{\lambda^k x^{k-1} e^{-\lambda x}}{(k-1)!}, \quad x > 0.

^[43] The exponential distribution is a special case of the more general gamma distribution, which provides a continuous extension allowing non-integer shape parameters. In the shape-scale parameterization, an \operatorname{Exp}(\lambda) random variable corresponds to a \operatorname{Gamma}(1, 1/\lambda) distribution, where the shape parameter \alpha = 1 and the scale parameter \theta = 1/\lambda.^[44] The Erlang distribution further fits within this framework as \operatorname{Gamma}(k, 1/\lambda) when the shape \alpha = k is a positive integer.^[45] This connection extends to moment-generating functions, where the gamma distribution's MGF is (1 - \theta t)^{-\alpha} for t < 1/\theta, reducing to the exponential MGF (1 - \theta t)^{-1} when \alpha = 1.^[46] As k \to \infty, the Erlang distribution \operatorname{Erlang}(k, \lambda) approximates a normal distribution by the central limit theorem, since it is the sum of k i.i.d. exponential variables with finite mean and variance.^[47] The Erlang distribution is named after the Danish mathematician and engineer Agner Krarup Erlang (1878–1929), who developed it in the context of queuing models for telephone traffic analysis.^[48] The exponential distribution serves as a special case of the Weibull distribution when the shape parameter is equal to 1, reducing the more general Weibull model—which accommodates monotonically increasing, decreasing, or constant hazard rates—to the constant hazard rate characteristic of the exponential.^[49] In contrast, the Pareto distribution provides a heavy-tailed alternative to the exponential, where the survival function decays as a power law rather than exponentially, resulting in lighter tails for the exponential distribution compared to the Pareto's slower decay that allows for more extreme values.^[50] The Laplace distribution, also known as the double exponential, is symmetric around its location parameter and relates to the exponential through the absolute value transformation: if X follows a Laplace distribution with mean 0 and scale b > 0, then |X| follows an exponential distribution with rate $1/b.^[51] This connection highlights the exponential's role in modeling positive deviations in symmetric heavy-tailed scenarios. Hyperexponential distributions extend the exponential by forming mixtures of multiple independent exponential distributions with distinct rates, often used in phase-type models for Markov chains to approximate more complex service time behaviors in queueing systems.^[52] These mixtures allow for greater flexibility in capturing variability beyond a single exponential phase.^[53] A scaled exponential random variable also links to the chi-squared distribution: if X follows an exponential distribution with rate \lambda, then $2\lambda X follows a chi-squared distribution with 2 degrees of freedom.^[54] This relationship underscores the exponential's position within the broader gamma family, as the chi-squared with 2 degrees of freedom is equivalent to a gamma distribution with shape 1 and scale 2.^[55]

Statistical Inference

Parameter Estimation

The parameter estimation for the exponential distribution focuses on estimating the rate parameter \lambda > 0 from a random sample X_1, \dots, X_n \stackrel{\text{iid}}{\sim} \text{[Exp](/page/Exp)}(\lambda), where the probability density function is f(x; \lambda) = \lambda e^{-\lambda x} for x \geq 0. Classical frequentist approaches include the method of moments and maximum likelihood estimation, both yielding the same point estimator but differing in theoretical properties.^[56] The method of moments estimator equates the first population moment to its sample counterpart. Since E(X_i) = 1/\lambda, the sample mean \bar{X} = n^{-1} \sum_{i=1}^n X_i provides an unbiased estimate of $1/\lambda, leading to \hat{\lambda}_{\text{MM}} = 1/\bar{X}. However, this estimator is biased for \lambda itself, with expected value E(\hat{\lambda}_{\text{MM}}) = n \lambda / (n-1).^[56]^[57] The maximum likelihood estimator maximizes the likelihood function L(\lambda) = \lambda^n \exp(-\lambda \sum_{i=1}^n X_i), resulting in \hat{\lambda}_{\text{MLE}} = n / \sum_{i=1}^n X_i = 1/\bar{X}, identical to the method of moments estimator. This estimator is consistent and asymptotically efficient, achieving the Cramér-Rao lower bound with variance $1/(n I(\lambda)), where I(\lambda) = 1/\lambda^2 is the Fisher information for a single observation.^[56]^[58] Additionally, \hat{\lambda}_{\text{MLE}} is a function of the sufficient statistic \sum X_i, which captures all information about \lambda in the sample, and it satisfies the invariance property: if \hat{\lambda}_{\text{MLE}} is the MLE of \lambda, then g(\hat{\lambda}_{\text{MLE}}) is the MLE of g(\lambda) for any one-to-one function g.^[59]^[58] In the presence of right-censored data, common in survival analysis where some observations X_i are only known to exceed a censoring time C_i, the likelihood is adjusted to L(\lambda) = \prod_{i=1}^d \lambda e^{-\lambda x_i} \prod_{i=d+1}^n e^{-\lambda c_i}, where d is the number of uncensored failures and x_i \leq c_i for censored cases. The resulting MLE is \hat{\lambda}_{\text{MLE}} = d / \sum_{i=1}^n t_i, with t_i = x_i if uncensored and t_i = c_i if censored, representing total exposure time.^[60] For small sample sizes, the bias of \hat{\lambda}_{\text{MLE}} can be notable, approximately \lambda / (n-1). A bias-corrected estimator is \hat{\lambda}_{\text{adj}} = (n-1) / \sum_{i=1}^n X_i, which is unbiased for \lambda and has variance \lambda^2 / (n-2) for n > 2. This adjustment is particularly useful in finite-sample settings to improve accuracy.^[57]

Confidence Intervals

Confidence intervals for the parameter \lambda of the exponential distribution can be constructed using exact methods based on the chi-squared distribution or approximate methods relying on asymptotic normality. These intervals quantify the uncertainty around estimates of \lambda, the rate parameter, from a sample of n independent and identically distributed exponential random variables X_1, \dots, X_n. The exact confidence interval leverages the pivotal quantity $2\lambda \sum_{i=1}^n X_i \sim \chi^2(2n), where \chi^2(2n) denotes the chi-squared distribution with $2n degrees of freedom. For a $100(1-\alpha)\% two-sided interval, this yields

\left[ \frac{\chi^2_{1-\alpha/2}(2n)}{2 \sum_{i=1}^n X_i}, \frac{\chi^2_{\alpha/2}(2n)}{2 \sum_{i=1}^n X_i} \right],

where \chi^2_{p}(2n) is the p-quantile of the chi-squared distribution with $2n degrees of freedom. This interval is derived by inverting the probability statement \Pr\left( \chi^2_{1-\alpha/2}(2n) < 2\lambda \sum_{i=1}^n X_i < \chi^2_{\alpha/2}(2n) \right) = 1 - \alpha.^[18] For the scale parameter \beta = 1/\lambda, which represents the mean lifetime, the corresponding exact interval is obtained by taking reciprocals of the bounds for \lambda, resulting in

\left[ \frac{2 \sum_{i=1}^n X_i}{\chi^2_{\alpha/2}(2n)}, \frac{2 \sum_{i=1}^n X_i}{\chi^2_{1-\alpha/2}(2n)} \right].

This transformation preserves the coverage probability since the mapping is monotonic.^[18] An asymptotic approximation is available for large n, based on the maximum likelihood estimator \hat{\lambda} = n / \sum_{i=1}^n X_i and the Fisher information I_n(\lambda) = n / \lambda^2. The asymptotic distribution is \sqrt{n} (\hat{\lambda} - \lambda) \xrightarrow{d} \mathcal{N}(0, \lambda^2), leading to a $100(1-\alpha)\% Wald interval

\hat{\lambda} \pm z_{\alpha/2} \frac{\hat{\lambda}}{\sqrt{n}},

where z_{\alpha/2} is the \alpha/2-quantile of the standard normal distribution. This interval uses the observed Fisher information evaluated at \hat{\lambda} to estimate the variance.^[39] The exact chi-squared-based interval is preferred for small sample sizes n, as it achieves the nominal coverage probability exactly, whereas the asymptotic normal interval can undercover due to skewness in the distribution of \hat{\lambda} when n or \lambda is small. For large n, the asymptotic method performs well and is computationally simpler. Simulations confirm that normal approximations underestimate variability for n < 50 and small \lambda. In reliability engineering, one-sided intervals are often used to provide conservative lower bounds on the mean lifetime \beta = 1/\lambda. For a $100(1-\alpha)\% lower confidence bound on \beta, the exact form is \frac{2 \sum_{i=1}^n X_i}{\chi^2_{\alpha}(2n)}, ensuring that the true mean exceeds this value with probability $1-\alpha. This is particularly valuable for demonstrating minimum reliability requirements in lifetime testing.^[61]

Bayesian Inference

In Bayesian inference for the exponential distribution with rate parameter \lambda > 0, the goal is to update prior beliefs about \lambda using observed data to obtain a posterior distribution that incorporates both sources of information.^[62] Given n independent and identically distributed observations X_1, \dots, X_n from Exponential(\lambda), the likelihood function is L(\lambda \mid \mathbf{x}) = \lambda^n \exp\left(-\lambda \sum_{i=1}^n x_i\right).^[63] A conjugate prior for \lambda is the gamma distribution, Gamma(\alpha, \beta), with density \pi(\lambda) = \frac{\beta^\alpha}{\Gamma(\alpha)} \lambda^{\alpha-1} e^{-\beta \lambda} for \alpha > 0, \beta > 0.^[62] This choice ensures the posterior distribution remains in the gamma family: \pi(\lambda \mid \mathbf{x}) = Gamma\left(\alpha + n, \beta + \sum_{i=1}^n x_i\right).^[64] The posterior mean, which serves as the Bayes estimator under squared error loss, is \frac{\alpha + n}{\beta + \sum_{i=1}^n x_i}.^[62] Credible intervals for \lambda can be constructed from the quantiles of this posterior gamma distribution, providing probabilistic bounds that reflect uncertainty given the prior and data.^[63] For non-informative priors, the Jeffreys prior \pi(\lambda) \propto 1/\lambda is often used, derived from the square root of the Fisher information I(λ) = 1 / λ².^[65] This improper prior corresponds to the limiting case of Gamma(\epsilon, \epsilon) as \epsilon \to 0^+, yielding a proper posterior Gamma\left(n, \sum_{i=1}^n x_i\right) after observing data.^[66] In the one-parameter exponential model, the reference prior—which aims for objectivity by maximizing expected posterior information and often matches frequentist coverage properties—coincides with the Jeffreys prior.^[67] Under decision-theoretic frameworks, such as squared error loss L(\hat{\lambda}, \lambda) = (\hat{\lambda} - \lambda)^2, the Bayes estimator minimizing expected posterior loss is the posterior mean, as noted above.^[68] Calibrating priors for objectivity, such as reference priors, ensures posterior inferences approximate frequentist procedures in terms of coverage while fully incorporating Bayesian updating.^[69]

Applications

Event Inter-Arrival Times

In a homogeneous Poisson process with rate parameter λ, the times between successive events, known as inter-arrival times, follow an exponential distribution with the same rate λ.^[13] This connection arises because the Poisson process models the occurrence of independent, rare events at a constant average rate, and the exponential distribution captures the probabilistic waiting time until the next event.^[70] The exponential distribution's constant hazard rate of λ implies that the probability of an event occurring in the next instant does not depend on how long one has already waited, embodying the memoryless property in a single sentence: this lack of memory makes it suitable for scenarios where past waiting time provides no information about future waits.^[71] Representative examples include the inter-arrival times of radioactive particle decays, where emissions occur randomly without influence from prior decays, and customer arrivals at a service counter during off-peak hours, assuming steady, independent inflows. Similarly, it models failure times in systems with constant risk, such as certain electronic components under stable conditions.^[72] Empirically, the exponential distribution fits well for processes involving rare, independent events, such as sporadic equipment malfunctions in low-stress environments or photon arrivals in quantum optics experiments.^[73] However, it fails for clustered events, where arrivals bunch together due to dependencies, as seen in network traffic bursts that deviate from Poisson assumptions.^[74] It also inadequately models aging processes with increasing hazard rates, such as mechanical wear, where alternatives like the Weibull distribution better capture time-dependent risks.^[72]

Reliability and Survival Analysis

In reliability engineering and survival analysis, the exponential distribution models the time until failure or event occurrence under the assumption of a constant failure rate, making it particularly suitable for systems where the risk of failure does not depend on age or usage duration.^[2] The hazard function, which represents the instantaneous failure rate at time t given survival up to that point, is given by h(t) = \lambda, where \lambda > 0 is the constant rate parameter; this constancy implies the "memoryless" property, where the probability of failure in the next interval is independent of past survival time.^[2] The reliability function, or survival function R(t), denotes the probability that the system survives beyond time t and is expressed as R(t) = e^{-\lambda t}.^[2] A key metric derived from the exponential distribution is the mean time to failure (MTTF), which quantifies the expected lifetime of a non-repairable system and equals \frac{1}{\lambda}.^[2] This measure is widely used to assess system dependability, as higher MTTF values indicate greater reliability under constant hazard conditions.^[75] The exponential distribution finds applications in modeling electronic components, such as resistors or integrated circuits, that exhibit constant failure rates during their useful life phase, allowing engineers to predict maintenance needs and system uptime.^[76] In actuarial science, it is applied to certain life tables for risks with constant mortality rates, such as specific insurance products where survival probabilities decline exponentially.^[77] Survival data often involves censoring, where some observations are incomplete because the event (e.g., failure) has not occurred by the study's end; the exponential model's likelihood function accommodates this by contributing only the survival term e^{-\lambda t_i} for censored cases at time t_i, while using the full density for observed failures.^[78] This partial likelihood approach ensures unbiased parameter estimation despite incomplete data.^[78] Within the bathtub curve framework, which describes failure rates over a product's lifecycle, the exponential distribution corresponds to the flat middle phase of constant hazard during normal operation, but it is inappropriate for the wear-out phase where rates increase due to degradation.^[79]

Queuing and Prediction Models

The exponential distribution plays a central role in the M/M/1 queuing model, where customer arrivals follow a Poisson process with rate \lambda, implying exponentially distributed inter-arrival times with mean $1/\lambda, and service times are exponentially distributed with rate \mu and mean $1/\mu. This model assumes a single server and infinite queue capacity, allowing the system state—defined by the number of customers present—to be modeled as a continuous-time birth-death process with birth rate \lambda and death rate \mu for all states. The steady-state probability that n customers are in the system is given by P_n = (1 - \rho) \rho^n for n = 0, [1](/page/1), 2, \dots, where \rho = \lambda / \mu represents the traffic intensity.^[80] For system stability, the traffic intensity must satisfy \rho < 1; otherwise, the queue length grows without bound. Under this condition, the mean number of customers in the system is L = \rho / (1 - \rho), providing a key performance metric for assessing congestion. This formulation enables exact analysis of waiting times and queue dynamics, with the mean waiting time in the system being W = 1/(\mu - \lambda).^[80] The memoryless property of the exponential distribution is particularly valuable in prediction models within queuing systems, as the remaining service time for a customer is always exponentially distributed with rate \mu, independent of the elapsed service time. This allows for straightforward forecasting of residual times, such as estimating the time until a server completes its current task, without needing historical data on prior service duration. In practice, this property simplifies real-time predictions in operational settings, enhancing decision-making for resource allocation.^[80] Applications of the M/M/1 model and its exponential underpinnings are widespread, including call centers where arrival patterns approximate Poisson processes and service times are roughly exponential, aiding in staffing optimization. In network traffic management, packet arrivals are often modeled as Poisson with exponential service, helping predict delays in routers and switches. Inventory models incorporating queuing treat replenishment lead times or demand arrivals as exponential, balancing stock levels against waiting costs in production systems.^[81]^[80]^[82] Extensions beyond pure exponential assumptions include the M/G/1 queue, which relaxes the exponential service time to a general distribution while retaining Poisson arrivals, analyzed via the Pollaczek-Khinchine formula for mean waiting times. For greater realism, phase-type distributions—absorbing Markov chains that generalize the exponential—enable modeling of multi-phase services in queues like M/PH/1 systems, preserving tractability through matrix-analytic methods.^[80]

Random Variate Generation

Inverse Transform Method

The inverse transform method, also known as inverse transform sampling, is a fundamental technique for generating random variates from the exponential distribution using uniform random numbers.^[83] The procedure relies on the inverse of the cumulative distribution function (CDF) to map uniform samples to the desired distribution. For the exponential distribution with rate parameter \lambda > 0, the CDF is F(x) = 1 - e^{-\lambda x} for x \geq 0. The quantile function, or inverse CDF, is then derived as F^{-1}(y) = -\frac{1}{\lambda} \ln(1 - y) for y \in (0,1).^[83] To implement the algorithm, first generate a uniform random variate U \sim \text{Uniform}(0,1). Then, compute the exponential variate X = F^{-1}(U) = -\frac{1}{\lambda} \ln(1 - U). Due to the symmetry of the uniform distribution, $1 - U is also \text{Uniform}(0,1), so the formula is often simplified to X = -\frac{1}{\lambda} \ln(U) for computational convenience.^[83] This process yields an exact sample from the exponential distribution \text{Exp}(\lambda). The method requires only a single uniform random number and a logarithmic computation per variate, making it straightforward for scalar generation.^[83] The rationale for this approach stems from the probability integral transform, which states that if X has CDF F, then F(X) \sim \text{Uniform}(0,1). Inverting this transform ensures that the generated X satisfies P(X \leq x) = F(x), preserving the target distribution's properties.^[83] This exact mapping guarantees unbiased samples without approximation errors, provided the uniform generator is of high quality to avoid clustering near zero in the logarithm.^[83] In practice, the parameter \lambda scales the output directly, allowing easy adjustment for different rates; for the standard exponential (\lambda = 1), X = -\ln(U). High-quality uniform generators, such as those based on linear congruential or Mersenne Twister algorithms, are recommended to ensure numerical stability, particularly since \ln(U) can become large as U approaches 0.^[83] The method's efficiency is notable for its simplicity and low overhead, though it may be less suitable for high-dimensional or vectorized generations compared to specialized algorithms.^[83] Historically, the inverse transform method emerged as a core tool in the early development of Monte Carlo simulations during the 1950s, enabling reliable random variate generation for probabilistic modeling in physics and engineering.^[84]

Acceptance-Rejection Methods

The acceptance-rejection method offers a flexible approach to generating random variates from the exponential distribution with rate parameter λ > 0, whose probability density function is given by

f(x) = \lambda e^{-\lambda x}, \quad x \geq 0.

The core setup involves selecting a proposal density g(x) that is easy to sample from and whose support covers that of f, along with a constant M ≥ \sup_x [f(x)/g(x)]. Independent samples Y are drawn from g until a uniform random variable U ∈ [0,1] satisfies U ≤ f(Y)/(M g(Y)), at which point Y is accepted as a variate from f; otherwise, the process repeats. This ensures the accepted samples follow the target distribution, with the expected number of proposals required being exactly M and the acceptance probability 1/M.^[85]^[86] Another approach decomposes the exponential into a mixture: with probability p = 1 - e^{-\lambda}, generate from the conditional distribution on [0,1), which is f(x | X < 1) = \lambda e^{-\lambda x} / (1 - e^{-\lambda}) for 0 ≤ x < 1; this can be sampled exactly using the inverse transform for the conditional CDF. With probability 1 - p, generate from the tail conditional on X > 1, which is 1 + Z where Z ~ Exp(λ). This method avoids rejection entirely and is efficient for moderate λ.^[85] The Ziggurat algorithm represents a high-speed variant of acceptance-rejection optimized for the exponential distribution, approximating f(x) from below with a stack of 128 (or more) rectangles of decreasing heights and widths fitted under the density curve. Proposals are drawn uniformly from a randomly selected rectangle (via a discrete choice weighted by areas), and acceptance occurs if the candidate lies below f(x), with the rare tail (beyond the stack) handled by a separate exponential sampler. This achieves acceptance probabilities exceeding 0.99 in practice, enabling generation rates of about 15 million variates per second on 400 MHz processors, far surpassing basic implementations.^[85] Efficiency in these methods hinges on minimizing M, which is optimized by choosing g close to f; for the exponential, suitable proposals yield M values around 1 to 2 for well-chosen parameters, balancing computational cost. The approach is particularly advantageous in scenarios where the inverse transform's logarithm is expensive to compute, such as early computing environments or embedded systems, and supports easy parallelization since iterations are independent across threads.^[85]^[86] Variants extend the method to truncated exponentials, where support is restricted to [0, b] and a uniform proposal g(x) = 1/b on [0, b] gives M = \lambda b e^{\lambda (b-1)} (adjusted for normalization), with acceptance U ≤ e^{-\lambda (x - b + 1)}; this is efficient for moderate b. Combinations with other techniques, such as table lookups for the body and AR for tails, further enhance speed in software libraries.^[85]

References

[1]
15.1 - Exponential Distributions | STAT 414 - STAT ONLINE
The continuous random variable follows an exponential distribution if its probability density function is: for and .
[2]
8.1.6.1. Exponential - Information Technology Laboratory
The exponential distribution is the only distribution to have a constant failure rate. Also, another name for the exponential mean is the Mean Time To Fail or ...Missing: definition applications
[3]
15.2 - Exponential Properties | STAT 414 - STAT ONLINE
Four key properties of an exponential random variable are presented and proven, including its probability density function, moment generating function, mean, ...
[4]
[PDF] Exponential distribution
The exponential distribution is used in survival analysis to model the lifetime of an organism or the survival time after treatment.
[5]
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Mostly_Harmless_Statistics_(Webb)/06%3A_Continuous_Probability_Distributions/6.03%3A_Exponential_Distribution
[6]
Exponential Distribution | Definition | Memoryless Random Variable
The exponential distribution is one of the widely used continuous distributions. It is often used to model the time elapsed between events.Missing: authoritative source
[7]
[PDF] 5.2 Exponential Distribution
The exponential distribution has a single scale parameter λ, as defined below. Definition 5.2 A continuous random variable X with probability density function.
[8]
[PDF] ECE 302: Lecture 4.3 Cumulative Distribution Function
Retrieving PDF from CDF. Theorem. The probability density function (PDF) is the derivative of the cumulative distribution function (CDF):. fX (x) = dFX (x) dx.
[9]
Chapter 5 Continuous Random Variable | Probability I - Bookdown
and a CDF that has the shape of an S. A typical S-shaped Cumulative Distribution Function. Figure 5.3: A typical S-shaped Cumulative Distribution Function. 5.3 ...
[10]
1.3.6.6.7. Exponential Distribution - Information Technology Laboratory
where μ is the location parameter and β is the scale parameter (the scale parameter is often referred to as λ which equals 1/β). The case where μ = 0 and β = 1 ...
[11]
14.2: The Exponential Distribution - Statistics LibreTexts
Apr 23, 2022 · ... exponential distribution with rate parameter r . The reciprocal 1 r is known as the scale parameter (as will be justified below). Note that ...
[12]
Exponential Distribution: Uses, Parameters & Examples
Lambda is also the mean rate of occurrence during one unit of time in the Poisson distribution. To convert between the scale (β) and decay rate (λ) forms of the ...
[13]
[PDF] Chapter 2 - POISSON PROCESSES - MIT OpenCourseWare
The parameter λ is called the rate of the process. We shall see later that ... For an exponential rv X of rate λ > 0, Pr{X>x} = e λx for x ≥ 0. This ...Missing: parameterization | Show results with:parameterization
[14]
The Exponential Distribution
The exponential distribution is used to model the behavior of units that have a constant failure rate (or units that do not degrade with time or wear out).
[15]
Exponential distribution - Minitab - Support
This distribution has a wide range of applications, including reliability ... For the 1-parameter exponential distribution, the scale parameter equals the mean.
[16]
[PDF] Stat 110 Strategic Practice 6, Fall 2011 1 Exponential Distribution ...
Find the mean and variance of Y using the MGF of X, without doing any integrals. Then for µ = 0, σ = 1, find the nth moment E(Y n) (in terms of n).
[17]
[PDF] Exponential Distribution: Theory and Properties - Quest Journals
Aug 22, 2022 · Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean and is given as;.
[18]
[PDF] 4.2 Exponential Distribution
Property 4.1 (memoryless property) If T ∼ exponential(λ), then. P[T ≥ t] = P[T ≥ t +s|T ≥ s] t ≥ 0; s ≥ 0. As shown in Figure 4.4 for λ = 1 and s = 0.5, this ...Missing: curve | Show results with:curve
[19]
https://stats.libretexts.org/Bookshelves/Probability_Theory/Probability_Mathematical_Statistics_and_Stochastic_Processes_(Siegrist)/14:_The_Poisson_Process/14.02:_The_Exponential_Distribution
[20]
Median of the exponential distribution | The Book of Statistical Proofs
Feb 11, 2020 · Proof: Median of the exponential distribution median(X)=ln2λ. (2) (2) m e d i a n ( X ) = ln ⁡ Proof: The median is the value at which the ...
[21]
Proof: Quantile function of the exponential distribution
Feb 12, 2020 · QX(p)=F−1X(x). (5) This can be derived by rearranging equation (3) : p=1−exp[−λx]exp[−λx]=1−p−λx=ln(1−p)x=−ln(1−p)λ.
[22]
[PDF] Theorem The exponential distribution has the memoryless ...
P(X>s)P(X>t) = e-s/αe-t/α = e-(s+t)/α = P(X>s + t). So the exponential distribution has the memoryless property. APPL verification: The APPL statements given ...
[23]
[PDF] The Exponential Distribution
only continuous distribution with the memory less property. Page 7. Proof: The memory less property implies. Since. 9 is continuous, this implies. Page 8. Note ...
[24]
Exponential distribution | Properties, proofs, exercises - StatLect
Memoryless property. One of the most important properties of the exponential distribution is the memoryless property: [eq54] for any $xgeq 0$ . Proof. This is ...How the distribution is used · Definition · The rate parameter and its... · More details<|control11|><|separator|>
[25]
What is the Memoryless Property? (Definition & Example) - Statology
We can prove this by using the CDF of the exponential distribution: CDF: 1 – e-λx. What is this? Report Ad. where λ is calculated as 1 / average time between ...
[26]
[PDF] Conditional Probabilities and the Memoryless Property - cs.wisc.edu
The exponential distribution is memoryless because the past has no bearing on its future behavior. Every instant is like the beginning of a new random period, ...
[27]
1. Memoryless Distributions - Continuous Time Markov Chains
The exponential distribution is the only memoryless distribution supported ... Proof. To see that (1.3) holds when $X$ is exponential with rate ...<|control11|><|separator|>
[28]
[PDF] Probability distributions and maximum entropy - Keith Conrad
Exponential and other distribution with same mean. While we will be concerned with the principle of maximum entropy insofar as it explains a natural role for ...
[29]
[PDF] Differential Entropy - WINLAB, Rutgers
(a) The exponential density, f(x) = λe¯`¤, x ≥ 0. (b) The Laplace density ... However, since a normal distribution maximizes the entropy for a given ...
[30]
[PDF] Lecture 4: Maximum Entropy Distributions and Exponential Family
Thus, the maximum entropy distribution with mean µ that is supported on the non-negative reals is the exponential distribution f∗(x) = 1. µ e−x/µ. Example ...
[31]
[PDF] Where do we Stand on Maximum Entropy?
Part B considers some general properties of the present maximum entropy formalism, stressing its consistency and inter- derivability with the other principles ...
[32]
[PDF] Theorem If Xi ∼ exponential(λi), for i = 1,2,...,n, and X1,X2,...,Xn are ...
These statements yield an exponential distribution for the minimum with parameter λ1 +λ2. 1.
[33]
[PDF] Extremes - Stat@Duke
The minimum of n i.i.d. Ex(λ) random variables has the Ex(nλ) distribution, so it converges to zero at rate 1/n; similarly the minimum of n We(α, λ).
[34]
[PDF] The Erlang Distribution 1. The convolution of the functions f and g is ...
A random variable T with density function fn is called Erlang distributed. When n = 1 the Erlang and exponential distributions coincide.
[35]
Order Statistics - Wiley Online Library
This volume provides an up-to-date coverage of the theory and applications of ordered random variables and their functions.
[36]
[PDF] Estimation with Sequential Order Statistics from Exponential ...
covariance matrix in the case of a sample of ordinary order statistics from an exponential distribution (cf. Sarhan (1954), Balakrishnan and Cohen (1991)) ...
[37]
https://math.stackexchange.com/questions/2589976/kullback-leibler-divergence-of-two-exponential-distributions-with-different-scal
[38]
[PDF] Fisher Information & Efficiency - Stat@Duke
May 11, 2021 · ... Exponential distribution above, for example, whose bias is βn(θ) = E ... Exponential: For the Ex(θ), the Fisher Information is I(θ)=1 ...
[39]
[PDF] 8.2 Exponential Distribution
To find the Fisher and observed information matrices associated with a complete data set from an exponential(λ) population, the derivative of the score ...
[40]
https://doi.org/10.1007/s10479-019-03373-1
[41]
[PDF] Value at Risk
Compute the value at risk V 99%. X . Exercise 6.5 Exponential distribution. Assume that X has an exponential distribution with parameter λ > 0 and mean 1/λ, ...
[42]
https://www.math.wm.edu/~leemis/chart/UDR/PDFs/ExponentialErlang.pdf
[43]
[PDF] Theorem The sum of n mutually independent exponential random ...
Theorem The sum of n mutually independent exponential random variables, each with common population mean α > 0 is an Erlang(α, n) random variable.
[44]
[PDF] Erlang Distribution
The Erlang distribution, denoted as X ∼ Erlang(α, n), has a probability density function f(x) = xn-1e-x/α αn(n − 1)! x > 0. The mean is nα and variance is nα².
[45]
15.4 - Gamma Distributions | STAT 414
Because each gamma distribution depends on the value of θ and α , it shouldn't be surprising that the shape of the probability distribution changes as θ and α ...
[46]
[PDF] Continuous Probability Distributions Exponential, Erlang, Gamma
Erlang Distribution. • The Erlang distribution is a generalization of the exponential distribution. • The exponential distribution models the time interval ...
[47]
15.6 - Gamma Properties | STAT 414 - STAT ONLINE
That is, when you put α = 1 into the gamma p.d.f., you get the exponential p.d.f.. Theorem Section. The moment generating function of a gamma random variable is ...
[48]
Statistics 5101 (Geyer, Spring 2022) Central Limit Theorem
Dec 8, 2020 · The CLT states that when the sum of many terms is large, the sum's distribution is approximately normal, even with discrete distributions. ...Missing: Erlang | Show results with:Erlang
[49]
Agner Erlang (1878 - 1929) - Biography - MacTutor
Quick Info. Agner Erlang was a Danish mathematician, statistician and engineer, who invented the fields of traffic engineering and queueing theory.
[50]
[PDF] Weibull Reliability Analysis
Weibull Reliability Analysis—FWS-5/1999—15. Page 16. Exponential Distribution. • The exponential distribution is a special case: β =1& τ = 0. F(t) = P(T ≤ t)=1 ...
[51]
[PDF] Theory and Applications of Stochastic Systems Lecture 11 - NYU Stern
Nov 21, 2003 · An exponential random variable is not heavy-tailed. Now let X have a Pareto distribution, F(x)=1−cx−α for x ≥ 0, with c and α > 0. F(x + ...
[52]
[PDF] Math 233 Note
Nov 2, 2020 · negative and whose absolute value is exponentially distributed ... Such a random variable is said to have a Laplace distribution, and its.
[53]
[PDF] Quiz 1 and Final Project, Hyperexponential Distributions
Four special types of phase-type exponential distributions: 1 ... – A mixture of different exponential distributions.
[54]
[PDF] Some Explicit Formulas for Mixed Exponential ... - Purdue e-Pubs
is a mixture of exponential distributions. We denote this mixture as H&(B ,y), i.e., an m- component hyperexponential distribution. ... Distributions of Phase ...
[55]
[PDF] ChisquareExponential.pdf
Theorem An exponential random variable with parameter α = 2 and a chi-square random variable with n = 2 degrees of freedom have the same probability ...
[56]
Probability Playground: The Chi-Squared Distribution
The chi-squared(2) distribution is an exponential(2) distribution. If X₁ is a chi-squared(ν₁) random variable and X₂ is an independent chi-squared(ν₂) random ...Missing: relation | Show results with:relation
[57]
[PDF] Lecture One - UC Berkeley Statistics
1. Basic ideas about estimation. 2. Method of Moments. 3. Maximum Likelihood. 4. Confidence Intervals. 5. Introduction to the ...
[58]
[PDF] Computing MLE Bias Empirically - Kar Wai Lim
Jan 31, 2017 · This note studies the bias arises from the MLE estimate of the rate parameter and the mean parameter of an exponential distribution. 1 ...
[59]
[PDF] Topic 15: Maximum Likelihood Estimation - Arizona Math
Maximum likelihood estimation chooses the parameter value that makes the observed data most probable, using the likelihood function.
[60]
[PDF] Maximum Likelihood in Exponential Families
Maximum likelihood (MLE) in exponential families finds the parameter by equating partial derivatives of the log-likelihood to zero, where the log-likelihood ...
[61]
[PDF] Analyzing Right - Censored Data with MLE Techniques
Right censoring means a unit's failure time is only known to exceed a value. MLE requires modification for right-censored data, where not all failure times are ...
[62]
[PDF] Confidence Intervals for Exponential Reliability - NCSS
This routine calculates the number of events needed for a specified confidence interval width for reliability, given an exponential distribution, using Type-II ...
[63]
[PDF] Bayesian Data Analysis Third edition (with errors fixed as of 20 ...
This book is intended to have three roles and to serve three associated audiences: an introductory text on Bayesian inference starting from first principles, a ...
[64]
Chapter 13 Bayesian Analysis of Rates
That is, Gamma distributions form a conjugate prior family for an Exponential likelihood. The posterior distribution is a compromise between prior and ...
[65]
[PDF] Chapter 9 The exponential family: Conjugate priors - People @EECS
There are two classical ways to derive the Student t distribution: as a scale mixture of. Gaussians and as a ratio of a Gaussian and the square root of a gamma ...
[66]
[PDF] Bayesian Inference - USP
This family is closed under multiplication (check this!). The natural conjugate prior for θ is Gamma with positive parameters α and β, i.e. p(θ) = βα.
[67]
[PDF] STA 114: Statistics Notes 12. The Jeffreys Prior - Stat@Duke
The Jeffreys prior is a non-informative prior constructed using the Fisher information function, and is invariant under monotone transformations. It can be ...
[68]
[PDF] BAYESIAN STATISTICS 4, pp. 35-60 - JM Bernardo, JO Berger, AP ...
The reference prior method, initiated by Bernardo (1979), is used to develop noninformative priors, following the tradition of Laplace and Jeffreys.
[69]
[PDF] E-Bayesian Estimation for the Exponential Model Based on Record ...
Under the squared error loss function, the Bayes estimate of 𝜃 can be shown to be. ̂𝜃BS (𝛼,𝛽) = n + 𝛼. 𝛽 + xn . (8). For more details about the squared ...
[70]
[PDF] NONINFORMATIVE PRIORS FOR INFERENCES IN EXPONENTIAL ...
... prior analyses, namely use of the Jeffreys or reference priors, do yield proper Bayesian inferences for the exponential regression problem. A second ...<|separator|>
[71]
[PDF] 1 IEOR 6711: Notes on the Poisson Process
But since the interarrival times have an exponential distribution, they have the memoryless property and thus your waiting time, A(s) = tN(s)+1 − s, until the ...
[72]
[PDF] ECE 302: Lecture 4.6 Exponential Random Variable
P[N = n] = (λt)n n! Figure: The inter-arrival time T between two consecutive Poisson events is an exponential random variable. = (λt)0 0! Therefore, the inter- ...Missing: formula | Show results with:formula
[73]
[PDF] A large-scale study of failures in high-performance computing systems
The study found failure rates varied widely (20-1000/year), time between failures follows a Weibull distribution, and repair times a lognormal distribution.
[74]
[PDF] STAT253/317 Lecture 10
Properties of Poisson Processes. Outline: ▷ Interarrival times of events are i.i.d Exponential with rate λ. ▷ Conditional Distribution of the Arrival Times.
[75]
[PDF] Wide area traffic: the failure of Poisson modeling - Csl.mtu.edu
POT waiting times with a medium-tailed distributhan such as the (memwyless) exponential distribution, the expected future waiting time is independent of the ...<|separator|>
[76]
What is mean time to failure (MTTF)? - Support - Minitab
In reliability analysis, MTTF is the average time that an item will function before it fails. It is the mean lifetime of the item.
[77]
Exponential distribution in reliability analysis - Minitab - Support
An electronic component is known to have a constant failure rate during the expected life of a product. Engineers record the time to failure of the ...
[78]
[PDF] parametric models for life tables - SOA
This paper presents a general law of mortality that is equal to a mixture of. Gompertz, Weibull, Inverse-Gompertz, and Inverse-Weibull survival functions.
[79]
[PDF] Likelihood Construction, Inference for Parametric Survival Distributions
In this section we obtain the likelihood function for noninformatively right- censored survival data and indicate how to make an inference when a para-.
[80]
Bathtub Curve - an overview | ScienceDirect Topics
The bathtub curve refers to a graphical representation that describes the variation of failure rates of components throughout their life cycle, ...
[81]
[PDF] Queueing systems
In this text we study the phenomena of standing, waiting, and serving, and we call this study queueing theory. Any system in which arrivals place demands upon a ...
[82]
[PDF] Queueing Theory in Call Centers - Specialty Answering Service
A variation of this queueing model is the M/G/1 model where the service time may follow any general statistical distribution and not necessarily an exponential ...
[83]
[PDF] queueing-inventory systems:asurvey - arXiv
Sep 19, 2023 · The service time is exponentially distributed with the parameter µ. They proved that the stationary distribution of the joint queue length and ...
[84]
[PDF] 1 Inverse Transform Method
The first general method that we present is called the inverse transform method. Let F(x), x ∈ IR, denote any cumulative distribution function (cdf) (continuous ...Missing: seminal paper
[85]
[PDF] JM Hammersley and DC Handscomb - Computer Science
The ensuing intensive study of Monte Carlo methods in the 1950s, particularly in the U.S.A., served paradoxically enough to discredit. NATURE OF MONTE CARLO ...
[86]
[PDF] Non- Uni form - Random Variate Generation - FSU Computer Science
Page 1. Luc Devroye. Non- Uni form. Random Variate Generation. S p ri n ge r ... Non-uniform random variate generation. Bibliography: p. Includes index. 1 ...
[87]
[PDF] 1 Acceptance-Rejection Method
Acceptance-Rejection Algorithm for continuous random variables. 1. Generate a rv Y distributed as G. 2. Generate U (independent from Y ). 3. If. U ≤ f(Y ) cg ...