Hyperexponential distribution
The hyperexponential distribution, also known as the H_k distribution, is a continuous probability distribution on the non-negative real line that models a random variable as a mixture of k independent exponential distributions, where each component is selected with probability p_i (summing to 1) and has rate parameter μ_i.[1] Its probability density function is given byf(t) = \sum_{i=1}^k p_i \mu_i e^{-\mu_i t}, \quad t > 0,
and the cumulative distribution function by
F(t) = 1 - \sum_{i=1}^k p_i e^{-\mu_i t}, \quad t \geq 0. [2] This structure allows it to approximate distributions with high variability, as its squared coefficient of variation satisfies c^2 \geq 1, contrasting with the exponential distribution's c^2 = 1.[1] Key moments include the mean E[X] = \sum_{i=1}^k p_i / \mu_i and variance \mathrm{Var}(X) = \sum_{i=1}^k p_i / \mu_i^2 - \left( \sum_{i=1}^k p_i / \mu_i \right)^2, which can be matched to empirical data using just the first two moments for fitting purposes, particularly for the two-phase case (H_2) where parameters are solved to balance means across phases.[2] The distribution is a special case of phase-type distributions, specifically an acyclic phase-type with parallel phases, and it exhibits a decreasing failure rate, with a monotonically decreasing hazard function.[1] In applications, the hyperexponential distribution is widely used in queueing theory to model service times or interarrival processes with heavy tails and high variability, such as in M/H_k/1 queues where explicit solutions for performance measures like waiting times are available.[3] It also appears in reliability engineering, software performance modeling, and call center analysis for customer patience times, providing robust fits to empirical traces when exponential assumptions fail due to overdispersion.[4] For instance, in storage systems and network simulations, H_2 fits outperform other long-tail models like log-normal in capturing response time distributions.[5]