Log-logistic distribution
The log-logistic distribution is a continuous probability distribution defined on the positive real line, arising as the distribution of a random variable whose logarithm follows a logistic distribution, with support for x > 0 and characterized by a scale parameter \alpha > 0 and a shape parameter \beta > 0.[1] Its cumulative distribution function is given by F(x) = \frac{(\alpha^{-1} x)^\beta}{1 + (\alpha^{-1} x)^\beta}, and the probability density function by f(x) = \frac{\beta}{\alpha} \left( \frac{x}{\alpha} \right)^{\beta - 1} \left[ 1 + \left( \frac{x}{\alpha} \right)^\beta \right]^{-2}, providing a closed-form expression that facilitates computation in statistical modeling.[1][2] The underlying logistic function was introduced by Pierre-François Verhulst in 1838 for modeling population growth in demography, and the distribution gained prominence in economics as the Fisk distribution following Prentice R. Fisk's 1961 application to income and wealth distributions, and it has since become a staple in survival analysis due to its flexibility in capturing skewed, heavy-tailed data.[2] Key properties include a unimodal hazard rate function h(x) = \frac{\beta x^{\beta - 1} / \alpha^\beta}{1 + (x / \alpha)^\beta} that increases for \beta > 1 before eventually decreasing, making it suitable for phenomena exhibiting initial reliability followed by wear-out, unlike monotonic alternatives such as the exponential distribution.[1][3] The mean exists only for \beta > 1 and equals \alpha \cdot \frac{\pi / \sin(\pi / \beta)}{\beta}, while the median is \alpha, reflecting its location-scale family structure; higher moments and quantiles are also analytically tractable via beta functions.[1][2] In applications, the log-logistic distribution is widely employed in survival analysis for lifetime data, such as patient remission times in medical studies or component failure in reliability engineering, where its non-monotonic hazard outperforms models like the Weibull for certain datasets; it also appears in hydrology for flood frequency modeling and in economics for size distributions of firms or cities.[3][2] Parameter estimation typically involves maximum likelihood methods, which are efficient for censored observations common in these fields, though the distribution's heavier tails compared to the log-normal can lead to distinct inferential behaviors.[2]Definition
Probability density function
The log-logistic distribution is obtained through a logarithmic transformation of the logistic distribution. Specifically, if the random variable Z follows a logistic distribution with location parameter \mu = 0 and scale parameter s = 1/\beta, then the random variable X = e^Z follows a log-logistic distribution with scale parameter \alpha = e^\mu = 1 and shape parameter \beta.[4] In its general form, the probability density function of a log-logistic random variable X with scale parameter \alpha > 0 and shape parameter \beta > 0 is f(x; \alpha, \beta) = \frac{\beta}{\alpha} \left( \frac{x}{\alpha} \right)^{\beta - 1} \left[ 1 + \left( \frac{x}{\alpha} \right)^\beta \right]^{-2}, \quad x > 0. [1] The scale parameter \alpha governs the location and dispersion of the distribution, with the median equal to \alpha regardless of \beta, providing a central tendency measure for positive-valued data.[4] The shape parameter \beta influences the skewness and tail behavior: values of \beta > 1 yield lighter tails and a more symmetric shape, while \beta < 1 produces heavier tails and greater skewness, allowing flexibility in modeling varying degrees of extremity in data.[1] The log-logistic distribution is defined exclusively on the positive real line (x > 0), making it suitable for modeling strictly positive random variables, such as lifetimes or durations in practical scenarios.[4]Cumulative distribution function
The cumulative distribution function (CDF) of the log-logistic distribution with scale parameter \alpha > 0 and shape parameter \beta > 0 is given by F(x; \alpha, \beta) = \frac{1}{1 + \left( \frac{\alpha}{x} \right)^\beta}, \quad x > 0. This expression is equivalent to F(x; \alpha, \beta) = \frac{\left( \frac{x}{\alpha} \right)^\beta}{1 + \left( \frac{x}{\alpha} \right)^\beta}, \quad x > 0. The CDF arises as the integral of the probability density function from 0 to x, yielding a closed-form expression that facilitates analytical computations in survival analysis and reliability engineering.[1] The survival function, defined as S(x) = 1 - F(x), takes the form S(x; \alpha, \beta) = \frac{1}{1 + \left( \frac{x}{\alpha} \right)^\beta}, \quad x > 0, and is particularly useful in reliability contexts for modeling the probability of survival beyond time x.[1] As x \to 0^+, F(x) \to 0, and as x \to \infty, F(x) \to 1. For \beta < 1, the distribution exhibits a heavy right tail, with S(x) \sim \left( \frac{\alpha}{x} \right)^\beta as x \to \infty, indicating Pareto-type behavior with tail index \beta.[5] The quantile function, obtained by inverting the CDF, is F^{-1}(p; \alpha, \beta) = \alpha \left( \frac{p}{1 - p} \right)^{1/\beta}, \quad 0 < p < 1. This form enables direct computation of percentiles for the distribution.[1]Parameterizations
Standard parameterization
The standard parameterization of the log-logistic distribution employs two positive parameters: a scale parameter \alpha > 0, which stretches the distribution along the positive real line, and a shape parameter \beta > 0, which governs the asymmetry and the rate of tail decay.[6] This formulation defines a continuous probability distribution supported on (0, \infty), making it suitable for modeling positive-valued random variables such as survival times or failure rates.[7] A key probabilistic interpretation arises from its connection to the logistic distribution: if X follows a log-logistic distribution with parameters \alpha and \beta, then \log(X / \alpha) follows a standard logistic distribution with location 0 and scale $1 / \beta.[6] This transformation highlights the log-logistic as a log-transformed variant of the logistic, preserving the latter's S-shaped cumulative distribution function on the logarithmic scale. The scale parameter \alpha effectively shifts the center of symmetry on the multiplicative scale for X, while \beta modulates the spread and kurtosis inherited from the logistic's scale.[7] The shape parameter \beta profoundly influences the distribution's form. When \beta = 1, the distribution is symmetric on the logarithmic scale, implying that X is multiplicatively symmetric around \alpha. For \beta > 1, the tails decay more rapidly, resulting in lighter tails compared to the logistic case; conversely, \beta < 1 produces heavier tails, enhancing the probability mass in the extremes.[6] These properties allow the log-logistic to flexibly model both light- and heavy-tailed phenomena, such as accelerated failure times in reliability analysis.[4] Unlike distributions with support on the full real line, the log-logistic requires no location parameter, as its inherent positivity—stemming from the exponential transformation of the logistic—eliminates the need for a shift to accommodate negative values.[7] This feature simplifies the parameterization while ensuring the distribution remains confined to positive outcomes, aligning with applications in fields like survival analysis where non-positive values are infeasible.[6]Scale-shape parameterization
The scale-shape parameterization of the log-logistic distribution utilizes a scale parameter \sigma > 0 and a shape parameter k > 0. In this form, the cumulative distribution function is expressed as F(x) = \frac{1}{1 + \left( \frac{\sigma}{x} \right)^k}, \quad x > 0, which is equivalent to the standard parameterization with \alpha = \sigma and \beta = k.[8] The scale parameter \sigma represents the median of the distribution, as F(\sigma) = 1/2.[4] The corresponding probability density function is f(x) = \frac{k \sigma^k x^{-k-1}}{\left[ 1 + \left( \frac{\sigma}{x} \right)^k \right]^2}, \quad x > 0. [8] This parameterization aids in applications requiring direct estimation or visualization of central tendencies, such as quantile plots in reliability engineering.[9] The conversion between this form and the standard \alpha-\beta parameterization is straightforward: \sigma = \alpha and k = \beta, preserving all distributional properties.[10] Historically, this scale-shape variant gained prominence in hydrology for analyzing flood magnitudes, where \sigma scales peak flows and k captures variability in extreme events, as introduced by Ahmad et al. in their 1988 study on flood frequency analysis in Scotland.[11] The rate interpretation of the shape parameter further streamlines hazard-based interpretations in such environmental models, emphasizing decreasing or unimodal risk profiles for flood occurrences.[12]Properties
Moments
The raw moments of the log-logistic distribution with scale parameter \alpha > 0 and shape parameter \beta > 0 are given by \mu_k = \mathbb{E}[X^k] = \alpha^k \Gamma\left(1 + \frac{k}{\beta}\right) \Gamma\left(1 - \frac{k}{\beta}\right) for k satisfying |k| < \beta, where \Gamma denotes the gamma function. This expression follows from the integral representation of the moments using the beta function, B\left(1 + \frac{k}{\beta}, 1 - \frac{k}{\beta}\right) = \frac{\Gamma\left(1 + \frac{k}{\beta}\right) \Gamma\left(1 - \frac{k}{\beta}\right)}{\Gamma(2)}, and the reflection formula \Gamma(z) \Gamma(1 - z) = \frac{\pi}{\sin(\pi z)}, which equivalently yields \mu_k = \frac{\alpha^k \pi k / \beta}{\sin(\pi k / \beta)}. Moments of order k \geq \beta do not exist due to the heavy-tailed nature of the distribution.[4] The mean exists for \beta > 1 and is \mathbb{E}[X] = \alpha \frac{\pi / \beta}{\sin(\pi / \beta)}. The second raw moment exists for \beta > 2, and the variance is then \mathrm{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2 = \alpha^2 \left[ \frac{2\pi / \beta}{\sin(2\pi / \beta)} - \left( \frac{\pi / \beta}{\sin(\pi / \beta)} \right)^2 \right]. Higher-order moments follow similarly from the general raw moment formula when \beta > k. The skewness and kurtosis, which measure asymmetry and tail heaviness, exist only for \beta > 3 and \beta > 4, respectively, reflecting the distribution's potential for infinite third and fourth moments at lower shape values. The skewness coefficient is \gamma_1 = \frac{2\pi^2 \csc^3(\pi / \beta) - 6\pi\beta \csc(2\pi / \beta) \csc(\pi / \beta) + 3\beta^2 \csc(3\pi / \beta)}{\left[ \pi \left(2\beta \csc(2\pi / \beta) - \pi \csc^2(\pi / \beta) \right) \right]^{3/2}}, and the kurtosis coefficient is \gamma_2 = \frac{6\pi^2 \beta \sec(\pi / \beta) \csc^3(\pi / \beta) - 3\pi^3 \csc^4(\pi / \beta) - 12\pi \beta^2 \csc(3\pi / \beta) \csc(\pi / \beta) + 4\beta^3 \csc(4\pi / \beta)}{\left[ \pi \left( \pi \csc^2(\pi / \beta) - 2\beta \csc(2\pi / \beta) \right) \right]^2}. These expressions are derived from the raw moments up to order four using standard relations for central moments. For \beta > 1, the distribution is positively skewed, with skewness decreasing toward zero as \beta increases, approaching symmetry in the limit.[13] The moment-generating function M(t) = \mathbb{E}[e^{tX}] does not possess a closed-form expression but can be approximated using series expansions based on the raw moments for small |t|.[3]Quantiles
The quantile function of the log-logistic distribution, which inverts the cumulative distribution function to provide the value x_p such that F(x_p) = p for $0 < p < 1, is given by x_p = \alpha \left( \frac{p}{1-p} \right)^{1/\beta}, where \alpha > 0 is the scale parameter and \beta > 0 is the shape parameter.[4][7][2] This closed-form expression arises directly from solving F(x) = p for x, leveraging the logistic form of the underlying distribution.[14] For p = 0.5, the median simplifies to x_{0.5} = \alpha, independent of the shape parameter \beta, which highlights the scale's role in centering the distribution.[4][7][2] Percentiles such as the first and third quartiles are x_{0.25} = \alpha \cdot 3^{-1/\beta} and x_{0.75} = \alpha \cdot 3^{1/\beta}, respectively, yielding an interquartile range of \alpha (3^{1/\beta} - 3^{-1/\beta}).[2] The ratio x_{0.75}/x_{0.25} = 3^{2/\beta} depends solely on \beta, illustrating how larger shape values produce narrower spreads and more symmetric behavior around the median.[7] In the upper tail, as p \to 1, the quantile exhibits heavy-tailed behavior approximated by x_p \sim \alpha (1-p)^{-1/\beta}, reflecting the distribution's polynomial decay and potential for extreme values.[4] This form is particularly useful for estimating high percentiles in applications like survival analysis, where tail risks are critical. The explicit closed-form quantile function offers numerical stability and efficiency in simulations and percentile computations, avoiding the need for iterative inversion methods.[7][14]Mode and other statistics
The log-logistic distribution with shape parameter β > 1 is unimodal, with the mode occurring at x = \alpha \left( \frac{\beta - 1}{\beta + 1} \right)^{1/\beta}. For β ≤ 1, the probability density function is monotonically decreasing on (0, ∞), so the mode is at the boundary x = 0.[6] The excess kurtosis, defined for β > 4, can be expressed using trigonometric functions derived from the higher-order moments; specifically, it involves terms like \frac{ \pi (4 / \beta) }{ \sin (4 \pi / \beta) } normalized by powers of the variance, and the distribution is leptokurtic (excess kurtosis > 0) particularly for smaller values of β.[6] The log-logistic distribution exhibits power-law tail behavior on the right, with tail index 1/β; for large x, the survival function satisfies P(X > x) \sim \left( \frac{\alpha}{x} \right)^\beta.[6] The characteristic function \phi(t) = E[e^{i t X}] has no closed-form expression but admits a series expansion \phi(t) = \sum_{n=0}^\infty \frac{(i t)^n}{n!} E[X^n], where the raw moments E[X^n] = \alpha^n \frac{\pi (n / \beta)}{\sin(\pi n / \beta)} for n < β.Parameter estimation
Method of moments
The method of moments for estimating the parameters of the log-logistic distribution equates the first two theoretical moments to their sample counterparts, requiring numerical solution due to the nonlinear nature of the equations. The theoretical mean is E[X] = \alpha \frac{\pi}{\beta} \csc\left( \frac{\pi}{\beta} \right) for \beta > 1, and the second moment is E[X^2] = \alpha^2 \frac{2\pi}{\beta} \csc\left( \frac{2\pi}{\beta} \right) for \beta > 2. To apply the method, first compute the sample mean \bar{x} = \frac{1}{n} \sum_{i=1}^n x_i and the sample second moment m_2 = \frac{1}{n} \sum_{i=1}^n x_i^2. Set \bar{x} = \hat{\alpha} \frac{\pi}{\hat{\beta}} \csc\left( \frac{\pi}{\hat{\beta}} \right) and m_2 = \hat{\alpha}^2 \frac{2\pi}{\hat{\beta}} \csc\left( \frac{2\pi}{\hat{\beta}} \right). Solving the first equation for the scale parameter gives \hat{\alpha} = \bar{x} \frac{\hat{\beta}}{\pi} \sin\left( \frac{\pi}{\hat{\beta}} \right). Substituting this into the second equation yields a transcendental equation in \hat{\beta}: m_2 = 2 \bar{x}^2 \left( \frac{\pi}{\hat{\beta}} \right)^2 \frac{ \sin^2 \left( \frac{\pi}{\hat{\beta}} \right) }{ \sin \left( \frac{2\pi}{\hat{\beta}} \right) }, which must be solved numerically, for example, via iterative root-finding algorithms. For \beta > 1, an initial guess for \hat{\beta} can be obtained by solving \frac{ \sin(\pi / \beta) }{ \pi / \beta } = \bar{x} / \sqrt{m_2}, after which the full nonlinear equation is iterated to convergence; once \hat{\beta} is found, \hat{\alpha} follows directly from the expression above. This approach leverages the theoretical moments discussed in the properties section for matching to data. The method is straightforward to implement but has limitations: it is inefficient for small \beta (corresponding to heavy-tailed distributions) because the moments are undefined for \beta \leq 1 (mean) or \beta \leq 2 (variance), and estimates exhibit bias in finite samples, with higher mean squared error compared to alternatives. Overall, while simpler than maximum likelihood estimation, it is less efficient for heavy-tailed cases due to poorer asymptotic properties and sensitivity to the existence of moments.Maximum likelihood estimation
Maximum likelihood estimation (MLE) for the log-logistic distribution involves maximizing the log-likelihood function derived from the probability density function. For a random sample x_1, \dots, x_n from the distribution with scale parameter \alpha > 0 and shape parameter \beta > 0, the log-likelihood is given by \ell(\alpha, \beta \mid \mathbf{x}) = n \ln \beta - n \ln \alpha + (\beta - 1) \sum_{i=1}^n \ln \left( \frac{x_i}{\alpha} \right) - 2 \sum_{i=1}^n \ln \left[ 1 + \left( \frac{x_i}{\alpha} \right)^\beta \right]. [15] This expression accounts for the full probabilistic contribution of each observation. To find the MLEs \hat{\alpha} and \hat{\beta}, the score equations are obtained by setting the partial derivatives of \ell with respect to \alpha and \beta to zero. These yield a system of nonlinear equations: \sum_{i=1}^n \frac{(x_i / \alpha)^\beta}{1 + (x_i / \alpha)^\beta} = n, \sum_{i=1}^n \frac{(x_i / \alpha)^\beta \ln(x_i / \alpha)}{1 + (x_i / \alpha)^\beta} = \beta \sum_{i=1}^n \ln(x_i / \alpha) + n. [15] No closed-form solutions exist, so numerical methods such as the Newton-Raphson algorithm are required to solve this system.[16] Under standard regularity conditions (which hold for \beta > 0), the MLEs are consistent and asymptotically efficient, with asymptotic normality \sqrt{n} (\hat{\theta} - \theta) \to N(0, I(\theta)^{-1}), where \theta = (\alpha, \beta) and I(\theta) is the Fisher information matrix.[16] For the log-logistic distribution, the expected Fisher information matrix per observation is diagonal: I(\beta, \alpha) = \begin{pmatrix} \frac{1 + \pi^2 / 3}{3 \beta^2} & 0 \\ 0 & \frac{1}{3 \beta^2 \alpha^2} \end{pmatrix}, leading to asymptotic variances \mathrm{Var}(\hat{\beta}) \approx 3 \beta^2 / [n (1 + \pi^2 / 3)] and \mathrm{Var}(\hat{\alpha}) \approx 3 \beta^2 \alpha^2 / n.[17] Software implementations facilitate MLE for the log-logistic distribution. In R, theflexsurv package fits the model using numerical maximization, supporting right-censored data via the expectation-maximization (EM) algorithm.[18] In Python, scipy.stats.fisk (parameterized as log-logistic) provides an .fit() method based on MLE, with extensions for censoring available through libraries like lifelines.[19] The EM algorithm is particularly useful for handling censored observations in survival contexts.[20]
Challenges in MLE include potential non-convergence for small sample sizes or when \beta is near 0, due to the nonlinear nature of the score equations and sensitivity to outliers.[15] In such cases, profile likelihood methods—maximizing over \alpha for fixed \beta to obtain a one-dimensional profile \ell_p(\beta)—can aid in estimating \beta and constructing confidence intervals.[21]