Fact-checked by Grok 2 weeks ago

Log-normal distribution

In probability theory, the log-normal distribution is a continuous probability distribution defined for positive real numbers, where the natural logarithm of the random variable follows a normal distribution. It is parameterized by two values: μ (the mean of the underlying normal distribution) and σ (its standard deviation, with σ > 0), such that if Y ~ N(μ, σ²), then X = e^Y has a log-normal distribution. The probability density function is given by
f(x; \mu, \sigma) = \frac{1}{x \sigma \sqrt{2\pi}} \exp\left( -\frac{(\ln x - \mu)^2}{2\sigma^2} \right)
for x > 0, and zero otherwise.
Key statistical properties distinguish the log-normal distribution from the normal distribution, as it is inherently right-skewed and cannot take negative values, making it suitable for modeling multiplicative processes or phenomena bounded below by zero. The (mean) is E[X] = e^{μ + σ²/2}, while the variance is Var(X) = (e^{σ²} - 1) e^{2μ + σ²}, both of which depend exponentially on the parameters and highlight the distribution's sensitivity to σ for larger spreads. Unlike the normal distribution, it lacks a closed-form but has a expressed via the standard normal CDF: F(x) = Φ((ln x - μ)/σ), where Φ is the of the standard normal. These properties arise from the exponential transformation, which stretches the positive tail and compresses values near zero. The log-normal distribution was first formally described in 1879 by and Lindsay McAlister in the context of velocity distributions, building on earlier observations of skewed data patterns dating back to the . It has since become a foundational model in various fields due to its ability to capture real-world data exhibiting multiplicative effects, such as growth rates or error accumulation. In , it underpins the modeling of stock prices and asset returns under assumptions like , where returns are normally distributed but prices are log-normally distributed. In reliability engineering, it describes failure times for systems subject to fatigue, corrosion, or degradation, such as cycles-to-failure in materials or repair durations in . Biological applications include modeling organism sizes, , or species abundance, while in , it fits distributions like particle sizes or concentrations (e.g., radon levels in homes). These uses leverage its flexibility for positive, skewed data, often validated through logarithmic to .

Definitions

Probability density function

The log-normal distribution is obtained by applying an exponential transformation to a normally distributed random variable. Specifically, if Y \sim \mathcal{N}(\mu, \sigma^2), then the random variable X = \exp(Y) follows a log-normal distribution, denoted X \sim \mathrm{LN}(\mu, \sigma^2). The probability density function (PDF) of X is derived using the change-of-variable technique from the PDF of Y. Let f_Y(y) = \frac{1}{\sigma \sqrt{2\pi}} \exp\left( -\frac{(y - \mu)^2}{2\sigma^2} \right) be the PDF of Y. Substituting y = \ln x and accounting for the Jacobian \left| \frac{dy}{dx} \right| = \frac{1}{x}, the PDF of X becomes f_X(x) = \frac{1}{x \sigma \sqrt{2\pi}} \exp\left( -\frac{(\ln x - \mu)^2}{2\sigma^2} \right) for x > 0, and f_X(x) = 0 otherwise. In this parameterization, \mu \in \mathbb{R} represents the location parameter, corresponding to the mean of the underlying normal distribution \ln X, while \sigma > 0 is the scale parameter, representing the standard deviation of \ln X. This form was formalized in the seminal treatment of the distribution. The PDF has support on the positive real line (0, \infty) and is positively skewed, with the skewness becoming more pronounced as \sigma increases, leading to a longer right tail. The mode, which maximizes the PDF, occurs at x = \exp(\mu - \sigma^2). Graphically, the shape of the PDF varies with the parameters. For a fixed \sigma, increasing \mu shifts the distribution rightward, moving the mode and peak higher along the x-axis without altering the spread. Conversely, for fixed \mu, larger values of \sigma result in a lower peak, greater dispersion, and increased asymmetry, with the mode shifting leftward relative to the mean.

Cumulative distribution function

The cumulative distribution function (CDF) of a log-normal random variable X with parameters \mu \in \mathbb{R} and \sigma > 0 is F(x) = \begin{cases} 0 & \text{if } x \leq 0, \\ \Phi\left( \frac{\ln x - \mu}{\sigma} \right) & \text{if } x > 0, \end{cases} where \Phi denotes the CDF of the standard normal distribution. This form arises because if X is log-normal, then Y = \ln X follows a normal distribution with mean \mu and standard deviation \sigma, so F(x) = P(X \leq x) = P(Y \leq \ln x) = \Phi\left( \frac{\ln x - \mu}{\sigma} \right) for x > 0. The log-normal CDF lacks a closed-form expression independent of special functions and is typically evaluated numerically using algorithms for the standard normal CDF. The standard normal CDF \Phi(z) relates to the error function \erf(z) = \frac{2}{\sqrt{\pi}} \int_0^z e^{-t^2} \, dt via \Phi(z) = \frac{1}{2} \left[ 1 + \erf\left( \frac{z}{\sqrt{2}} \right) \right], yielding F(x) = \frac{1}{2} \left[ 1 + \erf\left( \frac{\ln x - \mu}{\sigma \sqrt{2}} \right) \right] for x > 0. Numerical computation often employs series expansions, continued fractions, or asymptotic approximations for \erf or \Phi, especially for extreme values of . Due to the heavy right tail of the log-normal distribution, the CDF F(x) approaches 1 slowly as x \to \infty, reflecting the positive and potential for large outliers. This tail behavior makes the distribution suitable for modeling phenomena like prices or particle sizes, where extreme values occur infrequently but impact cumulative probabilities significantly. For illustration, consider the standard log-normal case with \mu = 0 and \sigma = 1. Here, F(1) = \Phi(0) = 0.5, corresponding to the median at x = e^\mu = 1. At x = e \approx 2.718, F(e) = \Phi(1) \approx 0.8413. For a larger value, x = 10, \ln 10 \approx 2.3026, so F(10) = \Phi(2.3026) \approx 0.9893, showing the gradual approach to 1.

Parameterization

The log-normal distribution is commonly parameterized by two parameters: \mu, the mean of the natural logarithm of the , and \sigma > 0, the standard deviation of the natural logarithm. These parameters arise naturally because if X follows a log-normal distribution, then \ln X follows a with mean \mu and standard deviation \sigma. An alternative parameterization expresses the distribution in terms of the geometric mean G = e^{\mu} and the geometric standard deviation S = e^{\sigma}. The geometric mean G represents the median of the distribution and serves as a measure of central tendency for multiplicative processes, while S quantifies the spread on a multiplicative scale, where values greater than 1 indicate variability. Another common reparameterization uses the m = e^{\mu + \sigma^2/2} and the variance v = (e^{\sigma^2} - 1) e^{2\mu + \sigma^2}. Here, m is the of X, which exceeds the due to the , and v captures the overall dispersion in the original scale. These moments provide direct links to sample statistics for data fitting. Conversions between these parameter sets are straightforward. For instance, starting from the arithmetic mean m and variance v, first compute \sigma^2 = \ln\left(1 + \frac{v}{m^2}\right), then \mu = \ln m - \frac{\sigma^2}{2}. Conversely, from \mu and \sigma, compute m = e^{\mu + \sigma^2/2} and v = m^2 (e^{\sigma^2} - 1). These relations derive directly from the moment expressions and facilitate switching between scales. The standard \mu-\sigma parameterization offers mathematical convenience, as operations on \ln X reduce to normal distribution properties, simplifying derivations in theoretical work. In contrast, the geometric mean and standard deviation parameterization enhances interpretability in applications involving multiplicative growth, such as financial modeling of asset returns or biological sizes, where ratios and compounded effects are intuitive. The arithmetic mean and variance form, meanwhile, aligns with conventional summary statistics but can obscure the underlying log-transform nature, potentially complicating analysis of skewed data.

Characterization

Moments and characteristic function

The moments of a log-normal random variable X, defined such that \ln X \sim \mathcal{N}(\mu, \sigma^2), are derived by leveraging the moment-generating properties of the underlying normal distribution. Let Y = \ln X, so Y \sim \mathcal{N}(\mu, \sigma^2). The k-th raw moment is then E[X^k] = E[e^{k Y}] = \exp\left(k \mu + \frac{k^2 \sigma^2}{2}\right), which follows directly from evaluating the moment-generating function of Y at point k. This formula holds for any real k > 0, though it is typically applied for positive integers in moment analysis. In particular, the mean is E[X] = \exp(\mu + \sigma^2 / 2). The variance, as a central moment, is obtained using the second raw moment: \text{Var}(X) = E[X^2] - (E[X])^2 = \exp(2\mu + 2\sigma^2) - \exp(2\mu + \sigma^2) = \exp(2\mu + \sigma^2) \left( \exp(\sigma^2) - 1 \right). Higher-order central moments can be computed similarly from the raw moments, though they grow rapidly due to the heavy-tailed nature of the distribution. Measures of asymmetry and tail heaviness are captured by the skewness and kurtosis, which depend only on \sigma and not on \mu. The skewness coefficient is \gamma_1 = \frac{E[(X - E[X])^3]}{( \text{Var}(X) )^{3/2}} = \left( e^{\sigma^2} + 2 \right) \sqrt{ e^{\sigma^2} - 1 }, indicating positive skewness for \sigma > 0, with the distribution becoming increasingly right-skewed as \sigma increases. The kurtosis is \gamma_2 = \frac{E[(X - E[X])^4]}{ ( \text{Var}(X) )^2 } = e^{4\sigma^2} + 2 e^{3\sigma^2} + 3 e^{2\sigma^2} - 3, which exceeds 3 for \sigma > 0, reflecting leptokurtosis and heavier tails compared to the normal distribution. These expressions are derived from the first four raw moments using standard formulas for standardized moments. The of X, defined as M_X(t) = E[e^{t X}], does not exist in closed form and is for all t > 0, owing to the growth of the tails. However, the raw moments E[X^k] are accessible via the of the underlying variable Y, as noted earlier. The \phi_X(t) = E[e^{i t X}] also lacks a simple . It can be represented as the \phi_X(t) = E\left[ e^{i t e^Y} \right] = \int_{-\infty}^{\infty} e^{i t e^y} \cdot \frac{1}{\sqrt{2\pi} \sigma} \exp\left( -\frac{(y - \mu)^2}{2\sigma^2} \right) \, dy, which requires numerical evaluation or approximation for computation. Series expansions, such as those using Hermite functions, provide ly convergent representations for practical use. The log-normal distribution is positively skewed when the shape parameter \sigma > 0, leading to the characteristic inequality that the exceeds the , which in turn exceeds the : \mathbb{E}[X] > \exp(\mu) > \exp(\mu - \sigma^2). This ordering highlights the distribution's asymmetry, with longer tails on the right, and is a direct consequence of the exponential transformation of the underlying . The mode, defined as the value that maximizes the probability density function, occurs at x = \exp(\mu - \sigma^2). To derive this, one takes the derivative of the density f(x) = \frac{1}{x \sigma \sqrt{2\pi}} \exp\left( -\frac{(\ln x - \mu)^2}{2\sigma^2} \right) with respect to x and sets it to zero, yielding the maximizer after simplification. The median, by contrast, is \exp(\mu), which corresponds to the 50th percentile and remains unchanged under the logarithmic transformation because the median of \ln X is \mu. The general p-th quantile (or $100p-th percentile) of the log-normal distribution is provided by the inverse : x_p = \exp\left( \mu + \sigma \Phi^{-1}(p) \right), where \Phi^{-1} denotes the of the standard . This formula arises from solving F(x_p) = p, where F(x) = \Phi\left( \frac{\ln x - \mu}{\sigma} \right), confirming the close to the normal quantiles. In applications such as risk analysis and actuarial science, partial expectations like the conditional expectation \mathbb{E}[X \mid X > q] quantify tail risks beyond a threshold q > 0. For the log-normal distribution, this is given by \mathbb{E}[X \mid X > q] = \frac{\exp(\mu + \sigma^2/2) \left[ 1 - \Phi\left( \frac{\ln q - \mu - \sigma^2}{\sigma} \right) \right]}{1 - \Phi\left( \frac{\ln q - \mu}{\sigma} \right)}, which leverages the mean of the distribution and the normal cumulative distribution function \Phi. This measure is particularly useful for modeling exceedances in financial losses or insurance claims, where the heavy right tail amplifies potential impacts.

Properties

Domain probabilities and transformations

The log-normal distribution is defined on the positive real line, with support x > 0, and probabilities over intervals within this domain are computed using the (CDF). For a log-normal X \sim \LN(\mu, \sigma^2), the probability that X falls between two positive values a > 0 and b > a is given by P(a < X < b) = F(b) - F(a), where the CDF is F(x) = \Phi\left( \frac{\ln x - \mu}{\sigma} \right) and \Phi denotes the standard normal CDF. This difference leverages the monotonicity of the CDF to quantify the likelihood of X lying within specified bounds, which is particularly useful for modeling bounded positive outcomes such as particle sizes or income levels. The survival function, which gives the probability of exceeding a threshold x > 0, is P(X > x) = 1 - F(x) = 1 - \Phi\left( \frac{\ln x - \mu}{\sigma} \right). Equivalently, this can be expressed using the standard normal survival function as \bar{\Phi}\left( \frac{\ln x - \mu}{\sigma} \right), where \bar{\Phi}(z) = 1 - \Phi(z). This tail probability is essential for assessing rare events in the right tail of the distribution, given its positive skewness. A defining property of the log-normal distribution is its closure under certain transformations, stemming from the normality of \ln X. Specifically, \ln X \sim \N(\mu, \sigma^2), so the natural logarithm transforms the log-normal variable to a normal one, facilitating easier computation of moments or simulations. For powers, if r > 0, then X^r \sim \LN(r\mu, r^2 \sigma^2), preserving the log-normal family with scaled parameters; this follows because \ln(X^r) = r \ln X \sim \N(r\mu, r^2 \sigma^2). The reciprocal transformation yields $1/X \sim \LN(-\mu, \sigma^2), as \ln(1/X) = -\ln X \sim \N(-\mu, \sigma^2), which is useful for modeling inverse processes like failure rates. In reliability engineering, the log-normal distribution models failure times of components, where the survival function computes the probability of exceeding a design lifetime threshold. For instance, if failure times follow \LN(\mu, \sigma^2), then P(T > t_0) = 1 - \Phi\left( \frac{\ln t_0 - \mu}{\sigma} \right) estimates the reliability beyond t_0, aiding in setting safety margins for systems like mechanical parts under fatigue.

Arithmetic and geometric moments

The of a X with parameters \mu and \sigma^2, denoted \mathbb{E}[X], is \exp\left(\mu + \frac{\sigma^2}{2}\right), while the is \exp(\mu). These differ due to the right-skewed nature of the distribution, where the exceeds the by the factor \exp(\sigma^2/2), reflecting the influence of the on the . For skewed data modeled by the log-normal distribution, the introduces upward bias compared to the , as captured by the \mathbb{E}[X] \geq \exp(\mathbb{E}[\ln X]), with equality holding only when \sigma = 0. This follows from applied to the convex exponential function and underscores why the provides a more stable measure for multiplicative processes underlying log-normality. Geometric moments arise naturally in the context of products of log-normal variables X_i \sim \mathrm{LN}(\mu_i, \sigma_i^2), where \mathbb{E}\left[\prod_{i=1}^n X_i\right] = \exp\left(\sum_{i=1}^n \mu_i + \frac{1}{2} \sum_{i=1}^n \sigma_i^2\right), leveraging the additive property of logarithms to preserve the log-normal form for the product. This multiplicative structure highlights the suitability of geometric moments for aggregating variables in scenarios involving compounded growth or successive proportions. In applications, the is preferred over the for averaging rates of return in , where asset prices follow log-normal , ensuring accurate representation of compounded performance without overestimation from . Similarly, in natural sciences such as physics, the characterizes particle sizes under log-normal distributions, providing a robust measure for skewed size spectra in processes like or .

Heavy tails and limit theorems

The log-normal distribution is characterized by a heavy right tail, meaning its survival function decays more slowly than exponentially. For a log-normal random variable X with parameters \mu \in \mathbb{R} and \sigma > 0, the tail probability satisfies P(X > x) \sim \frac{\phi\left( \frac{\ln x - \mu}{\sigma} \right)}{\sigma x} as x \to \infty, where \phi(z) = (2\pi)^{-1/2} \exp(-z^2/2) is the standard normal density function. This asymptotic form arises from the tail behavior of the underlying for \ln X, combined with the transformation X = e^{\ln X}, and places the log-normal in the class of subexponential distributions. Unlike distributions with exponential tails (e.g., the gamma or ), this slower decay implies a higher probability of extreme values, which is relevant in modeling phenomena like stock returns or particle sizes where outliers are common. A key reason for the prevalence of the log-normal distribution is its emergence in the multiplicative central limit theorem. Consider a sequence of independent and identically distributed positive random variables Z_i > 0 (i=1,2,\dots,n) such that E[\ln Z_i] = \nu and \mathrm{Var}(\ln Z_i) = \tau^2 < \infty. The logarithm of their product, \ln\left( \prod_{i=1}^n Z_i \right) = \sum_{i=1}^n \ln Z_i, is a sum of i.i.d. random variables with finite mean and variance. By the classical , after appropriate centering and scaling, \frac{1}{\sqrt{n}} \left( \sum_{i=1}^n \ln Z_i - n \nu \right) \xrightarrow{d} \mathcal{N}(0, \tau^2) as n \to \infty, implying that \prod_{i=1}^n Z_i converges in distribution (after normalization) to a log-normal random variable with parameters \mu = n \nu and \sigma = \sqrt{n} \tau. This theorem explains why log-normal distributions often approximate outcomes of multiplicative processes, such as growth models in biology or economics, where many small independent factors accumulate multiplicatively. In terms of tail heaviness, the log-normal occupies an intermediate position compared to other heavy-tailed families. Its tails are heavier than those of light-tailed distributions like the normal or exponential but lighter than power-law tails in the Pareto distribution, where P(X > x) \sim c x^{-\alpha} for some \alpha > 0, or in \alpha-stable distributions with index \alpha < 2. The Pareto and distributions exhibit polynomial decay, leading to infinite moments beyond order \alpha, whereas the log-normal retains finite moments of all orders due to the Gaussian nature of \ln X. This distinction is crucial: while the log-normal captures moderate extremes without diverging moments, generalizations like the generalized log-normal or certain infinite-variance cases may yield infinite higher moments, amplifying tail risks in applications such as risk assessment.

Transformations and combinations

The log-normal distribution exhibits closure under certain multiplicative transformations, making it particularly suitable for modeling phenomena involving products or ratios of positive random variables. If X_1 \sim \mathrm{LN}(\mu_1, \sigma_1^2) and X_2 \sim \mathrm{LN}(\mu_2, \sigma_2^2) are independent, their product X_1 X_2 follows a log-normal distribution with parameters \mu_1 + \mu_2 and \sigma_1^2 + \sigma_2^2. This property extends to the product of any finite number of independent log-normal variables, where the resulting parameters are the sums of the individual means and variances of the underlying normals. Similarly, the quotient X_1 / X_2 is log-normally distributed with parameters \mu_1 - \mu_2 and \sigma_1^2 + \sigma_2^2, as the logarithm of the ratio corresponds to the difference of independent normals. Raising a log-normal variable to a power also preserves the family. For X \sim \mathrm{LN}(\mu, \sigma^2) and constant r \neq 0, the transformed variable X^r follows \mathrm{LN}(r \mu, r^2 \sigma^2). This follows directly from the exponential form, since \ln(X^r) = r \ln X \sim \mathrm{N}(r \mu, r^2 \sigma^2). More generally, affine transformations of the form a X^r (with a > 0) yield a log-normal with adjusted \ln a + r \mu. In contrast, the sum of independent log-normal variables does not admit a closed-form distribution in general. While the sum S = X_1 + X_2 + \cdots + X_n of independent log-normals lacks an exact expression, it is often approximated by another log-normal distribution via moment-matching methods, such as the Fenton-Wilkinson approximation, which equates the first two moments of the sum to those of a fitting log-normal. Mixture models or numerical methods can also provide further approximations for the distribution of S. These transformations find application in error propagation for multiplicative models, common in and physics, where uncertainties in measurements multiply rather than add. For instance, in propagating relative errors through products of readings, the resulting is log-normal, facilitating variance calculations via the summed variances of the logs.

Multivariate extensions

The multivariate log-normal extends the univariate log-normal to a of random variables that are positive and jointly distributed with log-normal marginals and potentially correlated components. Specifically, -dimensional random \mathbf{X} = (X_1, \dots, X_p)^\top follows a multivariate log-normal , denoted \mathbf{X} \sim \mathrm{LN}_p(\boldsymbol{\mu}, \boldsymbol{\Sigma}), if \mathbf{Y} = \log \mathbf{X} = (\log X_1, \dots, \log X_p)^\top follows a multivariate normal \mathbf{Y} \sim \mathrm{MVN}_p(\boldsymbol{\mu}, \boldsymbol{\Sigma}), where \boldsymbol{\mu} \in \mathbb{R}^p is the mean and \boldsymbol{\Sigma} is the p \times p positive definite covariance matrix. The joint probability density function of \mathbf{X} is f(\mathbf{x}) = (2\pi)^{-p/2} |\boldsymbol{\Sigma}|^{-1/2} \left( \prod_{i=1}^p x_i^{-1} \right) \exp\left( -\frac{1}{2} (\log \mathbf{x} - \boldsymbol{\mu})^\top \boldsymbol{\Sigma}^{-1} (\log \mathbf{x} - \boldsymbol{\mu}) \right), for x_i > 0 and \mathbf{x} = (x_1, \dots, x_p)^\top, with the understanding that this expression, while explicit, does not simplify to a product of marginal densities due to the dependence induced by \boldsymbol{\Sigma}. Each marginal X_i follows a univariate \mathrm{LN}(\mu_i, \sigma_i^2), where \sigma_i^2 = \Sigma_{ii}. The conditional distribution of any subset of components given the others is also multivariate , as it inherits the conditional of the underlying \mathbf{Y}. The covariance structure reflects the exponential transformation: for i \neq j, \mathrm{Cov}(X_i, X_j) = \exp(\mu_i + \mu_j + \frac{1}{2}(\sigma_i^2 + \sigma_j^2)) \left( \exp(\Sigma_{ij}) - 1 \right), which captures the positive dependence possible under this model, with \mathrm{Cov}(X_i, X_i) = \mathrm{Var}(X_i) = \exp(2\mu_i + \sigma_i^2) (\exp(\sigma_i^2) - 1). This distribution is particularly useful in modeling dependent positive variables, such as in financial returns or environmental measurements, where the dependence structure aligns with a derived from the normal logs.

Statistical inference

Parameter estimation

The maximum likelihood estimator (MLE) for the parameters \mu and \sigma^2 of a log-normal distribution, based on a sample x_1, \dots, x_n > 0, is derived from the log-likelihood function \ln L(\mu, \sigma^2) = -\frac{n}{2} \ln (2\pi) - \sum_{i=1}^n \ln x_i - \frac{1}{2\sigma^2} \sum_{i=1}^n (\ln x_i - \mu)^2, which simplifies to closed-form expressions \hat{\mu} = \frac{1}{n} \sum_{i=1}^n \ln x_i (the arithmetic mean of the logged observations) and \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n (\ln x_i - \hat{\mu})^2 (the sample variance of the logged observations). These estimators exploit the fact that if X \sim \mathrm{LogNormal}(\mu, \sigma^2), then \ln X \sim \mathrm{Normal}(\mu, \sigma^2), reducing the problem to standard normal MLE. The method of moments (MOM) estimator matches the population mean E[X] = e^{\mu + \sigma^2/2} and variance \mathrm{Var}(X) = e^{2\mu + \sigma^2}(e^{\sigma^2} - 1) to the sample mean \bar{x} and sample variance s^2, respectively. This leads to nonlinear equations solved as \hat{\mu} = \ln \bar{x} - \frac{1}{2} \ln(1 + s^2 / \bar{x}^2) and \hat{\sigma}^2 = \ln(1 + s^2 / \bar{x}^2). MOM is computationally simpler than MLE but generally less statistically efficient, particularly for small samples or when \sigma^2 is large. Other approaches include minimum chi-square estimation, which minimizes the Pearson chi-square statistic between observed and expected frequencies under binned data to estimate \mu and \sigma^2, offering robustness to outliers compared to MLE in some grouped data scenarios. Bayesian estimation employs conjugate priors on the log scale, such as the normal-inverse-gamma distribution for (\mu, \sigma^2), yielding a posterior that is also normal-inverse-gamma and enabling credible intervals via marginal posteriors. Comparisons of and reveal that the MLE \hat{\mu} is unbiased, while \hat{\sigma}^2 is biased downward (with approximately -\sigma^2 / n); -corrected improve finite-sample . MOM estimators exhibit higher and than MLE across various sample sizes and values, though MOM remains preferable for quick approximations due to its explicit formulas. For censored or truncated , such as type I right-censored observations common in reliability studies, MLE adapts by incorporating survival terms into the likelihood (e.g., integrating the from the censor point to infinity), often requiring numerical optimization since closed forms are unavailable. MOM can be adjusted using conditional moments but loses ; Bayesian methods handle censoring naturally through the likelihood while incorporating prior information.

Interval estimation

Interval estimation for the parameters of the log-normal distribution relies on the fact that if X \sim \mathrm{LN}(\mu, \sigma^2), then \ln X \sim N(\mu, \sigma^2), allowing transformations to for constructing confidence intervals. A confidence interval for \mu is obtained using the sample mean and standard deviation of the logged observations y_i = \ln x_i, i=1,\dots,n: let \bar{y} = n^{-1} \sum y_i and s^2 = (n-1)^{-1} \sum (y_i - \bar{y})^2; then the (1-\alpha) confidence interval is \bar{y} \pm t_{n-1,1-\alpha/2} \, s / \sqrt{n}, where t_{n-1,1-\alpha/2} is the (1-\alpha/2)-quantile of the t-distribution with n-1 . This interval achieves exact coverage under the assumption for \ln X. Confidence intervals for \sigma^2 are based on the fact that (n-1)s^2 / \sigma^2 \sim \chi^2_{n-1}, yielding the exact (1-\alpha) interval \frac{(n-1)s^2}{\chi^2_{n-1, 1-\alpha/2}} < \sigma^2 < \frac{(n-1)s^2}{\chi^2_{n-1, \alpha/2}}, where \chi^2_{n-1, p} is the p-quantile of the with n-1 , and s^2 is the sample variance of the y_i. For \sigma, take square roots of the bounds. Since the median of the log-normal distribution is \exp(\mu), the confidence interval for the median is the exponential of the interval for \mu: \exp(\bar{y} \pm t_{n-1,1-\alpha/2} \, s / \sqrt{n}). This transformation preserves the monotonicity and provides an exact interval for the median. For the mean E[X] = \exp(\mu + \sigma^2/2), approximate confidence intervals can be constructed using the delta method on the estimates \hat{\mu} = \bar{y} and \hat{\sigma}^2 = s^2, yielding an asymptotic normal interval centered at \exp(\hat{\mu} + \hat{\sigma}^2/2) with standard error derived from the variance-covariance matrix of (\hat{\mu}, \hat{\sigma}^2). More precise intervals, especially in small samples, employ Fieller's theorem, which inverts a quadratic form to obtain exact coverage by solving for values of the mean where a pivotal statistic exceeds a critical value, often resulting in intervals that may be unbounded if the coefficient of variation is large. Generalized confidence intervals, based on Monte Carlo simulation of pivotal quantities involving normal and chi-squared random variables, provide good coverage (near 95%) even for n=5. Prediction intervals for a future X_{n+1} from a log-normal distribution are derived by first constructing a for \ln X_{n+1} \sim N(\mu, \sigma^2), which is \bar{y} \pm t_{n-1,1-\alpha/2} \, s \sqrt{1 + 1/n}, and then exponentiating the bounds to obtain the interval for X_{n+1}. This approach accounts for both estimation uncertainty and inherent variability, yielding asymmetric intervals reflective of the log-normal's . When comparing two independent log-normal distributions, say X \sim \mathrm{LN}(\mu_1, \sigma_1^2) and Y \sim \mathrm{LN}(\mu_2, \sigma_2^2), a for the difference in medians \exp(\mu_1) - \exp(\mu_2) can be obtained via parametric bootstrap: generate bootstrap samples from fitted distributions, compute the difference in sample medians for each, and take the appropriate percentiles of the empirical distribution of these differences. This method performs well in small samples, offering coverage probabilities close to nominal levels and shorter average lengths than normal approximation or fiducial generalized intervals when variances differ substantially.

Applications

Natural and social sciences

In , the log-normal distribution frequently describes the sizes of particles such as grains and suspended matter in natural environments, where multiplicative growth processes lead to skewed positive values. It also models volumes in various , reflecting heterogeneous rates that result in a broad range of sizes within populations. In , abundances in communities are often characterized by Preston's log-normal model, which posits a "veil" effect where sampling reveals progressively more , fitting empirical data from diverse habitats like forests and grasslands. In , the log-normal distribution applies to , where drug concentrations in over time exhibit log-normal patterns due to proportional absorption and elimination processes, aiding in dosing predictions for antibiotics and other therapeutics. Tumor sizes in studies similarly follow log-normal distributions, capturing the variable growth dynamics influenced by multiplicative cellular divisions, as observed in models of tumors like and lung cancers. In chemistry, reaction times for certain processes, such as or enzymatic reactions, are modeled as log-normal owing to the compounding effects of multiple rate-limiting steps. Isotope ratios in natural samples, including isotopes in geochemical cycles, often display log-normal variability stemming from processes that multiply small probabilistic differences. size distributions in are classically fitted to log-normal forms, representing the and mechanisms that produce a skewed from fine to coarse particles. In the social sciences, and distributions conform to log-normal patterns under Gibrat's law, which assumes proportionate growth independent of size, explaining the observed in household earnings across economies. sizes approximate a log-normal distribution, providing a basis for in the upper tail, as growth through mergers and expansions follows multiplicative dynamics in urban systems. The heavy tails of this distribution contribute to by amplifying disparities over time. In demographics, lifespan data from human populations are frequently log-normally distributed, accounting for the accelerating mortality rates after an initial period of relative stability, as seen in actuarial studies of life expectancy.

Engineering and finance

In physical sciences, the log-normal distribution models rainfall amounts, which arise from multiplicative accumulation processes in atmospheric dynamics, often exhibiting positive skew and heavy tails that capture extreme precipitation events. In technology applications, employs the log-normal distribution to describe failure times of components, such as devices, where degradation accumulates multiplicatively over time, leading to a decreasing hazard rate initially followed by wear-out failures. In , log-normal models represent noise and shadowing effects in communications, accounting for variations that multiply signal amplitudes and produce log-scale normality in received power levels. Financial modeling relies heavily on the log-normal distribution for stock returns, which result from successive multiplicative shocks in asset prices, ensuring non-negative values and geometric growth patterns observed in market data. The Black-Scholes model for option pricing assumes underlying asset prices evolve via geometric Brownian motion, implying log-normal distributions at expiration to derive closed-form valuation formulas under risk-neutral measures. Volatility clustering, a stylized fact in financial time series where large changes follow large changes, is often captured by log-normal specifications for volatility processes, enabling multiscale analysis of return heteroskedasticity. In studies, response times in psychological tasks, such as experiments, are commonly fitted with log-normal distributions to handle right-skewed data reflecting variable cognitive processing speeds. Word frequencies in adhere to patterns akin to , which can be derived from underlying log-normal distributions of lexical usage, explaining the power-law decay in rank-frequency plots across languages. Recent advancements in incorporate log-normal priors in Bayesian models for , particularly in applications like pruning and , where they model positive-valued parameters such as weights or prediction errors post-2020.

References

  1. [1]
    Log-normal distribution | Properties and proofs - StatLect
    A random variable is said to have a log-normal distribution if its natural logarithm has a normal distribution.Definition · Relation to the normal... · Variance · Higher moments
  2. [2]
    Lognormal Distribution - MATLAB & Simulink - MathWorks
    The lognormal distribution, sometimes called the Galton distribution, is a probability distribution whose logarithm has a normal distribution.Overview · Parameters · Cumulative Distribution Function · Examples
  3. [3]
    Lognormal Distribution - an overview | ScienceDirect Topics
    Log-normal distribution is defined as a continuous distribution of random variables whose natural logarithm is normally distributed, typically used to model ...<|control11|><|separator|>
  4. [4]
    Analyzing lognormal data: A nonmathematical practical guide
    Lognormal distributions, first described in 1879 (Galton, 1879; McAlister, 1879) are asymmetrical distributions commonly encountered in many fields of science.
  5. [5]
    Log-normal Distributions across the Sciences: Keys and Clues
    Basic properties of log-normal distributions. The basic properties of log-normal distribution were established long ago (Weber 1834, Fechner 1860, 1897, Galton ...
  6. [6]
    1.3.6.6.9. Lognormal Distribution - Information Technology Laboratory
    The lognormal distribution is used extensively in reliability applications to model failure times. The lognormal and Weibull distributions are probably the ...
  7. [7]
    Proof: Probability density function of the log-normal distribution
    Feb 13, 2022 · Proof: A log-normally distributed random variable is defined as the exponential function of a normal random variable: Y∼N(μ,σ2)⇒X=exp(Y)∼lnN(μ, ...
  8. [8]
    Mode of the log-normal distribution | The Book of Statistical Proofs
    Feb 13, 2022 · Proof: The mode is the value which maximizes the probability density function: mode(X)=argmaxxfX(x). (3) (3) m o d e ( X ) = a r g m a x x ⁡ ...
  9. [9]
  10. [10]
  11. [11]
    8.1.6.4. Lognormal - Information Technology Laboratory
    Lognormal · μ = ln T 50 and standard deviation · σ . This makes lognormal data convenient to work with; just take natural logarithms of all the failure times and ...
  12. [12]
    [PDF] The LogNormal Distribution - StatLit.org
    If X1 and X2 are two independent positive variates such that their product. X1 X2 is a A-variate, then both X1 and X2 are A-variates (or, as a special case,.
  13. [13]
    The Lognormal Distribution - Random Services
    The lognormal distribution is a continuous distribution on and is used to model random quantities when the distribution is believed to be skewed.
  14. [14]
    On lognormal random variables: I-the characteristic function
    Feb 28, 1990 · The characteristic function of a lognormal random variable is calculated in closed form as a rapidly convergent series of Hermite functions in a ...Missing: moments | Show results with:moments
  15. [15]
  16. [16]
    [PDF] DISTRIBUTIONS FOR ACTUARIES
    ... Lognormal Distributions ... limited expected value E[X; x] as a function of the variable limit x ...
  17. [17]
    [PDF] Hand-book on STATISTICAL DISTRIBUTIONS for experimentalists
    mean µ and standard deviation σ > 0 f(x;µ, σ) = 1 σ. √. 2π e. −1. 2 (x−µ σ. )2 and the double-exponential distribution with mean µ and slope parameter λ > 0.Missing: conversion | Show results with:conversion
  18. [18]
    Some basic facts and formulas about the lognormal distribution
    Moments and other characteristics. The mean (expectation) of X is. E(X) = exp( mu + sigma^2/2 ) the variance,Missing: arithmetic | Show results with:arithmetic
  19. [19]
    Lognormal Distribution - an overview | ScienceDirect Topics
    The moments of the lognormal distribution are given by median = exp(μ), mean = exp[μ + (σ2/2)] and variance = exp(2μ + σ2)exp(σ2 − 1), where exp is the ...Missing: mu) | Show results with:mu)
  20. [20]
    [PDF] On the Maximization of the Geometric Mean with Lognormal Return ...
    Dec 6, 2002 · We have shown that when price relatives are log normally distributed, the portfolio which maximizes the geometric mean lies on the efficient ...
  21. [21]
    [PDF] Particle Size Distributions: Theory and Application to Aerosols ...
    Nov 3, 2002 · The lognormal distribution is perhaps the most commonly used analytic expression in aerosol studies. Table 3 summarizes the standard lognormal ...
  22. [22]
    [PDF] Tail behavior of sums and differences of log-normal random variables
    Jan 6, 2016 · In this paper, we present an explicit characterization of the tail asymptotics of the density and the distribution function of arbitrary linear ...
  23. [23]
    [PDF] Classifying the Tails of Loss Distributions - Casualty Actuarial Society
    This converse (“Not all the positive moments of heavy-tailed distributions exist.”) is not true, for the lognormal is heavy-tailed, yet all its moments exist.
  24. [24]
    [PDF] Multivariate Log – Normal Distribution
    Here, multivariate log - normal distribution is defined and its mean and covariance matrix are obtained and their estimates are calculated. The application ...Missing: properties | Show results with:properties
  25. [25]
    [PDF] The Lognormal Random Multivariate [ ] 2 [ ] [ ]
    The purpose of this paper is present the basic theory of the lognormal random multivariate. Keywords: lognormal, multivariate, moment generating function, ...Missing: properties | Show results with:properties
  26. [26]
    Continuous Multivariate Distributions, Models and Applications
    Continuous Multivariate Distributions, Volume 1, Second Edition provides a remarkably comprehensive, self-contained resource for this critical statistical area.Missing: lognormal | Show results with:lognormal
  27. [27]
    Gaussian and non‐Gaussian inverse modeling of groundwater flow ...
    May 19, 2016 · The multivariate lognormal case can also be simulated using copulas. The approach allows the consideration of arbitrary marginal distributions ...3.1 Gaussian Copulas · 3.4 Non-Gaussian Dependence · 5.3 Non-Lognormal Marginal
  28. [28]
    [PDF] Lognormal Distribution Maximum-Likelihood Parameter Estimation ...
    Apr 24, 1989 · Maximum-likelihood estimation is done in two different ways: (I) by means of the derivative equations which result from the logarithmic likeli-.
  29. [29]
    [PDF] Comparison of Different Methods for Estimating Log-normal Means
    By taking the natural logarithm of a random variable, the random variable then will have a normal distribution. 0. 1. 2. 3. 4. 5. 0.0. 0.5. 1.0.
  30. [30]
    [PDF] A Comparison of Maximum Likelihood and Moment Methods in ...
    However, Maximum Likelihood Estimation method was better than Method of Moments based ... Considering also the efficiency and bias, MLE looks better than MME.
  31. [31]
    Variability of Space–Time Mean Rain Rate in - AMS Journals
    Minimum chi-square estimation versus least squares for mixed lognormal distribution for GATE I. The least squares estimates are given in the second column.
  32. [32]
    [PDF] A Compendium of Conjugate Priors - Applied Mathematics Consulting
    This report reviews conjugate priors and priors closed under sampling for a variety of data generating processes where the prior distributions are ...
  33. [33]
    [PDF] Report (pdf)
    This paper considers estimation of the moments of lognormal populations from moderate-sized type I censored samples. An adjusted maximum likelihood method is ...
  34. [34]
    [PDF] Inferences on the means of lognormal distributions using ...
    Note that inference concerning 1 − 2 is equivalent to that for the ratio of two lognormal means, namely, exp( 1)=exp( 2).Missing: arithmetic | Show results with:arithmetic
  35. [35]
    Confidence Intervals for the Mean of a Log-Normal Distribution - JSE
    Methods for calculating confidence intervals for the mean are reviewed for the case where the data come from a log-normal distribution.
  36. [36]
    [PDF] Confidence Bounds for Normal and Lognormal Distribution ...
    This paper compares the so-called exact approach for obtain- ing confidence intervals on normal distribution coefficients.
  37. [37]
    [PDF] A new confidence interval for the ratio of two normal means and ...
    Sep 15, 2022 · Fieller's [4] confidence interval for the ratio of normal means (assuming that the variances are equal) is the most popular and commonly used in ...
  38. [38]
    Prediction interval for a fitted log-normal distribution - Cross Validated
    Jul 24, 2014 · I am trying to do is to fit a log-normal distribution to a data-set, and then determine confidence and prediction intervals for the fitted distribution.Confidence Interval of a Lognormal Random VariablePrediction intervals for multiple regression on log-log transformed dataMore results from stats.stackexchange.com
  39. [39]
    [PDF] Log-normal Regression with R* - University of Toronto
    Wide prediction intervals are a fact of life for many data sets. “Predicting” university calculus score from a bunch of good variables, using a normal model ...
  40. [40]
    Confidence Intervals Based on the Difference of Medians for ... - MDPI
    In this paper, we study the inferences of the difference of medians for two independent log-normal distributions.
  41. [41]
    [PDF] Probability distributions and maximum entropy
    The principle of maximum entropy, as a method of statistical inference, is due to Jaynes. [6, 7, 8]. His idea is that this principle leads to the selection of a ...
  42. [42]
    Log-Normal Distributions Across the Sciences: Keys and Clues
    In medicine it aids in modeling tumor sizes and disease incubation periods ... log-normal distribution were used to analyze survival trends of these patients.
  43. [43]
    parametric approaches using species abundance distributions
    The use of the lognormal distribution to characterize the relative abundances of species in a community was introduced in ecology by the classic paper of ...Missing: particle cell volumes
  44. [44]
    Design and statistics of pharmacokinetic drug-drug, herb-drug, and ...
    Distributions of pharmacokinetic data; original and log-normal transformed. Provided with the geometric mean ratios and the log-transformed ratios.
  45. [45]
    Mechanisms for log normal concentration distributions in ... - Nature
    Aug 12, 2021 · A joint argument in these is the implication of the 'Multiplicative Central Limit Theorem' (or 'Gibrat's law'), which applies to processes that ...<|control11|><|separator|>
  46. [46]
    Comparison of radioactive aerosol size distributions (Activity ...
    The lognormal function describing the aerosol distribution can be defined in terms of the number concentration, surface distribution, or volume distribution.Missing: reaction | Show results with:reaction
  47. [47]
    [PDF] Why is Consumption More Log Normal Than Income? Gibrat's Law ...
    The result is that the consumption distribution is closer to log normal than the income distribution within cohorts, and observed departures from log normality ...Missing: sciences | Show results with:sciences
  48. [48]
    Gibrat's Law for (All) Cities - American Economic Association
    The nontruncated distribution is shown to be lognormal, rather than Pareto. This provides a simple justification for the coexistence of proportionate growth and ...
  49. [49]
    Zipf's law in income distribution of companies - ScienceDirect
    In 1931 Gibrat proposed log-normal distributions of income based on a theoretical assumptions of multiplicative random processes [3].
  50. [50]
    [PDF] Lognormal distribution for social researchers: A probability classic
    Similarly, the geometric mean is the square root of the product of the arithmetic mean and the harmonic mean. If the random variable X ~ Lognormal (μ, σ2) and ...Missing: conversion | Show results with:conversion
  51. [51]
    Lognormal distribution of individual lifetime fecundity - ResearchGate
    Aug 9, 2025 · We document the within-population pattern of individual variation in instantaneous and lifetime fecundity (as estimated by inflorescence ...
  52. [52]
    Measuring the dispersion of rainfall using Bayesian confidence ...
    Jul 22, 2019 · Since rainfall data series often contain zero values and thus follow a delta-lognormal distribution, the coefficient of variation is often used ...Missing: amounts | Show results with:amounts
  53. [53]
    Is there any evidence of normal distributions of eq. magnitudes?
    We have found that the log-M relationship may be interpreted as a normal distribution of magnitudes as well as the generally assumed exponential one.
  54. [54]
    [PDF] Lognormal Uncertainty Estimation for Failure Rates
    Component failure rates (λ) are not physical quantities; that is, they cannot be measured directly but must be inferred.
  55. [55]
    Lognormal Distribution - an overview | ScienceDirect Topics
    A Lognormal Distribution is defined as a probability distribution that describes the random shadowing effects in signal propagation, where the signal power ...Missing: processing | Show results with:processing<|separator|>
  56. [56]
    On the log-normal distribution of stock market data - ScienceDirect
    For some stock market data, the statistical distribution of closing prices normalized by the corresponding traded volumes, fits well a log-normal law. For other ...Missing: shocks | Show results with:shocks
  57. [57]
    [PDF] Technical Note on the Black-Scholes Formula 1 Option Overview
    The analysis consists of evaluating integrals of log-normal random variables. It's tedious at times, but involves no particularly advanced mathematics. 1 Option ...
  58. [58]
    Multiscaling and clustering of volatility - ScienceDirect.com
    We also argue that the distribution of volatility is log-normal. The paper is organized as follows: in Section 2 we show that the probability distribution ...
  59. [59]
    Assessing Fit of the Lognormal Model for Response Times
    Mar 23, 2020 · This article focuses on the lognormal model for response times, which is one of the most popular RTMs. Several existing statistics for testing ...
  60. [60]
    Lognormals, power laws and double power laws in the distribution ...
    Feb 16, 2022 · Zipf's law is a paradigm describing the importance of different elements in communication systems, especially in linguistics.
  61. [61]
    BMRS: Bayesian Model Reduction for Structured Pruning - arXiv
    Dec 20, 2024 · Here, we derive and compare the characteristics of two different reduced priors: one based on a truncated log-normal distribution, which we can ...