Median absolute deviation
The median absolute deviation (MAD) is a robust measure of statistical dispersion in a univariate dataset, defined as the median of the absolute differences between each data point and the median of the dataset.[1] It serves as a resistant alternative to the standard deviation, particularly effective in datasets contaminated by outliers or exhibiting heavy-tailed distributions, where the standard deviation can be unduly inflated by extreme values.[2] For example, in a sample from a standard Cauchy distribution, the MAD is approximately 1.16, while the sample standard deviation can exceed 998 due to tail sensitivity.[2] In robust statistics, the MAD is often scaled by dividing it by approximately 0.6745 (the 0.75 quantile of the standard normal distribution) to obtain an estimate comparable to the standard deviation under normality assumptions; this scaled version, denoted MADN, approximates the population standard deviation for normally distributed data.[1] The measure's robustness stems from its reliance on order statistics rather than squared deviations, making it less affected by the tails of the distribution compared to alternatives like the average absolute deviation or interquartile range.[2] Although the concept traces back to early ideas in deviation measures attributed to Carl Friedrich Gauss around 1816, it gained prominence in modern robust estimation through the work of Frank R. Hampel in 1974, who highlighted its role as a highly breakdown-resistant scale estimator.[3] Today, MAD is widely applied in fields such as signal processing, quality control, and outlier detection, where data integrity against anomalies is critical.[1]Fundamentals
Definition
The median absolute deviation (MAD) is a robust measure of statistical dispersion for a univariate dataset, defined as the median of the absolute deviations of each data point from the dataset's median. This approach captures the typical spread around the central value without being unduly influenced by extreme observations, making it particularly valuable in datasets prone to outliers.[1] Formally, for a dataset \{x_1, x_2, \dots, x_n\}, let m = \median\{x_i\} denote the median of the data points. The MAD is then given by \MAD = \median\{ |x_i - m| \ \forall \ i = 1, 2, \dots, n \}. This formulation ensures that MAD reflects the central tendency of deviations in an absolute sense, prioritizing the median's resistance to asymmetry and contamination.[1] The primary motivation for using MAD stems from its superior robustness compared to traditional variance-based measures like the standard deviation, which squares deviations and thus amplifies the impact of outliers on the overall estimate of variability. In contrast, MAD's reliance on medians and absolute values maintains stability even when a small fraction of the data is corrupted by extreme values.[4] MAD was introduced within the framework of robust statistics in the 20th century, notably by Frank R. Hampel in his seminal work on influence functions and robust estimation.[5]Computation
The median absolute deviation (MAD) of a univariate dataset \{x_1, x_2, \dots, x_n\} is computed through a straightforward three-step process. First, determine the median m of the dataset by sorting the values and selecting the middle value if n is odd, or the average of the two central values if n is even.[6] Second, calculate the absolute deviations d_i = |x_i - m| for each data point x_i. Third, compute the median of the set \{d_1, d_2, \dots, d_n\}, which yields the MAD.[6] To align the MAD with the standard deviation for normally distributed data, it is commonly scaled by the factor $1.4826, resulting in the formula \MAD_{\text{scaled}} = 1.4826 \times \MAD. This constant, approximately equal to $1 / \Phi^{-1}(0.75) where \Phi is the standard normal cumulative distribution function, ensures consistency as an estimate of the population standard deviation under normality.[6] In edge cases, the MAD evaluates to zero for a constant dataset, as all deviations from the median are identical and thus zero. Similarly, for a single-point dataset (n=1), the deviation is zero, yielding \MAD = 0.[7]Illustrative Examples
Univariate Case
Consider a simple univariate dataset consisting of the values {1, 3, 4, 8, 10}. To compute the median absolute deviation (MAD), first determine the median of the dataset, which is 4 (the third value in the sorted list). Next, calculate the absolute deviations from this median: |1 - 4| = 3, |3 - 4| = 1, |4 - 4| = 0, |8 - 4| = 4, |10 - 4| = 6. The sorted absolute deviations are {0, 1, 3, 4, 6}, and their median is 3, so the MAD is 3. The following table summarizes the original data, the median, and the absolute deviations:| Observation | Absolute Deviation from Median (4) |
|---|---|
| 1 | 3 |
| 3 | 1 |
| 4 | 0 |
| 8 | 4 |
| 10 | 6 |
Interpretation
The median absolute deviation (MAD) quantifies the typical absolute deviation of data points from the dataset's median, serving as a robust indicator of dispersion. By definition, at least half of the observations in the dataset lie within one MAD of the median, providing an intuitive measure of central clustering that is less affected by extreme values than other dispersion metrics.[1][5] In the context of normally distributed data, the expected value of the unscaled MAD is approximately $0.6745 \times \sigma, where \sigma is the population standard deviation; this relationship allows MAD to be interpreted on a comparable scale to the standard deviation for symmetric, bell-shaped distributions.[1] A smaller MAD relative to the data's range signals lower overall variability and tighter concentration around the median, while larger values highlight greater spread. This interpretation is especially valuable for skewed distributions or datasets prone to outliers, where MAD resists distortion from asymmetric tails or contaminating points, offering a more reliable summary of core variability than variance-based measures.[8] Despite its robustness, MAD has limitations in certain analytical contexts. Unlike variance, which is additive for sums of independent random variables (i.e., the variance of the sum equals the sum of the variances), MAD lacks this property, complicating its use in models involving aggregated or independent components. Additionally, as a median-based statistic, MAD emphasizes central deviations and is less responsive to the full extent of tail behavior, potentially underrepresenting extreme outliers or heavy-tailed structures in the data.Properties
Relation to Standard Deviation
The median absolute deviation (MAD) serves as a robust alternative to the standard deviation for measuring dispersion in a dataset. Under the assumption of a normal distribution, for large sample sizes, the MAD is asymptotically equivalent to approximately 0.6745 times the population standard deviation \sigma, such that \MAD \approx 0.6745 \sigma.[1] This relationship arises because the population MAD for a normal distribution equals \Phi^{-1}(0.75) \sigma, where \Phi^{-1}(0.75) \approx 0.6745, providing a scaling factor to align the two measures.[1] A key distinction between MAD and the standard deviation lies in their treatment of outliers. The standard deviation amplifies the influence of extreme values by squaring deviations from the mean, making it highly sensitive to contamination in the data.[2] In contrast, MAD mitigates this by using the median as the central point and taking absolute deviations, which inherently downweights outliers and preserves stability even when tails are heavier than in a normal distribution.[2] Both the sample standard deviation and the scaled MAD (typically MAD divided by 0.6745) are consistent estimators of \sigma under normality, converging in probability to the true scale parameter as sample size increases.[9] However, under models with contamination—such as a mixture of normal and outlier distributions—the MAD exhibits superior efficiency, maintaining lower variance and bias compared to the standard deviation, which can be severely distorted by even a small proportion of outliers.[9] For symmetric distributions, the MAD offers an intuitive interpretation analogous to the empirical rule for the standard deviation. Approximately 50% of the data points lie within one MAD of the median, reflecting the median's property as the central value of the absolute deviations.[10] This half-sample coverage provides a robust benchmark for central dispersion, particularly useful in non-normal or contaminated settings where the standard deviation's 68% rule under normality fails.[10]Derivation
The median absolute deviation (MAD) for a random variable X \sim [N](/page/N+)(\mu, \sigma^2) is defined as the median of the absolute deviations from the population median, which coincides with \mu for the normal distribution. To derive the relationship between the population MAD and \sigma, consider the standardized variable [Z = (X](/page/Z/X) - \mu)/\sigma \sim N(0, 1). The absolute deviations are then |X - \mu| = \sigma |Z|, so the population MAD is \sigma times the median of |Z|. The distribution of |Z| follows a folded normal distribution with cumulative distribution function F_{|Z|}(w) = 2\Phi(w) - 1 for w \geq 0, where \Phi is the standard normal CDF.[11] The median m of |Z| satisfies F_{|Z|}(m) = 0.5, yielding $2\Phi(m) - 1 = 0.5, or \Phi(m) = 0.75. Thus, m = \Phi^{-1}(0.75) \approx 0.6745, and the population MAD equals $0.6745 \sigma. For comparison, the expected absolute deviation E[|Z|] = \sqrt{2/\pi} \approx 0.7979, which is larger than the median due to the skewness of the folded normal. To obtain an estimator of \sigma from the MAD, the scaling constant is the reciprocal: approximately $1/0.6745 \approx 1.4826. Therefore, the scaled population MAD is $1.4826 \times \text{MAD} = \sigma.[11][3] For the sample MAD, defined as the median of |x_i - \hat{\mu}| where \hat{\mu} is the sample median and x_1, \dots, x_n are i.i.d. from N(\mu, \sigma^2), consistency follows from the asymptotic properties of order statistics. The sample median \hat{\mu} converges in probability to \mu as n \to \infty, and the absolute deviations |x_i - \hat{\mu}| converge in distribution to |X - \mu|. The sample MAD, being the (n+1)/2-th order statistic of these deviations (adjusted for even n), is a consistent estimator of the population MAD by the consistency of sample quantiles for continuous distributions. By the central limit theorem applied to L-estimators, \sqrt{n} (\text{sample MAD} - 0.6745 \sigma) \xrightarrow{d} N(0, \sigma_D^2) for some \sigma_D^2 > 0, ensuring the scaled sample MAD $1.4826 \times \text{sample MAD} is a consistent and asymptotically normal estimator of \sigma.[11][12]Applications
Robust Statistics
The median absolute deviation (MAD) plays a central role in robust statistics by offering a dispersion measure that resists the distorting effects of outliers, enabling more reliable inference in contaminated datasets. Its high breakdown point of 50%—the proportion of outliers needed to make the estimator arbitrary—far exceeds that of the standard deviation, which breaks down completely (0% breakdown point) with even a single extreme value, making MAD particularly valuable for maintaining statistical stability when up to nearly half the data may be erroneous. In robust estimation for location-scale models, MAD serves as a consistent and efficient scale estimator, often paired with the sample median for location to form a fully robust pair that approximates classical estimators under minimal contamination. For example, it replaces the standard deviation in modified t-tests, yielding robust test statistics that preserve type I error rates and power against alternatives even with heavy-tailed errors or outliers. Similarly, in linear regression, MAD estimates the scale of residuals to downweight influential points, supporting outlier-resistant coefficient inference without assuming normality.[13][14] Within M-estimation frameworks, MAD provides an initial scale estimate to set tuning constants for bounded influence functions, such as in Huber's estimator, where it scales residuals to control outlier impact and achieve high efficiency at the normal model. This integration ensures the overall procedure remains asymptotically normal and consistent under mild contamination assumptions. Practical computation of MAD is facilitated by standard software libraries; R'smad() function in the stats package implements it with optional consistency corrections for normal data, while Python's scipy.stats.median_abs_deviation in SciPy offers flexible scaling and axis options for vectorized arrays.[7]