Fact-checked by Grok 2 weeks ago

Median absolute deviation

The median absolute deviation (MAD) is a robust measure of in a univariate , defined as the of the absolute differences between each data point and the of the . It serves as a resistant alternative to the standard deviation, particularly effective in datasets contaminated by outliers or exhibiting heavy-tailed distributions, where the standard deviation can be unduly inflated by extreme values. For example, in a sample from a standard , the MAD is approximately 1.16, while the sample standard deviation can exceed 998 due to tail sensitivity. In , the MAD is often scaled by dividing it by approximately 0.6745 (the 0.75 of the ) to obtain an estimate comparable to the deviation under assumptions; this scaled version, denoted MADN, approximates the population deviation for normally distributed data. The measure's robustness stems from its reliance on order statistics rather than squared deviations, making it less affected by the tails of the distribution compared to alternatives like the or . Although the concept traces back to early ideas in deviation measures attributed to around 1816, it gained prominence in modern robust estimation through the work of Frank R. Hampel in 1974, who highlighted its role as a highly breakdown-resistant scale estimator. Today, MAD is widely applied in fields such as , , and detection, where against anomalies is critical.

Fundamentals

Definition

The median absolute deviation () is a robust measure of for a univariate , defined as the of the absolute deviations of each data point from the dataset's . This approach captures the typical spread around the central value without being unduly influenced by extreme observations, making it particularly valuable in datasets prone to outliers. Formally, for a dataset \{x_1, x_2, \dots, x_n\}, let m = \median\{x_i\} denote the median of the data points. The MAD is then given by \MAD = \median\{ |x_i - m| \ \forall \ i = 1, 2, \dots, n \}. This formulation ensures that MAD reflects the central tendency of deviations in an absolute sense, prioritizing the median's resistance to asymmetry and contamination. The primary motivation for using MAD stems from its superior robustness compared to traditional variance-based measures like the standard deviation, which squares deviations and thus amplifies the impact of outliers on the overall estimate of variability. In contrast, MAD's reliance on medians and absolute values maintains stability even when a small fraction of the data is corrupted by extreme values. MAD was introduced within the framework of in the 20th century, notably by Frank R. Hampel in his seminal work on influence functions and robust estimation.

Computation

The absolute deviation (MAD) of a univariate \{x_1, x_2, \dots, x_n\} is computed through a straightforward three-step process. First, determine the m of the by the values and selecting the middle value if n is odd, or the average of the two central values if n is even. Second, calculate the absolute deviations d_i = |x_i - m| for each data point x_i. Third, compute the of the set \{d_1, d_2, \dots, d_n\}, which yields the MAD. To align the MAD with the standard deviation for normally distributed , it is commonly scaled by the factor $1.4826, resulting in the formula \MAD_{\text{scaled}} = 1.4826 \times \MAD. This constant, approximately equal to $1 / \Phi^{-1}(0.75) where \Phi is the standard normal , ensures consistency as an estimate of the population standard deviation under normality. In edge cases, the MAD evaluates to zero for a constant , as all deviations from the are identical and thus zero. Similarly, for a single-point (n=1), the deviation is zero, yielding \MAD = 0.

Illustrative Examples

Univariate Case

Consider a simple univariate consisting of the values {1, 3, 4, 8, 10}. To compute the absolute deviation (MAD), first determine the of the , which is 4 (the third value in the sorted list). Next, calculate the absolute deviations from this median: |1 - 4| = 3, |3 - 4| = 1, |4 - 4| = 0, |8 - 4| = 4, |10 - 4| = 6. The sorted absolute deviations are {0, 1, 3, 4, 6}, and their median is 3, so the MAD is 3. The following table summarizes the original data, the median, and the absolute deviations:
ObservationAbsolute Deviation from Median (4)
13
31
40
84
106
Now, contrast this with the same dataset modified by adding an outlier, becoming {1, 3, 4, 8, 100}. The median remains 4. The absolute deviations are 3, 1, 0, 4, and 96; sorted as {0, 1, 3, 4, 96}, with median 3, so the MAD is still 3. However, the sample standard deviation increases drastically from approximately 3.31 in the original dataset to about 38.47 in the outlier-affected one, due to the influence of the extreme value. This demonstrates the robustness of the MAD to outliers relative to the standard deviation.

Interpretation

The median absolute deviation () quantifies the typical absolute deviation of data points from the , serving as a robust indicator of . By definition, at least half of the observations in the lie within one of the , providing an intuitive measure of central clustering that is less affected by extreme values than other metrics. In the context of normally distributed data, the expected value of the unscaled MAD is approximately $0.6745 \times \sigma, where \sigma is the population standard deviation; this relationship allows MAD to be interpreted on a comparable scale to the standard deviation for symmetric, bell-shaped distributions. A smaller MAD relative to the data's range signals lower overall variability and tighter concentration around the median, while larger values highlight greater spread. This interpretation is especially valuable for skewed distributions or datasets prone to outliers, where MAD resists distortion from asymmetric tails or contaminating points, offering a more reliable summary of core variability than variance-based measures. Despite its robustness, MAD has limitations in certain analytical contexts. Unlike variance, which is additive for sums of independent random variables (i.e., the variance of the sum equals the sum of the variances), MAD lacks this property, complicating its use in models involving aggregated or components. Additionally, as a median-based , MAD emphasizes central deviations and is less responsive to the full extent of tail behavior, potentially underrepresenting extreme outliers or heavy-tailed structures in the data.

Properties

Relation to Standard Deviation

The median absolute deviation (MAD) serves as a robust alternative to the standard deviation for measuring in a . Under the assumption of a , for large sample sizes, the MAD is asymptotically equivalent to approximately 0.6745 times the population standard deviation \sigma, such that \MAD \approx 0.6745 \sigma. This relationship arises because the population MAD for a equals \Phi^{-1}(0.75) \sigma, where \Phi^{-1}(0.75) \approx 0.6745, providing a scaling factor to align the two measures. A key distinction between MAD and the standard deviation lies in their treatment of outliers. The standard deviation amplifies the influence of extreme values by squaring deviations from the , making it highly sensitive to contamination in the . In contrast, MAD mitigates this by using the as the central point and taking absolute deviations, which inherently downweights outliers and preserves stability even when tails are heavier than in a . Both the sample standard deviation and the scaled MAD (typically MAD divided by 0.6745) are consistent estimators of \sigma under normality, converging in probability to the true scale parameter as sample size increases. However, under models with contamination—such as a mixture of normal and outlier distributions—the MAD exhibits superior efficiency, maintaining lower variance and bias compared to the standard deviation, which can be severely distorted by even a small proportion of outliers. For symmetric distributions, the MAD offers an intuitive interpretation analogous to the empirical rule for the standard deviation. Approximately 50% of the data points lie within one MAD of the , reflecting the median's property as the central value of the absolute deviations. This half-sample coverage provides a robust for central , particularly useful in non-normal or contaminated settings where the standard deviation's 68% rule under fails.

Derivation

The () for a X \sim [N](/page/N+)(\mu, \sigma^2) is defined as the of the absolute deviations from the , which coincides with \mu for the normal . To derive the relationship between the population MAD and \sigma, consider the standardized variable [Z = (X](/page/Z/X) - \mu)/\sigma \sim N(0, 1). The absolute deviations are then |X - \mu| = \sigma |Z|, so the population MAD is \sigma times the of |Z|. The of |Z| follows a with F_{|Z|}(w) = 2\Phi(w) - 1 for w \geq 0, where \Phi is the standard normal CDF. The median m of |Z| satisfies F_{|Z|}(m) = 0.5, yielding $2\Phi(m) - 1 = 0.5, or \Phi(m) = 0.75. Thus, m = \Phi^{-1}(0.75) \approx 0.6745, and the population MAD equals $0.6745 \sigma. For comparison, the expected absolute deviation E[|Z|] = \sqrt{2/\pi} \approx 0.7979, which is larger than the median due to the skewness of the folded normal. To obtain an estimator of \sigma from the MAD, the scaling constant is the reciprocal: approximately $1/0.6745 \approx 1.4826. Therefore, the scaled population MAD is $1.4826 \times \text{MAD} = \sigma. For the sample MAD, defined as the median of |x_i - \hat{\mu}| where \hat{\mu} is the sample median and x_1, \dots, x_n are i.i.d. from N(\mu, \sigma^2), consistency follows from the asymptotic properties of order statistics. The sample median \hat{\mu} converges in probability to \mu as n \to \infty, and the absolute deviations |x_i - \hat{\mu}| converge in distribution to |X - \mu|. The sample MAD, being the (n+1)/2-th order statistic of these deviations (adjusted for even n), is a consistent estimator of the population MAD by the consistency of sample quantiles for continuous distributions. By the applied to L-estimators, \sqrt{n} (\text{sample MAD} - 0.6745 \sigma) \xrightarrow{d} N(0, \sigma_D^2) for some \sigma_D^2 > 0, ensuring the scaled sample MAD $1.4826 \times \text{sample MAD} is a consistent and asymptotically normal estimator of \sigma.

Applications

Robust Statistics

The median absolute deviation (MAD) plays a central role in by offering a measure that resists the distorting effects of outliers, enabling more reliable inference in contaminated datasets. Its high breakdown point of 50%—the proportion of outliers needed to make the arbitrary—far exceeds that of the standard deviation, which breaks down completely (0% breakdown point) with even a single extreme value, making MAD particularly valuable for maintaining statistical stability when up to nearly half the data may be erroneous. In robust for - models, serves as a consistent and efficient , often paired with the sample for to form a fully robust pair that approximates classical estimators under minimal . For example, it replaces the standard deviation in modified t-tests, yielding robust test statistics that preserve type I error rates and power against alternatives even with heavy-tailed errors or outliers. Similarly, in , estimates the of residuals to downweight influential points, supporting outlier-resistant coefficient inference without assuming . Within M-estimation frameworks, MAD provides an initial scale estimate to set tuning constants for bounded influence functions, such as in Huber's estimator, where it scales residuals to control impact and achieve high efficiency at the normal model. This integration ensures the overall procedure remains asymptotically and consistent under mild contamination assumptions. Practical computation of MAD is facilitated by standard software libraries; R's mad() function in the stats package implements it with optional consistency corrections for normal data, while Python's scipy.stats.median_abs_deviation in offers flexible scaling and axis options for vectorized arrays.

Signal Processing and Outlier Detection

In signal processing, the median absolute deviation (MAD) plays a key role in wavelet-based denoising algorithms, where it serves as a robust estimator for the noise level in wavelet coefficients. Pioneering work by Donoho and Johnstone introduced soft-thresholding techniques that estimate the standard deviation of noise \sigma from the MAD of the finest-scale detail coefficients, using the scaling factor \hat{\sigma} = \frac{\mathrm{MAD}}{0.6745}, assuming approximate normality of the noise. Thresholds are then applied as multiples of this estimate, such as k = 3 for hard thresholding, retaining coefficients larger than k \times \mathrm{MAD} (adjusted for scale) to preserve signal features while attenuating noise-dominated components. This approach minimizes mean-squared error in noisy signals and has become a standard in applications like image and audio processing. For outlier detection in time-series analysis, MAD enables a simple yet effective rule: a data point is flagged as an if its absolute deviation from the exceeds $3 \times \mathrm{MAD}, providing bounds like \mathrm{median} \pm 3 \times \mathrm{MAD}. This method excels in non-normal data, where traditional z-scores based on and standard deviation fail due to sensitivity to extremes. Its robustness stems from the median's resistance to s, making it suitable for identifying anomalies without assuming Gaussianity. Applications span diverse fields, including astronomy, where MAD thresholds variability in star flux light curves to detect unusual events amid noisy observations from telescopes like Hubble. In finance, it identifies anomalies in return series, such as sudden spikes in stock prices, supporting tasks like fraud detection and portfolio risk assessment. Compared to z-scores, MAD-based detection offers superior performance in heavy-tailed distributions common in real-world signals, as the standard deviation is unduly influenced by outliers, leading to wider intervals and missed detections, whereas MAD maintains consistent scaling.

Generalizations

Multivariate Extension

The multivariate extension of the median absolute deviation (MAD) addresses the need for a robust measure of dispersion in higher-dimensional data. For a sample of n points \mathbf{x}_i \in \mathbb{R}^p, i = 1, \dots, n, the location estimate is the geometric median \mathbf{m}, defined as the point that minimizes the sum of Euclidean distances to the observations: \mathbf{m} = \arg\min_{\mathbf{z} \in \mathbb{R}^p} \sum_{i=1}^n \|\mathbf{x}_i - \mathbf{z}\|_2. This choice preserves the robustness of the univariate median, as the geometric median has a breakdown point of 0.5, resisting the influence of up to nearly half of the data points being outliers. The multivariate MAD is then the median of these Euclidean distances: \MAD = \median_i \left\{ \|\mathbf{x}_i - \mathbf{m}\|_2 \right\}. This yields a scalar summary of spread, analogous to the univariate case but accounting for the geometry of the data cloud. Component-wise MAD, applying the univariate separately to each dimension and combining (e.g., via norms), is an alternative but less geometrically coherent approach. The version is particularly useful in fields like , where it quantifies variability across spectral bands. A key challenge in computation arises from the itself, which lacks a and requires iterative algorithms like Weiszfeld's procedure for estimation. Moreover, uniqueness holds only if the points are not collinear; if all data lie on a line, multiple medians may exist, though this occurs with probability zero for continuous distributions. For data from an isotropic , the scaling factor to relate the MAD to the component standard deviation depends on the p and is given by 1 / \median(\chi_p), where \chi_p is the chi distribution with p . This provides a robust analog to measures like the , but adjusted for dimensionality. This extension finds use in applications such as clustering algorithms (e.g., k-medians variants) and in multidimensional datasets, where it helps identify deviations from typical spread.

Population MAD

The population median absolute deviation (MAD) for a random variable X with population median \mu is defined as \MAD = \median\{|X - \mu|\}, where the median is taken with respect to the distribution of the absolute deviations |X - \mu|. This parameter quantifies the typical deviation from the median in the population, offering a robust measure of scale that is less sensitive to extreme values compared to the standard deviation. For the normal distribution X \sim \mathcal{N}(\mu, \sigma^2), the population MAD equals \Phi^{-1}(3/4) \sigma, where \Phi^{-1} denotes the of the standard ; this evaluates to approximately $0.6745 \sigma. The exact value arises because the distribution of |X - \mu| is a folded normal, and its median corresponds to the 75th of the standard normal due to symmetry. For the Laplace distribution with location \mu and scale b (where the variance is $2b^2 and thus \sigma = b\sqrt{2}), the absolute deviations |X - \mu| follow an exponential distribution with rate $1/b, so the population MAD is b \ln 2 \approx 0.6931 b. In terms of the standard deviation, this is \MAD = (\ln 2 / \sqrt{2}) \sigma \approx 0.4901 \sigma. This value is smaller than the corresponding population mean absolute deviation of b = \sigma / \sqrt{2} \approx 0.7071 \sigma, highlighting the distinction between median- and mean-based measures for asymmetric or heavy-tailed distributions. In general, for arbitrary distributions, the population MAD serves as a consistent , with the sample MAD converging in probability to it as the sample size increases. Asymptotically, under mild regularity conditions (such as a continuous at the ), the sample MAD exhibits \sqrt{n}-consistency and , and it is asymptotically independent of the sample . Relative to the population deviation \sigma, the efficiency of the population MAD depends on the underlying distribution: it achieves about 37% efficiency for the normal but approaches full efficiency (up to 82% for the MAD scaled by 0.6745) for the Laplace, making it preferable for heavy-tailed data.

References

  1. [1]
    Median Absolute Deviation
    Apr 11, 2016 · The median absolute deviation is: where x ~ is the median of the variable. This statistic is sometimes used as a robust alternative to the standard deviation ...
  2. [2]
    1.3.5.6. Measures of Scale - Information Technology Laboratory
    In summary, the variance, standard deviation, average absolute deviation, and median absolute deviation measure both aspects of the variability; that is, the ...
  3. [3]
    [PDF] arXiv:1910.00229v5 [math.ST] 4 Aug 2024
    Aug 4, 2024 · The median absolute deviation is a robust measure of dispersion (MAD, see e.g. Hampel, 1974; Hampel et al.,. 1986). Defined as the median of the ...
  4. [4]
    2 Robust summaries – Introduction to Data Science - rafalab
    2.5 Median absolute deviation. Another way to robustly estimate the standard deviation in the presence of outliers is to use the median absolute deviation (MAD) ...
  5. [5]
    [PDF] Lecture 14 — October 13 14.1 Robust Statistics
    Instead we can use the MAD estimator: median absolute deviation. Actually is more like MADAM, median absolute deviation around the median. 14-5.
  6. [6]
    The Influence Curve and Its Role in Robust Estimation - jstor
    Gauss [9] considered, the median deviation was the only robust one; on the other hand, the "optimal" standard deviation may easily have an asymptotic efficiency ...
  7. [7]
    Robust Statistics - Peter J. Huber - Google Books
    Provides selected numerical algorithms for computing robust estimates, as well as convergence proofs. Tables contain quantitative robustness information for a ...
  8. [8]
    median_abs_deviation — SciPy v1.16.2 Manual
    ### Summary of Median Absolute Deviation (MAD) Computation
  9. [9]
    Chapter 12 Robust summaries | Introduction to Data Science - rafalab
    12.5 Median absolute deviation. Another way to robustly estimate the standard deviation in the presence of outliers is to use the median absolute deviation (MAD) ...<|control11|><|separator|>
  10. [10]
  11. [11]
    The mean and median absolute deviations - ScienceDirect.com
    In this article, we present a survey of important results related to the mean and median absolute deviations of a distribution.
  12. [12]
    Caveats of using the median absolute deviation
    Aug 2, 2022 · The median absolute deviation is a measure of dispersion which can be used as a robust alternative to the standard deviation.
  13. [13]
    (PDF) Asymptotic Relative Efficiency in Estimation - ResearchGate
    Jun 1, 2020 · Median is unbiased for the sample mean, and rescaled MAD is unbiased for the standard deviation under the normal distribution. The efficiency ...
  14. [14]
    [PDF] Robust Statistics
    Chapter 1 is an introduction and Chapter 2 considers the location model with emphasis on the median, the median absolute deviation, the trimmed mean, and the ...
  15. [15]
  16. [16]
    [PDF] Robust Estimators for Transformed Location Scale Families
    May 5, 2006 · Numerically, D ≈ 0.6745. Since MED(Y ) = µ and MAD(Y ) ... Rousseeuw, P.J. and Croux, C., Alternatives to the median absolute deviation, J.
  17. [17]
    Reduce Outlier Effects Using Robust Regression - MATLAB & Simulink
    MAD is the median absolute deviation of the residuals from their median. The constant 0.6745 makes the estimate unbiased for the normal distribution. If the ...
  18. [18]
    Detecting outliers: Do not use standard deviation around the mean ...
    The median becomes absurd only when more than 50% of the observations are infinite. With a breakdown point of 0.5, the median is the location estimator that has ...Missing: limitations | Show results with:limitations
  19. [19]
    The Hubble Catalog of Variables (HCV)
    We adopted magnitude-dependent thresholding in median absolute deviation (a robust measure of light curve scatter) combined with sophisticated preprocessing ...
  20. [20]
    Anomaly Detection with Median Absolute Deviation | InfluxData
    Jul 7, 2020 · Large deviations from each individual time series and the median indicate that a series is anomalous. Edited version of the original image from ...
  21. [21]
    [PDF] The Geometric Median and Applications to Robust Mean Estimation
    Jul 19, 2023 · The main goal of this work is to understand when the random variable ∥bµN − µ∥ admits good deviation bounds under minimal assumptions on the ...
  22. [22]
  23. [23]
    Laplace Distribution -- from Wolfram MathWorld
    The Laplace distribution, also called the double exponential distribution, is the distribution of differences between two independent variates with identical ...
  24. [24]
    Asymptotic independence of median and MAD - ScienceDirect.com
    The asymptotic joint normality of the sample median and the median absolute deviation from the median (MAD) as robust counterparts of sample mean and standard ...