Fact-checked by Grok 2 weeks ago

Average absolute deviation

The average absolute deviation (AAD), also known as the mean absolute deviation (MAD), is a statistical measure of dispersion that quantifies the average distance between each data point in a dataset and the dataset's mean, using absolute values to avoid cancellation of positive and negative deviations.^[1] It is formally defined by the formula \mathrm{AAD} = \frac{1}{N} \sum_{i=1}^{N} |x_i - \mu| where N is the number of observations, x_i are the individual data points, and \mu is the arithmetic mean of the dataset.^[2] This measure provides a straightforward indication of variability in units consistent with the original data, making it intuitive for interpreting spread without requiring square roots or additional transformations.^[1] Unlike the variance or standard deviation, which square the deviations and thus amplify the influence of outliers, the AAD employs absolute differences, rendering it less sensitive to extreme values and more robust for datasets with heavy tails or non-normal distributions.^[1] It also differs from the median absolute deviation (MAD from median), which uses the median as the central tendency and is even more outlier-resistant, though the AAD is preferred when the mean is the appropriate location measure.^[1] The AAD finds applications in various fields, including robust statistical analysis for exploring data patterns and identifying inconsistencies, as well as in finance for portfolio optimization models that minimize risk via absolute deviation criteria rather than squared errors.^[3]^[4] In educational contexts, it serves as an accessible tool for teaching variability, bridging intuitive absolute distances with more advanced concepts like the L1 norm in data science.^[5] Its computational simplicity and interpretability make it particularly valuable in preliminary data assessments where outlier sensitivity could otherwise distort results.^[1]

Definition and Concepts

General Definition

The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point, serving as a summary statistic of statistical dispersion or variability.^[6] It quantifies how spread out the values are around this central point without emphasizing outliers as much as squared deviation measures.^[7] The general mathematical formulation for AAD is

\text{AAD} = \frac{1}{n} \sum_{i=1}^n |x_i - m|,

where X = \{x_1, x_2, \dots, x_n\} is the data set and m is the chosen central point.^[6] The absolute value in the formula plays a crucial role by treating deviations regardless of direction—positive or negative—thus avoiding cancellation that occurs with signed deviations and yielding a nonnegative measure of average spread.^[7] To illustrate, consider the small data set X = \{1, 3, 5\} with central point m = 3. The absolute deviations are |1 - 3| = 2, |3 - 3| = 0, and |5 - 3| = 2. The AAD is then \frac{2 + 0 + 2}{3} = \frac{4}{3} \approx 1.333.

Central Tendency Measures

Central tendency measures provide the reference point from which absolute deviations are calculated in a dataset, representing typical or central values around which data points cluster. These measures include the arithmetic mean, median, and mode, each offering distinct advantages depending on the data's characteristics.^[8] The arithmetic mean, often simply called the mean, is computed as the sum of all data values divided by the number of observations:

\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i

where n is the sample size and x_i are the data points. This measure is widely used due to its mathematical properties, such as being the basis for many statistical models, but it is sensitive to outliers, which can disproportionately influence the result.^[9]^[10] The median is the middle value in an ordered dataset; for an odd number of observations, it is the central value, while for an even number, it is the average of the two central values. This measure is robust to outliers, as extreme values do not affect it unless they alter the ordering significantly, making it suitable for skewed distributions.^[11]^[12] The mode is the value that occurs most frequently in the dataset, and it can be multimodal if multiple values share the highest frequency. Unlike the mean and median, the mode is particularly useful for categorical or multimodal data, where it identifies the most common category or peak, though it may not exist or be unique in uniform distributions.^[13]^[14] In measures of dispersion like absolute deviation, a central tendency measure serves as an anchor to quantify how data points deviate from a representative value, providing context for the spread relative to the dataset's core.^[15] For example, in the dataset {1, 2, 2, 100}, the mean is approximately 26.25, while both the median and mode are 2, illustrating how outliers can shift the mean away from the clustered values.^[15] The selection of these points can influence the robustness of absolute deviation metrics to outliers, with the median often yielding more stable results in contaminated data.^[16]

Variants of Absolute Deviation

Mean Absolute Deviation from the Mean

The mean absolute deviation from the mean (MAD) is a measure of statistical dispersion that quantifies the average distance between each data point and the arithmetic mean of the dataset.^[1]^[17] For a dataset with n observations x_1, x_2, \dots, x_n and sample mean \bar{x} = \frac{1}{n} \sum_{i=1}^n x_i, the MAD is given by

\text{MAD} = \frac{1}{n} \sum_{i=1}^n |x_i - \bar{x}|.

This formula computes the arithmetic mean of the absolute deviations from the mean, preserving the original units of the data.^[1]^[17] The MAD provides an intuitive interpretation as the typical deviation of data points from the central tendency, making it more accessible than measures involving squared deviations, such as variance, since it avoids the need to interpret squared units.^[1] For symmetric distributions, the sample MAD serves as an estimator of the population expected absolute deviation E[|X - \mu|], where \mu is the population mean.^[17] However, because it relies on the arithmetic mean as the reference point, the MAD is sensitive to outliers, as extreme values shift the mean and thereby inflate the deviations.^[18]^[19] To illustrate, consider the dataset \{2, 4, 4, 4, 5, 5, 7, 9\}. The mean is \bar{x} = 5, and the absolute deviations are $3, 1, 1, 1, 0, 0, 2, 4, yielding a sum of $12 and thus \text{MAD} = 12 / 8 = 1.5.^[1] This value indicates that, on average, the data points deviate from the mean by $1.5 units. The MAD relates to the root mean square (RMS) deviation, defined as \text{RMS} = \sqrt{\frac{1}{n} \sum_{i=1}^n (x_i - \bar{x})^2}, through the inequality \text{MAD} \leq \text{RMS}, with equality holding only for constant datasets where all deviations are zero.^[20] This follows from the fact that the quadratic mean exceeds or equals the arithmetic mean of absolute values. In contrast to the more robust mean absolute deviation from the median, the mean-based MAD is less resistant to outliers.^[20]

Mean Absolute Deviation from the Median

The mean absolute deviation from the median, often abbreviated as MADM, is a robust measure of statistical dispersion that quantifies the average distance of data points from the dataset's median value. Unlike measures centered on the mean, MADM leverages the median's resistance to extreme values, making it particularly suitable for datasets with asymmetry or outliers. The formula for MADM is given by

\text{MADM} = \frac{1}{n} \sum_{i=1}^n |x_i - \tilde{x}|,

where n is the number of observations, x_i are the data points, and \tilde{x} denotes the median.^[21]^[22] A key property of MADM is its reduced sensitivity to outliers, as the median minimizes the sum of absolute deviations and remains stable in the presence of extreme values.^[22]^[21] In a normal distribution, where the median coincides with the mean, the expected MADM equals \sqrt{2/\pi} \, \sigma \approx 0.798 \sigma, with \sigma being the standard deviation; this value reflects the measure's consistency under symmetry while highlighting its interpretability relative to other scale estimators.^[1] In skewed distributions, MADM offers advantages by providing a more representative estimate of typical spread, as the median better captures the central location without being pulled toward the tail.^[21] This robustness ensures that MADM avoids overestimating dispersion due to skewness, unlike mean-centered alternatives.^[22] For illustration, consider the dataset {1, 2, 3, 100}. The median is 2.5, yielding absolute deviations of 1.5, 0.5, 0.5, and 97.5, so MADM = (1.5 + 0.5 + 0.5 + 97.5)/4 = 25. By comparison, the mean absolute deviation from the dataset mean (26.5) is 36.75, demonstrating how the outlier inflates the mean-based measure more severely.^[22]

Median Absolute Deviation

The median absolute deviation (MAD) is a robust estimator of scale in statistics, defined as the median of the absolute deviations of the data points from the sample median. For a univariate dataset \{x_1, x_2, \dots, x_n\} with sample median \tilde{x}, the absolute deviations are given by d_i = |x_i - \tilde{x}| for i = 1, 2, \dots, n, and the MAD is the median of the set \{d_1, d_2, \dots, d_n\}.^[23] This measure quantifies the typical deviation from the central value in a way that is less sensitive to extreme values than the standard deviation. To achieve consistency with the standard deviation under the assumption of a normal distribution, the MAD is often scaled by the factor 1.4826, which corresponds to the reciprocal of the 75th percentile of the absolute value of a standard normal random variable (approximately \Phi^{-1}(0.75) = 0.6745).^[24] The scaled MAD thus provides an estimate of dispersion comparable to the standard deviation for symmetric, unimodal distributions without outliers. One key property of the unscaled MAD is its high breakdown point of 0.5, meaning it remains bounded and reliable even if up to 50% of the observations are corrupted by arbitrary outliers, making it a preferred robust alternative to the standard deviation in contaminated datasets.^[25] For example, consider the dataset {1, 1, 2, 2, 100}. The median \tilde{x} is 2, yielding absolute deviations {1, 1, 0, 0, 98}. Sorting these gives {0, 0, 1, 1, 98}, so the median of the deviations (MAD) is 1.^[23] This value effectively ignores the influence of the outlier 100, highlighting the robustness of the measure.

Maximum Absolute Deviation

The maximum absolute deviation, often denoted as \text{MAD}_{\max}, is defined as the largest absolute difference between any data point in a dataset and a chosen central point m, mathematically expressed as

\text{MAD}_{\max} = \max_i |x_i - m|,

where x_i are the data points and m is typically the arithmetic mean or the mid-range (the average of the minimum and maximum values). This measure quantifies the extreme extent of dispersion in the data by focusing solely on the farthest outlier from the center, providing a straightforward indicator of the dataset's overall spread without averaging deviations. Unlike averaged measures, \text{MAD}_{\max} is highly sensitive to extreme values, as even one outlier can dramatically inflate the value, rendering it less robust for datasets with potential anomalies.^[26]^[27] A key property of \text{MAD}_{\max} is its relation to the range of the dataset. When m is the mid-range, \text{MAD}_{\max} equals half the range, since the farthest points (minimum and maximum) are equidistant from this central point:

\text{MAD}_{\max} = \frac{1}{2} (\max x_i - \min x_i).

This connection highlights its role as a simple bound on dispersion, guaranteeing that all data points lie within \text{MAD}_{\max} of m. Additionally, the mid-range is the unique central point that minimizes \text{MAD}_{\max} for a given dataset, making it the optimal choice for this measure under the L^\infty norm. It is used in worst-case analysis to establish deterministic limits on variability, such as ensuring no observation exceeds a specified deviation threshold.^[27] For example, consider the dataset \{1, 3, 5, 10\}. The mean is m = 4.75, and the absolute deviations are |1 - 4.75| = 3.75, |3 - 4.75| = 1.75, |5 - 4.75| = 0.25, and |10 - 4.75| = 5.25. Thus, \text{MAD}_{\max} = 5.25, driven by the outlier at 10. If instead using the mid-range m = (1 + 10)/2 = 5.5, the deviations become 4.5, 2.5, 0.5, and 4.5, yielding \text{MAD}_{\max} = 4.5, which is half the range of 9.

Comparison with Variance and Standard Deviation

The average absolute deviation (AAD), often referred to as the mean absolute deviation (MAD), measures dispersion by averaging the absolute differences from a central tendency, typically the mean or median, resulting in a linear treatment of deviations that moderates the influence of outliers. In contrast, variance quantifies dispersion as the average of squared deviations from the mean, which amplifies larger deviations due to the squaring operation, thereby giving greater weight to outliers. The standard deviation, as the square root of the variance, restores the original units of the data but retains this sensitivity to extreme values.^[28]^[29] For a normal distribution, the standard deviation \sigma relates to the mean absolute deviation from the mean (MAD_\text{mean}) by the approximation \sigma \approx 1.253 \times \text{MAD}_\text{mean}, derived from the expected absolute deviation E[|X - \mu|] = \sigma \sqrt{2/\pi} \approx 0.798 \sigma, where the factor \sqrt{\pi/2} \approx 1.253 arises from the properties of the folded normal distribution. This relationship highlights how MAD_\text{mean} understates dispersion relative to \sigma in symmetric, bell-shaped distributions, but the gap widens in skewed or outlier-heavy data.^[30] AAD offers advantages in interpretability, as it represents the average distance from the center in the same units as the data without the need for squaring, making it more intuitive for practical applications like forecasting or quality control. However, its reliance on the absolute value function renders it non-differentiable at zero, complicating optimization and theoretical analyses in calculus-based statistics, whereas variance and standard deviation benefit from smoother, differentiable properties that facilitate additions for independent variables and links to parametric models like the normal distribution.^[28]^[31] To illustrate these differences, consider the following example datasets: a symmetric one resembling a uniform distribution and a skewed one with an outlier.

Dataset	Type	Mean	MAD (from mean)	Variance	Standard Deviation
{1, 2, 3, 4, 5}	Symmetric	3	1.2	2	≈1.414
{1, 2, 3, 4, 10}	Skewed	4	2.4	10	≈3.162

In the symmetric case, the measures are closer in scale, but the skewed dataset shows variance and standard deviation inflating more dramatically due to the outlier at 10, while MAD remains relatively moderated.^[28]

Mathematical Properties

Minimization Characteristics

The average absolute deviation (AAD) from a central point m, defined as \frac{1}{n} \sum_{i=1}^n |x_i - m| for a sample \{x_1, \dots, x_n\}, is minimized when m is the sample median.^[32] This result holds because the median serves as the minimizer under the L1 norm, balancing the number of observations on either side of m.^[33] A proof sketch for the sample case relies on the piecewise linearity of the absolute value function. The sum S(m) = \sum_{i=1}^n |x_i - m| (assuming ordered x_1 \leq \dots \leq x_n) has a derivative (or subgradient for non-differentiable points) that decreases by 2 for each x_i < m and increases by 2 for each x_i > m. At the median (where approximately half the points are below and half above), the subgradient includes zero, confirming a minimum.^[34] For the population case, the expected value E[|X - m|] has derivative -F(m) + (1 - F(m)), which equals zero when m is the median, where F is the cumulative distribution function.^[34] In contrast, the variance \frac{1}{n} \sum_{i=1}^n (x_i - m)^2, minimized at the arithmetic mean, corresponds to the L2 norm and emphasizes larger deviations more heavily.^[35] This difference highlights the mean's sensitivity to outliers versus the median's robustness. The minimization property underscores the median's role as a robust location estimator, particularly in distributions contaminated by outliers, as it downweights extreme values unlike squared-error criteria.^[33] For illustration, consider the dataset \{1, 3, 5, 7, 100\}, with median 5 and mean 23.2. The AAD at the median is \frac{|1-5| + |3-5| + |5-5| + |7-5| + |100-5|}{5} = 20.6. At the mean, it is 30.72; at 3, it is 21. These values confirm the minimum occurs at the median.

Relation to Other Dispersion Measures

The average absolute deviation (AAD), also known as the mean absolute deviation (MAD), exhibits several important relations to other measures of dispersion. One key connection is with the interquartile range (IQR), a robust measure of spread based on quartiles. For normally distributed data, the IQR is approximately twice the median absolute deviation (MAD from the median), as the MAD from the median is roughly 0.674 times the standard deviation while the IQR is about 1.349 times the standard deviation.^[1] Another relation links AAD to the Gini mean difference, a dispersion measure originally proposed by Corrado Gini that calculates the average absolute difference between all pairs of observations. For certain distributions, particularly symmetric ones, the Gini mean difference equals 2 times the mean absolute deviation from the mean, reflecting how pairwise differences capture twice the expected deviation from the central tendency in such cases.^[36] In probability theory, the expected absolute deviation from the mean for a continuous random variable X with probability density function f(x) and cumulative distribution function F(x) can be expressed as

E[|X - \mu|] = 2 \int_{-\infty}^{\mu} F(x) \, dx,

where \mu is the mean; this integral form arises from integrating the tail probabilities and is particularly useful for deriving properties of the distribution.^[37] As noted in the comparison with variance and standard deviation, the SD exceeds the mean AAD from the mean in general, consistent with these relations.

Computation and Estimation

Calculating Absolute Deviations

To calculate the mean absolute deviation from the mean (also known as the average absolute deviation), begin by computing the arithmetic mean of the dataset, denoted as \bar{x}, which is the sum of all data points divided by the number of observations n. Next, determine the absolute deviation for each data point x_i by subtracting the mean from x_i and taking the absolute value: |x_i - \bar{x}|. Finally, average these absolute deviations by summing them and dividing by n: \frac{1}{n} \sum_{i=1}^n |x_i - \bar{x}|.^[5] For the mean absolute deviation from the median or the median absolute deviation (MAD), first sort the dataset in ascending order to find the median m. If n is odd, the median is the middle value; if even, it is the average of the two middle values to handle ties appropriately. Then, compute the absolute deviations |x_i - m| for each data point. For the mean absolute deviation from the median, average these deviations as \frac{1}{n} \sum_{i=1}^n |x_i - m|; for MAD, take the median of these absolute deviations.^[38]^[39] In large datasets, computing the mean absolute deviation from the mean requires a single pass to calculate the mean (O(n) time complexity) followed by another pass for the deviations and average, making it linear time overall. However, variants involving the median necessitate sorting the data first (O(n log n) time using algorithms like quicksort or mergesort), after which deviations can be computed in O(n). For very large or streaming data, approximations such as histogram-based median estimation or online algorithms can reduce the effective complexity to near-linear time while maintaining reasonable accuracy.^[29] The absolute value operation in these calculations is computationally cheaper than squaring used in variance, as it avoids multiplication (e.g., floating-point absolute value via FABS is typically faster than multiplication for squaring via FMUL on modern CPUs) and preserves the original scale without amplifying outliers excessively.^[40] Software implementations facilitate efficient computation. In R, the mad() function from the base stats package computes the median absolute deviation by default, scaling it by a consistency factor of 1.4826 for normal distributions, with options for mean absolute deviation via the center argument. In Python, SciPy's scipy.stats.median_abs_deviation() function calculates the median of absolute deviations from the data's median (or a specified center), supporting scales like 1.4826 and nan handling.^[41]^[42] Pseudocode for the mean absolute deviation from the mean is as follows:

function mean_absolute_deviation(data):
    n = length(data)
    if n == 0:
        return NaN
    mean = sum(data) / n
    sum_dev = 0
    for i in 1 to n:
        sum_dev += abs(data[i] - mean)
    return sum_dev / n
function mean_absolute_deviation(data):
    n = length(data)
    if n == 0:
        return NaN
    mean = sum(data) / n
    sum_dev = 0
    for i in 1 to n:
        sum_dev += abs(data[i] - mean)
    return sum_dev / n

For the median absolute deviation:

function median_absolute_deviation(data):
    n = length(data)
    if n == 0:
        return NaN
    sorted_data = sort(data)
    if n mod 2 == 1:
        median = sorted_data[(n+1)/2]
    else:
        median = (sorted_data[n/2] + sorted_data[n/2 + 1]) / 2
    deviations = []
    for i in 1 to n:
        deviations.append(abs(data[i] - median))
    return median_of(deviations)  // Apply median computation recursively or via sorting
function median_absolute_deviation(data):
    n = length(data)
    if n == 0:
        return NaN
    sorted_data = sort(data)
    if n mod 2 == 1:
        median = sorted_data[(n+1)/2]
    else:
        median = (sorted_data[n/2] + sorted_data[n/2 + 1]) / 2
    deviations = []
    for i in 1 to n:
        deviations.append(abs(data[i] - median))
    return median_of(deviations)  // Apply median computation recursively or via sorting

These algorithms can be implemented in O(n log n) time due to sorting for the median step.^[43]

Statistical Estimation Methods

Estimating the population average absolute deviation (AAD) from a sample involves addressing the bias inherent in sample-based estimators of dispersion. The sample AAD is generally a biased estimator of the population parameter, with the direction and magnitude of the bias depending on the choice of central point (mean or median) and the underlying distribution. For the mean absolute deviation from the mean (MAD_mean), the sample estimator—defined as the average of the absolute deviations from the sample mean—tends to underestimate the population value due to the dependence introduced by using the sample mean as the center. An approximate unbiased estimator for the population MAD_mean can be obtained by multiplying the sample MAD by \sqrt{\frac{n}{n-1}}, particularly in large samples where the bias is small.^[44] For the median absolute deviation (MAD_median), the sample estimator is the median of the absolute deviations from the sample median, which is also biased in finite samples but possesses desirable robustness properties. Under regularity conditions, the sample MAD_median is asymptotically normal, with its asymptotic variance depending on the underlying distribution's density at the median and the scale parameter. Specifically, for a distribution with median \theta and error density f, the asymptotic distribution is \sqrt{n} (\hat{m} - \theta) \to N(0, \frac{1}{4f(\theta)^2}) for the sample median \hat{m}, and the joint asymptotic normality of the sample median and MAD_median has covariance zero under symmetry, leading to independent limiting distributions.^[45] The variance of the MAD_median estimator varies with the distribution; for example, in the normal case, it converges to a value that allows consistent estimation of the standard deviation when scaled by approximately 1.4826.^[46] Bootstrap methods provide a distribution-free approach to constructing confidence intervals for the AAD, particularly useful when asymptotic approximations are unreliable in small samples or non-normal distributions. The nonparametric bootstrap involves resampling with replacement from the original sample to generate an empirical distribution of the AAD statistic, from which percentile or bias-corrected accelerated intervals can be derived. This method is effective for both MAD_mean and MAD_median, capturing the sampling variability without assuming normality. Seminal work on bootstrap techniques for such estimators emphasizes their utility in robust settings, where they outperform parametric intervals under contamination. In robust estimation, techniques like winsorizing or trimming are applied to mitigate the impact of outliers in samples with heavy tails or contamination, improving the reliability of AAD estimates. Winsorizing replaces extreme values with less extreme ones based on quantiles, while trimming removes them entirely; these methods reduce bias in the sample AAD while preserving consistency for the population parameter. For instance, the trimmed MAD_median has been shown to balance efficiency and robustness better than the unadjusted version in outlier-heavy scenarios.^[47] Simulations illustrate the bias in small samples: for a standard normal distribution, the sample MAD_median for n=5 underestimates the population scale by a factor requiring a correction of approximately 1.72 (compared to the asymptotic 1.4826), leading to about 16% relative bias, whereas for n=100, the bias is negligible (less than 1%). These results highlight the need for finite-sample corrections, such as those tabulated for various n, to achieve near-unbiased estimation.^[46]

Applications and Uses

In Descriptive Statistics

The average absolute deviation (AAD), also known as the mean absolute deviation (MAD), serves as a fundamental measure of data variability in descriptive statistics by quantifying the average distance of data points from the central tendency, typically the mean. It complements traditional components of the five-number summary—such as the minimum, first quartile, median, third quartile, and maximum—by providing an additional perspective on spread that bridges the gap between the range (which captures extremes) and the interquartile range (IQR, which focuses on the middle 50% of data). Unlike the IQR, which ignores deviations beyond the quartiles, AAD incorporates all observations, offering a more holistic view of typical deviation in datasets where outliers may not dominate but overall scatter is of interest. This makes it particularly useful in summarizing variability alongside box plots, where the box represents the IQR and whiskers extend to the range, allowing analysts to report both positional spread and average displacement for a fuller descriptive profile.^[48] In educational contexts, AAD is valued as an accessible teaching tool for introducing concepts of dispersion to beginners, as its calculation—averaging absolute differences from the mean—avoids the squaring step in standard deviation, making it computationally simpler and more intuitively interpretable. For instance, an AAD of 2 units conveys that data points deviate from the mean by an average of 2 units, a direct statement in the original scale of the data that resonates with novices without requiring advanced mathematical intuition. Studies on prospective teachers highlight its role in middle school curricula, where understanding AAD as the "typical distance" from the mean fosters both procedural accuracy (correct computation) and conceptual grasp (comparing variability across datasets of unequal sizes), enhancing early statistical literacy.^[5]^[49] In forecasting applications, AAD manifests as the mean absolute error (MAE), which assesses prediction accuracy by computing the average absolute difference between forecasted and actual values, thereby summarizing the typical magnitude of errors in models like time-series projections. This metric retains the units of the original data, facilitating straightforward interpretation—for example, an MAE of 14 visits per day indicates predictions err by that amount on average—making it a practical tool for evaluating forecast reliability in fields such as supply chain management or economic projections.^[50]^[51] In finance, AAD is used in portfolio optimization to minimize risk by employing absolute deviation criteria instead of squared errors, providing a robust measure of return variability that is less sensitive to outliers than variance. This approach supports mean-absolute deviation models for asset allocation.^[3]^[4] Economists employ AAD to summarize income inequality by calculating the average absolute difference between individual incomes and the mean income, providing a simple absolute measure of dispersion that highlights the scale of disparities in monetary terms. For instance, in regional analyses, it has been used to gauge income variation across populations, though its sensitivity to population size changes limits its standalone application compared to scale-invariant alternatives. This approach, rooted in early inequality studies, aids in descriptive reports of economic disparity, such as average deviations in household income distributions to contextualize inequality trends.^[52]

In Robust Statistics and Outlier Detection

In robust statistics, the median absolute deviation (MAD)—a median-based variant of the average absolute deviation (AAD)—serves as a key scale estimator resistant to outliers and contamination in data. Unlike the AAD, which uses the mean and can be heavily influenced by extreme values, the MAD computes the median of the absolute deviations from the sample median, yielding a breakdown point of 50%, meaning it remains consistent even if up to half the observations are corrupted by outliers.^[53] This property makes MAD a consistent estimator under contamination models, such as those involving 25% outliers, where classical measures like the standard deviation fail due to asymmetry or heavy tails.^[54] For outlier detection, MAD enables robust thresholding rules, such as flagging data points exceeding 3 times the MAD from the median, which identifies anomalies while preserving the core structure of the dataset under non-normal conditions.^[55] This approach outperforms mean-based methods in contaminated environments, as demonstrated in simulations where it maintains detection accuracy with up to 25% gross errors.^[55] In regression analysis, the absolute deviation loss function underlying L1 (least absolute deviations) regression minimizes the sum of absolute residuals, effectively estimating the conditional median and providing a robust alternative to least squares that aligns with MAD minimization for scale assessment.^[56] This L1 framework enhances robustness in linear models by downweighting outlier leverage, ensuring parameter estimates remain stable under contamination levels below 50%.^[56] MAD finds practical applications in signal processing for denoising, where it estimates noise variance in wavelet-based methods to threshold coefficients without distorting underlying signals, as in Donoho and Johnstone's shrinkage techniques.^[57] In quality control, MAD-based control charts monitor process dispersion from specifications, detecting shifts robustly in skewed or outlier-prone manufacturing data.^[58] A notable case study involves filtering noise in astronomical images from the James Webb Space Telescope (JWST), where a MAD-based iterative filter (SOCKS) with a 3×MAD threshold removes cosmic rays and sensor artifacts across multiple iterations, revealing stellar structures and enhancing data clarity in contaminated pixel arrays.^[59]

Historical Development

The roots of absolute deviation measures, including the average absolute deviation (AAD), lie in 18th-century efforts to quantify observational errors in astronomy and geodesy. Mathematicians such as Johann Heinrich Lambert and Pierre-Simon Laplace explored early concepts of dispersion through absolute differences in their works on probability and error analysis, laying foundational ideas for assessing data variability.^[60] A pivotal advancement occurred in 1816 when Carl Friedrich Gauss, in his paper "Bestimmung der Genauigkeit der Beobachtungen" (Determination of the Accuracy of Observations), proposed the mean absolute deviation from the mean as a practical measure of precision for numerical astronomical data. Gauss highlighted its utility alongside the median absolute deviation, noting its robustness for real-world observations affected by errors. This marked one of the earliest formal uses of AAD in statistical theory.^[60]^[61] Throughout the 19th century, AAD saw widespread application in fields like surveying and physics due to its computational simplicity and interpretability in original units. However, by the late 1800s, the mean squared deviation (leading to variance and standard deviation) gained favor for its mathematical tractability, particularly in optimization and normal distribution theory. Karl Pearson's 1893 introduction of the term "standard deviation" further popularized squared measures, somewhat eclipsing absolute ones.^[60] In the 20th century, renewed interest in AAD emerged within robust statistics, where its lower sensitivity to outliers proved advantageous for non-normal distributions and outlier detection. This revival, influenced by works from statisticians like Frank Anscombe and Peter Hampel, positioned AAD as a complementary tool to standard deviation in modern data analysis.^[62]

References

[1]
1.3.5.6. Measures of Scale - Information Technology Laboratory
In summary, the variance, standard deviation, average absolute deviation, and median absolute deviation measure both aspects of the variability; that is, the ...
[2]
AVERAGE ABSOLUTE DEVIATION
Jan 31, 2015 · Purpose: Compute the average absolute deviation for a variable. Description: The average absolute deviation is defined as.
[3]
[PDF] Computation and interpretation of mean absolute deviations by ...
Feb 12, 2025 · Mean absolute deviation about median as a tool of exploratory data analysis. Int J Res Rev Appl Sci. (2012) 11:517–23. 16. Leys C, Klein O ...
[4]
Portfolio Optimization with a Mean–Absolute Deviation–Entropy ...
In portfolio optimization, the mean-absolute deviation model has been used to achieve the target rate of return and minimize the risk. However, the maximization ...
[5]
Teaching Statistics Investigations Chapter 3-Developing Conceptual ...
They explore the meaning, geometrically and symbolically, for three common measures of variation: mean absolute deviation (MAD), variance, and standard ...
[6]
Display of Numerical Data - Department of Mathematics at UTSA
Dec 18, 2021 · The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of ...<|control11|><|separator|>
[7]
Measures of Spread | STAT 504 - STAT ONLINE
The arithmetic mean of all of the deviations must be zero because they sum to zero. However, the arithmetic mean of the absolute values of the deviations (the ...
[8]
Distribution Statistics - Bureau of Labor Statistics
Average, mean, or arithmetic mean. The sum of a set of numbers divided by the number of members in the set. Example: the average of 2, 4, 12 is 6. (2+4+12)/ ...
[9]
2.6 Measures of Center – Significant Statistics
The technical term is “arithmetic mean” and “average” is technically a center location. ... definition of mean: mean = \frac{data\text{ }sum}{number\text ...
[10]
[PDF] Lecture 36. Summarizing Data - III - Math 408 - Mathematical Statistics
Apr 29, 2013 · The main drawback of the arithmetic mean is it is sensitive to outliers. If fact, by changing a single number, the arithmetic mean of a ...
[11]
Lesson 2: Summarizing Data | Principles of Epidemiology | CSELS
Median. Definition of median. The median is the middle value of a set of data that has been put into rank order. Similar to the median on a highway that ...
[12]
[PDF] Robust Statistical Methods for Automated Outlier Detection
Thus. the median offers some robustness against outliers and consequently is receiving re- newed attention as a location estimator. Similarly, a preferred ...
[13]
Measures of central tendency: Median and mode - PMC
Mode is defined as the value that occurs most frequently in the data. Some data sets do not have a mode because each value occurs only once. On the other hand, ...
[14]
Central Tendency & Variability - Sociology 3112
Apr 12, 2021 · It is possible to have more than one mode in a distribution. Such distributions are considered bimodal (if there are two modes) or multi-modal ( ...
[15]
[PDF] viii. statistics - Human Relations Area Files
Measures of dispersion help us understand the degree to which scores are spread out across a distribution. •Most researchers use both a measure of central ...
[16]
[PDF] STAT22000 Autumn 2013 Lecture 2 Mean Median Finding Mean ...
Oct 2, 2013 · The exact average of the distances of the observations from the mean is the mean absolute deviation. (MAD). MAD = 1 n n. X i=1. |xi − x|. MAD is ...
[17]
[PDF] Assessment of Statistical Methods Applied to Geochemical Data for ...
The mean absolute deviation from the mean is defined as the average distance between the mean ... the midrange are sensitive to outliers, extreme values ...
[18]
[PDF] Summarizing Data: Measures of Spread, Boxplots - myweb
Jan 29, 2016 · Then its mean absolute deviation from the mean, denoted xMAD(x), is: ... The range, xR, is extremely sensitive to outliers. The variance, s2.
[19]
Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE)
The RMSE will always be larger or equal to the MAE; the greater difference between them, the greater the variance in the individual errors in the sample. If ...Missing: deviation inequality
[20]
MAD (about median) vs. quantile-based alternatives for classical ...
Jun 1, 2023 · One way to address this sensitivity is by considering alternative metrics for deviation, skewness, and kurtosis using mean absolute deviations ...
[21]
[PDF] Mean Absolute Deviation about Median as a Tool of Explanatory ...
Abstract—The mean absolute deviation about median (MAD median) is often regarded as a robust measure of the scale of a distribution.
[22]
Median Absolute Deviation
Apr 11, 2016 · The median absolute deviation is: where x ~ is the median of the variable. This statistic is sometimes used as a robust alternative to the standard deviation ...
[23]
Median Absolute Deviation - R
Compute the median absolute deviation, ie, the (lo-/hi-) median of the absolute deviations from the median (or another center ), and (by default) adjust by a ...
[24]
Alternatives to the Median Absolute Deviation
Feb 27, 2012 · In this article we set out to construct explicit and 50% breakdown scale estimators that are more efficient.
[25]
Jackknife empirical likelihood inference for the mean absolute ...
Some other useful measures are average absolute deviation about median, median absolute deviation (MAD) and the maximum absolute deviation about a point, etc.
[26]
[PDF] Basic Methods of Data Analysis
see later) minimizes the maximum absolute deviation. (L∞-norm). The median minimizes the average absolute deviation as we ...
[27]
Mean Absolute Deviation vs. Standard Deviation - Statology
This tutorial explains the difference between the mean absolute deviation and the standard deviation, including pros and cons of each metric.
[28]
Mean absolute deviation vs. standard deviation - Cross Validated
Jan 12, 2014 · Regarding the difference between mean absolute deviation & standard deviation the both involve the deviation of ALL the points from the mean.
[29]
https://stats.stackexchange.com/questions/81986/mean-absolute-deviation-vs-standard-deviation
[30]
Why Standard Deviation is more popular than Mean Absolute ...
Oct 10, 2018 · Standard deviation is more popular because it's mathematically convenient, differentiable, and linked to the Normal distribution, while MAD had ...Mean absolute deviation vs. standard deviation - Cross ValidatedWhy square the difference instead of taking the absolute value in ...More results from stats.stackexchange.com
[31]
[PDF] A Simple Noncalculus Proof That the Median Minimizes the Sum of ...
The proof that the median minimizes the sum of the absolute deviations is omitted in many mathematical statistics textbooks. Other textbooks (e.g.,. Bickel ...
[32]
Robust Estimation of a Location Parameter - Project Euclid
This paper contains a new approach toward a theory of robust estimation; it treats in detail the asymptotic theory of estimating a location parameter for ...
[33]
Proof: The median minimizes the mean absolute error
Sep 23, 2024 · This implies, since the median is the sole critical point, that it must be a global minimum. Therefore, the median must minimize the mean ...
[34]
Proof that Mean Minimizes Sum of Squared Errors
This page presents a simple proof that the mean is the unique estimate that minimizes the sum of squared errors.
[35]
The relationship between the absolute deviation from a quantile and ...
Aug 3, 2013 · We investigate the relationship between Gini's mean difference (GMD), the mean absolute deviation, the least absolute deviation and the absolute deviation from ...
[36]
Expectation of Absolute Deviation From Mean - Math Stack Exchange
May 5, 2020 · The absolute deviation from the mean is |X−μX|, and its expectation ... [The constant is derived using the fact that fX integrates to 1].Expected Value of a Continuous Random VariableDoes "Expected Absolute Deviation" or ... - Math Stack ExchangeMore results from math.stackexchange.com
[37]
Mean absolute deviation (MAD) review (article) - Khan Academy
The mean absolute deviation of a dataset is the average distance between each data point and the mean. It gives us an idea about the variability in a dataset.
[38]
2 Robust summaries – Introduction to Data Science - rafalab
... median absolute deviation (MAD). To compute the MAD, we first compute the median, and then for each value we compute the distance between that value and the ...
[39]
Median for Even Number of Observations: Simple Guide - Vedantu
Rating 4.2 (373,000) The steps to find the median for an even number of observations are: Arrange the data in ascending or descending order. Count the total number of observations ( ...
[40]
What is faster, abs(x) or x^2? - Performance - Julia Discourse
Jul 5, 2020 · On Intel 9th Gen. CPUs, FABS instruction (floating point absolute value) is a bit less computationally expensive that FMUL (floating point ...
[41]
R: Median Absolute Deviation - MIT
Median Absolute Deviation. Description. Compute the median absolute deviation for a vector, dispatching only on the first argument, x .Missing: algorithm | Show results with:algorithm
[42]
median_abs_deviation — SciPy v1.16.2 Manual
The string “normal” is also accepted, and results in scale being the inverse of the standard normal quantile function at 0.75, which is approximately 0.67449.
[43]
Mean Absolute Deviation - GeeksforGeeks
Jul 23, 2025 · 1. Calculate the mean of the data set. · 2. Calculate the difference between each data point and the mean. · 3. Square each of those differences.
[44]
8.2.2 Point Estimators for Mean and Variance - Probability Course
The sample mean is a point estimator for the mean. The sample variance (S2) is an unbiased estimator for variance, and the sample standard deviation (S) is a ...
[45]
Asymptotic independence of median and MAD - ScienceDirect.com
The asymptotic joint normality of the sample median and the median absolute deviation from the median (MAD) as robust counterparts of sample mean and standard ...
[46]
[PDF] Finite-sample bias-correction factors for the median absolute ... - arXiv
Jul 25, 2022 · In [RC93], the median absolute deviation is introduced as a very robust scale estimator because it has the best possible breakdown point (0.5).<|separator|>
[47]
Alternatives to the Median Absolute Deviation - jstor
Moreover, Croux and Rousseeuw. (1992b) have derived finite-sample correction factors that render MAD,, S", and Q, almost exactly unbiased. In the following ...
[48]
Chapter 5 Descriptive statistics
The mean absolute deviation for these five scores is 15.52. However, while ... In this context, the summary() function gives us a count of the number ...
[49]
[PDF] Prospective Teachers' Procedural and Conceptual Knowledge of ...
In this report, I focus primarily on one particular measure: the mean absolute de- viation (MAD). It is the mean distance between the individual data values in ...
[50]
Mean Absolute Error [MAE] - Statistics By Jim
Mean Absolute Error (MAE) is a statistical measure that evaluates the accuracy of a predictive or forecasting model.
[51]
14.2 Measures of Forecast Accuracy - eCampusOntario Pressbooks
The mean absolute deviation, denoted M A D , is the average of the absolute value of the forecasting errors.
[52]
On the suitability of income inequality measures for regional analysis
Only the most 'simple' measures, such as absolute mean deviation, absolute standard deviations and absolute mean difference, fail to indicate any change, when ...
[53]
[PDF] THE BREAKDOWN POINT — EXAMPLES AND ...
At the sample (3.1) above the median absolute deviation TMAD has a finite sample breakdown point of 1/11 compared with the upper bound of 3/11 given by Theorem ...
[54]
[PDF] Introduction to robust statistics - CMStatistics
Dec 7, 2015 · Examples of scale estimators. • Median Absolute deviation (MAD). Page 11. 07/12/2015. 11. Example. • Location scale model yi~N(µ,σ2. ) • Data ...
[55]
Detecting outliers: Do not use standard deviation around the mean ...
Journal of Experimental Social Psychology ... Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median.
[56]
[PDF] Analysis of least absolute deviation - HKUST Math Department
The least absolute deviation or L1 method is a widely known alternative to the classical least squares or L2 method for statistical analysis of linear ...
[57]
Performance of median absolute deviation and some alternatives to ...
May 25, 2021 · This study investigates the performances of MAD- and AMAD-based control charts under normal, skewed, and heavy-tailed distributions.Abstract · INTRODUCTION · MAD- AND AMAD-BASED... · DISCUSSION
[58]
[PDF] Data Discovery in Images Using a Median-Absolute-Deviation ...
Fig. 1 shows a typical use case, demonstrating the systematic noise and outlier removal from a selfie image of the James Webb Space Telescope (JWST).