Fact-checked by Grok 2 weeks ago

Winsorized mean

The Winsorized mean is a robust statistical measure of that mitigates the impact of outliers by replacing the most extreme values in a —typically the lowest and highest percentiles—with less extreme values from adjacent percentiles, before computing the of the modified data. This approach yields a more stable estimate of the data's location compared to the conventional , especially in distributions affected by anomalous observations. Named after biostatistician Charles P. Winsor, the method was introduced in 1946 as a technique for handling outliers in counting statistics. The computation of the Winsorized mean involves specifying a trimming proportion α (often 0.05 or 0.10 for 90% or 80% Winsorization, respectively), the , and capping the αn lowest values at the (αn + 1)th ordered value while capping the αn highest values at the [n(1 - α)]th ordered value, where n is the sample size. For instance, consider the {1, 2, 3, 4, 100} with n=5 and α=0.20 (20% Winsorization): the lowest value (1) is replaced by the second lowest (2), and the highest (100) by the second highest (4), resulting in the adjusted set {2, 2, 3, 4, 4} and a Winsorized mean of 3. This process can be implemented symmetrically for balanced tails or asymmetrically if needed, and it is available in statistical software such as and . In contrast to the trimmed mean, which excludes extreme values entirely and reduces the effective sample size, the Winsorized mean preserves all data points, avoiding loss of information and maintaining for . It is particularly valuable in applied fields like , , and biomedical , where outliers from errors or can distort results, and for symmetric distributions, it provides an unbiased estimator of the population mean. The technique enhances the reliability of analyses such as t-tests and regression models by producing variance estimates less sensitive to .

Definition and Background

Definition

The , a common measure of , is highly sensitive to outliers—extreme values in a that deviate significantly from the majority of observations—because they can pull the average toward themselves, leading to distorted estimates of the population parameter. To mitigate this without discarding data, the Winsorized mean serves as a robust alternative that adjusts extreme values rather than removing them. Winsorizing involves replacing the extreme observations in a sorted with less extreme values drawn from the itself, typically at specified lower and upper percentiles such as \alpha and $1 - \alpha, where $0 < \alpha < 0.5. For a of n observations sorted in non-decreasing order as x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}, and a trimming level k (often k = \lfloor \alpha n \rfloor), the k smallest values x_{(1)} to x_{(k)} are replaced by x_{(k+1)}, and the k largest values x_{(n-k+1)} to x_{(n)} are replaced by x_{(n-k)}. This capping process preserves the original sample size n, distinguishing it from trimming methods that delete extremes and reduce the effective sample. The Winsorized mean is then computed as the arithmetic mean of this adjusted dataset: \bar{x}_w = \frac{1}{n} \sum_{i=1}^n w_i, where w_i are the Winsorized values (with w_i = x_{(k+1)} for the lowest k originals, w_i = x_{(n-k)} for the highest k, and w_i = x_i otherwise). This approach limits the influence of while retaining all data points, making it suitable for datasets suspected of containing anomalies.

Historical Development

The Winsorized mean was introduced by biostatistician during his tenure at the , where he advocated for replacing extreme observations in datasets with values from adjacent non-extreme points to reduce outlier influence. This approach emerged in 1941, as recounted by , who first encountered Winsor's ideas that year and credited him with a practical philosophy for treating "wild shots" in real-world data without discarding them outright. Winsor's innovation arose in the context of early 20th-century statistics, a period marked by increasing recognition of the limitations of classical methods like the when applied to non-normal distributions prevalent in biological and public health studies. At the time, researchers in fields such as and sought robust alternatives to better accommodate skewed or contaminated data from observational sources, reflecting broader debates on estimation stability that traced back to late-19th-century concerns over . Following World War II, the Winsorized mean gained traction in econometrics and finance starting in the post-1950s era, as these disciplines grappled with volatile economic indicators and heavy-tailed distributions in time-series data. Tukey formalized the terminology "Winsorized" in his 1962 seminal paper, popularizing the method and linking it to emerging robust statistics frameworks, though Winsor's foundational contribution predated and inspired Tukey's later exploratory techniques. By the 2000s, computational advancements made Winsorizing routine, with implementations in open-source software like R's DescTools package and Python's SciPy library enabling easy application across disciplines. Recent studies continue to affirm its utility, such as in 2024 analyses of financial forecasting where winsorization improved model stability against extreme market events.

Computation

Procedure

The procedure for computing the Winsorized mean involves sorting the data and systematically replacing extreme values to mitigate the influence of outliers. Begin by sorting the dataset in ascending order to obtain the order statistics x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}, where n is the sample size. Next, select the trimming level \alpha, a value between 0 and 0.5 that specifies the proportion of observations to adjust in each tail (for example, \alpha = 0.05 for 5% per tail). Compute k = \lfloor \alpha n \rfloor using the floor function to obtain an integer number of values to replace. When \alpha n is not an integer, the floor function ensures k values are replaced per tail, which may yield a proportion slightly less than \alpha for small or odd n; alternatively, some implementations use ceiling or rounding for k, but flooring is the conventional choice for consistency and to avoid over-adjustment. Then, replace the k smallest values (x_{(1)} through x_{(k)}) with x_{(k+1)}, and replace the k largest values (x_{(n-k+1)} through x_{(n)}) with x_{(n-k)}. Finally, calculate the arithmetic mean of the resulting modified dataset. The following pseudocode outlines the process in a clear, implementable form (assuming 1-based indexing and a sorted array):
function winsorized_mean(data, alpha):
    n = length(data)
    if n <= 1:
        return mean(data)  // or handle edge case
    sort data ascending  // data[1] ≤ ... ≤ data[n]
    k = floor(alpha * n)
    if k > 0 and 2 * k < n:  // ensure valid range
        lower = data[k + 1]
        upper = data[n - k]
        for i = 1 to k:
            data[i] = lower
        for i = n - k + 1 to n:
            data[i] = upper
    return sum(data) / n
This algorithm preserves the sample size while capping extremes, with the condition k < n/2 preventing the entire dataset from being replaced.

Mathematical Formulation

The α-Winsorized mean of a sample X = (X_1, \dots, X_n) consisting of independent and identically distributed random variables is formally defined as \bar{X}^{(\alpha)} = \frac{1}{n} \sum_{i=1}^n W_i, where W_i = \min\left( \max\left( X_i, Q_{\alpha} \right), Q_{1-\alpha} \right) for each i, and Q_p denotes the p-quantile of the underlying distribution (or the sample p-quantile in the empirical setting). This transformation caps observations below the α-quantile at Q_{\alpha} and those above the (1-α)-quantile at Q_{1-\alpha}, thereby bounding the influence of extreme values while retaining all data points in the averaging process. In terms of order statistics, let X_{(1)} \leq X_{(2)} \leq \dots \leq X_{(n)} be the ordered sample, and set k = \lfloor \alpha n \rfloor. The α-Winsorized mean can equivalently be expressed as \bar{X}^{(\alpha)} = \frac{1}{n} \left[ k X_{(k+1)} + \sum_{i=k+1}^{n-k} X_{(i)} + k X_{(n-k)} \right]. This formulation replaces the k smallest observations with X_{(k+1)} and the k largest with X_{(n-k)}, then computes the sample of the adjusted values. The sample quantiles Q_p are estimated using the empirical cumulative distribution function (CDF) F_n(x) = \frac{1}{n} \sum_{i=1}^n I(X_i \leq x), where I(\cdot) is the , via Q_p = \inf \{ x : F_n(x) \geq p \}. This ties the Winsorized mean directly to the empirical , ensuring it is a nonparametric functional of the . Asymptotically, as n \to \infty, the α-Winsorized mean is consistent for the population α-Winsorized mean E[\min(\max(X, Q_\alpha), Q_{1-\alpha})], which coincides with the population mean \mu for symmetric , assuming a finite first absolute for the underlying .

Properties

Robustness Characteristics

The Winsorized mean exhibits a finite-sample breakdown point of \min(\alpha, 1 - \beta), where \alpha and \beta represent the proportions replaced at the lower and upper tails, respectively; for the symmetric case with \alpha = \beta, this simplifies to approximately \alpha, indicating that the can resist by up to \alpha proportion of arbitrary outliers before its value becomes unbounded or meaningless. This property ensures qualitative robustness, as the remains stable unless the fraction of outliers exceeds this threshold, outperforming non-robust alternatives like the , which has a breakdown point approaching zero. The of the Winsorized mean is bounded, which qualitatively measures its resistance to perturbations in the underlying by limiting the contribution of any single . Specifically, the bounded influence caps the effect of an at a value related to the distance between the \alpha- and $1-\alpha-quantiles, preventing any individual extreme value from disproportionately skewing the estimate. In contrast, the arithmetic mean's unbounded allows a single to produce arbitrarily large . Under gross-error models, such as the \epsilon-contaminated where a proportion \epsilon of observations arise from a contaminating , the demonstrates superior performance to the by reducing bias and , particularly for low contamination levels (\epsilon \leq 0.10) and sample sizes n \geq 100. This robustness extends to distributions with heavy tails, where the maintains lower sensitivity to extreme deviations compared to the sample mean, as evidenced by simulations across varying trimming proportions and contamination scenarios.

Bias and Efficiency

The Winsorized mean is unbiased for the population mean under symmetric distributions, such as the normal distribution, because the symmetric replacement of extreme values with their corresponding quantiles preserves the of the . In skewed distributions, however, the procedure introduces a small , as the clipping of extremes does not fully account for the , though this bias can be minimized by selecting an optimal cutoff value that reduces the . For the normal distribution specifically, the is zero, as confirmed by the property. The variance of the Winsorized mean is lower than that of the arithmetic mean in distributions susceptible to outliers, since replacing extreme values dampens the contribution of heavy tails to overall variability. Under the normal distribution, the asymptotic variance takes the form \frac{\sigma^2}{n} \left(1 - 2\alpha + \frac{2\alpha^2}{1 - 2\alpha}\right), reflecting the reduced spread after Winsorization while accounting for the data-dependent nature of the cutoffs. This formula highlights how modest clipping levels (\alpha) yield variances close to but slightly above the arithmetic mean's \sigma^2 / n due to the variability in estimating the quantiles. Relative efficiency, defined as the inverse ratio of the asymptotic variances compared to the arithmetic mean, stands at approximately 95% for \alpha = 0.05 under a clean , indicating a minor efficiency loss in uncontaminated settings. In contrast, for contaminated normal distributions—such as those with 10% scale contamination—the relative efficiency exceeds 1, often reaching 1.2 or higher depending on the contamination level, demonstrating superior performance when outliers are present. Similar gains occur with heavier-tailed distributions like the t-distribution; for instance, with \alpha = 0.05 and 3 , efficiency surpasses 1.1 relative to the normal case, increasing to over 1.3 for \alpha = 0.1, as clipping mitigates the influence of extreme values more effectively in non-normal scenarios. These efficiencies underscore the bias-variance favoring the Winsorized mean in robust settings, though optimal \alpha selection balances the loss in clean data against gains in robustness.

Advantages and Limitations

Advantages

The Winsorized mean offers simplicity in computation, as it requires only sorting the data values and replacing the extremes with specified thresholds before taking the , making it accessible without advanced programming or optimization techniques. This approach contrasts with more intricate robust estimators, such as M-estimators or regression-based methods, which often involve iterative solving of nonlinear equations or hyperparameter tuning. By replacing values rather than discarding them, the Winsorized mean preserves the full sample size and retains all observations, thereby avoiding the information loss inherent in trimming or deletion procedures that reduce effective dataset size. This preservation maintains greater data variability compared to trimmed means, where extremes are entirely excluded from the calculation. The method demonstrates versatility across data types, serving as a robust location estimator for univariate distributions with heavy tails or , and extending to multivariate contexts through adaptations like robust mean vector estimation in control charts or composite indicators. In non-ideal conditions, such as datasets contaminated by outliers or deviations from , it enhances the reliability of by mitigating extreme influences while stabilizing variance estimates. Modern software integration further bolsters its practicality, with implementations available in libraries like SciPy's masked statistics module (as of version 1.16.2 in ), although there is an ongoing proposal to deprecate the mstats module, enabling seamless use in and workflows. Overall, this reduces the impact of outliers on estimates without requiring distributional assumptions.

Limitations

One key limitation of the Winsorized mean is the arbitrary selection of the parameter α, which determines the proportion of winsorized at each and lacks universally criteria, thereby introducing potential subjectivity into the . Researchers often rely on conventional values such as α = 0.05 or 0.1, but these choices can vary based on or exploratory analysis, and analyses are recommended to assess the impact of different α levels on results. This subjectivity can undermine , particularly when the optimal α is context-dependent. The method also exhibits residual sensitivity to outliers if their number exceeds the proportion specified by α or if they cluster near the chosen quantiles, as the replacement values may still be contaminated by extreme observations. In such cases, the Winsorized mean fails to fully mitigate influence, potentially leading to biased estimates that resemble those of the ordinary mean. For instance, if more than αn outliers exist in one tail, the (αn + 1)th used for replacement could itself be an outlier, preserving undue . Additionally, the Winsorized mean relies on asymptotic assumptions that may not hold in small samples (typically n < 30) or highly skewed s, where it can perform poorly without further adjustments. In small-sample scenarios, such as with n around 10–20, the Winsorized mean has been shown to yield inaccurate point estimates compared to more robust alternatives. For skewed data, the procedure can introduce by asymmetrically altering the , especially if tails are unequal. Traditional discussions of α selection have been limited, but post-2010 developments include data-driven methods such as using the () to set thresholds. These approaches, often integrated in modern robust estimation frameworks, help mitigate subjectivity by adaptively choosing truncation levels based on empirical evidence. For example, MAD-based selection identifies boundaries relative to the .

Comparison with Trimmed Mean

The primary distinction between the Winsorized mean and the trimmed mean lies in their treatment of extreme values: the Winsorized mean replaces outliers beyond specified percentiles with the nearest non-extreme values, thereby capping their influence while retaining the full sample size, whereas the trimmed mean discards those extremes entirely, resulting in a reduced effective sample size. In positively skewed distributions, where high-value outliers are more prevalent, the Winsorized mean tends to produce slightly higher estimates than the trimmed mean because it incorporates capped versions of the extremes into the across the original sample size, avoiding the downward pull from outright removal of those values. Regarding performance, both estimators exhibit similar points, typically up to the proportion of data trimmed or Winsorized (e.g., 50% for maximal robustness), but the Winsorized mean demonstrates higher under symmetric scenarios, such as distributions with balanced outliers on both tails. The choice between them depends on context: the trimmed mean is preferable when strict outlier removal is desired to eliminate potential contamination entirely, while the Winsorized mean is better suited for small datasets where preserving the full sample size is crucial to maintain statistical power.

Comparison with Other Robust Estimators

The Winsorized mean provides greater statistical efficiency than the under distributions close to with only mild contamination, as it retains more information from the data while capping extremes. For instance, under the standard , a 5% Winsorized mean achieves approximately 95% relative compared to the sample , whereas the attains only about 64% relative (2/π). The , however, offers simplicity in computation and asymptotic distribution-free properties, making it preferable when contamination is heavy or computational resources are limited. In comparison to the , the Winsorized mean is non-iterative and thus easier to implement, requiring only and replacement of extremes rather than solving nonlinear estimating equations. The estimator, introduced as a foundational , allows fine-tuning of its robustness parameter to balance efficiency and resistance, achieving up to 95% relative efficiency under while bounding ; certain extensions or combinations with high-breakdown initial estimates can yield breakdown points up to 25%. In contexts, the Winsorized mean serves as a straightforward alternative to Huber methods, particularly when iteration is undesirable, though it trades off some flexibility in tuning for computational simplicity. The following table summarizes key properties for a 5% Winsorized mean (α=0.05), the , and the Huber M-estimator tuned for 95% efficiency under :
EstimatorBreakdown PointRelative Efficiency ()Computational Complexity
Winsorized mean (α=0.05)0.05~0.95O(n log n) ()
0.500.64O(n)
Huber M-estimator0 (asymptotic)0.95O(n × iterations)
Breakdown points reflect the maximum proportion of contaminated observations the estimator can tolerate before arbitrary bias; efficiencies are asymptotic relative to the sample mean.

Examples and Applications

Numerical Example

Consider a hypothetical dataset consisting of 10 exam scores: 50, 55, 60, 65, 70, 75, 80, 85, 90, 200. This set includes a clear outlier at 200, which inflates the arithmetic mean to 83 (calculated as the sum 830 divided by 10). To illustrate the 10% Winsorized (with α = 0.1 per , yielding = 1 value replaced at each end), first sort the in ascending : 50, 55, 60, 65, 70, 75, 80, 85, 90, 200. Replace the lowest value (50) with the next lowest (55) and the highest value (200) with the next highest (90). The resulting Winsorized dataset is 55, 55, 60, 65, 70, 75, 80, 85, 90, 90, with a sum of 725 and a of 72.5 (725 divided by 10). This adjustment caps the outlier's influence, lowering the by 10.5 points compared to the original. For comparison, the corresponding 10% trimmed mean removes the extreme values (50 and 200), leaving the central 8 scores: 55, 60, 65, 70, 75, 80, 85, 90. The sum is 580, yielding a of 72.5 (580 divided by 8). In this case, both robust estimators produce the same result, highlighting how Winsorization retains all observations while achieving similar mitigation. The following table displays the datasets side by side for clarity:
PositionOriginal DataWinsorized Data (10%)Trimmed Data (10%)
1505555
2555560
3606065
4656570
5707075
6757580
7808085
8858590
99090(removed)
1020090(removed)
Means: Original = 83; Winsorized = 72.5; Trimmed = 72.5 This example demonstrates the Winsorized mean's ability to reduce impact without discarding points, making the estimate more representative of the typical performance in the scores. Visualizing the via boxplots further illustrates : the original dataset's boxplot shows a around 72.5 with an upper whisker extended to 200 due to the , creating a skewed representation. After Winsorization, the boxplot's upper whisker shortens to 90, resulting in a more symmetric and compact display that better captures the central of scores.

Practical Applications

The Winsorized mean finds significant application in , particularly in analyzing returns where extreme market events can skew traditional means. During the , researchers applied winsorization to return data to eliminate spurious outliers and enhance statistical efficiency, enabling more reliable performance evaluations before, during, and after the crisis period. Similarly, in studies of volatility across financial crises, winsorized monthly real returns are used to compute standard deviations, mitigating the distorting effects of outliers in international stock data spanning 60 countries. This approach provides a robust estimate of , crucial for and momentum profitability analysis in volatile markets. In and , the Winsorized mean supports robust averaging of clinical measurements prone to errors or s, such as in and trial outcomes. For patient-based real-time quality control in medical laboratories, winsorization of extreme values outperforms simple removal, yielding more stable medians and percentiles for monitoring analytical performance. In systems, it is applied to turn-around times for critical test results, where winsorizing reveals a reduction in mean processing time from 34 minutes to 20 minutes after system improvements, highlighting operational efficiencies otherwise masked by extremes. These uses ensure reliable inference in datasets with measurement variability, as seen in screening performance models that employ winsorization for adjustment. Recent applications extend to preprocessing, where the Winsorized mean handles s in feature data to improve model stability and accuracy. In cognitive models, winsorization limits values in features like psychophysiological test data such as reaction time and accuracy metrics, reducing distortions and enhancing performance alongside variance inflation factor selection. This technique preserves dataset size while curbing influence, making it suitable for high-dimensional inputs in neural networks and time-series forecasting. Software implementations facilitate practical adoption of the Winsorized mean across disciplines. In R, the robustHD package offers the winsorize function to shrink outlying observations to data borders, ideal for cleaning financial or clinical datasets; post-2020 updates in related libraries like DescTools enhance its efficiency for large-scale analysis.
r
library(robustHD)
data <- c(1, 2, 3, 100, -50, 4, 5)  # Example data with outliers
winsorized_data <- winsorize(data, trim = 0.1)  # 10% winsorization
mean(winsorized_data)  # Compute Winsorized mean
In , 's mstats.winsorize from the scipy.stats module caps extremes at specified percentiles, commonly used in pipelines for preprocessing numerical features.
python
from scipy.stats.mstats import winsorize
import numpy as np
data = np.array([1, 2, 3, 100, -50, 4, 5])  # Example data with outliers
winsorized = winsorize(data, limits=[0.1, 0.1])  # 10% on each tail
np.mean(winsorized)  # Compute Winsorized mean

References

  1. [1]
    How to Winsorize Data: Definition & Examples - Statology
    To winsorize data means to set extreme outliers equal to a specified percentile of the data. For example, a 90% winsorization sets all observations greater ...<|control11|><|separator|>
  2. [2]
    [PDF] Fisher Digital Publications Winsorizing
    Jun 5, 2018 · Winsorizing is an important tool for educational and social science researchers for two reasons. First, significance tests based on the mean.Missing: paper | Show results with:paper
  3. [3]
    Trimmed and Winsorized Means
    For a symmetric distribution, the symmetrically trimmed or Winsorized mean is an unbiased estimate of the population mean. But the trimmed or Winsorized mean ...
  4. [4]
    Winsorization - an overview | ScienceDirect Topics
    Winsorization is defined as a method of handling outliers in a data distribution by converting extreme high values to the value of the highest data point ...
  5. [5]
    Understanding Winsorized Mean: Formula, Examples, and ...
    Winsorized means can be expressed in two ways. A "kn" winsorized mean replaces the "k" smallest and largest values, where "k" is an integer. An "X%" winsorized ...What Is the Winsorized Mean? · Formula · Benefits of Using Winsorized... · Example
  6. [6]
    Winsorization: The good, the bad, and the ugly - The DO Loop
    Feb 8, 2017 · If the data are from a symmetric population, the Winsorized mean is a robust unbiased estimate of the population mean. The graph to right ...
  7. [7]
    [PDF] Introduction to the Bootstrap and Robust Statistics
    Another robust estimtor of central tendency is the winsorized mean. Like the trimmed mean, the winsorized mean eliminates the outliers at both ends of an ...<|control11|><|separator|>
  8. [8]
    of Robust Estimation 1885-1920 - jstor
    Yet most of the early work in mathematical statistics was obsessed with "proving" the method of least squares, either starting with the assumption that the ...
  9. [9]
    Pooling and winsorizing machine learning forecasts to predict stock ...
    Winsorization has little effect on the longer windows. Winsorizing ensemble forecasts increases their predictability by approximately 20 % on average from 0.9 % ...
  10. [10]
    Robust Estimation of a Location Parameter - Project Euclid
    This paper contains a new approach toward a theory of robust estimation; it treats in detail the asymptotic theory of estimating a location parameter for ...
  11. [11]
    [PDF] A Short Course on Robust Statistics
    α-Windsorized mean: Replace a proportion of α from both ends of the data set by the next closest observation and then take the mean. • Example: 2, 4, 5, 10, 200.
  12. [12]
    [PDF] High Breakdown Analogs of the Trimmed Mean - OpenSIUC
    The Winsorized mean. Wn = Wn(ln,un) = 1 n. [lnX(ln+1) + un. ∑ i=ln+1. X(i) + (n − un)X(un)]. (3). These estimators have a breakdown point of min(α,1 − β).
  13. [13]
    [PDF] Comparison Between Robust Trimmed and Winsorized Mean
    In this situation, the robust method. International Journal For Research In Mathematics And Statistics. ISSN: 2208-2662. Volume-3 | Issue-12 | December,2017 | ...Missing: citation | Show results with:citation<|control11|><|separator|>
  14. [14]
    [PDF] A COMPARATIVE STUDY ON UNIVARIATE OUTLIER ...
    Thus, the mean winsorization statistic is recommended for most of the cases, especially for smaller levels of contamination (ε ≤ 0.1). For normal distribution, ...
  15. [15]
    An Estimator for a Population Mean Which Reduces the Effect of ...
    Westat Research Analysts, Inc. An estimator, YI, of Po is investigated. This estimator is formed by replacing all sample values larger than a predetermined ...
  16. [16]
    On Searls' winsorized mean for skewed populations
    A winsorized mean is obtained by replacing all the observations larger than some cut-off value R by R before averaging. The optimal cut-off value, as defined ...
  17. [17]
    [PDF] Winsorization-methods-in-sample-surveys.pdf - ResearchGate
    The first paper, Searls(1966), proposed the winsorized sample mean. The second paper, Ernst(1980), discussed this estimator and many others, and proved that the ...
  18. [18]
    L-Statistics (Chapter 22) - Cambridge University Press
    The a-trimmed mean is the average of the middle -th fraction of the observations, the a-Winsorized mean ... (A formula for the asymptotic variance is given in ...
  19. [19]
    Winsorized Regression: Technometrics - Taylor & Francis Online
    Apr 9, 2012 · Winsorization showed improvement, over least. squares when the data are taken from a scale contaminated normal distribution.
  20. [20]
    (PDF) Trimmed and Winsorized means based on a scaled deviation
    Aug 6, 2025 · The influence functions of the estimators are derived and their limiting distributions are established via asymptotic representations.
  21. [21]
    1.3.5.1. Measures of Location - Information Technology Laboratory
    Winsorized Mean - similar to the trimmed mean. However, instead of trimming ... The first three alternative location estimators defined above have the advantage ...
  22. [22]
    Application of robust multivariate control chart with Winsorized Mean
    Jul 19, 2019 · The Winsorized Mean is the mathematical average of the Winsorized distribution (Goodwyn 2012) and when it is applied on normal data, it does not ...
  23. [23]
    winsorize — SciPy v1.16.2 Manual
    Returns a Winsorized version of the input array. The (limits[0])th lowest values are set to the (limits[0])th percentile, and the (limits[1])th highest values ...
  24. [24]
    [PDF] Average Task Times in Usability Tests: What to Report? - MeasuringU
    Small sample point estimates will be inaccurate ... The arithmetic mean, trim-all mean, harmonic mean and. Winsorized mean performed much more poorly than the.Missing: limitations | Show results with:limitations
  25. [25]
    Winsorized Mean: What You Need to Know to Handle Outliers
    Sep 10, 2024 · A winsorized mean is a statistical measure that reduces the impact of outliers by replacing extreme values with less extreme percentiles rather than completely ...Practical applications of... · Calculate the mean · Comparing trimmed mean and...
  26. [26]
    Winsorization: Handling Outliers in Machine Learning
    Mar 23, 2025 · What is Winsorization? Winsorization (named after biostatistician Charles P. Winsor), or capping, involves replacing extreme values in a dataset ...Missing: paper | Show results with:paper
  27. [27]
    Comparison Between Robust Trimmed and Winsorized Mean
    Dec 31, 2017 · That is, the estimator winsorized mean provides more efficient as well as robust result compared to the estimator trimming mean.Missing: seminal papers
  28. [28]
    [PDF] Statistical Properties of Winsorized Means for Skewed Distributions
    Nov 25, 2005 · Section 3 proposes a nearly unbiased estimator of the mean squared error of the. Winsorized mean. ... order statistics. Its proof together ...
  29. [29]
    Comparison of the bias of trimmed and Winsorized means
    Aug 6, 2025 · Further deductions explain why the Winsorized mean typically has smaller biases compared to the trimmed mean; two sequences of semiparametric ...Missing: seminal robustness
  30. [30]
    Fixed income mutual fund performance during and after a crisis - NIH
    Mar 4, 2021 · This study investigates the performance of the fixed income mutual funds industry, focusing on Canadian fixed income funds before, during, and after the 2008 ...Missing: applications | Show results with:applications
  31. [31]
    [PDF] Learning from History: Volatility and Financial Crises
    Volatility is calculated as the standard deviation of the previous 12 winsorized monthly real returns, scaled by. √. 12, and the sample includes 60 countries.
  32. [32]
    Outliers and momentum in the corporate bond market - ScienceDirect
    Specifically, outlier trimming vanishes momentum returns, whereas winsorization yields a robust but conservative assessment of the momentum effect. Positive ...
  33. [33]
    Understanding Patient-Based Real-Time Quality Control ... - PubMed
    Aug 1, 2020 · Winsorization of outlying values often led to a better performance than simple outlier removal. For medians and Harrell-Davis 50 percentile ...
  34. [34]
    Critical Results Notification System in the Electronic Health Record
    Jul 17, 2025 · However, after applying Winsorization and analyzing mean TATs, the average TAT decreased from 34 min to 20 min, demonstrating improved ...
  35. [35]
    Identification of physicians with unusual performance in screening ...
    Design: Bayesian random-effects modeling and Winsorization of potential outliers were applied to develop a robust model for the majority of providers.
  36. [36]
    Using machine learning methods to predict cognitive age ... - arXiv
    Winsorization was used to handle outliers, limiting extreme values and reducing distortions caused by anomalies. This approach stabilized the feature ...
  37. [37]
    Data cleaning by winsorization - RDocumentation
    Clean data by means of winsorization, i.e., by shrinking outlying observations to the border of the main part of the data.