Fact-checked by Grok 2 weeks ago

Tolerance interval

A tolerance interval is a statistical interval derived from sample data that contains at least a specified proportion p of the values with a stated level $1 - \alpha, distinguishing it from confidence intervals that cover population parameters and intervals that cover future observations. For a normal distribution, the interval is typically two-sided, with lower limit \bar{Y} - k s and upper limit \bar{Y} + k s, where \bar{Y} is the sample mean, s is the sample standard deviation, and k is a factor depending on p, \alpha, and sample size. One-sided intervals are also used, providing either a lower or upper bound. The concept originated in the early 1940s with Samuel S. Wilks' work on determining sample sizes for setting tolerance limits, building on nonparametric methods to ensure coverage of population proportions. Subsequent developments, such as W.G. Howe's 1969 method for normal distributions, standardized calculations using and t-distributions for precise factor computation. Tolerance intervals apply across fields like manufacturing quality control, where they verify if specification limits encompass a high proportion (e.g., 99%) of product measurements with 95% , and for pollutant ranges. They are particularly valuable in , such as pharmaceutical assays or assessments, to infer population behavior from limited samples without assuming the full distribution. Key distinctions include their focus on individual values rather than means or single predictions, requiring larger samples than intervals for similar precision due to dual coverage of proportion and confidence. Modern implementations in software like or use these methods for both parametric (normal) and nonparametric cases, with Bayesian extensions emerging for complex distributions.

Fundamentals

Definition

A tolerance interval is a type of statistical designed to contain at least a specified proportion p of the values in a , with a stated level \gamma = 1 - \alpha that the interval achieves this coverage. This makes it particularly useful for describing the range within which a large share of the is expected to fall, accounting for both the variability in the sample and the uncertainty in estimating that variability. For instance, a tolerance interval might be constructed to cover 95% of the (p = 0.95) with 95% (\gamma = 0.95), ensuring that the probability the interval misses more than 5% of the is only 5%. The key parameters defining a tolerance interval include the coverage proportion p, which specifies the minimum fraction of the to be enclosed; the confidence level \gamma, which quantifies the reliability of the coverage claim; and the sample n, which influences the interval's width and . Larger samples generally yield narrower intervals for the same p and \gamma, but the construction balances these to reflect spread rather than point estimates. Tolerance intervals differ fundamentally from other statistical intervals: they do not bound population parameters like confidence intervals, nor do they predict single future observations like prediction intervals; instead, they provide bounds for a substantial portion of the entire . The concept of tolerance intervals originated in the amid growing applications of statistics to in , with foundational theoretical developments by Samuel S. Wilks in his work on statistical prediction and tolerance limits. Subsequent contributions, including those by on setting tolerance limits for normal distributions, further established their role in industrial settings.

Types

Tolerance intervals are classified primarily by their directionality, which determines whether they provide bounds on one tail or both tails of the distribution. One-sided tolerance intervals establish either a lower bound, ensuring that at least a proportion p of the population exceeds the limit with confidence \gamma, or an upper bound, ensuring that no more than a proportion $1-p exceeds the limit with confidence \gamma. These are commonly applied in scenarios such as quality control where only a minimum strength or maximum defect rate is of interest, for instance, setting a lower tolerance limit for the tensile strength of materials to guarantee that at least 95% of items meet the requirement with 99% confidence. Two-sided tolerance intervals provide both lower and upper bounds, capturing at least proportion p of the within the with confidence \gamma, and can be symmetric or asymmetric depending on the application. Within two-sided , equal-tailed variants allocate equal proportions of the uncovered to each tail, such as (1-p)/2 on each side, resulting in a central that symmetrically bounds the . In contrast, unequal-tailed two-sided allow asymmetric tail probabilities, which may be preferred when the or requirements are skewed, enabling more coverage on one side while still achieving the overall guarantee. Tolerance intervals are further categorized by distributional assumptions into and non-parametric types. tolerance intervals assume a specific underlying , such as , to derive bounds that are typically narrower and more precise for smaller samples when the assumption holds. Non-parametric tolerance intervals, also known as distribution-free, make no such assumptions and rely on the order statistics of the sample, offering broader applicability but requiring larger sample sizes to achieve comparable coverage guarantees. Special cases extend intervals beyond univariate settings. bands for provide interval bounds that vary with predictor variables in linear or nonlinear models, ensuring coverage of future responses across the range of covariates, often constructed simultaneously to control overall confidence. Multivariate intervals or regions bound a proportion p of a multidimensional with confidence \gamma, using methods like depth or spacings to define ellipsoidal or other shaped regions suitable for vector-valued quality characteristics.

Methods

Parametric Approaches

Parametric approaches to tolerance intervals rely on the assumption that the underlying population follows a specified parametric distribution, with the normal distribution being the most prevalent case due to its widespread applicability in quality control and engineering contexts. Under this framework, the data are modeled as independent and identically distributed samples from a normal distribution N(\mu, \sigma^2), where the population mean \mu and variance \sigma^2 are typically unknown and estimated from the sample. This parametric assumption enables the derivation of exact or approximate intervals that guarantee, with a specified confidence level \gamma, coverage of at least a proportion p of the population. For one-sided tolerance intervals, the lower bound is constructed as L = \bar{y} - k s, where \bar{y} is the sample mean, s is the sample standard deviation, and k is a tolerance factor chosen such that the interval covers at least proportion p of the population with confidence \gamma. The factor k is obtained from the quantile of a noncentral t-distribution with noncentrality parameter \delta = z_p \sqrt{n}, degrees of freedom n-1, and cumulative probability $1 - \gamma, where z_p is the p-quantile of the standard normal distribution and n is the sample size. This approach, originally proposed by Paulson, ensures the probabilistic coverage by linking the interval to the distribution of future observations relative to the sample estimates. An upper one-sided bound follows symmetrically as U = \bar{y} + k s. Two-sided tolerance intervals take the form [\bar{y} - k s, \bar{y} + k s], where the tolerance factor k is determined to achieve the desired coverage p with confidence \gamma, but its computation is more involved than for the one-sided case due to the joint uncertainty in mean and variance estimates. The factor k incorporates elements from the chi-squared distribution to account for the variability in s^2 / \sigma^2, which follows a \chi^2_{n-1} distribution scaled by $1/(n-1); specifically, approximate methods solve for k such that the expected coverage meets the criteria, often using k \approx z_{(1+p)/2} \sqrt{\frac{\chi^2_{\alpha, n-1}}{n-1}}, where \alpha = 1 - \gamma and \chi^2_{\alpha, n-1} is the upper \alpha critical value of the chi-squared distribution, though a more refined approximation includes multiplication by \sqrt{1 + 1/n} and exact solutions require numerical integration over the joint distribution of \bar{y} and s. This formulation balances the symmetric bounds while ensuring the interval's reliability under normality. The derivation of these intervals centers on pivotal quantities that transform the problem into a distribution-free coverage probability statement. For the normal distribution, the standardized sample mean (\bar{y} - \mu)/(\sigma / \sqrt{n}) follows a t_{n-1} distribution, and the coverage of a future observation Y relative to the interval involves a noncentral t pivotal quantity for one-sided bounds, leading to the selection of k that satisfies \Pr(\Pr(Y > L \mid \bar{y}, s) \geq p) = \gamma. For two-sided intervals, order statistics of the sample play a role in approximating the minimum coverage, but the primary reliance is on the joint pivotal distribution of the mean and variance, often requiring inversion of the coverage probability function. When parameters \mu and \sigma^2 are unknown, as is standard, the tolerance factors k explicitly incorporate the degrees-of-freedom adjustment in the t and chi-squared distributions to reflect estimation uncertainty, widening the interval compared to known-parameter cases. For finite populations of size N, a correction factor \sqrt{(N - n)/(N - 1)} is applied to the standard deviation component in the interval formula, reducing the effective variability and narrowing the bounds to account for the exhaustive sampling fraction n/N. This adjustment ensures the interval's validity when the population is not infinite, preserving the coverage guarantees.

Non-Parametric Approaches

Non-parametric approaches to constructing tolerance intervals rely on distribution-free methods that do not assume any specific form for the underlying population distribution, making them robust alternatives when parametric assumptions, such as normality, cannot be justified. These methods primarily utilize order statistics from a random sample of size n, where the observations are ranked as X_{(1)} \leq X_{(2)} \leq \cdots \leq X_{(n)}. A two-sided tolerance interval is typically formed as [X_{(r+1)}, X_{(n-s)}], with r and s being non-negative integers that control the number of observations excluded from each tail to ensure the desired coverage properties. The values of r and s are determined using exact probabilities to achieve a content of at least proportion p with level \gamma. Specifically, let k = r + s; then r and s are chosen as the smallest integers such that \sum_{i=0}^{k} \binom{n}{i} (1-p)^i p^{n-i} \geq \gamma, where the sum represents the probability that at most k future observations from the fall outside the interval, ensuring the coverage requirement with the specified confidence. For symmetric intervals, r = s, though asymmetric choices may be used for skewed data. The actual coverage proportion follows a , Beta(r+1, s+1), which provides a mechanism to assess the expected coverage, with mean (n - r - s + 1)/(n + 1). This approach guarantees the tolerance interval contains at least $100p\% of the with confidence \gamma, independent of the . These methods offer key advantages, including applicability to skewed, multimodal, or otherwise unknown distributions without requiring goodness-of-fit tests, thereby enhancing robustness in real-world scenarios where data may deviate from parametric ideals. However, they come with limitations: the resulting intervals are generally wider than those from parametric methods due to reduced statistical efficiency, often necessitating larger sample sizes (e.g., hundreds or more for tight p and high \gamma); additionally, while computational demands for solving the binomial sums were intensive historically, they are manageable today but can still pose challenges for extremely large n. The historical development of non-parametric tolerance intervals traces back to foundational work by Wilks in the early , who introduced order statistic-based limits for small samples, with significant expansions in the through contributions like Walsh's tables and approximations that facilitated practical for various sample sizes and coverage levels.

Comparisons

With Confidence Intervals

A (CI) is a statistical range that bounds an unknown , such as the mean or variance, with a specified confidence level of $1 - \alpha, indicating the probability that the interval contains the true value based on the sample. In contrast, a tolerance interval (TI) aims to enclose a specified proportion p of the with confidence $1 - \alpha, focusing on the spread of individual values rather than a single , which makes TIs generally wider than CIs due to incorporating both sampling variability and . A key mathematical relation between the two is that a TI can be interpreted as a applied to a of the underlying distribution, where the TI bounds ensure coverage of the proportion p around that quantile with the desired . For instance, a 95% for the mean of a product's strength might span 100 to 120 units, estimating the , whereas a 95% TI covering 95% of the values at 95% could extend more broadly, such as 80 to 140 units, to account for the full range of variability in individual measurements. When the population parameters, such as the mean and standard deviation, are fully known, a TI simplifies to the exact fixed percentiles of the distribution (e.g., the p/2 and $1 - p/2 quantiles for a two-sided interval), as there is no sampling uncertainty to incorporate.

With Prediction Intervals

A prediction interval (PI) provides an interval that, with confidence level $1 - \alpha, is expected to contain the value of a single future observation drawn from the same population, incorporating both the uncertainty in estimating the population parameters from the sample and the random variability of the individual observation itself. Unlike a tolerance interval (TI), which aims to encompass a specified proportion p of the entire population with confidence $1 - \alpha, a PI is tailored to bound just one additional data point, making it narrower than a TI when p > 1/n (where n is the sample size) because the latter must account for the spread across multiple population units rather than a solitary instance. The two intervals exhibit notable relations under limiting conditions. Furthermore, when population parameters are known with certainty, a TI designed to cover proportion p = 1 - \alpha with full confidence (i.e., certainty) coincides exactly with a PI at the same confidence level, as both reduce to the deterministic bounds \mu \pm z_{1 - \alpha/2} \sigma, where z is the standard normal quantile, \mu the mean, and \sigma the standard deviation. In terms of construction for normally distributed data, the formulas highlight their distinct statistical foundations. A PI is typically computed as \bar{y} \pm t_{\alpha/2, n-1} s \sqrt{1 + 1/n}, where \bar{y} is the sample mean, s the sample standard deviation, t_{\alpha/2, n-1} the critical value from the t-distribution with n-1 degrees of freedom, and the term under the square root captures both parameter estimation error and observational variance. By contrast, a TI employs \bar{y} \pm k s, where the tolerance factor k is derived from the noncentral t-distribution or chi-squared distribution to ensure the required population coverage and confidence, resulting in a more complex adjustment for the joint uncertainty in mean and variance estimates across the population proportion. Prediction intervals find application in scenarios requiring forecasts for individual outcomes, such as estimating the performance of a single future unit in regression-based predictions. Tolerance intervals, however, are better suited to batch , where the goal is to verify that a large proportion of produced items—such as 99% of a run—falls within acceptable limits with high , ensuring overall process reliability.

Applications

Example 1: Parametric Two-Sided Tolerance Interval for Machine Part Diameters

Consider a sample of 20 machine part diameters measured in millimeters, assumed to follow a normal distribution: 9.72, 9.88, 10.05, 9.95, 10.12, 9.81, 10.03, 9.94, 10.08, 9.89, 10.01, 9.96, 10.07, 9.92, 10.04, 9.98, 10.02, 9.93, 10.00, 9.97. The sample mean \bar{y} = 9.97 and sample standard deviation s = 0.095. For a two-sided tolerance interval covering at least 95% of the population (p = 0.95) with 95% confidence (\gamma = 0.95), the interval is given by \bar{y} \pm k s, where k is the tolerance factor obtained from standard tables for the normal distribution. For n = 20, k = 2.752. Thus, the lower limit L = 9.97 - 2.752 \times 0.095 \approx 9.71 and upper limit U = 9.97 + 2.752 \times 0.095 \approx 10.23, yielding the interval [9.71, 10.23]. This means there is 95% confidence that at least 95% of future diameters will fall within this range.

Example 2: One-Sided Non-Parametric Upper Tolerance Interval for Environmental Contaminant Levels

Consider a sample of 30 contaminant levels measured in parts per million () from environmental samples, without assuming a specific : 1.12, 1.25, 1.18, 1.34, 1.21, 1.45, 1.29, 1.52, 1.37, 1.61, 1.43, 1.68, 1.49, 1.73, 1.55, 1.79, 1.62, 1.84, 1.67, 1.91, 1.72, 1.96, 1.78, 2.02, 1.83, 2.08, 1.89, 2.14, 1.94, 2.20 (sorted for clarity). For a one-sided upper tolerance interval covering at least 90% of the (p = 0.90) with 99% (\gamma = 0.99), the interval is (-\infty, X_{(r)}], where X_{(r)} is the r-th and r is determined using the to ensure P(\text{Bin}(n, p) \geq r) \geq \gamma. For n = 30, p = 0.90, \gamma = 0.99, r = 23 (computed as the largest integer such that P(\text{Bin}(30, 0.90) \geq 23) \geq 0.99, using the lower-tail cumulative probability P(\text{Bin}(30, 0.90) \leq 22) \approx 0.003 < 0.01). The 23rd order statistic is X_{(23)} = 1.78, so the interval is (-\infty, 1.78]. This guarantees with 99% confidence that at least 90% of future contaminant levels will be below 1.78 ppm.

Interpretation of Results

Post-calculation verification of coverage and confidence can be performed via simulation or bootstrap methods. For the parametric example, generate 10,000 simulated normal samples with the estimated mean and standard deviation, compute the proportion falling within [9.71, 10.23] for each, and confirm the empirical coverage exceeds 95% in at least 95% of simulations. For the non-parametric example, resample the data 10,000 times, compute the order statistic X_{(23)} each time, and verify that in at least 99% of cases, the true p-quantile (estimated separately) falls below the bound. These procedures empirically validate the interval's properties beyond the theoretical construction.

Sensitivity Analysis: Effect of Sample Size on Interval Width

Increasing the sample size n reduces the tolerance factor k in parametric methods, narrowing the interval width $2ks for fixed s. The following table illustrates this for two-sided normal tolerance intervals with p = 0.95 and \gamma = 0.95:
Sample Size nTolerance Factor kApproximate Width Factor (Relative to s)
103.3796.758
202.7525.504
302.5295.058
502.3794.758
For non-parametric methods, larger n positions the order statistic r closer to the proportion p, providing a more precise estimate of the p-quantile and effectively tightening the bound relative to variability in smaller samples.

Practical Uses

In and , tolerance intervals are employed to specify product tolerances, ensuring that a high proportion of manufactured items meet specifications. For instance, in automotive , they are used to monitor valve lash measurements, verifying that 95% of components fall within acceptable limits with 95% confidence to maintain performance and reduce defects. At , tolerance intervals estimate material allowables and limiting capabilities, bounding a large proportion of the to support reliable component and testing. These applications help in process capability analysis, where intervals guide decisions on acceptability and rework, minimizing production costs while upholding safety standards. In environmental regulation and , tolerance intervals establish reference thresholds for assessing compliance with standards. They are applied to set upper bounds for benthic community indices or quality, capturing at least 95% of background data with specified to detect excursions beyond natural variability. Such intervals aid in decision-making for site remediation or regulatory enforcement, providing a statistical basis for distinguishing natural fluctuations from impacts. In the pharmaceutical industry, tolerance intervals serve as batch release criteria for drug potency and other critical quality attributes. They define specification limits that contain at least a specified proportion of future batches with high confidence, ensuring product consistency and patient safety during manufacturing validation. For example, two-sided intervals are calculated from historical lot data to support stability assessments and regulatory submissions. Modern extensions of tolerance intervals include multivariate versions used in machine learning for anomaly detection in high-dimensional datasets, such as identifying outliers in sensor networks or process streams. Software integration has facilitated broader adoption; the R package 'tolerance' provides functions for univariate and regression tolerance intervals across various distributions, enabling practitioners to compute limits for quality control and environmental applications. In Python, libraries like statsmodels and toleranceinterval support similar computations, allowing seamless incorporation into data analysis pipelines. Advancements in non-parametric methods since the have addressed gaps in handling non-normal data, improving interpolated and extrapolated order statistics for more robust intervals in skewed distributions common in real-world . These developments enhance applicability in fields like environmental regulation, where distributional assumptions often fail.

References

  1. [1]
    7.2.6.3. Tolerance intervals for a normal distribution
    Definition of a tolerance interval, A confidence interval covers a population parameter with a stated confidence, that is, a certain proportion of the time.
  2. [2]
    Determination of Sample Sizes for Setting Tolerance Limits
    Project Euclid, Open Access March, 1941, Determination of Sample Sizes for Setting Tolerance Limits, SS Wilks, DOWNLOAD PDF + SAVE TO MY LIBRARY.
  3. [3]
    Statistical Tolerance Intervals: Definition, Use, and Calculation
    May 8, 2022 · A statistical tolerance interval infers the location of a given proportion of the individual values at a given level of confidence.
  4. [4]
    Tolerance interval basics - Support - Minitab
    A tolerance interval defines the upper and/or lower bounds within which a certain percent of the process output falls with a stated confidence.
  5. [5]
    Tolerance intervals in statistical software and robustness under ...
    Jul 18, 2021 · This paper aims to provide a comparative study of the computational procedures for tolerance intervals in some commonly used statistical software packages.
  6. [6]
    Confidence Intervals vs Prediction Intervals vs Tolerance Intervals
    A tolerance interval is a range that likely contains a specific proportion of a population. For example, you might want to know where 99% of the population ...<|control11|><|separator|>
  7. [7]
    [PDF] Normal Tolerance Interval Procedures in the tolerance Package
    Statistical tolerance intervals of the form (1 − α, P) provide bounds to capture at least a specified proportion P of the sampled population with a given ...
  8. [8]
    [PDF] Statistical tolerance intervals and regions - IABS
    The central tolerance interval is assumed to be of the form X¯ ±kS . The central tolerance factor k is to be computed so that the interval (X¯ −kS, X¯ + kS ) ...
  9. [9]
    [PDF] Tolerance Intervals - NCSS
    Notes: The parametric (normal-based) limits assume that the data follow the normal distribution. The nonparametric (distribution-free) limits make no special ...
  10. [10]
    [PDF] Calibration and Simultaneous Tolerance Intervals for Regression
    Simultaneous calibration (or discrimination) intervals in regression were proposed by Lie- berman, Miller, and Hamilton (1967) and by Scheffe (1973).
  11. [11]
    Multivariate Statistical Tolerance Limits - STATGRAPHICS
    Dec 27, 2017 · Multivariate statistical tolerance limits may be calculated from n multivariate observations such that the limits bound P% of all items in the population with ...
  12. [12]
    Statistical Tolerance Regions: Theory, Applications, and Computation
    Nov 10, 2008 · A modern and comprehensive treatment of tolerance intervals and regions The topic of tolerance intervals ... Krishnamoorthy, Thomas Mathew,. First ...
  13. [13]
    None
    **Summary of Tolerance Intervals vs. Confidence Intervals from "Tolerance Intervals Demystified" by AFIT**
  14. [14]
    The distinction between confidence intervals, prediction ... - GraphPad
    Confidence intervals, prediction intervals, and tolerance intervals are three distinct approaches to quantifying uncertainty in a statistical analysis.
  15. [15]
    A tutorial on tolerance intervals in method comparison studies ... - NIH
    The tolerance intervals are not new, they were published in several papers in the 1940s, including by the famous statistician Wald, 12 , 13 , 14 and papers ...
  16. [16]
    When Should I Use Confidence Intervals, Prediction Intervals, and ...
    Apr 18, 2013 · A tolerance interval is a range that is likely to contain a specified proportion of the population. To generate tolerance intervals, you must ...
  17. [17]
    [PDF] Table XI Factors for Tolerance Intervals
    Values of k for Two-Sided Intervals. Confidence. Level. 0.90. 0.95. 0.99. Percent ... 20. 2.152. 2.564. 3.368. 2.310. 2.752. 3.615. 2.659. 3.168. 4.161. 21. 2.135.Missing: kn= p= gamma=