Fact-checked by Grok 2 weeks ago

Ancillary statistic

In statistics, an ancillary statistic is a function of the sample data whose does not depend on the unknown of the model, meaning it remains across all possible parameter values. This concept, introduced by A. Fisher in his 1925 paper "Theory of Statistical Estimation," allows for the identification of data aspects that provide no direct information about the parameters but can influence inference when combined with other statistics. Ancillary statistics play a crucial role in data reduction and conditional inference, enabling the separation of parameter-free variability from parameter-dependent information. One of the most significant theoretical advancements involving ancillary statistics is , proved by in 1955, which states that any boundedly complete is independent of any ancillary statistic. This independence result is pivotal for proving statistical properties without computing full joint distributions, such as in the analysis of normal distributions with known variance, where the sample mean is a complete for the mean parameter and independent of the ancillary sample variance, or in a sample from a on (\theta, \theta + 1), the (maximum minus minimum) is an ancillary statistic, as its distribution is free of \theta. Ancillary statistics also arise in location-scale families, where studentized residuals—ratios that normalize for and —exhibit parameter-free distributions, facilitating robust procedures. Their utility extends to recovering ancillary information lost in sufficient statistics, a emphasized for improving in complex models. Despite their lack of standalone informational value about parameters, ancillary statistics ensure that conditional distributions remain relevant for observed data, underpinning modern frequentist methods.

Definition and Properties

Definition

In statistics, the concept of an ancillary statistic was introduced by Ronald A. Fisher in the 1920s within his framework for conditional and estimation. Fisher used the term to describe quantities derived from the that support without depending on the parameters. The idea was later formalized by in 1964, who emphasized its role in recovering information lost in reduction. An ancillary statistic is a function of the observed data whose remains invariant to the unknown θ in the underlying . This parameter-free allows ancillary statistics to serve as a foundational in conditional approaches to , where the focus shifts to aspects of the data unrelated to θ. Formally, consider a random sample X drawn from a parameterized by θ. A A(X) is ancillary if its satisfies P(A(X) \leq a \mid \theta) = P(A(X) \leq a) for all values of θ in the parameter space and all a in the support of A(X). This condition ensures that the sampling variability of A(X) is identical regardless of the true parameter value. In contrast to sufficient statistics, which encapsulate all information about θ present in the sample, ancillary statistics convey none about the parameter.

Properties

An ancillary statistic A is defined such that its distribution does not depend on the unknown parameter \theta, meaning the probability measure P_\theta(A \in \cdot) is identical for all \theta \in \Theta and thus free of \theta. This parameter independence implies that the marginal distribution of A remains unchanged across the parameter space, ensuring that A alone conveys no direct evidence about \theta. This property extends to invariance under reparameterization: if A is ancillary for \theta, then A is also ancillary for any one-to-one transformation g(\theta), as the distribution of A continues to lack dependence on the transformed parameter. Ancillary statistics are non-informative regarding \theta, contributing zero Fisher information, since their likelihood does not vary with \theta and the expected value of the score function with respect to A is null. Ancillary statistics are not unique; any measurable function h(A) of an ancillary statistic A is also ancillary provided that h preserves the parameter-free distribution of A. In the context of independent and identically distributed (i.i.d.) samples, the \sigma-algebra generated by an ancillary statistic is independent of \theta, meaning events defined by the ancillary statistic have probabilities unaffected by the parameter value.

Examples

Normal Distribution

In the context of independent and identically distributed (i.i.d.) samples from a N(\mu, 1) with known variance \sigma^2 = 1, the sample variance serves as a classic example of an ancillary statistic for the \mu. Specifically, the sample dispersion S^2 = \frac{1}{n-1} \sum_{i=1}^n (X_i - \bar{X})^2, where \bar{X} is the sample , has a that does not depend on \mu. This arises because the residuals X_i - \bar{X} are translation-invariant, shifting with \mu in a way that cancels out in the squared differences. The normalized form (n-1) S^2 / \sigma^2 follows a with n-1 , \chi^2_{n-1}, which is free of \mu. This result stems from the of the residuals under , where the sum of squared standardized deviations yields the chi-squared law independently of the mean. In contrast, the sample mean \bar{X} is not ancillary, as its N(\mu, 1/n) explicitly depends on \mu, providing information about the rather than being parameter-free. For the full normal model N(\mu, \sigma^2) with both location and scale parameters unknown, a higher-dimensional ancillary statistic emerges in the form of the configuration statistic, defined as the vector of normalized residuals \mathbf{U} = \left( \frac{X_1 - \bar{X}}{S}, \dots, \frac{X_n - \bar{X}}{S} \right). This vector captures the shape or configuration of the sample, with a joint distribution independent of both \mu and \sigma^2, as standardization removes scale and centering eliminates location effects. The configuration statistic thus provides a parameter-free summary of the data's relative positions, useful for illustrating ancillarity in location-scale settings.

Location-Scale Families

In location families, where the density is of the form f(x \mid \theta) = \psi(x - \theta) for \theta \in \mathbb{R}, ancillary statistics arise from location-invariant functions of the data. For instance, differences between observations, such as X_i - X_j or the sample X_{(n)} - X_{(1)}, have distributions that do not depend on \theta, as shifting all data by a constant leaves these differences unchanged. In scale families, with densities f(x \mid \theta) = \frac{1}{\theta} \psi\left( \frac{x}{\theta} \right) for \theta > 0, ancillary statistics are scale-invariant, such as ratios \frac{X_i}{X_j} or \frac{|X_i|}{\sum |X_j|}, whose distributions are free of \theta because multiplying all data by a positive constant preserves the ratios. For location-scale families, densities take the form f(x \mid \mu, \sigma) = \frac{1}{\sigma} \psi\left( \frac{x - \mu}{\sigma} \right) with \mu \in \mathbb{R} and \sigma > 0, and ancillary statistics are invariant under affine transformations ax + b with a > 0. Examples include studentized residuals \frac{X_i - \bar{X}}{S}, where \bar{X} is the sample mean and S is the sample standard deviation; the joint distribution of these residuals is independent of \mu and \sigma. More generally, the vector of standardized observations T = \frac{X - \mu}{\sigma} has a parameter-free distribution, and data-based approximations like \frac{X - \bar{X}}{S} serve as ancillary statistics capturing this invariance. A concrete example occurs in the on [\theta - 1/2, \theta + 1/2], a location family; here, the X_{(n)} - X_{(1)} is ancillary for \theta, as its distribution does not depend on the . This illustrates how such statistics standardize away the , facilitating in broader location-scale settings.

Applications

Basu's Theorem

Basu's theorem states that if T is a boundedly complete for a \theta and A is an ancillary statistic, then T and A are stochastically independent for every value of \theta. This result, proved by in 1955, builds on Ronald Fisher's earlier introduction of the concepts of sufficiency and ancillarity in the . The proof relies on the completeness property of T. Specifically, for any bounded f, the satisfies E[f(T) \mid A = a] = E[f(T)], which holds for all a in the support of A. This equality implies that the joint distribution factors as P(T \leq t, A \leq a) = P(T \leq t) P(A \leq a) for all t and a, establishing . A key implication of the theorem is that it facilitates exact conditional inference by separating the parameter-dependent component captured by the sufficient statistic from the ancillary component, which provides no information about \theta but can be used to refine inference without introducing bias. This separation aligns with the conditionality principle, allowing inference to be based solely on the conditional distribution of T given A, thereby achieving uniformity across ancillary values.

Recovery of Information

One key application of ancillary statistics arises in recovering information lost when using a non-sufficient statistic by pairing it with an appropriate ancillary complement, thereby achieving full sufficiency in the conditional distribution. This approach leverages the fact that the joint distribution of a sufficient statistic and its ancillary complement is minimal sufficient, allowing the conditional distribution of the non-sufficient part given the ancillary to encapsulate the complete information. A classic example illustrates this recovery: consider two independent and identically distributed observations X_1, X_2 \sim N(\theta, 1), where \theta is the unknown mean. The single observation X_1 is not sufficient for \theta, carrying of 1, but the difference D = X_1 - X_2 is ancillary with distribution N(0, 2), independent of \theta. Conditioning on D = d yields the full information from both observations, increasing the to 2. The conditional density is given by f(X_1 \mid X_1 - X_2 = d; \theta) = N\left(\theta + \frac{d}{2}, \frac{1}{2}\right), demonstrating recovered precision equivalent to the variance of a single normal observation halved. In conditional inference, ancillaries are used to condition out parameter-free variation, enhancing the exactness of tests and intervals by focusing on the relevant data distribution. This method improves performance over unconditional approaches, particularly in small samples, by eliminating ancillary-induced variability. facilitates this by establishing independence between sufficient and ancillary statistics in certain models, enabling such conditioning without information loss (detailed in the Basu's Theorem section). An advanced application appears in constructing prediction intervals, where ancillaries standardize future observations into parameter-free pivots. For instance, in normal models, conditioning on residuals or differences creates pivotal quantities whose distributions do not depend on unknown parameters, yielding exact predictive distributions that account for both estimation and prediction uncertainty.

Ancillary Complement

In , an ancillary complement to a T is an ancillary U whose does not depend on the unknown \theta, such that the joint (T, U) is sufficient for \theta even when T alone is not. This concept, introduced by , allows for the recovery of information lost in a non-sufficient reduction of the data by incorporating the ancillary component. Ancillary complements are not always unique; multiple such U may exist for a given T, and their selection depends on the structure of the statistical model. For instance, in models from exponential families, the existence and form of ancillary complements often relate to the dimensionality of the parameter space and the availability of maximal ancillaries, which can be used to construct optimal conditional inferences. A classic example arises in the context of binomial trials, analogous to estimating a baseball player's batting success probability p, where X denotes the number of hits observed in N at-bats. Here, N is ancillary because its distribution does not depend on p (e.g., fixed by the game schedule or design). The proportion X/N is not sufficient for p, as it loses information about the scale of observation, but the joint statistic (X, N) is minimal sufficient. In this model, conditional on N = n, X \sim \operatorname{Bin}(n, p), and the pair (X, N) fully captures the information about p from the data. This mechanism underpins the recovery of ancillary information in more general settings, where conditioning on the complement enhances .

Relation to Sufficiency

Ancillary statistics were introduced by in 1925 as part of his foundational work on statistical estimation, where he motivated their role in achieving exact by conditioning on relevant subsets of the , tying them directly to the concept of sufficiency to avoid irrelevant variation in tests and estimates. argued that ancillaries, whose distributions do not depend on the parameters, complement sufficient statistics to define these subsets, enabling precise likelihood-based without approximation. In terms of decomposition, any statistic can be factored into a sufficient part that captures all information about the parameter and an ancillary part that carries none, extending the factorization theorem; specifically, for a sufficient statistic T (such as the maximum likelihood estimator), there often exists an ancillary statistic U such that the pair (T, U) is minimal sufficient, allowing the full data to be recovered up to the ancillary variation. This decomposition highlights how sufficiency reduces the data dimension while preserving inferential content, whereas ancillarity maintains the structural variation in the data for conditional inference, ensuring that inferences are tailored to the observed configuration without introducing parameter-dependent bias. Ancillaries are particularly useful when paired with complete sufficient statistics, as this combination facilitates the construction of unbiased estimators through independence properties, such as those established in , where the ancillary is independent of the complete sufficient statistic. In this framework, conditioning on the ancillary refines unbiasedness by eliminating extraneous variability, leading to uniformly minimum variance unbiased estimators in many cases. In , ancillary statistics contribute to the development of reference priors by ensuring conditional posterior independence from the ancillary given the sufficient statistic, which helps in deriving objective priors that maximize expected information while respecting the model's ancillary structure. This application underscores the modern utility of ancillaries in non-informative Bayesian inference, bridging frequentist with posterior computation.

References

  1. [1]
    [PDF] 1 Ancillary statistics
    Jan 25, 2016 · A statistics is ancillary if its distribution does not depend on θ. More precisely, a statistic S(X) is ancillary for Θ it its distribution is ...
  2. [2]
    [PDF] ANCILLARY STATISTICS: A REVIEW
    Traditionally the definition of an ancillary statistic has been a statistic with a distribution free of the model parameter, although as we saw in the measuring.
  3. [3]
    [PDF] CR .1 In # n In problems of parametric inference, an ancillary statistic ...
    In problems of parametric inference, an ancillary statistic is one whose distribution is the same for all parameter values. his is the simplest definition and ...
  4. [4]
    Completeness, Ancillarity, and Basu's Theorem - Stat 210a
    2 Ancillarity​​ Our next definition describes a type of statistic that carry no information about the parameter. Definition: We say is ancillary for the model P ...
  5. [5]
    [PDF] Stat 709: Mathematical Statistics Lecture 20
    Lecture 12: Completeness. Ancillary statistics. A statistic V(X) is ancillary iff its distribution does not depend on any unknown quantity.
  6. [6]
    [PDF] Ancillary Statistics
    An illustrative, if somewhat artificial, example is a sample (Y1,...,Yn), where now n is fixed, from the uniform distribution on (θ , θ + 1). The largest and ...
  7. [7]
    [PDF] Ancillary Statistics: A Review
    Basu pointed out that if X1,··· ,Xn (n ≥ 2) are iid uniform (θ, θ2), θ > 1, then the MLE of θ is T = [max(X1,··· ,Xn)]1/2. But, in this case, there does not ...
  8. [8]
    Recovery of Ancillary Information* - ResearchGate
    This chapter discusses the recovery of ancillary information. R.A. Fisher defines an ancillary statistic as one whose probability (sampling) distribution is ...
  9. [9]
    [PDF] Ancillary Statistics: A Review
    Ancillary statistics, one of R. A. Fisher's most fundamental contributions to statistical inference, are statistics whose distributions do not depend on the ...
  10. [10]
    The Application of Invariance to Unbiased Estimation - Project Euclid
    An ancillary statistic is one whose distribution does not depend on the parameter. The preponderance of applications of all the above references have been to ...Missing: reparameterization | Show results with:reparameterization
  11. [11]
    [PDF] On the Nile Problem by Sir Ronald Fisher - arXiv
    Jul 5, 2013 · In particular, if S is an ancillary statistic, the σ-algebras σ(T) and σ(S) are independent for all θ ∈ Θ. If the pair (T,S) determines the ...<|control11|><|separator|>
  12. [12]
    26.3 - Sampling Distribution of Sample Variance | STAT 414
    Let's turn our attention to finding the sampling distribution of the sample variance. The following theorem will do the trick for us!
  13. [13]
    Ancillaries and Conditional Inference - Project Euclid
    same reparameterization. For a numerical illustration, consider the extreme ... value of an appropriate ancillary statistic. In Sections. 3 and 4 we ...
  14. [14]