Fact-checked by Grok 2 weeks ago

Sample mean and covariance

In , the sample and sample covariance are key estimators derived from a of observations to approximate the corresponding parameters. The sample , denoted \bar{x}, for a univariate sample of size n consisting of observations x_1, x_2, \dots, x_n, is calculated as the arithmetic average \bar{x} = \frac{1}{n} \sum_{i=1}^n x_i, providing an unbiased estimate of the \mu. For multivariate data with p variables observed across n samples, this extends to the sample mean \bar{\mathbf{x}} = \frac{1}{n} \sum_{i=1}^n \mathbf{x}_i, where each component is the of that variable's observations, again serving as an unbiased estimator of the mean . The sample covariance, which measures the linear relationship and joint variability between variables in the sample, is defined for two variables X_j and X_k as s_{jk} = \frac{1}{n-1} \sum_{i=1}^n (x_{ij} - \bar{x}_j)(x_{ik} - \bar{x}_k), where the division by n-1 ensures it is an unbiased of the population \sigma_{jk}. In the multivariate case, the sample \mathbf{S} is given by \mathbf{S} = \frac{1}{n-1} \sum_{i=1}^n (\mathbf{x}_i - \bar{\mathbf{x}})(\mathbf{x}_i - \bar{\mathbf{x}})^\top, a symmetric whose diagonal elements are the sample variances and off-diagonal elements are the sample ; it unbiasedly estimates the and is positive definite when n > p. These measures are central to , , and multivariate analysis, such as in and hypothesis testing, as they capture and dependence structure from data.

Definitions

Sample mean

The sample mean serves as a fundamental point estimator for the population in inferential statistics, providing a measure of based on a of observations drawn from a larger . For a univariate random sample X_1, X_2, \dots, X_n of size n from a population, the sample mean, denoted \bar{X}, is defined as the arithmetic average of these observations: \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i This formula represents the total sum of the sample values divided by the number of observations. To compute the sample mean, first sum all the individual data points in the sample, then divide this total by the sample size n. In descriptive statistics, the sample mean is equivalent to the arithmetic average, serving as the most straightforward summary of a dataset's central location.

Sample covariance

The sample covariance serves as an estimator for the population covariance, quantifying the linear dependence between two variables from a sample of paired observations. For a dataset consisting of n pairs (X_i, Y_i) where i = 1, \dots, n, the sample covariance s_{XY} is defined as s_{XY} = \frac{1}{n-1} \sum_{i=1}^n (X_i - \bar{X})(Y_i - \bar{Y}), with \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i and \bar{Y} = \frac{1}{n} \sum_{i=1}^n Y_i denoting the sample means. The choice of n-1 in the denominator ensures that s_{XY} is an unbiased estimator of the corresponding population parameter. This statistic measures the directional co-variation between the variables: a positive s_{XY} indicates that deviations from the means tend to occur in the same direction (e.g., both above or both below their means), suggesting a positive linear relationship, whereas a negative value points to opposite directional deviations. The magnitude reflects the extent of this co-movement, though it is scale-dependent on the units of measurement. For illustration, consider paired data on heights (in inches) and weights (in pounds) from a sample of five individuals: heights = 63, 67, 69, 72, 73; weights = 170, 175, 177, 190, 184. The sample means are approximately \bar{X} = 68.8 and \bar{Y} = 179.2. Computing the sum of the products of centered values yields approximately 115.2, so the sample covariance is $115.2 / (5-1) = 28.8, a positive value consistent with the expected positive association between height and weight.

Properties

Unbiasedness

The sample mean \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i, where X_1, \dots, X_n are independent and identically distributed random variables with finite expected value \mu, is an unbiased estimator of the population mean \mu. This holds for any distribution with a finite mean, as the expected value satisfies E[\bar{X}] = \mu. The proof relies on the linearity of expectation, which does not require independence or identical distribution beyond the finite mean assumption: E[\bar{X}] = E\left[ \frac{1}{n} \sum_{i=1}^n X_i \right] = \frac{1}{n} \sum_{i=1}^n E[X_i] = \frac{1}{n} \cdot n \mu = \mu. In contrast, the sample covariance defined using the population mean as \frac{1}{n} \sum_{i=1}^n (X_i - \mu)(X_i - \mu)^T is unbiased for the population covariance matrix \Sigma, assuming finite second moments. However, the standard sample covariance \mathbf{S} = \frac{1}{n} \sum_{i=1}^n (X_i - \bar{X})(X_i - \bar{X})^T, which substitutes the sample mean \bar{X} for \mu, introduces bias because \bar{X} is a random variable dependent on the data. Specifically, under the assumption of independent and identically distributed observations with finite second moments, the expected value is E[\mathbf{S}] = \frac{n-1}{n} \Sigma. To obtain an unbiased estimator, the formula is adjusted to use the denominator n-1: \mathbf{S} = \frac{1}{n-1} \sum_{i=1}^n (X_i - \bar{X})(X_i - \bar{X})^T, yielding E[\mathbf{S}] = \Sigma. The proof proceeds by decomposing the sum and using the fact that E[(X_i - \bar{X})(X_j - \bar{X})^T] = 0 for i \neq j, while for i = j, E[(X_i - \bar{X})(X_i - \bar{X})^T] = \frac{n-1}{n} \Sigma, leading to the overall factor of \frac{n-1}{n} before correction. This denominator adjustment, known as , ensures unbiasedness for both sample variance (a special case of ) and ; using n instead results in a downward of factor \frac{n-1}{n}. The correction originated with , who applied it in astronomical data analysis as early as 1823, though it is named after for his later independent use in 1818.

Consistency

The sample mean \bar{X}_n = \frac{1}{n} \sum_{i=1}^n X_i is a of the population mean \mu, converging in probability to \mu as the sample size n \to \infty, provided the random variables X_i are independent and identically distributed (i.i.d.) with finite variance \sigma^2 < \infty. This fundamental result, known as the weak law of large numbers (WLLN), was established using Chebyshev's inequality, which bounds the probability that |\bar{X}_n - \mu| exceeds any \epsilon > 0 by \sigma^2 / (n \epsilon^2), approaching zero as n grows. In the multivariate setting, the sample covariance matrix \mathbf{S}_n = \frac{1}{n} \sum_{i=1}^n (X_i - \bar{X}_n)(X_i - \bar{X}_n)^T (or the unbiased version with n-1) is consistent for the population \Sigma, meaning \mathbf{S}_n \xrightarrow{p} \Sigma as n \to \infty, assuming the i.i.d. vectors X_i \in \mathbb{R}^p have finite fourth moments. This convergence follows from the consistency of the sample mean \bar{X}_n \xrightarrow{p} \mu combined with the applied to the defining the , ensuring that deviations in the centered terms vanish in probability. The finite fourth-moment condition is necessary because the variance of elements in \mathbf{S}_n involves expectations of products up to fourth order; without it, consistency may fail for heavy-tailed distributions. Alternatively, the result can be derived via on the decomposition \mathbf{S}_n = \frac{1}{n} \sum_{i=1}^n (X_i - \mu)(X_i - \mu)^T + o_p(1), where the first term converges by a multivariate WLLN and the remainder term accounts for centering. While consistency describes convergence in probability without specifying speed, the refines this by showing that the sample mean scales at rate \sqrt{n}, with \sqrt{n} (\bar{X}_n - \mu) \xrightarrow{d} N(0, \sigma^2) under finite variance, highlighting errors of order O_p(1/\sqrt{n}); analogous asymptotic normality holds for \mathbf{S}_n (with \sqrt{n} scaling for vec(\mathbf{S}_n - \Sigma)), but full details appear in the sampling distributions section. Unbiasedness provides exact finite-sample expectation matching but does not guarantee consistency, though the sample mean and covariance satisfy both properties under the stated conditions. To illustrate empirical convergence, consider simulating n i.i.d. draws from a N(0,1) distribution: for small n=10, sample means vary widely around 0 with standard deviation approximately 0.32; as n increases to 100, the standard deviation drops to about 0.1, and by n=1000, over 95% of simulated means lie within 0.03 of 0, visually tightening toward the population mean in histograms across 1000 replications. A similar pattern holds for the sample variance, approaching 1 more reliably with larger n.

Sampling Distributions

Distribution of the sample mean

The distribution of the sample \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i, where X_1, \dots, X_n are and identically distributed (IID) random variables drawn from a , depends on the underlying . Under the that the is normally distributed, specifically X_i \sim N(\mu, \sigma^2) for all i, the sample follows an exact : \bar{X} \sim N(\mu, \sigma^2 / n). This result arises because the sum of normal random variables is itself normal, with n\mu and variance n\sigma^2, and dividing by n scales the mean to \mu and the variance to \sigma^2 / n. A key property of this distribution is its variance, \mathrm{Var}(\bar{X}) = \sigma^2 / n, which holds more generally for any IID population with finite variance \sigma^2 > 0, regardless of the population's shape. This variance decreases as the sample size n increases, reflecting the averaging effect that reduces sampling variability. For the normal population case, the standard deviation of \bar{X} is thus \sigma / \sqrt{n}, known as the standard error of the mean. When the population is not normal but the random variables are IID with finite mean \mu and finite variance \sigma^2 > 0, the exact distribution of \bar{X} is generally unknown. However, the central limit theorem provides an asymptotic approximation: as n \to \infty, the standardized sample mean \sqrt{n} (\bar{X} - \mu) / \sigma converges in distribution to a standard normal random variable, N(0, 1). Equivalently, \sqrt{n} (\bar{X} - \mu) \xrightarrow{d} N(0, \sigma^2). This theorem justifies using the normal distribution to approximate the sampling distribution of \bar{X} for large n, typically n \geq 30, even for non-normal populations. These distributional properties enable the construction of confidence intervals for the population mean \mu. If \sigma^2 is known, a (1 - \alpha) \times 100\% confidence interval is given by \bar{x} \pm z_{\alpha/2} \cdot (\sigma / \sqrt{n}), where z_{\alpha/2} is the upper \alpha/2 quantile of the standard normal distribution, exploiting the exact or asymptotic normality of \bar{X}. When \sigma^2 is unknown, the interval uses the sample standard deviation s and replaces z_{\alpha/2} with the t-distribution quantile t_{n-1, \alpha/2} with n-1 degrees of freedom: \bar{x} \pm t_{n-1, \alpha/2} \cdot (s / \sqrt{n}); this t-based interval is exact under population normality and approximate otherwise via the central limit theorem.

Distribution of the sample covariance

The distribution of the sample covariance matrix \mathbf{S} is a central topic in multivariate statistical inference, particularly when the observations \mathbf{X}_i, i=1,\dots,n, are independent and identically distributed from a p-variate normal distribution N_p(\boldsymbol{\mu}, \Sigma). In this case, the scaled sample covariance matrix (n-1)\mathbf{S} follows a Wishart distribution with n-1 degrees of freedom and scale matrix \Sigma, denoted as (n-1)\mathbf{S} \sim W_p(\Sigma, n-1). This result, derived from the properties of quadratic forms in normal variables, provides the exact sampling distribution for inference on \Sigma. In the univariate setting, where p=1 and the data X_i \sim N(\mu, \sigma^2), the sample variance s^2 follows a scaled chi-squared distribution: s^2 \sim \frac{\sigma^2 \chi^2_{n-1}}{n-1}. This is a special case of the Wishart distribution, reducing to the gamma family for scalar variances, and underpins exact tests like the F-test for equality of variances. For large sample sizes, regardless of the underlying distribution (assuming finite fourth moments), the sample covariance matrix \mathbf{S} admits an asymptotic normal distribution after appropriate centering and scaling. Specifically, \sqrt{n} (\mathrm{vech}(\mathbf{S} - \Sigma)) \xrightarrow{d} N(0, \mathbf{V}), where \mathrm{vech}(\cdot) stacks the lower triangular elements of the symmetric matrix into a vector, and \mathbf{V} is the asymptotic covariance matrix depending on the fourth-order moments of the distribution. Under multivariate normality, \mathbf{V} simplifies to $2(\Sigma \otimes \Sigma) (I + K_{pp}), where K_{pp} is the commutation matrix, enabling delta-method approximations for functions of \Sigma. A key property under the multivariate normal assumption is the independence between the sample \bar{\mathbf{X}} and the sample \mathbf{S}. This independence facilitates joint inference, such as in Hotelling's T-squared statistic, by separating location and dispersion parameters.

Extensions

Weighted samples

In scenarios where observations in a dataset have unequal importance or reliability, such as in or data with varying precision, the sample and covariance can be adapted using weights w_i > 0 assigned to each observation i. Weights can be frequency weights (e.g., multiplicities or inverse inclusion probabilities in sampling designs) or reliability weights (e.g., inverse variances in measurement errors). The weighted sample is defined as \bar{X}_w = \frac{\sum_{i=1}^n w_i X_i}{\sum_{i=1}^n w_i}, which reduces to the standard unweighted sample when all weights are equal to 1. This formulation accounts for factors like inverse variances in measurement errors or sampling probabilities, ensuring that more reliable or representative observations contribute proportionally more to the estimate. The corresponding weighted sample covariance between two variables X and Y depends on the weight type. For frequency weights, where w_i are positive integers representing observation multiplicities, it is s_{XY,w} = \frac{\sum_{i=1}^n w_i (X_i - \bar{X}_w)(Y_i - \bar{Y}_w)}{\sum_{i=1}^n w_i - 1}, with the denominator providing an unbiased estimate of the covariance under the that the effective sample size is \sum w_i. For reliability weights, such as inverse variances, the unbiased requires a different adjustment; a common approximation normalizes the weights so \sum w_i = 1 and uses s_{XY,w} = \frac{\sum_{i=1}^n w_i (X_i - \bar{X}_w)(Y_i - \bar{Y}_w)}{1 - \sum_{i=1}^n w_i^2}, to account for effective . These formulations emphasize pairs with higher when weights reflect reliability. Under design-based frameworks, such as those in probability sampling designs, the weighted sample is an unbiased of the population , with the weights typically set as the inverse of inclusion probabilities to correct for unequal selection chances. For example, in , the population is divided into homogeneous strata, and the overall is estimated as a weighted of stratum-specific sample means, with weights proportional to each stratum's N_h; this yields \bar{y}_{st} = \sum_{h=1}^H (N_h / N) \bar{y}_h, which is unbiased for the finite population . The weighted inherits similar unbiasedness properties when applied within this , provided the weights align with the sampling and the appropriate for the weight type is used. These weighted adaptations find key applications in handling through (IPW), where weights are the reciprocals of estimated probabilities of non-missingness, enabling unbiased inference by effectively upweighting observed cases to represent the full . In cluster sampling, weights adjust for the clustered structure by incorporating design effects, such as varying cluster sizes, to produce representative estimates of means and covariances without assuming equal observation importance. Such methods are particularly valuable in complex surveys, where ignoring weights could lead to biased summaries of relationships.

Multivariate case

In the multivariate case, observations consist of p-dimensional random vectors \mathbf{X}_i = (X_{i1}, \dots, X_{ip})^T for i = 1, \dots, n, drawn from a population with \boldsymbol{\mu} and \boldsymbol{\Sigma}. The sample vector is defined as \bar{\mathbf{X}} = \frac{1}{n} \sum_{i=1}^n \mathbf{X}_i, which is a p \times 1 whose j-th component is the univariate sample of the j-th coordinate across all observations. This provides a point estimate for \boldsymbol{\mu} and generalizes the scalar sample to higher dimensions, where the univariate case arises as the special case p=1. The sample covariance matrix \mathbf{S} extends the scalar sample variance to capture both individual and joint variability among the p coordinates. It is a p \times p symmetric matrix given by \mathbf{S} = \frac{1}{n-1} \sum_{i=1}^n (\mathbf{X}_i - \bar{\mathbf{X}})(\mathbf{X}_i - \bar{\mathbf{X}})^T, with diagonal elements s_{jj} representing the sample variances of each coordinate and off-diagonal elements s_{jk} (for j \neq k) denoting the sample covariances between coordinates j and k. As a Gram matrix of centered data, \mathbf{S} is positive semi-definite, meaning all its eigenvalues are non-negative (\lambda_j \geq 0), which ensures it can serve as a valid covariance structure; it is positive definite if n > p and the data span the full p-dimensional space. The elements of \mathbf{S} offer key interpretations: the trace \operatorname{tr}(\mathbf{S}) = \sum_{j=1}^p s_{jj} quantifies total sample variability, while the off-diagonals reveal linear dependencies, with positive (negative) values indicating coordinates that tend to increase (decrease) together. The eigenvalues of \mathbf{S} further decompose this variability, where the largest eigenvalues correspond to directions of maximum variance, previewing their role in for by projecting data onto these principal axes. For illustration, consider a bivariate sample from soil calcium and measurements, adapted from a trivariate treated as draws from a bivariate , with data points (y_1, y_2): (35, 3.5), (35, 4.9), (40, 30.0), (10, 2.8), (6, 2.7), (20, 2.8), (35, 4.6), (35, 10.9), (35, 8.0), (30, 1.6). The resulting sample mean is \bar{\mathbf{y}} = \begin{pmatrix} 28.1 \\ 7.18 \end{pmatrix}, and the sample is \mathbf{S} = \begin{pmatrix} 140.54 & 49.68 \\ 49.68 & 72.25 \end{pmatrix}, where the off-diagonal indicates positive association between the variables.

Limitations

Degrees of freedom adjustment

The degrees of freedom adjustment in sample covariance estimation involves dividing the sum of cross-products of deviations by n-1 rather than n, where n is the sample size, to account for the one degree of freedom lost when estimating the sample from the data itself. This reduction in effective independent observations arises because the sample imposes a constraint on the deviations, leaving n-1 degrees of freedom for variance or estimation. In general settings where additional parameters beyond the mean are estimated—such as coefficients—the degrees of freedom extend to n - 1 - k, with k representing the number of extra parameters. Using n in the denominator produces the maximum likelihood estimator of the covariance under , but this estimator is biased, systematically underestimating the covariance because deviations from the sample mean are smaller than those from the true mean. The n-1 adjustment yields an unbiased , ensuring its matches the covariance exactly. In the multivariate setting, the sample covariance matrix \mathbf{S} incorporates the same adjustment, such that (n-1) \mathbf{S} follows a with n-1 and scale matrix \boldsymbol{\Sigma}, the covariance matrix. The trace of \mathbf{S}, equivalent to the sum of the diagonal elements (sample variances), then possesses p(n-1) , where p is the dimensionality of the data. For a concrete illustration with small n=2, consider univariate observations x_1 and x_2 from a with variance \sigma^2. The sample is \bar{x} = (x_1 + x_2)/2, and the sum of squared deviations is (x_1 - \bar{x})^2 + (x_2 - \bar{x})^2 = (x_1 - x_2)^2 / 2. Dividing by n-1 = 1 gives the unbiased sample variance s^2 = (x_1 - x_2)^2 / 2, with \sigma^2. Dividing by n=2 instead yields (x_1 - x_2)^2 / 4, with \sigma^2 / 2, confirming the downward bias without adjustment.

Criticisms

The sample mean and sample covariance matrix are highly sensitive to outliers, as even a single extreme observation can substantially distort their values due to the unbounded of these estimators. This sensitivity arises because the sample mean is pulled toward outliers, and the amplifies this effect by incorporating squared deviations, leading to inflated variances and covariances. In contrast, robust alternatives like the and () maintain stability, with breakdown points up to 50%, whereas the sample mean and covariance have a breakdown point of zero. The sampling distributions of the sample mean and covariance rely heavily on the assumption of for exact , but this assumption often fails in real-world data, resulting in unreliable confidence intervals and tests. Without multivariate , the sample does not follow a , which undermines the validity of estimates and leads to fragile , particularly for small samples. adjustments provide partial mitigation by correcting bias in variance estimates, but they do not address the distributional fragility under non-. In high-dimensional settings where the number of variables p approaches or exceeds the sample size n, the sample \mathbf{S} suffers from the curse of dimensionality, becoming unstable and ill-conditioned unless n \gg p. This instability manifests as eigenvalue spreading, where the eigenvalues of \mathbf{S} diverge from those of the population , leading to poor performance in downstream tasks like . Historical debates in the early , particularly following Fisher's work on , highlighted tensions between unbiasedness and (MSE) minimization for the sample in small samples. For instance, in the multivariate case, later developments like the James-Stein estimator showed that biased alternatives can achieve lower MSE than the unbiased sample in finite samples. To address these issues, robust estimators such as the minimum (MCD) have been developed as alternatives, offering high breakdown points while estimating location and scatter.

References

  1. [1]
    Sample Means - Yale Statistics and Data Science
    The sample mean from a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables.
  2. [2]
    [PDF] Sample Mean Vector and Sample Covariance Matrix
    Notice that here each column in the data matrix corresponds to a particular variate Xj. Sample mean: For each variate Xj, define the sample mean: ¯xj = 1 n.
  3. [3]
    Measures of Association: Covariance, Correlation - STAT ONLINE
    The difference between the first and second terms is then divided by n -1 to obtain the covariance value. That is, the sample covariance \(s_{jk}\) is unbiased ...
  4. [4]
    8.5 - Sample Means and Variances | STAT 414
    The sample mean, denoted and read “x-bar,” is simply the average of the data points. The sample mean summarizes the location or center of the data.
  5. [5]
    Measures of Central Tendency | STAT 504
    The sample mean, written as , equals the sum of observations divided by the size of the sample. X ¯ = ∑ i = 1 n X i n.
  6. [6]
    Statistics Intro Lesson 3&4 Review - Andrews University
    We find the arithmetic mean by summing all elements and dividing by the number of elements. Although x-bar is used for sample mean, µ (mu) is used for ...
  7. [7]
    Variance, covariance, correlation
    (For essentially the same reasons as in the case of the variance, we should divide by N-1 to get an unbiased estimator of the sample covariance.) Earlier, we ...Missing: formula | Show results with:formula
  8. [8]
    Statistical notes for clinical researchers: covariance and correlation
    Covariance is expressed as following formula: Covariance X , Y = ∑ ( X - X ... covariance value means that two variables move to opposite directions.
  9. [9]
    Covariance - Math.net
    ... example is shown below. Example. Find the covariance given a sample of the heights and weights of men who frequent a particular gym: x = height (in), y = weight ...Missing: numerical | Show results with:numerical
  10. [10]
    1.3 - Unbiased Estimation | STAT 415 - STAT ONLINE
    The first equality holds because we've merely replaced p ^ with its definition. The second equality holds by the rules of expectation for a linear combination.
  11. [11]
    [PDF] Sampling - Stanford University
    Feb 21, 2020 · Proof 2: By linearity of expectation: E[ 𝑋𝑋] = 𝐸𝐸[. 1. 𝑛𝑛 ∑𝑖𝑖=1. 𝑛𝑛. 𝑋𝑋𝑖𝑖] = 1. 𝑛𝑛 ∑𝑖𝑖=1 ... We know that the sample mean 𝑋𝑋 is an unbiased ...
  12. [12]
    [PDF] 15 - Estimating moments
    n=1. −4. −3. −2. −1. 0. 1. 2. 3. 4. 0. 0.01. 0.02. 0.03. 0.04. 0.05. 0.06. 0.07 n=2 ... Proof that the sample covariance is unbiased we'd like to find EQn; we ...
  13. [13]
    [PDF] Sample Geometry - Edps/Soc 584, Psych 594
    n - 1. Sn = S = 1 n - 1 n. X j=1. (Xj - ¯x)(Xj - ¯x)′ is an unbiased estimator of Σ. ▷ S w/o a subscript has divisor (n - 1) and is unbiased. ▷ Sn w/ a ...
  14. [14]
    Bessel's Correction -- from Wolfram MathWorld
    161), the correction factor is probably more properly attributed to Gauss, who used it in this connection as early as 1823 (Gauss 1823). For two samples ...
  15. [15]
    [PDF] Sample Variance Have N-1 in the denominator?
    The reason we use n-1 rather than n is so that the sample variance will be what is called an unbiased estimator of the population variance 𝜎2. To explain what ...Missing: covariance | Show results with:covariance
  16. [16]
    26.2 - Sampling Distribution of Sample Mean | STAT 414
    The sample mean is normally distributed with mean and variance. That is, the probability distribution of the sample mean is:
  17. [17]
    24.4 - Mean and Variance of Sample Mean | STAT 414
    We'll finally accomplish what we set out to do in this lesson, namely to determine the theoretical mean and variance of the continuous random variable.
  18. [18]
    Central Limit Theorem - Probability Course
    7.1.2 Central Limit Theorem ... has mean EZn=0 and variance Var(Zn)=1. The central limit theorem states that the CDF of Zn converges to the standard normal CDF.
  19. [19]
    27.1 - The Theorem | STAT 414
    The Central Limit Theorem (CLT) tells us that the sampling distribution of the sample mean is, at least approximately, normally distributed.
  20. [20]
    Confidence Intervals - Utah State University
    Since the sample is small, use the t-distribution to obtain the critical value for the confidence interval. For a 95% confidence interval, (1−α)⋅100%=95 ...
  21. [21]
    2.5 - A t-Interval for a Mean | STAT 415
    So far, we have shown that the formula: x ¯ ± z α / 2 ( σ n ). is appropriate for finding a confidence interval for a population mean if two conditions are ...
  22. [22]
    [PDF] Weighted Means and Means as Weighted Sums
    A weighted mean is a sum of coefficients times numbers, where the coefficients, called weights, sum to 1. The ordinary mean is a special case.
  23. [23]
    [PDF] Data Analysis Toolkit #12: Weighted averages and their uncertainties
    Weighted averages use measurements (xi) and their weights (wi) to calculate the mean. Weights can be based on importance or uncertainty, with inverse variance ...
  24. [24]
    WEIGHTED CORRELATION, WEIGHTED COVARIANCE ...
    Nov 8, 2018 · Given paired response variables x and y of length n and a weights variable w, the weighted covariance is computed with the formula where m denotes the weighted ...
  25. [25]
    The Weighted Mean
    A weighted mean combines measurements with different errors, weighting each by the inverse square of its error, giving more importance to more precise ...
  26. [26]
    [PDF] 3 STRATIFIED SIMPLE RANDOM SAMPLING
    is an unbiased estimator of t. An unbiased estimator of yU is a weighted average of the stratum sample means c. yU str = b tstr. N. = 1. N. H. X h=1. Nhyh or ...
  27. [27]
    [PDF] Chapter 4 Stratified Sampling - IIT Kanpur
    First, we discuss the estimation of the population mean. Note that the population mean is defined as the weighted arithmetic mean of stratum means in the case.
  28. [28]
  29. [29]
    On weighting approaches for missing data - PMC - NIH
    We review the class of inverse probability weighting (IPW) approaches for the analysis of missing data under various missing data patterns and mechanisms.Missing: covariance | Show results with:covariance
  30. [30]
    [PDF] Chapter 6 Weighting and Variance Estimation
    Finally, the multiplicity adjustment factor was derived by dividing the new sampling weight by the old sampling weight, WT3 = NEW_WT2 / WT2,Missing: covariance | Show results with:covariance<|control11|><|separator|>
  31. [31]
    [PDF] Principal Component Analysis (PCA)
    Apr 24, 2023 · The 1st Principal component v1 is the eigenvector of the sample covariance matrix XXT associated with the largest eigenvalue λ1.
  32. [32]
    [PDF] Multivariate Analysis - UNM Math
    Jan 30, 2015 · Covariance: example. To calculate the sample covariance matrix, we can calculate the pairwise covariances between each of the three variables ...
  33. [33]
    The Generalised Product Moment Distribution in Samples ... - jstor
    THE GENERALISED PRODUCT MOMENT DISTRIBUTION. IN SAMPLES FROM A NORMAL MULTIVARIATE POPU-. LATION. By JOHN WISHART, M.A., B.Sc. Statistical Department ...
  34. [34]
    [PDF] Robust statistics for outlier detection - KU Leuven
    The influence function of the mean is unbounded, which again illustrates that the mean is not robust. For a general definition of the median, we de- note the ...
  35. [35]
    [PDF] Robust Statistical Methods for Automated Outlier Detection
    against the presence of outliers. = [(n - l)/n] t. The extreme sensitivity of the sample mean to outliers can be traced to the error criterion from which it ...
  36. [36]
    Detecting multivariate outliers: Use a robust variant of the ...
    However, that indicator uses the multivariate sample mean and covariance matrix that are particularly sensitive to outliers. Hence, this method is ...
  37. [37]
    [PDF] Chapter 2: Inference about the mean vector(s)
    It is assumed that the sample {X1,...,Xn} is IID from a p-variate normal distribution with some unknown mean vector µ and unknown covariance matrix. The ...
  38. [38]
    [PDF] A Review and Guide to Covariance Matrix Estimation
    Feb 2, 2022 · The sample covariance matrix ST is unbiased and the maximum likelihood estimator under normality. There was a time when it was thought that such ...
  39. [39]
    [PDF] arXiv:1203.0967v1 [math.ST] 5 Mar 2012
    Mar 5, 2012 · matrix. 1. Page 2. To overcome this curse of dimensionality, several works studied the esti- mation of the population covariance matrix, under ...
  40. [40]
    [PDF] Some History of Optimality - Rice Statistics
    Asymptotic optimality for estimation goes back to Fisher (1922), as mentioned in Section 2, and is defined as minimum asymptotic variance. For testing ...
  41. [41]
    Minimum covariance determinant and extensions - Hubert - 2018
    Dec 22, 2017 · The minimum covariance determinant (MCD) method is a highly robust estimator of multivariate location and scatter, for which a fast algorithm is available.DESCRIPTION OF THE MCD... · PROPERTIES · MCD-BASED MULTIVARIATE...