Fact-checked by Grok 2 weeks ago

Pearson's chi-squared test

Pearson's chi-squared test, introduced by British statistician Karl Pearson in 1900, is a nonparametric statistical procedure used to assess whether observed frequencies in categorical data significantly differ from expected frequencies under a specified null hypothesis, such as random distribution or independence between variables.^[1] The test computes a test statistic, denoted as \chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}, where O_i represents the observed frequency in each category and E_i the corresponding expected frequency; under the null hypothesis and with sufficiently large sample sizes, this statistic approximately follows a chi-squared distribution with degrees of freedom depending on the application—typically k-1 for k categories in goodness-of-fit tests or (r-1)(c-1) for an r \times c contingency table in tests of independence or homogeneity.^[2]^[3] Pearson's chi-squared test encompasses three primary variants, each addressing distinct hypotheses about categorical data. The goodness-of-fit test evaluates whether sample data conform to a theoretical distribution, such as a uniform, normal, or Poisson distribution, by comparing observed counts to those predicted by the model; it is particularly useful in quality control and genetics to validate distributional assumptions.^[2]^[4] The test of independence examines whether two categorical variables are associated in a single population, using a contingency table to test the null hypothesis that the variables are independent; for instance, it can assess if gender influences voting preference in survey data.^[5]^[3] The test of homogeneity applies to multiple populations, testing whether the distribution of a categorical variable is the same across groups, such as comparing disease prevalence across different regions; it shares the same computational framework as the independence test but frames the hypothesis in terms of population equality rather than variable association.^[4]^[3] Widely applied in fields including social sciences, medicine, economics, and biology, the test requires key assumptions for validity: all expected frequencies should be at least 1, with no more than 20% below 5 (ideally all at least 5 for accuracy), random sampling, and categorical data without excessive sparsity; violations may necessitate alternatives like Fisher's exact test or simulations.^[3]^[4]^[6]

Applications

Goodness-of-fit testing

The goodness-of-fit test based on Pearson's chi-squared statistic evaluates whether the observed frequencies in categorical data align with the frequencies anticipated under a specified probability distribution, providing a measure of discrepancy between empirical observations and theoretical expectations.^[2] Introduced by Karl Pearson in 1900, this test is particularly suited for discrete data where categories have predefined probabilities.^[7] To conduct the test, one first formulates the null hypothesis that the data are drawn from the hypothesized distribution, which may require estimating unspecified parameters (such as the mean for a Poisson distribution) from the sample itself.^[8] Next, the expected frequencies E_i for each category i are computed using the formula E_i = n \cdot p_i, where n is the total sample size and p_i is the probability assigned to category i under the null hypothesis.^[2] The test statistic is then applied to quantify deviations, with larger values indicating poorer fit to the hypothesized model.^[9] This approach finds applications in testing discrete probability distributions, including the Poisson distribution for modeling count data like event occurrences, the binomial distribution for binary outcomes in fixed trials, and the multinomial distribution for multiple categories with fixed probabilities.^[10] For example, researchers might use it to assess whether defect rates in manufacturing follow a Poisson process or if genetic trait inheritance adheres to binomial expectations.^[11] A representative example involves testing uniformity in die rolls, hypothesizing that a fair six-sided die produces each face with equal probability p_i = 1/6. Suppose 120 rolls yield the following observed frequencies:

Face	Observed (O_i)	Expected (E_i)
1	15	20
2	25	20
3	20	20
4	18	20
5	22	20
6	20	20

Here, expected frequencies are E_i = 120 \times (1/6) = 20 for each face, allowing assessment of whether deviations from uniformity are significant.

Independence testing

Pearson's chi-squared test for independence, introduced by Karl Pearson in 1900, provides a method to assess whether two categorical variables exhibit a statistically significant association in a dataset organized as a contingency table. This formulation extends beyond simple 2x2 tables, applying specifically to larger r × c contingency tables where r and c represent the number of categories for each variable, respectively. In this setup, the observed data consist of frequencies O_{ij} in the cell at row i and column j, derived from cross-classifying n independent observations into the r rows and c columns. The null hypothesis states that the two variables are independent, implying no association between them; under this hypothesis, the expected frequency for each cell is calculated as E_{ij} = \frac{R_i C_j}{N}, where R_i is the total for row i, C_j is the total for column j, and N is the grand total of all observations. The test statistic is then \chi^2 = \sum_{i=1}^r \sum_{j=1}^c \frac{(O_{ij} - E_{ij})^2}{E_{ij}}, which measures the overall deviation between observed and expected frequencies and follows an approximate chi-squared distribution with (r-1)(c-1) degrees of freedom under the null for large samples. If the computed \chi^2 value yields a p-value below a chosen significance level (e.g., 0.05), the null hypothesis is rejected, indicating that the variables are likely dependent and that the observed deviations are not due to random chance alone. For instance, in survey data analyzing responses across demographic groups, a significant result might suggest an association between variables such as age category and opinion on a policy.^[6] This interpretation holds provided basic assumptions like sufficient sample size are met, ensuring expected frequencies are generally at least 5 in most cells.

Homogeneity testing

The chi-squared test of homogeneity assesses whether the distribution of a categorical variable is the same across multiple populations or groups, using a contingency table where rows represent groups and columns represent categories of the variable. Unlike the independence test, which examines association within a single population, homogeneity focuses on comparing proportions across populations under the null hypothesis of equal distributions. The computation mirrors that of the independence test, with expected frequencies E_{ij} = \frac{R_i C_j}{N} and degrees of freedom (r-1)(c-1), where r is the number of groups and c the number of categories. A significant result rejects the null, indicating differing distributions across groups. For example, to test if the proportion of smokers is the same in two regions, observed counts of smokers and non-smokers in each region form a 2x2 table; the test evaluates if regional differences are statistically significant.^[12]

Computation

Test statistic formula

The test statistic for Pearson's chi-squared test, introduced by Karl Pearson in 1900, measures the discrepancy between observed and expected frequencies under the null hypothesis.^[13] It is computed using the formula

\chi^2 = \sum_i \frac{(O_i - E_i)^2}{E_i},

where the summation is taken over all categories or cells i, O_i denotes the observed count in category i, and E_i represents the expected count under the null hypothesis.^[2] This statistic quantifies deviations by standardizing the differences (O_i - E_i) relative to the expected values, emphasizing larger relative discrepancies in cells with smaller expectations. The same core formula applies to both goodness-of-fit testing and independence testing, though the computation of expected frequencies E_i differs by context.^[14] In goodness-of-fit tests, E_i = n p_i, where n is the total sample size and p_i is the hypothesized probability for category i. For tests of independence in contingency tables, E_{ij} = \frac{(r_i \cdot c_j)}{n}, with r_i as the row total for row i, c_j as the column total for column j, and n as the grand total.^[14] Computationally, the statistic involves summing the squared standardized residuals across all relevant categories or table cells, ensuring no expected values are zero to avoid division issues. Modern statistical software packages, such as R or SAS, automate this calculation by inputting observed frequencies and specifying the null model to derive expectations, facilitating efficient implementation for large datasets.^[2] Under the null hypothesis and for sufficiently large sample sizes, the test statistic \chi^2 approximately follows a chi-squared distribution with k-1 degrees of freedom, where k is the number of categories in a simple goodness-of-fit scenario without estimated parameters.^[2] This asymptotic approximation underpins the test's inferential properties.

Degrees of freedom and p-values

The degrees of freedom (df) for Pearson's chi-squared test depend on the specific application. In the goodness-of-fit test, the degrees of freedom are calculated as df = k - 1 - m, where k is the number of categories and m is the number of parameters estimated from the data under the null hypothesis.^[15] For the test of independence in an r \times c contingency table, the degrees of freedom are df = (r - 1)(c - 1).^[16] These values determine the shape of the reference chi-squared distribution used for inference. Once the test statistic \chi^2 is computed, its significance is assessed by comparing it to the chi-squared distribution with the appropriate degrees of freedom. The p-value is the probability of observing a test statistic at least as extreme as the one calculated, assuming the null hypothesis is true; it is obtained by evaluating the survival function (right-tail probability) of the chi-squared distribution at \chi^2 with the given df.^[17] For example, at a significance level \alpha = 0.05, the null hypothesis is rejected if the p-value is less than 0.05. Alternatively, the critical value \chi^2_{\alpha, df} can be looked up from chi-squared distribution tables, and the null is rejected if the observed \chi^2 exceeds this threshold, defining the rejection region in the right tail.^[16] Estimating parameters from the sample data under the null hypothesis reduces the degrees of freedom by the number of such parameters, as this accounts for the variability introduced by the estimation process and adjusts for the loss of independence in the fitted model.^[15] This adjustment ensures the test maintains its nominal significance level. Pearson's chi-squared test relies on an asymptotic approximation, where the distribution of the test statistic converges to a chi-squared distribution as the sample size increases, provided the expected frequencies are sufficiently large.^[18] This validity holds only for large samples, typically when all expected cell counts exceed 5, to ensure the approximation is reliable.^[19]

Theoretical Basis

Derivation for goodness-of-fit

The derivation of Pearson's chi-squared statistic for goodness-of-fit testing assumes that the observed frequencies O = (O_1, \dots, O_k) arise from a multinomial distribution with total sample size n = \sum O_i and specified null probabilities p = (p_1, \dots, p_k) where \sum p_i = 1. Under the null hypothesis H_0: P(O = o) = \frac{n!}{\prod o_i!} \prod p_i^{o_i}, the expected frequencies are E_i = n p_i. Pearson's original formulation in 1900 constructs the test statistic as X^2 = \sum_{i=1}^k \frac{(O_i - E_i)^2}{E_i}, which sums the squared differences between observed and expected frequencies, standardized by the expected frequencies to reflect the scale of variability in each category. This approach treats the deviations as analogous to standardized residuals in a normal model, motivated by the need for a general criterion to assess fit beyond Gaussian assumptions.^[20] The asymptotic \chi^2_{k-1} distribution of X^2 under H_0 follows from its equivalence to the likelihood ratio statistic for large n. The likelihood ratio statistic is G^2 = 2 \sum_{i=1}^k O_i \log(O_i / E_i). Applying a second-order Taylor expansion to \log(O_i / E_i) around 0 yields \log(O_i / E_i) \approx \frac{O_i - E_i}{E_i} - \frac{1}{2} \left( \frac{O_i - E_i}{E_i} \right)^2, so substituting and simplifying gives G^2 \approx \sum_{i=1}^k \frac{(O_i - E_i)^2}{E_i} = X^2. Since G^2 \xrightarrow{d} \chi^2_{k-1} by standard likelihood ratio theory for the multinomial, X^2 shares this limiting distribution for large n.^[21]^[20] An equivalent derivation uses the central limit theorem: \sqrt{n} (\hat{p} - p) \xrightarrow{d} N(0, \Sigma), where \hat{p} = O/n and \Sigma = \diag(p) - p p^T. Rewriting X^2 = n (\hat{p} - p)^T D (\hat{p} - p) with D = \diag(1/p_i), this is a quadratic form in the asymptotically normal vector. The matrix D^{1/2} \Sigma D^{1/2} is idempotent with rank k-1 (trace k-1), confirming X^2 \xrightarrow{d} \chi^2_{k-1}. A moment-generating function approach similarly shows the limiting form, as the MGF of the quadratic form matches that of a \chi^2_{k-1} random variable under the covariance structure.^[22]^[23] In the special case of two categories (k=2), the multinomial reduces to binomial, and X^2 = \frac{n (\hat{p}_1 - p_1)^2}{p_1 (1 - p_1)} = z^2, where z = \frac{\sqrt{n} (\hat{p}_1 - p_1)}{\sqrt{p_1 (1 - p_1)}}; since z \xrightarrow{d} N(0,1), it follows that X^2 \xrightarrow{d} [\chi^2_1](/page/Chi_squared_distribution).^[20]

Derivation for independence

Consider an r \times c contingency table where the observed cell counts O_{ij} for i = 1, \dots, r and j = 1, \dots, c arise from a multinomial distribution with total sample size N = \sum_{i,j} O_{ij} and unknown cell probabilities \pi_{ij}, satisfying \sum_{i=1}^r \sum_{j=1}^c \pi_{ij} = 1. The null hypothesis of independence posits that the row and column factors are independent, implying \pi_{ij} = \pi_{i \cdot} \pi_{\cdot j} for all i, j, where the marginal probabilities are \pi_{i \cdot} = \sum_{j=1}^c \pi_{ij} and \pi_{\cdot j} = \sum_{i=1}^r \pi_{ij}. Under the saturated model, which fits the full set of rc probabilities without restrictions, the maximum likelihood estimates are simply \hat{\pi}_{ij} = O_{ij}/N. In contrast, the independence model imposes the product structure, reducing the number of free parameters to r + c - 1, and the maximum likelihood estimates of the marginals yield expected cell frequencies E_{ij} = N \hat{\pi}_{i \cdot} \hat{\pi}_{\cdot j} = R_i C_j / N, where R_i = \sum_{j=1}^c O_{ij} is the i-th row total and C_j = \sum_{i=1}^r O_{ij} is the j-th column total.^[23] Pearson's chi-squared statistic X^2 quantifies the lack of fit between the observed counts and those expected under independence, expressed as the sum of cell-wise squared standardized residuals:

X^2 = \sum_{i=1}^r \sum_{j=1}^c \frac{(O_{ij} - E_{ij})^2}{E_{ij}}.

This form arises as a measure of discrepancy analogous to the goodness-of-fit test, comparing the saturated model to the restricted independence model.^[22] Under the null hypothesis and as N \to \infty, X^2 follows asymptotically a chi-squared distribution with (r-1)(c-1) degrees of freedom, reflecting the difference in parameters between the saturated and independence models.^[23] This asymptotic chi-squared property holds due to the equivalence between Pearson's statistic and the likelihood ratio test in large samples; the likelihood ratio statistic G^2 = 2 \sum_{i=1}^r \sum_{j=1}^c O_{ij} \ln (O_{ij} / E_{ij}) converges to the same limiting distribution as X^2.^[23] The derivation extends naturally from the $2 \times 2 case, where the single degree of freedom captures overall association, to the general r \times c table via additive cell contributions, each term (O_{ij} - E_{ij})^2 / E_{ij} behaving asymptotically as a chi-squared variate with one degree of freedom under the null.^[22] Karl Pearson introduced this refinement for contingency tables in his 1904 paper, adapting his earlier 1900 goodness-of-fit criterion to assess independence in cross-classified categorical data.^[24]

Assumptions and Validity

Sample size requirements

The validity of Pearson's chi-squared test relies on sufficient sample size to ensure the test statistic approximates the chi-squared distribution asymptotically. A widely adopted rule of thumb requires that all expected cell frequencies E_i are at least 5, or that no more than 20% of the cells have E_i < 5 with none below 1, to maintain reliable inference.^[25]^[4] These guidelines, originating from simulation-based investigations, help prevent distortions in the test's performance when categories are sparse.^[26] The rationale for these thresholds is that small expected frequencies undermine the asymptotic chi-squared approximation, leading to inaccurate p-values and reduced test reliability.^[27] Cochran's 1952 analysis, based on Monte Carlo simulations for goodness-of-fit applications, established that expected frequencies below 5 often result in substantial deviations from the nominal distribution, particularly affecting the accuracy of significance levels.^[26] Similar simulation evidence extends to tests of independence in contingency tables, where low expectations compromise the central limit theorem underpinnings of the approximation.^[28] Violations of these sample size conditions can inflate Type I error rates, causing excessive false positives, or render the test overly conservative with diminished power to detect true associations.^[29]^[28] In such cases, a practical remedy involves combining adjacent categories to increase expected frequencies and restore the approximation's validity, thereby preserving the test's utility without altering the underlying hypothesis.^[28]

Independence and randomness

The Pearson's chi-squared test relies on the fundamental assumption that the observations are independent and identically distributed (i.i.d.) according to a multinomial distribution under the null hypothesis. This means that each trial or observation contributes to one of the categorical outcomes independently of the others, with the probability of each outcome remaining constant across trials. The multinomial framework ensures that the joint distribution of the observed frequencies aligns with the expected frequencies derived from the hypothesized probabilities, allowing the test statistic to approximate a chi-squared distribution asymptotically.^[30]^[31] For the test to be valid, the data must arise from a simple random sample without inherent dependencies such as clustering or stratification, unless specific adjustments are made to account for the sampling design. In practice, this requires that the sample be drawn randomly from the population, ensuring no systematic correlations between observations that could inflate or deflate the test statistic. Violations of this independence assumption, such as in clustered data from repeated measures or hierarchical sampling, can lead to overdispersion—where the variance of the observed frequencies exceeds that expected under the multinomial model—thereby invalidating the chi-squared approximation and increasing the risk of erroneous conclusions.^[32]^[33]^[34] In contingency table analyses, the sampling design can further complicate independence if row or column margins are fixed by the experimental setup, such as when one variable is deliberately balanced. The test still uses (r-1)(c-1) degrees of freedom, but the appropriate model (e.g., test of independence or homogeneity) depends on the sampling scheme. To assess adherence to these assumptions and evaluate overall model fit, residual analysis serves as a key diagnostic tool, examining the differences between observed and expected frequencies on a cell-by-cell basis to identify patterns of deviation that may signal underlying dependencies or other issues. Standardized or adjusted residuals, for instance, help pinpoint which cells contribute disproportionately to any lack of fit, providing insights beyond the omnibus test statistic.^[35]^[36] These diagnostics complement considerations of sample size adequacy, ensuring the test's robustness across varied data structures.^[37]

Examples

Testing die fairness

To illustrate the application of Pearson's chi-squared goodness-of-fit test, consider testing whether a six-sided die is fair by rolling it 60 times and recording the outcomes for each face. The null hypothesis states that the die is fair, meaning each face has an equal probability of 1/6, while the alternative hypothesis states that the probabilities are not equal. The observed frequencies from the rolls are 5, 8, 12, 10, 11, and 14 for faces 1 through 6, respectively. The expected frequency for each face under the null hypothesis is E_i = 60 / 6 = 10. The test statistic is computed as

\chi^2 = \sum_{i=1}^6 \frac{(O_i - E_i)^2}{E_i},

where O_i are the observed frequencies. The individual contributions are calculated as follows:

Face	Observed (O_i)	Expected (E_i)	O_i - E_i	(O_i - E_i)^2 / E_i
1	5	10	-5	2.5
2	8	10	-2	0.4
3	12	10	2	0.4
4	10	10	0	0
5	11	10	1	0.1
6	14	10	4	1.6
Total	60	60	-	χ² = 5.0

The degrees of freedom are df = 6 - 1 = 5. The p-value associated with \chi^2 = 5.0 and df = 5 is approximately 0.42, obtained by comparing the test statistic to the chi-squared distribution (e.g., using statistical tables or software such as R's pchisq(5, df=5, lower.tail=FALSE)). Since the p-value (0.42) exceeds the common significance level of 0.05, there is insufficient evidence to reject the null hypothesis. The observed frequencies are consistent with the expectation of a fair die, indicating no significant deviation from uniformity.

Analyzing contingency tables

To illustrate the application of Pearson's chi-squared test for independence, consider a hypothetical survey of 100 individuals examining whether there is an association between gender (male or female) and preference for a product (yes or no). The observed frequencies form a 2×2 contingency table as follows:

	Yes	No	Total
Male	30	20	50
Female	25	25	50
Total	55	45	100

The chi-squared test of independence assesses whether the distribution of preferences differs significantly between genders, under the null hypothesis that gender and preference are independent.^[38]^[17] Under the null hypothesis, the expected frequency E_{ij} for each cell is computed as the row total multiplied by the column total divided by the grand total: E_{ij} = \frac{(\text{row total}_i) \times (\text{column total}_j)}{n}.^[17] For this table, the expected frequencies are:

	Yes	No	Total
Male	27.5	22.5	50
Female	27.5	22.5	50
Total	55	45	100

The residuals, which quantify the deviation of observed from expected values, can be examined through the contributions to the test statistic, \frac{(O_{ij} - E_{ij})^2}{E_{ij}}, for each cell. The full table incorporating observed frequencies (O_{ij}), expected frequencies (E_{ij}), and these contributions is:

	Yes	No	Total
Male	O=30, E=27.5 contrib. ≈0.227	O=20, E=22.5 contrib. ≈0.278	50
Female	O=25, E=27.5 contrib. ≈0.227	O=25, E=22.5 contrib. ≈0.278	50
Total	55	45	100

The test statistic is the sum of these contributions: \chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}} \approx 1.01.^[38] With 1 degree of freedom (calculated as (2-1)(2-1) = 1), the p-value from the chi-squared distribution is approximately 0.315, which exceeds the common significance level of 0.05.^[17] This result provides no evidence to reject the null hypothesis, supporting the conclusion that gender and product preference are independent in this sample.

Limitations and Extensions

Small sample corrections

When the expected frequencies in a contingency table are small, the asymptotic chi-squared approximation underlying Pearson's test can lead to inflated type I error rates, prompting the need for corrections to improve accuracy.^[39] One prominent adjustment historically used is Yates' continuity correction, proposed by Frank Yates in 1934 for contingency tables involving small numbers.^[39] This correction modifies the standard chi-squared statistic specifically for 2×2 tables by subtracting 0.5 from the absolute difference between each observed and expected frequency before squaring and dividing by the expected frequency. The adjusted statistic is given by:

\chi^2_y = \sum \frac{ \left( |O_i - E_i| - 0.5 \right)^2 }{E_i}

where O_i are the observed frequencies and E_i are the expected frequencies.^[39] Yates' correction was typically applied when at least one expected frequency is less than 5, though some older guidelines suggested using it if any E_i < 10. It reduces the tendency of the uncorrected test to over-reject the null hypothesis in small samples by making the statistic more conservative. However, Yates' correction is now generally not recommended, even for 2×2 tables, as it tends to be overly conservative, leading to decreased power and inflated Type II error rates.^[40]^[41]^[42] Modern statistical practice favors exact tests or simulation-based methods over such approximations. It is also not recommended for tables larger than 2×2, as the continuity adjustment becomes less appropriate with increasing degrees of freedom.^[43] For larger contingency tables or goodness-of-fit tests with small expected frequencies, Williams' correction provides an alternative adjustment, introduced by D.A. Williams in 1976. This method scales the chi-squared statistic by a factor q = 1 + \frac{(df - 1)^2}{6n}, where df is the degrees of freedom and n is the total sample size, yielding \chi^2_w = \chi^2 / q.^[43] It aims to better align the test's distribution with the chi-squared under small-sample conditions, offering improved performance over Yates' for multi-cell scenarios. However, with current computational capabilities, exact methods are often preferred over such corrections when expected frequencies are small (e.g., total sample size less than 1000).^[43]

Alternative tests and methods

When the Pearson's chi-squared test's asymptotic approximations fail, particularly due to small sample sizes or sparse data, alternative methods offer exact or improved inference for testing goodness-of-fit or independence in categorical data. These alternatives address limitations like the chi-squared test's reliance on large expected frequencies, providing more reliable p-values or probabilistic interpretations in such cases. Fisher's exact test serves as a primary alternative for analyzing 2×2 contingency tables, especially when expected cell counts are low. Under the null hypothesis of independence, it calculates the exact probability of observing the given table (or more extreme) by treating the data as a sample from a hypergeometric distribution, where row and column totals are fixed.^[44] This approach avoids approximations entirely, making it suitable for small samples where the chi-squared test may inflate Type I error rates.^[44] The G-test, also known as the likelihood ratio test, provides another asymptotic alternative that is often preferred for small samples. It is computed as

G = 2 \sum_i O_i \ln \left( \frac{O_i}{E_i} \right),

where O_i are observed frequencies and E_i are expected frequencies; this statistic follows a chi-squared distribution under the null hypothesis and is asymptotically equivalent to Pearson's chi-squared but tends to yield better p-values in finite samples due to its basis in maximum likelihood principles.^[45] Originally developed in the context of contingency table analysis, it is particularly effective for comparing observed and expected distributions in goodness-of-fit scenarios.^[45] Bayesian variants of the chi-squared test shift focus from p-values to posterior distributions of parameters, using a Dirichlet prior on the multinomial probabilities for the categories. This conjugate prior updates with observed data to yield a posterior Dirichlet distribution, enabling credible intervals for probabilities and Bayes factors for model comparison, which is advantageous when avoiding frequentist assumptions or incorporating prior knowledge.^[46] For independence testing in contingency tables, the Dirichlet prior on cell probabilities facilitates exact Bayesian inference, as demonstrated in analyses of 2×2 tables.^[46] Permutation tests enable exact inference without asymptotic reliance by generating the null distribution through random rearrangements of the data that preserve marginal totals. For independence in contingency tables, the test statistic (such as Pearson's chi-squared) is recomputed over all or a large number of permutations, with the p-value as the proportion of permuted statistics at least as extreme as the observed one; this method is computationally intensive but precise for any sample size.^[47] For scenarios involving ordered categories, such as dose-response studies, the Cochran-Armitage trend test extends chi-squared principles to detect linear trends in proportions across ordered levels of one variable. It scores the ordered categories (e.g., linearly as 0, 1, 2, ...) and tests the null of no trend against alternatives of monotonic increase or decrease, offering greater power than the omnibus chi-squared test when ordering is present.^[48] This test is widely applied in epidemiology and genetics for its sensitivity to ordered alternatives.^[48]

References

[1]
[PDF] Karl Pearson a - McGill University
From these ~1 groups I have found X 2 by the method of this paper. By this reduction of groups I have given. Sir George Airy's curve even a better chance ...Missing: test | Show results with:test
[2]
1.3.5.15. Chi-Square Goodness-of-Fit Test
The chi-square test is an alternative to the Anderson-Darling and Kolmogorov-Smirnov goodness-of-fit tests. The chi-square goodness-of-fit test can be applied ...
[3]
[PDF] The Chi Square Test
❖ The chi-square test is an approximate method that becomes more accurate as the counts in the cells of the table get larger. Therefore, it is important to ...Missing: history credible sources
[4]
[PDF] chi-square test - analysis of contingency tables - University of Vermont
The original chi-square test, often known as Pearson's chi-square, dates from papers by Karl Pearson in the earlier 1900s. The test serves both as a ...<|control11|><|separator|>
[5]
CHI-SQUARE INDEPENDENCE TEST
The chi-square independence test performs a chi-square test on a two-way table to check if row and column variables are unassociated.
[6]
The Chi-square test of independence - PMC - NIH
Jun 15, 2013 · It is a powerful statistic that enables researchers to test hypotheses about variables measured at the nominal level. As with all inferential ...Missing: history | Show results with:history
[7]
Karl Pearson and the Chi-Squared Test - jstor
Pearson's paper of 1900 introduced what subsequently became known as the chi-squared test of goodness of fit. The terminology and allusions of 80 years ago ...
[8]
2.4 - Goodness-of-Fit Test | STAT 504
A goodness-of-fit test, in general, refers to measuring how well do the observed data correspond to the fitted (assumed) model.Missing: applications | Show results with:applications
[9]
[PDF] Section 10 Chi-squared goodness-of-fit test.
Pearson's theorem. Chi-squared goodness-of-fit test is based on a probabilistic result that we will prove in this section. P(Xi ∈ B1) = p1,..., P(Xi ∈ Br) = pr, ...
[10]
[PDF] Statistical Inference in R - University of Washington
To test whether these distributions were. Poisson distributed, a chi-squared goodness-of-fit test was run on Y for both levels of X. The test for level 'a' was.
[11]
2: Binomial and Multinomial Inference - STAT ONLINE
The advantage of working with a chi-square distribution is that it allows us to generalize readily to multinomial data when more than two outcomes are possible.<|control11|><|separator|>
[12]
The Chi Square Frequency Test - Andrews University
A Chi Square Distribution Table ... Example: On July 12, 2005 we collected 192 dice rolls, each person present using a different die and each person doing 24 ...
[13]
X. On the criterion that a given system of deviations from the ...
Publication Cover. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science. Series 5. Volume 50, 1900 - Issue 302 · Journal homepage.Missing: URL | Show results with:URL
[14]
11.3 - Chi-Square Test of Independence | STAT 200
Recall that if two categorical variables are independent, then P ( A ) = P ( A ∣ B ) . The chi-square test of independence uses this fact to compute expected ...
[15]
Chi-Square Goodness of Fit Test - Yale Statistics and Data Science
Estimating Parameters By estimating a parameter, we lose a degree of freedom in the chi-square test statistic. In general, if we estimate d parameters under ...
[16]
Tutorial: Pearson's Chi-square Test for Independence
The Chi-square test is intended to test how likely it is that an observed distribution is due to chance. It is also called a "goodness of fit" statistic, ...
[17]
11.3 - Chi-Square Test of Independence - STAT ONLINE
The chi-square test is used to determine if there is convincing evidence that the two variables are not independent in the population using the same hypothesis ...
[18]
[PDF] Stat 610: Mathematical Statistics Lecture 21
Both Pearson's chi-square statistics converge in distribution to the chi-square distribution with degrees of freedom k −1. The Pearson's chi-square tests reject ...<|control11|><|separator|>
[19]
https://libguides.library.kent.edu/spss/chisquare
[20]
A Note on Karl Pearson's 1900 Chi-Squared Test: Two Derivations ...
Dec 8, 2018 · The test was introduced in Pearson (1900), but the derivation in that paper is almost incomprehensible. Two derivations of the asymptotic ...Missing: original | Show results with:original
[21]
[PDF] Goodness of Fit - Arizona Math
The G2 statistic follows from the likelihood ratio test criterion. The χ2 statistics is a second order Taylor series approximation to G2. 2 r. ∑.
[22]
[PDF] Chapter 7
Sep 24, 2001 · For the Hypergeometric Sampling model the Chi-squared test statistic can be shown under the null hypothesis to also have approximately a Chi- ...
[23]
[PDF] Seven proofs of the Pearson Chi-squared independence test ... - arXiv
Sep 3, 2018 · These proofs show that they are profound connections between binomial, multinomial, Poisson, normal and chi squared distribution for asymptotic ...
[24]
[PDF] On the theory of contingency and its relation to association and ...
ON THE THEORY OF CONTINGENCY AND ITS RELATION. TO ASSOCIATION AND NORMAL CORRELATION. BY. KARL PEARSON, F.R.S.. [WITH TWO DIAGRAMS.] LONDON;.
[25]
Chi-Square Test of Independence Rule of Thumb: n > 5
According to Cochran (1952, 1954), all expected counts should be 10 or greater. If < 10, but >=5, Yates' Correction for continuity should be applied. More ...
[26]
The $\chi^2$ Test of Goodness of Fit - Project Euclid
This paper contains an expository discussion of the chi square test of goodness of fit, intended for the student and user of statistical theory rather than for ...
[27]
On the Chi-square test with small expected frequencies
A general rule is to at least have 5 as the smallest frequency in your table, however this rule seems to be chosen at random (Cochran,1952). Yates proposed ...Missing: guidelines | Show results with:guidelines
[28]
The Tale of Cochran's Rule: My Contingency Table has so Many ...
The article will conclude with some advice on what to do if a contingency table has many expected values smaller than 5.
[29]
Topic 8 Chi-squared tests for associations between categorical ...
This can make the normal p-values far too low in such cases, and therefore lead to Type I errors (false positives). It is often recommended that χ2 χ 2 ...
[30]
The Multinomial Distribution and the Chi-Squared Test for Goodness ...
Sep 2, 2019 · The chi-squared test for goodness of fit is to reject the null hypothesis if the observed value of the chi-squared statistic is greater than xk- ...Missing: poisson | Show results with:poisson
[31]
3.3 - Test for Independence | STAT 504 - STAT ONLINE
Under the null hypothesis, X 2 and G 2 are approximately distributed as a chi-squared distribution with ν = ( I − 1 ) ( J − 1 ) degrees of freedom, provided ...
[32]
The Four Assumptions of a Chi-Square Test - Statology
Aug 14, 2021 · A Chi-Square test of independence is used to determine whether or not there is a significant association between two categorical variables.
[33]
Testing for independence in J × K contingency tables with complex ...
For complex survey samples, use of the Pearson chi-squared test is not appropriate due to the lack of independence of observations. Many large scale surveys ...
[34]
Pearson's chi square test of independence: Use & misuse
Lack of independence of outcome is the commonest factor invalidating the test. This can arise in a number of ways. If samples are paired (either in before-after ...
[35]
Biostatistics: Facing the Interpretation of 2 × 2 Tables - PMC - NIH
From a statistical sampling standpoint, there are only three ways to establish a 2 × 2 contingency table: (i) the row margins (a + b) and (c + d) are fixed, in ...
[36]
The Chi Square Test With Both Margins Fixed
It is well known that the chi square test independence in a two-way contingency table is valid when the cell frequencies follow either a multinomial ...
[37]
3.9 - Diagnostic Measures | STAT 504
Residuals tell how far off are the expected and observed values for each cell, under the assumed model. They tell us which cells drive the lack of fit.
[38]
[PDF] Using Adjusted Standardized Residuals for Interpreting Contingency ...
A Pearson's chi-squared test for independence is used to test for an association between two variables in a two-way contingency table. When a significant ...
[39]
[PDF] Your Chi-Square Test Is Statistically Significant: Now What?
expected frequencies less than five so Cochran's (1954) rule is not violated. 3 One additional approach to the omnibus test problem recommended by Shaffer ...
[40]
[PDF] X. On the Criterion that a given System of Deviations
On the Criterion that a given System of Deviations from the Probable in the Case of a Correlated System of. Variables is such that it can be reasonably ...
[41]
Contingency Tables Involving Small Numbers and the χ<sup ... - jstor
CONTINGENCY TABLES INVOLVING SMALL NUMBERS AND THE. X2 TEST. By F. YATES, B.A.. Introduction. THERE has in the past been a good deal of ...
[42]
Chi-squared test and Fisher's exact test - PMC - NIH
Mar 30, 2017 · Fisher's exact test assesses the null hypothesis of independence applying hypergeometric distribution of the numbers in the cells of the table.Missing: source | Show results with:source
[43]
THE LOG LIKELIHOOD RATIO TEST (THE G‐TEST) - WOOLF - 1957
Wilks , S. S. ( 1935 ). The likelihood test of independence in contingency tables . Ann. Math. Statist. 6 , 190 – 5 . 10.1214/aoms/1177732564. Web of Science ...Missing: original | Show results with:original
[44]
Exact Bayesian p-values for a Test of Independence in a 2 × 2 ...
In particular, in an exact Bayesian analysis using a specific Dirichlet prior distribution for the cell probabilities of the 2 × 2 table, Altham showed that the ...
[45]
USP: an independence test that improves on Pearson's chi-squared ...
Dec 8, 2021 · We present the U-statistic permutation (USP) test of independence in the context of discrete data displayed in a contingency table.
[46]
Cochran-Armitage Test for Trend - SAS Help Center
Apr 16, 2025 · The TREND option in the TABLES statement provides the Cochran-Armitage test for trend, which tests for trend in binomial proportions across levels of a single ...