Fact-checked by Grok 2 weeks ago

McNemar's test

McNemar's test is a non-parametric statistical procedure used to determine whether there are significant differences in the marginal proportions of a dichotomous outcome between two related samples, such as in paired or matched designs where the same subjects are measured under two conditions.^[1] It evaluates marginal homogeneity in a 2×2 contingency table by focusing exclusively on the discordant pairs—those where the outcomes differ between the two measurements—while ignoring concordant pairs where outcomes match.^[2] Developed by psychologist Quinn McNemar, the test was first described in 1947 as a method to address the sampling error in the difference between correlated proportions, building on earlier work in psychometrics and providing a chi-squared-based approach for dependent categorical data.^[1] Unlike the Pearson chi-squared test for independence, which assumes unrelated samples, McNemar's test accounts for the paired nature of the data to avoid inflated Type I error rates.^[2] The test statistic is computed as \chi^2 = \frac{(b - c)^2}{b + c}, where b and c represent the counts in the off-diagonal cells of the contingency table (discordant pairs), and it follows a chi-squared distribution with 1 degree of freedom under the null hypothesis of no difference in marginal proportions.^[3] For small sample sizes (typically fewer than 20 discordant pairs), a continuity correction \chi^2 = \frac{(|b - c| - 1)^2}{b + c} or an exact binomial test is preferred to improve accuracy.^[4] Key assumptions include paired observations with exactly two dichotomous outcomes per pair and random sampling from the population, making it suitable for scenarios like pre- and post-treatment assessments in clinical trials or evaluating attitude changes in surveys.^[3] Common misapplications involve using it for unpaired data or ignoring the dependency structure, which can lead to invalid inferences; extensions like the Stuart-Maxwell test handle multi-category outcomes.^[2]

Overview

Definition

McNemar's test is a non-parametric statistical procedure designed to evaluate marginal homogeneity in paired binary data, determining whether there is a significant difference between the proportions observed in two related samples.^[5] It specifically addresses scenarios where the same subjects or matched pairs are assessed twice on a dichotomous outcome, such as before and after an intervention, to detect changes in prevalence without assuming a normal distribution.^[6] Developed to handle correlated proportions, the test focuses on the symmetry of responses across the two measurements rather than independent groups.^[1] The test relies on a 2×2 contingency table that captures paired observations, with rows and columns representing the two binary categories (e.g., yes/no or success/failure) for the initial and follow-up assessments. The table's cells denote: a for concordant pairs where both responses are positive (yes-yes), b for discordant pairs shifting from negative to positive (no-yes), c for discordant pairs shifting from positive to negative (yes-no), and d for concordant pairs where both are negative (no-no). Emphasis is placed on the off-diagonal discordant cells (b and c), as these reflect changes or inconsistencies between the paired measures, while concordant cells (a and d) do not contribute to the assessment of difference.^[7]^[8] This method finds broad application across disciplines involving repeated measures on binary traits. In medicine, it evaluates pre- and post-treatment outcomes, such as symptom presence before and after therapy.^[9] In psychology, it analyzes shifts in attitudes or behaviors, like opinion changes in survey responses over time.^[1] In epidemiology, it supports matched case-control designs to compare exposure risks between paired subjects.^[5]

Historical Background

McNemar's test was developed by Quinn McNemar, an American psychologist and statistician born in 1900, who served as a professor of psychology, statistics, and education at Stanford University.^[10] McNemar made significant contributions to psychological measurement, including revising the Stanford-Binet intelligence scale in 1942 and authoring the influential textbook Psychological Statistics in 1949, which became a standard reference in the field.^[11]^[12] He also held leadership roles, such as president of the Psychometric Society from 1950 to 1951, underscoring his impact on quantitative methods in psychology.^[11] The test was first described in McNemar's 1947 paper titled "Note on the sampling error of the difference between correlated proportions or percentages," published in Psychometrika, a journal dedicated to psychometric methods.^[13] This work addressed the need in psychometrics and experimental psychology for a reliable approach to analyzing paired categorical data, particularly when assessing changes in dichotomous responses within the same subjects, such as before-and-after measurements in psychological experiments.^[13] It built upon foundational statistical techniques for contingency tables, including Karl Pearson's chi-square test introduced in 1900, by adapting them specifically for correlated proportions to account for the dependency in paired observations.^[12] Following its introduction, McNemar's test saw adoption in biostatistics starting in the post-1950s era, where it proved valuable for evaluating paired binary outcomes in fields like clinical research and epidemiology.^[5] Early refinements included the proposal of a continuity correction by A. L. Edwards in 1948 to improve the chi-square approximation for small samples, with further adjustments explored in the 1960s and 1970s, such as those by William G. Cochran and Joseph L. Fleiss.^[14] No major theoretical updates have occurred since, but by the 1980s, the test was integrated into widely used statistical software packages, facilitating its standardization as a tool for paired homogeneity testing.^[15] McNemar's broader influence lies in standardizing statistical practices for paired designs in psychological measurement, where his test complemented his work on reliability and validity in IQ testing and opinion-attitude surveys, helping to bridge psychometrics with broader statistical applications.^[16]^[12]

Methodology

Assumptions and Hypotheses

McNemar's test requires paired binary observations collected on the same subjects, forming matched pairs where each subject provides two dichotomous responses, such as before-and-after measurements or responses to two related questions.^[17] The data must be at the nominal scale, with outcomes categorized into two mutually exclusive groups, like "yes/no" or "success/failure," without assuming any ordinal or interval properties.^[5] Independence is assumed between different pairs, ensuring that observations from one subject do not influence those from another, while dependence within each pair is explicitly expected and accounted for by the test's design, which focuses on the correlation induced by matching.^[17] Unlike the standard chi-squared test for independent samples, McNemar's test does not require or assume equal marginal probabilities across the two measurements prior to analysis; instead, it directly evaluates whether such equality holds under the null hypothesis.^[5] It is particularly suited for handling correlated data, making it appropriate for scenarios where the pairing introduces intra-subject dependency, such as in pre-post intervention studies. Additionally, a sufficient sample size is necessary for the validity of the asymptotic chi-squared approximation, typically requiring more than 20 discordant pairs (where the two responses differ) to ensure reliable inference, though exact tests can be used for smaller samples.^[5] The null hypothesis (H₀) posits marginal homogeneity, meaning the probability of a "positive" outcome (or "yes") in the first measurement equals that in the second (p₁ = p₂), implying no systematic change or difference in the marginal proportions between the paired conditions.^[1] This can also be interpreted as the absence of any net shift in the population response distribution across the two time points or treatments. The alternative hypothesis (Hₐ) states marginal inhomogeneity (p₁ ≠ p₂), indicating a significant difference in the marginal probabilities; this is often tested in a two-sided manner but can be one-sided to detect directional changes, such as an increase (p₁ < p₂) or decrease (p₁ > p₂) in the proportion.^[5] Regarding sampling considerations, McNemar's test does not assume fixed marginal totals, allowing for variability in the overall number of "yes" or "no" responses across the sample; the inference is conditional on the observed discordant pairs, which capture the relevant variability for testing the hypotheses.^[1] This conditional approach ensures the test's robustness to the concordant pairs (where responses match), focusing solely on the off-diagonal elements of the 2×2 contingency table that inform about changes.^[17]

Test Statistic and Procedure

McNemar's test procedure involves constructing a 2×2 contingency table from paired dichotomous observations, with rows representing outcomes from the first measurement (e.g., "no" and "yes") and columns from the second measurement. The cell counts are denoted as follows: a for pairs where both are "no," b for changes from "no" to "yes," c for changes from "yes" to "no," and d for pairs where both are "yes." Only the discordant pairs (b and c) contribute to the test, as the concordant cells (a and d) do not inform on changes in marginal proportions. The test statistic is computed as

\chi^2 = \frac{(b - c)^2}{b + c},

which follows a chi-squared distribution with 1 degree of freedom under the null hypothesis of marginal homogeneity for large samples (typically when b + c \geq 20). This single degree of freedom arises because the test estimates one parameter: the difference between the marginal proportions. To obtain the p-value, \chi^2 is compared to the cumulative distribution function of the chi-squared distribution with 1 df; the null hypothesis is rejected if the p-value is less than the significance level \alpha (commonly 0.05). For small samples where b + c < 20, the asymptotic approximation may be unreliable, so an exact alternative uses a binomial test on the discordant pairs.^[14] Under the null, the number of "no-to-yes" changes (b) follows a binomial distribution with parameters n = b + c and p = 0.5; the two-sided p-value is twice the minimum of the binomial cumulative probabilities for \min(b, c) and n - \min(b, c).^[14] The step-by-step application proceeds as follows: first, collect paired dichotomous data from the same subjects across two conditions; second, tabulate the data into the 2×2 table and extract b and c; third, compute the test statistic (chi-squared for large samples or exact binomial for small); finally, determine the p-value and compare to \alpha to decide whether to reject the null hypothesis of no systematic change in proportions.^[14]

Variations

One common modification to the standard McNemar's test is the application of Yates' continuity correction, which adjusts the chi-squared statistic to better approximate the discrete binomial distribution when the number of discordant pairs is small. The corrected statistic is given by

\chi^2 = \frac{(|b - c| - 1)^2}{b + c},

where b and c represent the off-diagonal counts in the 2×2 contingency table. This correction, originally proposed by Yates for chi-squared tests in the 1930s and adapted to McNemar's test in subsequent applications, helps reduce the inflation of type I error rates for small samples where b + c < 20. For scenarios with very small numbers of discordant pairs, particularly when b + c < 10, the chi-squared approximation becomes unreliable, leading to the use of the exact McNemar's test. This exact version treats the number of discordant pairs in one direction (e.g., b) as following a binomial distribution with parameters n = b + c and p = 0.5 under the null hypothesis of marginal homogeneity, computing the p-value directly from the binomial cumulative distribution function. Mid-p adjustments, which average the exact p-value with the probability of the observed data, can further improve performance by reducing conservativeness. This approach avoids asymptotic approximations entirely and is recommended for precise inference in small samples.^[18] The standard McNemar's test is limited to binary outcomes, but for multi-category paired data in k \times k tables (k > 2), the Stuart-Maxwell test provides a generalization to test marginal homogeneity. This extension computes a test statistic as a quadratic form involving the vector of marginal differences and the estimated covariance matrix of the discordant pairs, following a chi-squared distribution with k-1 degrees of freedom under the null. Originally developed by Stuart in 1955 and refined by Maxwell in 1970, it extends the focus on off-diagonal elements to multiple categories while maintaining the paired structure. One-sided versions of McNemar's test are employed when there is a directional hypothesis, such as testing for an increase in one category (e.g., b > c). In these cases, the p-value is derived from the one-tailed binomial cumulative probability, Pr(X \geq b | n = b + c, p = 0.5), providing greater power for detecting asymmetries in a specified direction compared to the two-sided test. This variant is particularly useful in applications like diagnostic test comparisons where superiority in one direction is anticipated.^[18] For ordinal paired data, weighted variants of McNemar's test incorporate weights based on the magnitude of discordance between categories, enhancing sensitivity to ordered differences. These weights, often linear functions of the distance between category scores, modify the test statistic to account for the ordinal nature, as in extensions for multinomial responses where symmetry is weighted by response category distances. Such approaches are applicable in contexts like item response theory models, including Rasch models, to analyze ordered ratings while preserving the paired design.^[19]

Applications

Worked Examples

To illustrate the application of McNemar's test, consider a hypothetical medical study evaluating a smoking cessation program in 100 patients, where smoking status is assessed before and after treatment as a binary outcome (smoker or non-smoker). The resulting 2×2 contingency table for paired observations is as follows:

	Post: Smoker	Post: Non-smoker
Pre: Smoker	a = 40	b = 20
Pre: Non-smoker	c = 10	d = 30

Here, b = 20 represents patients who were smokers before but non-smokers after (positive changes), while c = 10 represents those who were non-smokers before but smokers after (negative changes). The test statistic is computed as \chi^2 = \frac{(b - c)^2}{b + c} = \frac{(20 - 10)^2}{30} \approx 3.33, with 1 degree of freedom, yielding a p-value of 0.068. This indicates no statistically significant change in smoking status at the 0.05 level, though the direction suggests a modest net reduction in smoking (more positive than negative changes). In a psychological context, McNemar's test can assess attitude shifts on a dichotomous scale, such as agreement or disagreement with a policy statement, measured before and after an informational exposure in 50 subjects. The contingency table is:

	Post: Agree	Post: Disagree
Pre: Agree	a = 15	b = 12
Pre: Disagree	c = 5	d = 18

With b = 12 (shift from agree to disagree) and c = 5 (shift from disagree to agree), the test statistic is \chi^2 = \frac{(12 - 5)^2}{17} \approx 2.88, p ≈ 0.090. The non-significant result suggests no overall change in attitudes, despite a slight net shift toward disagreement. In such paired designs, b and c capture the direction and magnitude of discordant responses, enabling evaluation of whether one outcome predominates over the other. A useful measure of effect size in these examples is the standardized difference in discordant proportions, \frac{b - c}{b + c}, which quantifies the net directional change relative to total discordance; for the medical case, this is \frac{10}{30} \approx 0.33, indicating a moderate effect. For the psychological case, it is \frac{7}{17} \approx 0.41. For small samples where the chi-square approximation may be unreliable (e.g., b + c < 25), an exact binomial test is preferred, treating the discordant pairs as binomial trials under the null of equal discordance (p = 0.5). Consider a hypothetical dataset with b + c = 8 discordant pairs, say b = 6 and c = 2; the two-sided p-value is p = 2 \times \sum_{k=0}^{2} \binom{8}{k} (0.5)^8 \approx 0.29, indicating non-significance. This approach maintains validity for paired binary data under the test's assumptions of independence across pairs.

Computational Implementation

McNemar's test can be implemented efficiently in Python using the statsmodels library, which provides the mcnemar function for analyzing 2x2 contingency tables. To perform the test, import the module and pass a 2x2 table as a list of lists, where the off-diagonal elements represent discordant pairs; the function returns the test statistic and p-value based on the chi-squared approximation by default.^[20]

python
import statsmodels.stats.contingency_tables as ct
table = [[a, b], [c, d]]  # a and d are concordant; b and c are discordant
result = ct.mcnemar(table)
print(result.statistic, result.pvalue)
import statsmodels.stats.contingency_tables as ct
table = [[a, b], [c, d]]  # a and d are concordant; b and c are discordant
result = ct.mcnemar(table)
print(result.statistic, result.pvalue)

Here, a, b, c, and d are the cell counts from the contingency table.^[20] In R, the base stats package includes the mcnemar.test function, which tests for symmetry in a 2x2 contingency table and outputs the chi-squared statistic, degrees of freedom, and p-value, with an optional continuity correction applied by default.^[21]

r
mcnemar.test(matrix(c(a, b, c, d), 2, 2))
mcnemar.test(matrix(c(a, b, c, d), 2, 2))

The output includes interpretation details such as the chi-squared value and associated p-value, indicating whether to reject the null hypothesis of marginal homogeneity.^[21] For other software, SAS uses PROC FREQ with the TABLES statement and AGREE option to compute McNemar's test on paired categorical data, producing the test statistic and p-value alongside agreement measures.^[22] The syntax is:

sas
PROC FREQ DATA=dataset;
   TABLES var1*var2 / AGREE;
RUN;
PROC FREQ DATA=dataset;
   TABLES var1*var2 / AGREE;
RUN;

In SPSS, the NPAR TESTS procedure with the MCNEMAR subcommand handles paired nominal variables, generating the chi-squared statistic and exact p-value if specified.^[23] The syntax is:

NPAR TESTS
  /MCNEMAR=var1 WITH var2 (PAIRED)
  /STATISTICS EXACT.
NPAR TESTS
  /MCNEMAR=var1 WITH var2 (PAIRED)
  /STATISTICS EXACT.

Variations such as continuity correction can be applied in Python's statsmodels by setting the correction parameter to True, which adjusts the chi-squared statistic by subtracting 0.5 from the absolute difference of discordant counts to improve the approximation for small samples.^[20] For the exact test, use scipy.stats.binomtest on the discordant counts (e.g., testing if the probability of one discordant type equals 0.5 under the null), providing a binomial-based p-value without approximation.^[24]

python
from scipy.stats import binomtest
p_value = binomtest(b, n=b + c, p=0.5).pvalue  # b and c are discordant counts
from scipy.stats import binomtest
p_value = binomtest(b, n=b + c, p=0.5).pvalue  # b and c are discordant counts

This approach is suitable when cell counts are small (e.g., less than 10). Data input typically involves a 2x2 contingency table for direct use in the test functions, but for raw paired data, preprocess using libraries like pandas to create the table via cross-tabulation before applying the test.^[20] For example, in Python:

python
import pandas as pd
ct = pd.crosstab(before, after)  # before and after are paired binary variables
import pandas as pd
ct = pd.crosstab(before, after)  # before and after are paired binary variables

This ensures the input matches the required format, handling concordant and discordant pairs appropriately.

Interpretation and Extensions

Power and Limitations

The statistical power of McNemar's test primarily depends on the effect size, defined as \left| b - c \right| / (b + c), which represents the proportion of discordant pairs favoring one marginal proportion over the other, the total number of discordant pairs b + c, and the chosen significance level \alpha.^[25] Power increases with larger effect sizes, more discordant pairs, and lower \alpha, but remains contingent on the within-pair correlation structure.^[7] The statistical power of the test is low when the number of discordant pairs is fewer than 25.^[5] Simulation-based approaches, such as Monte Carlo enumeration under the multinomial model, provide accurate power estimates when analytical approximations are unreliable, particularly for small to moderate sample sizes.^[7] A key limitation of the test is its low power in small samples or when discordant pairs are nearly balanced (b \approx c), as the effective sample size is reduced to only the discordant counts, potentially leading to failure in detecting meaningful differences.^[5] Additionally, by conditioning on discordant pairs, the test discards information from concordant pairs (a and d), which can limit efficiency compared to unconditional alternatives in certain settings.^[8] In repeated measures designs, such as before-after studies, the test assumes no carryover effects from the first measurement to the second, which could bias results if violated, as in crossover trials without sufficient washout periods.^[26] The test is also sensitive to multiple comparisons, where performing it repeatedly without adjustments (e.g., Bonferroni correction) inflates the family-wise Type I error rate.^[27] Regarding error rates, the asymptotic \chi^2 approximation tends to over-reject the null hypothesis (elevated Type I error) when the total sample size is small or discordant pairs are few (b + c < 25), making the exact binomial test preferable for such cases despite its computational intensity for very large n.^[5] This can correspondingly increase Type II errors in low-power scenarios, underscoring the need for adequate planning.^[28] McNemar's test should be avoided for unpaired binary data, where the standard Pearson chi-squared test is more appropriate; for continuous paired outcomes, where the paired t-test or Wilcoxon signed-rank test applies; and in the presence of missing data unless missingness is completely at random (MCAR), as listwise deletion in paired analyses can introduce bias otherwise.^[5]^[8] Power can be enhanced through larger overall sample sizes to yield more discordant pairs, or by employing one-sided alternatives when a directional hypothesis is justified, which approximately doubles power for the same setup.^[7] Reporting effect sizes, such as \left| b - c \right| / (b + c), is advisable to aid interpretation of practical significance beyond p-values.^[29] For unpaired binary data from independent samples, the standard Pearson chi-squared test assesses association between two categorical variables, while Fisher's exact test provides an exact alternative suitable for small sample sizes; these differ from McNemar's test by not accounting for the paired structure that controls for individual variability.^[6] When dealing with more than two related binary measurements per subject, such as in repeated measures designs, Cochran's Q test extends the principles of McNemar's test to evaluate overall differences across multiple time points or conditions, serving as a non-parametric analog to repeated measures ANOVA for dichotomous outcomes.^[30] For paired ordinal data, the Wilcoxon signed-rank test is appropriate, as it incorporates the magnitude and direction of differences through ranking, providing greater power than McNemar's nominal-level approach when the ordering of categories conveys additional information.^[5] In the context of marginal homogeneity for multi-way contingency tables from paired categorical data, Bhapkar's test offers a robust alternative to generalized versions of McNemar's test by utilizing the asymptotic normality of marginal proportions, while Wald's test provides a score-based approach for testing equality of marginal distributions under correlated observations.^[31] Contemporary extensions for paired binary outcomes incorporate covariates through logistic mixed-effects models, which account for random effects to model heterogeneity across subjects, or generalized estimating equations (GEE), which focus on population-averaged effects while handling within-subject correlation via working correlation matrices.^[32] Bayesian analogs to McNemar's test employ beta-binomial priors to model the dependence in discordant pairs, enabling posterior inference on the difference in marginal probabilities and incorporating prior knowledge for small samples.^[33] McNemar's test is ideal for simple pre-post designs without covariates, emphasizing discordant pairs to test marginal homogeneity; however, for adjusted analyses involving predictors or complex correlations, regression-based methods like GEE or mixed models are preferred to enhance interpretability and control for confounding.^[2]

References

[1]
Note on the sampling error of the difference between correlated ...
Cite this article. McNemar, Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12, 153–157 (1947).
[2]
Effective use of the McNemar test | Behavioral Ecology and ...
Oct 10, 2020 · The statistical test designed to provide this P value is the McNemar test. Here, we seek to promote greater and better use of the McNemar test.
[3]
None
### Summary of McNemar’s Test
[4]
[PDF] McNemar's Test - 12-21-2010 - Statistics Solutions
Dec 21, 2010 · McNemar's test was first published in a Psychometrika article in 1947. It was created by Quinn · McNemar, who was a professor in the ...
[5]
McNemar And Mann-Whitney U Tests - StatPearls - NCBI Bookshelf
The McNemar test is a non-parametric test used to analyze paired nominal data. It is a test on a 2 x 2 contingency table and checks the marginal homogeneity of ...Missing: "Quinn | Show results with:"Quinn
[6]
Biostatistics Series Module 4: Comparing Groups – Categorical ...
McNemar's χ2 test (after Quinn McNemar, 1947), also simply called McNemar's test, assesses the difference between paired proportions. Thus, it is used when the ...
[7]
[PDF] Tests for Two Correlated Proportions (McNemar Test) - NCSS
McNemar's test compares the proportions for two correlated dichotomous variables. These two variables may be two responses on a single individual or two ...Missing: structure | Show results with:structure
[8]
9.6 - McNemar's test - biostatistics.letgen.org
McNemar's solution considers only the discordant pairs; it's a conditional test. The downside of these tests is that the concordant pairs are not considered.Missing: structure | Show results with:structure
[9]
McNemar Test - an overview | ScienceDirect Topics
The McNemar test is used to examine paired dichotomous data. For example, one might compare the symptomatology pretreatment and post-treatment. Specifically, ...
[10]
https://awspntest.apa.org/fulltext/2009-09459-001.pdf
[11]
Past, Present, and Incoming Presidents - Psychometric Society
Dec 3, 2019 · Some Contributions to Efficient Statistics in Structural ... Methodology in Psychology. 1950-51. Quinn McNemar. Headshot of Quinn McNemar ...
[12]
Categorical Data - 2022 - Wiley Series in Probability and Statistics
Oct 28, 2022 · Quinn McNemar's expertise in statistics and psychometrics led to an influential textbook titled Psychological Statistics. The McNemar test ...
[13]
https://doi.org/10.1007/BF02295996
[14]
The McNemar test for binary matched-pairs data: mid-p and ...
Jul 13, 2013 · The easy-to-calculate mid-p test is an excellent alternative to the complex exact unconditional test. Both can be recommended for use in any situation.Missing: applications psychology
[15]
[PDF] Two Correlated Proportions (McNemar Test) - NCSS
For this test, Δ must be equal to 0. McNemar Test with Continuity Correction. Fleiss (1981) also presents a continuity-corrected version of McNemar test.
[16]
Quinn McNemar: Opinion-Attitude Methodology - Brock University
Feb 22, 2010 · This paper presents an appraisal, by an outsider, of the techniques and methodologies utilized in studies of opinions and attitudes.
[17]
McNemar Test - Information Technology Laboratory
The McNemar test has the the following assumptions: The pairs (Xi,Yi) are mutually independent. Each Xi and Yi can be assigned to one of two possible ...
[18]
The McNemar test for binary matched-pairs data: mid-p and ...
Jul 13, 2013 · The easy-to-calculate mid-p test is an excellent alternative to the complex exact unconditional test. Both can be recommended for use in any situation.
[19]
A weighted test of internal symmetry - Taylor & Francis Online
This study examines extensions of McNemar's Test with multinomial responses, and proposes a linear weighting scheme, based on the distance of the response ...
[20]
McNemar's test using SPSS Statistics
The McNemar test is used to determine if there are differences on a dichotomous dependent variable between two related groups.
[21]
McNemar's test for classifier comparisons - mlxtend - GitHub Pages
The McNemar test statistic ("chi-squared") can be computed as follows: χ2=(b−c)2(b+c), If the sum of cell c and b is sufficiently large, the χ2 value follows a ...
[22]
statsmodels.stats.contingency_tables.mcnemar
This is a special case of Cochran's Q test, and of the homogeneity test. The results when the chisquare distribution is used are identical, except for ...
[23]
McNemar's Chi-squared Test for Count Data - RDocumentation
Description. Performs McNemar's chi-squared test for symmetry of rows and columns in a two-dimensional contingency table.
[24]
Tests and Measures of Agreement - SAS Help Center
Sep 29, 2025 · PROC FREQ also computes an exact p-value for McNemar's test when you specify the MCNEM option in the EXACT statement.
[25]
SPSS McNemar Test
McNemar under Test Type Clicking Paste results in the syntax below. *Run McNemar Test. NPAR TESTS /MCNEMAR=product_a WITH product_b (PAIRED) /STATISTICS ...
[26]
binomtest — SciPy v1.16.2 Manual
The binomial test [1] is a test of the null hypothesis that the probability of success in a Bernoulli experiment is p. Details of the test can be found in many ...1.12.0 · 1.15.2 · 1.13.1 · 1.14.0
[27]
McNemar's Test for Paired Data with Python
Aug 24, 2019 · The McNemar test determines if the row and marginal column frequencies are equal, also known as marginal homogeneity.
[28]
How to Calculate McNemar's Test to Compare Two Machine ...
Aug 8, 2019 · In this tutorial, you will discover how to use the McNemar's statistical hypothesis test to compare machine learning classifier models on a single test dataset.
[29]
McNemar Test Effect Size and Power - Real Statistics Using Excel
We explain how to calculate the effect size, power, and minimum sample size required for McNemar's test. We will use the notation found in McNemar's Test.
[30]
Multivariate Extensions of McNemar's Test - Oxford Academic
Adjustments to the regular McNemar statistic in cases of dependent samples of clustered binary data were proposed by Eliasziw and Donner (1991) and Obuchowski ...Missing: adoption | Show results with:adoption
[31]
[PDF] Power analysis for a two-sample paired-proportions test - Stata
power pairedproportions computes sample size, power, or target discordant proportions for a two-sample paired-proportions test, also known as McNemar's test. ...
[32]
Power and sample size evaluation for the McNemar test ... - PubMed
Various expressions have appeared for sample size calculation based on the power function of McNemar's test for paired or matched proportions, ...
[33]
Effect size of McNemar's Test - Cross Validated - Stack Exchange
Nov 5, 2010 · What is a sensible effect size for McNemar's test? I've seen the odds ratio b/c and the proportions b/(b+c) and c/(b+c) both used in papers.Why aren't the diagonal counts used in McNemar's test?One-sided McNemar's test - Cross Validated - Stack ExchangeMore results from stats.stackexchange.com
[34]
[PDF] Cochran's Q Test - NCSS
Cochran's Q test is an extension of the McNemar test to a situation where there are more than two matched samples. When Cochran's Q test is computed with ...
[35]
Testing marginal homogeneity in clustered matched-pair data
The Stuart-Maxwell test is an extension to the McNemar test, used to assess marginal homogeneity in independent matched-pair data, where responses are allowed ...
[36]
GEE for Repeated Measures Analysis | Columbia Public Health
This page looks specifically at generalized estimating equations (GEE) for repeated measures analysis and compares GEE to other methods of repeated measures.
[37]
Improving and extending the McNemar test using the Bayesian ...
Jan 18, 2016 · The well-known McNemar test assesses the difference between two correlated proportions in binary matched-pairs data.