Fact-checked by Grok 2 weeks ago

Confounding

Confounding is a type of bias in observational studies, particularly in epidemiology and statistics, where a third variable—known as a confounder—distorts the apparent association between an exposure (or independent variable) and an outcome (or dependent variable) by being associated with both. This distortion can result in overestimation, underestimation, or even reversal of the true effect, leading to spurious conclusions about causality. For instance, in studies examining the relationship between alcohol consumption and lung cancer risk, smoking often acts as a confounder because it is associated with both higher alcohol intake and increased lung cancer incidence, independent of alcohol's direct effects. A confounder is defined as a that influences both the and the outcome, creating a mixing of effects that obscures the genuine relationship. To qualify as a potential confounder, a must meet three key criteria: (1) it is associated with the in the source population; (2) it is associated with the outcome, of the ; and (3) it is not an intermediate step in the causal pathway between the and outcome. These criteria ensure that the is not merely a consequence of the or outcome but a genuine external influence, such as in analyses of and , where older individuals may have different dietary habits and higher disease risk. Confounding poses a significant challenge in non-randomized studies, as it can mask true associations or fabricate false ones, impacting decisions and scientific inference. For example, early observational data on statins suggested a protective effect against Parkinson’s disease risk ( of 0.75), but adjustment for levels—a confounder—revealed no significant benefit ( of 1.04). To mitigate confounding, researchers employ strategies such as in experimental designs, which distributes confounders evenly across groups; restriction or matching to limit variability in the confounder; to analyze subgroups; or statistical adjustment via regression models. These methods, when applied appropriately, help isolate the exposure-outcome relationship and enhance the validity of study findings. The concept of confounding has evolved since the mid-20th century, with foundational discussions in epidemiological literature emphasizing its role in , though its recognition traces back to earlier statistical observations of extraneous variables. Despite advances in techniques, unmeasured or residual confounding remains a persistent limitation in many studies, underscoring the importance of careful design and analysis.

Fundamentals

Definition

In statistics and , confounding refers to a arising in observational studies when a third , known as a confounder, distorts the observed between an (independent ) and an outcome (dependent ). A confounder is defined as a that is causally associated with both the and the outcome, independently of any direct effect of the on the outcome, and is not an intermediate in the causal pathway from to outcome. This creates a non-causal path that mixes the true effect with extraneous influences, leading to a spurious or misleading estimate of the causal relationship. To establish the prerequisites for confounding, consider a simple : the may directly influence the outcome, but the confounder precedes and affects both, opening a "backdoor" path through which association flows without reflecting the 's true impact. This setup violates the assumption of exchangeability between exposed and unexposed groups, as the confounder unevenly across exposure levels, thereby altering the outcome independently of the . Confounding thus exemplifies how mere between and outcome does not imply causation, as the observed link may stem from shared causes rather than a direct causal mechanism. Mathematically, confounding bias can be formulated on the additive scale for measures like risk differences, where the apparent (crude) effect equals the true causal effect plus the bias term due to confounding:
\text{Apparent effect} = \text{True effect} + \text{Confounding bias}
Here, the confounding bias represents the distortion introduced by the confounder, which can be positive (exaggerating the apparent effect, e.g., making a null true effect appear positive) or negative (attenuating or reversing the apparent effect, e.g., masking a true positive effect). The direction and magnitude depend on the strength of the confounder's associations with exposure and outcome, as well as its distribution in the population. On the multiplicative scale, such as for relative risks, the apparent effect is instead the true effect multiplied by a bias factor greater or less than 1, reflecting over- or underestimation.

Illustrative Example

A classic example of confounding arises from the observed positive association between sales and rates in observational data from a coastal . Without for external factors, one might erroneously conclude that increased ice cream consumption causes more , as both metrics rise together during certain periods. This spurious association is driven by summer acting as a confounder, which independently influences both ice cream sales—through higher demand for cold treats—and rates—through more people engaging in water activities like . As previously defined, a confounder is a third variable associated with both the and outcome, producing a distorted estimate of their relationship. The causal chain proceeds as follows: There is no direct causal path from the exposure ( sales) to the outcome ( rates); instead, the confounder () links them by causing increases in purchases and, separately, in exposure that elevates risk. This common cause creates the illusion of association between and drownings. To quantify the , consider a analysis of monthly data where sales (in thousands of dollars) predict drownings. The crude (unadjusted) model shows a strong positive association, while adjustment for eliminates it, demonstrating how confounding inflates the apparent effect.
ModelCoefficient (β) for Ice Cream Salesp-value
Crude (unadjusted)0.5269< 0.001
Adjusted for Temperature-0.0360.387
This table illustrates the bias: the unadjusted estimate suggests a meaningful link (every $1,000 increase in sales tied to 0.53 more drownings), but adjustment reveals no such relationship.

Historical Development

Early Concepts

The concept of confounding traces its philosophical roots to ancient inquiries into causation, particularly , which posited that phenomena arise from multiple interacting factors: material (the substance), formal (the structure), efficient (the agent of change), and final (the purpose). This framework acknowledged the complexity of multiple causes contributing to an effect, laying groundwork for later recognition that attributing outcomes to a single factor could mislead if other influences were not accounted for. In 19th-century epidemiology, these ideas manifested implicitly during investigations of disease outbreaks, as seen in John Snow's 1854 analysis of cholera in London's Soho district. Snow mapped cases and compared mortality rates between residents using the contaminated Broad Street pump and those supplied by other water sources, effectively isolating water quality as the key variable while assuming comparability in other social and environmental factors. This approach addressed potential confounders like population density or sanitation differences by leveraging geographic variation as a natural control, demonstrating an early intuitive grasp of non-comparability between groups. By the mid-20th century, early medical studies began explicitly identifying demographic variables as confounders in causal inferences. In their 1950 case-control study on smoking and lung cancer, Richard Doll and Austin Bradford Hill matched 709 lung cancer patients with hospital controls of the same sex and within five-year age groups to mitigate biases from age and sex distributions, which could otherwise distort the observed association between tobacco use and disease risk. This methodological choice reflected a growing awareness that such variables might independently influence both exposure and outcome. Underpinning these developments was a philosophical transition from deterministic views of causation—where effects followed inevitably from causes—to probabilistic models, where exposures merely elevate the likelihood of outcomes amid multiple influences. This shift, evident in 19th-century vital statistics and early 20th-century epidemiology, emphasized empirical comparison over absolute necessity, allowing for the nuanced handling of confounding in non-deterministic natural processes.

Key Milestones

The role of randomized controlled trials (RCTs) in addressing confounding gained prominence in the mid-20th century, particularly through the 1948 Medical Research Council (MRC) trial on streptomycin for pulmonary tuberculosis, which demonstrated how randomization could balance known and unknown confounders across treatment groups to isolate the drug's effect. This trial, involving 107 patients allocated via random numbers, highlighted the necessity of such methods to prevent selection bias and ensure comparable groups, marking a foundational shift toward experimental designs that inherently control for confounding in clinical research. In 1959, Jerome Cornfield and colleagues advanced the assessment of confounding in observational studies by introducing inequalities to evaluate potential biases in the association between smoking and lung cancer. Their analysis showed that no plausible confounder could fully explain the observed risk unless it exhibited an implausibly strong association with both smoking and lung cancer, thereby strengthening causal inferences and establishing a quantitative framework for ruling out alternative explanations in epidemiological data. The same year, Nathan Mantel and William Haenszel developed a statistical method for stratifying data by potential confounders to compute an adjusted odds ratio, providing a practical tool for observational epidemiology. Building on earlier stratification ideas, such as William Cochran's 1954 work on combining chi-square tests across strata, the Mantel-Haenszel procedure became widely adopted for controlling confounding in case-control and cohort studies by weighting stratum-specific estimates to yield an overall unbiased measure of association. By the 1970s, Olli S. Miettinen formalized a precise definition of in epidemiology, distinguishing it from effect modification as a bias arising when a third variable distorts the exposure-outcome relationship due to its associations with both. In his 1974 paper, Miettinen emphasized that occurs if the crude effect measure differs from the effect adjusted for the potential confounder, offering a clear operational criterion that influenced subsequent methodological developments and teaching in the field. From the 1980s onward, the recognition of confounding spurred broader institutional and methodological responses, including the routine integration of stratification techniques like into guidelines for epidemiological analysis and the emphasis on multivariable adjustments in large-scale studies to mitigate bias across diverse populations.

Types and Variants

Classical Confounding

Classical confounding represents the core mechanism by which a third variable distorts the observed association between an exposure and an outcome in observational studies, leading to biased estimates of causal effects. In this standard form, the confounder influences both the distribution of the exposure and the risk of the outcome independently of the exposure itself, thereby mixing extraneous effects into the apparent exposure-outcome relationship. This phenomenon is particularly prevalent in non-experimental settings where randomization is absent, resulting in groups that differ systematically on the confounding variable. For a variable to qualify as a confounder under classical criteria, it must satisfy three key causal conditions: first, it must be associated with the exposure in the source population without being caused by the exposure; second, it must be independently associated with the outcome among individuals not exposed to the factor of interest; and third, it must not lie on the causal pathway between the exposure and the outcome, thereby avoiding mediation rather than confounding. These criteria, rooted in epidemiologic principles, ensure the variable acts as an extraneous risk factor that imbalances comparison groups. The direction of bias induced by classical confounding depends on the relative strengths and directions of the associations involved, potentially causing overestimation or underestimation of the true effect. For relative risk (RR), the confounded estimate can be approximated as \text{RR}_{\text{confounded}} = \text{RR}_{\text{true}} \times \text{RR}_{\text{confounder in unexposed}}, where \text{RR}_{\text{confounder in unexposed}} captures the association between the confounder and outcome among the unexposed group; if this multiplier exceeds 1, the bias tends toward overestimation (away from the null), while a value less than 1 leads to underestimation (toward the null). This structure is often illustrated conceptually as a confounding triangle, with the exposure linked to the outcome via a direct causal path, the confounder connected to both the exposure (through association) and the outcome (through causation), and no direct link from exposure to confounder, emphasizing the dual pathways that distort the marginal association. Confounding must be distinguished from other third-variable effects in causal inference, particularly mediation and interaction, to avoid misinterpretation of causal relationships. In mediation, a variable lies on the causal pathway between the exposure and outcome, transmitting part or all of the effect, whereas a confounder is associated with both the exposure and outcome but does not lie on this pathway. Statistically, mediation and confounding can appear identical, but they are differentiated conceptually: confounding distorts the total effect by providing an alternative explanation, while mediation explains how the effect occurs. Unlike confounding, which acts independently to bias the exposure-outcome association, interaction (or effect modification) occurs when the effect of the exposure on the outcome varies across levels of another variable, representing heterogeneity rather than distortion. Confounding requires control to estimate unbiased effects, whereas interaction should be explored and reported to capture subgroup differences. Collider bias represents another related phenomenon, arising when conditioning on a common effect of two variables (the collider) induces a spurious association between them by opening a non-causal path. In causal diagrams, a collider is a variable influenced by both the exposure and outcome (or their causes), and stratifying or adjusting for it can bias estimates, unlike confounding, which involves backdoor paths that are blocked by adjustment. This bias is particularly relevant in selection or conditioning scenarios, such as restricting analyses to survivors in longitudinal studies, where it creates associations not present in the source population. Simpson's paradox exemplifies confounding at an aggregate level, where trends observed in combined data reverse or disappear when data are stratified by levels of the confounder. This occurs due to uneven distribution of the confounder across exposure groups, leading to misleading overall associations that align with the true effect only within strata. For instance, an intervention may appear ineffective overall but beneficial in each subgroup defined by the confounder, highlighting how aggregation masks underlying relationships distorted by the third variable.

Identification and Assessment

Criteria for Identification

Identifying potential confounders in epidemiological studies requires applying established criteria to evaluate whether a variable distorts the observed association between an exposure and an outcome. A variable qualifies as a confounder if it meets three key conditions: (1) it is associated with the exposure in the source population; (2) it is a risk factor for the outcome independent of the exposure; and (3) it is not an intermediate variable in the causal pathway between the exposure and the outcome. These criteria ensure the variable mixes effects without being affected by the exposure itself, distinguishing confounding from other biases like selection or information bias. Confounding arises when such a variable is unevenly distributed across exposure groups, leading to spurious associations. Domain knowledge, encompassing prior biological plausibility and established associations, is fundamental for recognizing potential confounders, as it leverages expert understanding of mechanisms linking the variable to both exposure and outcome. For instance, age or socioeconomic status may be flagged as confounders in studies of environmental exposures and health outcomes due to well-documented biological and social pathways. This criterion emphasizes theoretical justification over empirical testing alone, ensuring selection aligns with plausible causal structures, such as those informed by prior literature on common causes. Integrating domain knowledge reduces reliance on data-driven methods that might overlook unmeasured variables, promoting robust study design from the outset. The change-in-estimate rule provides a practical, semi-quantitative approach to identify confounding by comparing the effect measure (e.g., odds ratio or risk ratio) before and after adjusting for the potential confounder. A shift of 10-20% or more in the adjusted estimate relative to the crude (unadjusted) estimate suggests the variable is a confounder warranting control, with the 10% threshold commonly used as a conservative cutoff to detect meaningful distortion. This criterion, evaluated through stratified analysis or regression models, helps quantify bias but should be interpreted cautiously, as smaller shifts may still indicate confounding in precise studies with narrow confidence intervals. Its application is particularly useful in exploratory phases to prioritize variables for inclusion, though it assumes the adjustment method accurately captures the confounder-outcome relationship.

Detection Techniques

Detection techniques for confounding involve empirical statistical methods applied to observational data to assess whether a variable distorts the apparent association between an exposure and an outcome. These approaches help confirm the presence of confounding after data collection, distinguishing them from a priori identification criteria by focusing on quantitative diagnostics. A key modern tool for identification is the use of , which visually represent causal relationships to identify variables that are common causes of exposure and outcome, thus potential confounders, without needing data. help minimize adjustment sets and avoid overadjustment. Common empirical methods include stratification-based tests, regression-based diagnostics, and sensitivity analyses for unobserved factors. Stratification tests detect confounding by dividing the data into subgroups (strata) based on levels of a potential confounder and comparing the crude (unadjusted) association between exposure and outcome to the stratum-specific associations. If the overall association changes substantially after stratification, the stratifying variable is likely a confounder. A widely used implementation is the , which computes a summary measure, such as an adjusted odds ratio, across strata to provide a pooled estimate that accounts for the confounder. For binary outcomes and exposures, the approximates a weighted average of stratum-specific odds ratios, with weights related to the number of non-exposed individuals in each stratum; a significant difference between this adjusted estimate and the crude odds ratio indicates confounding. This method assumes homogeneity of the exposure-outcome association across strata and is particularly effective in case-control and cohort studies for detecting confounding due to categorical variables like age or sex. Regression diagnostics assess confounding by incorporating a suspected confounder into a multivariable model and evaluating changes in the estimated effect of the primary exposure. In linear or logistic regression, the process involves first fitting a crude model with only the exposure and outcome, then adding the potential confounder and comparing the exposure coefficient (or odds ratio) between models. A common rule-of-thumb is that a change of 10% or more in the exposure effect estimate suggests , though this threshold can vary by context and study precision. For instance, if the confounder is correlated with both the exposure and outcome, its inclusion will alter the exposure coefficient toward the true causal effect, revealing the distortion in the crude model. This approach is computationally straightforward and applicable to continuous or categorical confounders, but it requires careful model specification to avoid issues like multicollinearity. Sensitivity analysis, particularly Rosenbaum's methods, evaluates the robustness of findings to potential unobserved confounding by deriving bounds on how much hidden bias could alter conclusions without directly observing the confounder. In this framework, for matched observational studies with binary outcomes, the analysis parameterizes the degree of imbalance in an unobserved binary covariate between treatment groups using an odds ratio , where = 1 implies no unobserved confounding and larger indicates increasing susceptibility to bias. The method computes upper and lower bounds on the treatment effect (e.g., odds ratio) under different values; if the bounds exclude the null hypothesis even at high (e.g., > 2), the result is deemed robust to moderate unobserved confounding. This technique is influential in non-experimental research, as it quantifies the strength of an unobserved confounder needed to overturn observed associations, aiding in assessing the credibility of causal claims.

Control Methods

Design-Based Strategies

Design-based strategies aim to prevent confounding by incorporating specific choices into the study protocol prior to data collection, thereby ensuring balance or homogeneity with respect to potential confounders across exposure groups. , a cornerstone of randomized controlled trials (RCTs), distributes potential confounders evenly across treatment and control groups on average through , thereby minimizing both known and unknown sources of bias. This probabilistic balancing occurs because ensures that, in expectation, the distribution of any confounder is identical between groups, allowing causal inferences about the exposure effect without systematic distortion. The principle was formalized by in his seminal work on experimental design, emphasizing as essential for valid in comparative studies. Restriction involves narrowing the population to individuals within a narrow range of the potential confounder, creating homogeneity that eliminates variation in that factor between exposed and unexposed groups. For instance, in a examining the effect of a on , researchers might restrict enrollment to participants aged 50-60 years to for age-related confounding. This approach prevents confounding by the restricted but may limit generalizability to broader populations. Matching pairs exposed and unexposed subjects based on key confounders to ensure similar distributions of those variables across groups, thereby reducing confounding at the design stage. In studies, for example, each exposed individual might be matched to an unexposed counterpart of the same and ; in case-control studies, controls are selected to match cases on these factors. While effective for observed confounders, matching does not address unknown ones and requires careful selection to avoid over-matching, which can reduce study efficiency.

Analysis-Based Adjustments

Analysis-based adjustments refer to statistical methods applied after data collection to control for the effects of measured confounders on the association between an exposure and an outcome. These techniques aim to estimate the causal effect by accounting for the distribution of confounders within the study population, thereby reducing bias in effect estimates. Common approaches include stratification, standardization, and multivariable regression, each providing a framework to isolate the exposure-outcome relationship while adjusting for confounding variables identified through prior detection methods. Stratification involves dividing the data into subgroups, or strata, based on levels of the confounding variable, allowing separate estimation of the exposure-outcome association within each stratum where the confounder is held constant. This method assumes no residual confounding within strata and no interaction between the exposure and confounder unless explicitly modeled. A seminal application in epidemiology is the Mantel-Haenszel procedure, which pools stratum-specific estimates to obtain an overall adjusted measure of association, such as the odds ratio, using the formula: \hat{OR}_{MH} = \frac{ \sum_k \frac{a_k d_k}{n_k} }{ \sum_k \frac{b_k c_k}{n_k} } where a_k, b_k, c_k, and d_k are the cell counts in the 2×2 table for stratum k (exposed cases, exposed non-cases, unexposed cases, unexposed non-cases, respectively), and n_k is the total in stratum k. This approach effectively controls for categorical confounders like age or sex, providing an unbiased summary estimate when the confounder is independent of the exposure within strata. Standardization extends by computing adjusted rates or risks through weighted averages, eliminating confounding due to differences in the distribution of the confounder across groups. Direct applies stratum-specific rates from the study populations to a standard population's structure, yielding comparable adjusted rates; it is preferred when rates in all groups are reliably estimated. The age-adjusted rate for a group is given by: \text{Adjusted rate} = \sum_i (r_i \times w_i) where r_i is the rate in stratum i (e.g., age group), and w_i is the proportion of the standard population in that stratum, with the sum over all strata normalized by the standard population size if needed. Indirect standardization, conversely, applies study rates to the standard population's structure to compute expected events, then derives a standardized mortality ratio (SMR) as observed over expected; it is useful for sparse data where direct rates are unstable. Both methods assume the confounder is the primary source of distortion and provide interpretable adjusted metrics for public health comparisons, such as age-standardized incidence rates. Multivariable regression models offer a flexible approach to adjust for multiple confounders simultaneously by including them as covariates in a regression framework, estimating the exposure effect while holding confounders constant. In logistic regression for binary outcomes, the adjusted odds ratio is derived from the model coefficient for the exposure, controlling for confounders via terms like \log\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 \text{exposure} + \sum \beta_j \text{confounder}_j, where the exponentiated \beta_1 yields the adjusted OR assuming linearity in the logit and no unmodeled interactions. This method accommodates continuous or categorical confounders and allows assessment of effect modification through interaction terms, making it widely used in observational studies for its efficiency with large datasets. Proper model specification, including confounder selection based on causal criteria, is essential to avoid residual bias.

Advanced Applications

Causal Inference Integration

In modern causal inference, directed acyclic graphs (DAGs) provide a graphical framework for representing causal relationships and identifying confounding structures. A DAG consists of nodes representing variables and directed edges indicating causal influences, allowing researchers to visualize paths through which confounders may distort associations between treatment and outcome. The backdoor criterion, developed by Judea Pearl, specifies a set of variables that blocks all backdoor paths—non-causal paths from treatment to outcome that pass through common causes—thus enabling unbiased estimation of causal effects when conditioning on those variables. This criterion ensures that no descendant of the treatment is included in the set, preventing the introduction of bias from mediators, and has been foundational in Pearl's structural causal model since the 1990s. Do-calculus extends this graphical approach by providing a to distinguish interventional distributions, denoted P(Y \mid do(X)), from observational ones, P(Y \mid X), particularly when confounders are present. Introduced by Pearl, do-calculus comprises three rules that allow manipulation of expressions involving the do-operator, which simulates interventions by severing incoming edges to the node in the DAG. For confounder adjustment, these rules identify when conditioning on observed variables suffices to equate the interventional distribution to a conditional one, such as P(Y \mid do(X)) = \sum_Z P(Y \mid X, Z) P(Z), where Z blocks backdoor paths. This facilitates rigorous adjustment for confounding without assuming forms, emphasizing conditions derived directly from the graph. The potential outcomes framework, pioneered by Donald Rubin, formalizes confounding through counterfactual reasoning, where each unit has two potential outcomes: Y(1) under and Y(0) under . Confounding arises when treatment assignment is not of these potential outcomes, leading to systematic differences in their distributions across treated and untreated groups, such that the observed association E[Y \mid X=1] - E[Y \mid X=0] deviates from the causal effect E[Y(1) - Y(0)]. In Rubin's model, adjustment for confounding requires strong ignorability—independence of potential outcomes from treatment given covariates—which aligns with blocking confounding paths and restores comparability of groups. This definition underscores confounding as a in the observed data, addressable by matching or on confounders to estimate average effects.

Contemporary Challenges

One persistent challenge in causal inference is unmeasured confounding, where unobserved variables influence both exposure and outcome, leading to biased estimates despite adjustments for measured covariates. Traditional methods like multivariable fail to mitigate this, as they cannot account for unknown factors, potentially exaggerating or masking true associations. To address unmeasured confounding, instrumental variable () estimation employs a variable that correlates with the exposure but affects the outcome only through the exposure, satisfying relevance and exclusion restriction assumptions. A foundational IV approach is two-stage least squares (2SLS), which proceeds in two steps: first, regress the endogenous exposure X on the instrument Z and exogenous covariates W to obtain predicted values \hat{X}: \hat{X} = \pi_0 + \pi_1 Z + \pi_2 W + \epsilon Second, regress the outcome Y on \hat{X} and W: Y = \beta_0 + \beta_1 \hat{X} + \beta_2 W + \nu The coefficient \beta_1 estimates the local average treatment effect for compliers influenced by the instrument. Despite its utility, 2SLS is limited by weak instrument bias, where low instrument strength inflates standard errors and reduces precision, and sensitivity to violations like pleiotropy in genetic contexts. Collider stratification bias presents another modern hurdle, particularly in genomics and epidemiology, where conditioning on a common effect (collider) of exposure and outcome induces spurious correlations by opening non-causal paths in directed acyclic graphs. For instance, stratifying by survival in studies of early-life exposures and later diseases—such as restricting analyses to survivors of a cohort—can create inverse associations between unrelated factors, as seen in the birthweight paradox where low birthweight appears protective against infant mortality due to selection on survival. In genomic research, conditioning on phenotypes like disease status in genome-wide association studies can bias estimates by stratifying on colliders influenced by both genetic variants and environmental factors, exacerbating selection bias in diverse populations. This bias is especially problematic in big data settings, where automated conditioning on derived variables amplifies distortions without explicit recognition. Emerging applications highlight additional pitfalls, such as in where models risk to confounders, capturing spurious patterns that fail to generalize beyond training data. For example, predictive algorithms in healthcare may learn demographic confounders as proxies for outcomes, leading to biased predictions in underrepresented groups unless features are enforced through causal representations. In observational studies from the 2020s, time-varying confounding—arising from evolving factors like policy changes, testing availability, and behaviors—has confounded estimates of interventions' effects, as seen in analyses of where temporal shifts in exposure patterns introduced unmeasured biases detectable via self-controlled designs. These challenges underscore the need for robust sensitivity analyses and hybrid methods integrating causal graphs to navigate dynamic, high-dimensional data environments.

References

  1. [1]
    Principles of Epidemiology: Glossary - CDC Archive
    confounding the distortion of the association between an exposure and a health outcome by a third variable that is related to both. contact exposure to a source ...
  2. [2]
    [PDF] Confounding Bias, Part I
    Confounding is an important concept in epidemiology, because, if present, it can cause an over- or under- estimate of the observed association.
  3. [3]
    Confounding | Catalog of Bias
    A distortion that modifies an association between an exposure and an outcome because a factor is independently associated with the exposure and the outcome.Background · Impact · Preventive steps
  4. [4]
    How to control confounding effects by statistical analysis - PMC - NIH
    A Confounder is a variable whose presence affects the variables being studied so that the results do not reflect the actual relationship.
  5. [5]
    Confounding: what it is and how to deal with it - PubMed
    As confounding obscures the 'real' effect of an exposure on outcome, investigators performing etiological studies do their utmost best to prevent or control ...
  6. [6]
    Confounding – Foundations of Epidemiology - Oregon State University
    Confounders are variables—not the exposure and not the outcome—that affect the data in undesirable and unpredictable ways. Specifically, in data that are ...
  7. [7]
    History of the modern epidemiological concept of confounding
    This essay discusses how confounding was perceived in the 18th and 19th centuries, reviews how the concept evolved across the 20th century and finally ...
  8. [8]
    Assessing bias: the importance of considering confounding - PMC
    Confounding is often referred to as a “mixing of effects”, wherein the effects of the exposure under study on a given outcome are mixed in with the effects of ...
  9. [9]
    Confounding, Causality and Confusion: The Role of Intermediate ...
    Confounding is a bias due to the existence of a common cause of exposure and outcome, which, by definition, occurs temporally prior to both exposure and outcome ...<|separator|>
  10. [10]
    Bias, Confounding, and Interaction: Lions and Tigers, and Bears, Oh ...
    A confounding variable (confounding factor or confounder) is a variable that correlates (positively or negatively) with both the exposure and outcome.
  11. [11]
    Methodological issues of confounding in analytical epidemiologic ...
    Confounding can be thought of as mixing the effect of exposure on the risk of disease with a third factor which distorts the measure of association such as risk ...
  12. [12]
    [PDF] Outline
    Confounding in our Example. 14. Ice Cream. Consumption. Drowning rate. Outdoor. Temperature. Confounding Example: Drowning and Ice Cream Consumption. 15. Ice ...Missing: textbook | Show results with:textbook
  13. [13]
    [PDF] VN Confounding
    The output even suggests that ice cream sales is significantly and positively associated with number of drownings (Beta=0.5269, p-value<0.001). In.Missing: drowning rates source epidemiology
  14. [14]
    Aristotle on Causality - Stanford Encyclopedia of Philosophy
    Jan 11, 2006 · Aristotle developed a theory of causality which is commonly known as the doctrine of the four causes.The Four Causes · The Four Causes and the... · Final Causes Defended
  15. [15]
    John Snow, Cholera, the Broad Street Pump; Waterborne Diseases ...
    His thorough investigation of an epidemic in the Soho district of London led to his conclusion that contaminated water from the Broad Street pump was the source ...Missing: confounding | Show results with:confounding
  16. [16]
    2.3. Establishing Causality - Computational and Inferential Thinking
    In order to establish whether it was the water supply that was causing cholera, Snow had to compare two groups that were similar to each other in all but one ...
  17. [17]
    [PDF] BRITISH MEDICAL JOURNAL
    For instance, in Germany, Muller (1939) found that only 3 out of 86 male patients with cancer of the lung were non-smokers, while 56 were heavy smokers, and, in ...
  18. [18]
    Probabilistic Causation - Stanford Encyclopedia of Philosophy
    Jul 11, 1997 · The central idea behind probabilistic theories of causation is that causes change the probability of their effects; an effect may still occur ...
  19. [19]
    Commentary: Cornfield, Epidemiology and Causality - PMC - NIH
    Cornfield proposed an approach for establishing a cause-and-effect relationship based on the systematic elimination of alternative or competing hypotheses.
  20. [20]
    [PDF] Dealing with confounding in the analysis - IARC Publications
    In summary, the Mantel–Haenszel method is a very useful technique to adjust for confounders, and this approach is often adequate for data with few confounders.
  21. [21]
    Equivalence of the Mediation, Confounding and Suppression Effect
    Mediation and confounding are identical statistically and can be distinguished only on conceptual grounds. Methods to determine the confidence intervals for ...
  22. [22]
    Confounding in Statistical Mediation Analysis: What It Is and How to ...
    Confounding can occur whenever there are either measured or unmeasured variables that are related to more than one of the variables in the mediation model (i.e. ...
  23. [23]
    Confounding and effect measure modification: analysing sex in ...
    Confounding is referred to as a confusion or mixing of effects. The distortion of the estimated association between an exposure and an outcome depends on the ...
  24. [24]
    [PDF] Causal inference in statistics: An overview - UCLA
    Abstract: This review presents empirical researchers with recent advances in causal inference, and stresses the paradigmatic shifts that must be un-.
  25. [25]
    Collider Bias in Observational Studies - NIH
    CB is a distortion that arises through restriction on or stratification by a collider variable, or through statistical adjustment for a collider variable in a ...
  26. [26]
    Collider bias | Catalog of Bias - The Catalogue of Bias
    A distortion that modifies an association between an exposure and outcome, caused by attempts to control for a common effect of the exposure and outcome.
  27. [27]
    Simpson's Paradox and Experimental Research - PMC - NIH
    Simpson's paradox is an extreme condition of confounding in which an apparent association between two variables is reversed when the data are analyzed within ...
  28. [28]
  29. [29]
  30. [30]
    An overview of confounding. Part 2: how to identify it and special ...
    Dec 28, 2017 · Many people learned to identify confounders by consider- ing the following three criteria: (1) the exposure is associ- ated with the confounder, ...
  31. [31]
    [PDF] Hill's Criteria for Causality - RTI Health Solutions
    Of course, once the confounding factor is identified, the association is diminished by adjustment for the factor. These examples remind us that a strong ...
  32. [32]
    Causation in epidemiology: association and causation
    The Bradford-Hill criteria are widely used in epidemiology as providing a framework against which to assess whether an observed association is likely to be ...
  33. [33]
    Assessing causality in epidemiology: revisiting Bradford Hill to ... - NIH
    The nine Bradford Hill (BH) viewpoints (sometimes referred to as criteria) are commonly used to assess causality within epidemiology.
  34. [34]
    Applying the Bradford Hill criteria in the 21st century: how data ... - NIH
    Sep 30, 2015 · In 1965, Sir Austin Bradford Hill published nine “viewpoints” to help determine if observed epidemiologic associations are causal.
  35. [35]
    THE IMPACT OF CONFOUNDER SELECTION CRITERIA ON ...
    This paper presents the results of a Monte Carlo simulation of several confounder selection criteria, including change-in-estimate and collapsibility test ...
  36. [36]
    Is a Cutoff of 10% Appropriate for the Change-in-Estimate Criterion ...
    Dec 7, 2013 · When using the change-in-estimate criterion, a cutoff of 10% is commonly used to identify confounders. However, the appropriateness of this ...
  37. [37]
    The change in estimate method for selecting confounders
    Aug 9, 2021 · The change in estimate is a popular approach for selecting confounders in epidemiology. It is recommended in epidemiologic textbooks and articles over ...<|separator|>
  38. [38]
    Principles of confounder selection - PMC - PubMed Central - NIH
    Mar 6, 2019 · This paper puts forward a practical approach to confounder selection decisions when the somewhat less stringent assumption is made that knowledge is available ...
  39. [39]
    Identification of confounder in epidemiologic data contaminated by ...
    May 18, 2016 · Common methods for confounder identification such as directed acyclic graphs (DAGs), hypothesis testing, or a 10 % change-in-estimate (CIE) ...
  40. [40]
    Should we adjust for a confounder if empirical and theoretical ...
    Aug 15, 2014 · Confounders can be identified by either empirical or theoretical strategies. Empirical strategies select a confounder based on objective ...Results · Discussion · Methods<|control11|><|separator|>
  41. [41]
    Assessing Sensitivity to an Unobserved Binary Covariate in an ... - jstor
    The paper proposes a technique to assess the sensitivity of causal conclusions to an unobserved binary covariate, relevant to treatment and response, by ...
  42. [42]
    Confounding: What it is and how to deal with it - ScienceDirect
    Feb 1, 2008 · Confounding, sometimes referred to as confounding bias, is mostly described as a 'mixing' or 'blurring' of effects. It occurs when an ...
  43. [43]
    Fisher, Bradford Hill, and randomization - Oxford Academic
    In the 1920s RA Fisher presented randomization as an essential ingredient of his approach to the design and analysis of experiments, validating significance ...
  44. [44]
    [PDF] The design of experiments
    By. Sir Ronald A. Fisher, Sc.D., F.R.S.. Honorary Research Fellow, Division of Mathematical Statistics,. C.S.I.R.O., University of Adelaide; Foreign ...Missing: randomization | Show results with:randomization
  45. [45]
    Matching - ScienceDirect.com
    Matching is an intuitively appealing design strategy for ensuring balance on one or more potential confounding variables.
  46. [46]
    Part 2: Direct and Indirect Standardization | Nephron Clinical Practice
    Jul 28, 2010 · Direct standardization uses a population's structure as the standard, while indirect uses specific event rates. Both remove confounding factors ...Missing: seminal | Show results with:seminal
  47. [47]
    Combating Unmeasured Confounding in Cross-Sectional Studies
    Unmeasured confounding refers to unmeasured characteristics of individuals that lead them both to be in a particular “treatment” category and to register higher ...Missing: seminal | Show results with:seminal
  48. [48]
    Two-Stage Least Squares (2SLS) Regression Analysis
    Two-Stage least squares (2SLS) regression analysis is a statistical technique that is used in the analysis of structural equations.
  49. [49]
    Instrumental Variables: Application and Limitations - Epidemiology
    We conclude that instrumental variables can be useful in case of moderate confounding but are less useful when strong confounding exists.Missing: seminal papers
  50. [50]
    Collider stratification bias II: magnitude of bias - PMC - NIH
    Various epidemiologic paradoxes have been proposed to have collider stratification bias at their root, such as the birthweight paradox and the obesity paradox.
  51. [51]
    The Case of Collider Stratification Bias - PubMed - NIH
    Feb 5, 2024 · Collider stratification bias, the bias resulting from conditioning on a common effect of two causes, is oftentimes considered a type of selection bias.
  52. [52]
    Training confounder-free deep learning models for medical ... - Nature
    Nov 26, 2020 · In this article, we introduce an end-to-end approach for deriving features invariant to confounding factors while accounting for intrinsic correlations.Missing: pitfalls seminal
  53. [53]
    Applying two approaches to detect unmeasured confounding due to ...
    Two approaches to detect unmeasured confounding due to time-varying variables in a self-controlled risk interval design evaluating COVID-19 vaccine safety ...Missing: 2020s seminal
  54. [54]
    Time-Related Biases in Nonrandomized COVID-19–Era Studies ...
    Jun 17, 2021 · Here we review how several important aspects of time-related bias ... time-dependent confounding and selection bias. In the present ...Missing: 2020s seminal papers