Rubin causal model
The Rubin causal model (RCM), also known as the Neyman–Rubin potential outcomes framework, is a foundational approach in statistics for defining and estimating causal effects in both experimental and observational studies.[1] It conceptualizes causation through counterfactual reasoning, where the causal effect of a treatment on a unit is the difference between the potential outcome under the treatment and the potential outcome under an alternative condition, such as no treatment.[2] Formally introduced in a series of works by Donald B. Rubin starting in the early 1970s, the model addresses the fundamental problem of causal inference: the impossibility of observing both potential outcomes for the same unit simultaneously. The framework's core elements include units (e.g., individuals or entities affected by the treatment), treatments (causes or interventions, often binary like treated vs. control), and potential responses denoted as Y_t(u) for the outcome of unit u under treatment t and Y_c(u) under control c.[1] Unit-level causal effects are defined as Y_t(u) - Y_c(u), though these are unobservable; instead, the model focuses on estimable quantities like the average treatment effect (ATE), \tau = E[Y_t - Y_c], across the population.[3] This approach builds on Jerzy Neyman's 1923 work for randomized experiments but extends it to nonrandomized settings by incorporating assignment mechanisms and matching methods to approximate randomization.[2] Key assumptions underpin the RCM's validity, including the stable unit treatment value assumption (SUTVA), which posits no interference between units and a consistent treatment version, ensuring potential outcomes are well-defined without spillover effects.[3] Another critical assumption is ignorability (or exchangeability), stating that treatment assignment is independent of potential outcomes conditional on observed covariates, allowing unbiased estimation via methods like propensity score matching or regression adjustment.[3] The term "Rubin causal model" was coined by Paul W. Holland in 1986 to encapsulate Rubin's contributions, highlighting its role in unifying causal analysis across disciplines like economics, epidemiology, and social sciences.[1] Since its formalization, the RCM has become central to modern causal inference, influencing tools like instrumental variables and difference-in-differences, and enabling robust policy evaluations where randomized trials are infeasible. Its emphasis on randomization's benefits—while acknowledging the value of carefully controlled observational data—has shaped empirical research, with over 13,000 citations to Rubin's seminal 1974 paper underscoring its enduring impact.Overview
Definition
The Rubin causal model (RCM), also known as the Neyman-Rubin causal model, is a foundational framework in statistics for defining and analyzing causation using the potential outcomes approach.[4] It formalizes cause-effect relationships through counterfactual reasoning, positing that causation can be understood by comparing what would happen under different interventions on the same units. This model shifts the focus from mere associations to explicit contrasts between hypothetical outcomes, enabling rigorous quantification of causal impacts in both experimental and observational settings. At its core, the RCM considers a population of units (e.g., individuals, firms, or regions) indexed by i, where each unit has two potential outcomes: Y_i(1), the outcome if unit i receives the treatment, and Y_i(0), the outcome if it receives the control or no treatment. The individual causal effect for unit i is then defined as the difference \tau_i = Y_i(1) - Y_i(0), representing the change attributable to the treatment. However, a fundamental challenge arises because only one potential outcome is observable per unit—the one corresponding to the assigned treatment—rendering the other a counterfactual that cannot be directly measured.[4] To address this unobservability, the RCM relies on randomization of treatment assignment to ensure that observed outcomes provide unbiased estimates of the potential outcomes distributions, or on alternative assumptions when randomization is unavailable. The primary goal of the model is to identify and estimate population-level causal effects, such as averages of these individual differences, thereby providing a principled basis for inferring how treatments influence outcomes beyond correlational evidence.Historical Development
The foundations of the Rubin causal model trace back to early 20th-century statistical work on experimental design in agriculture. In 1923, Jerzy Neyman introduced the potential outcomes framework in his analysis of randomized experiments aimed at comparing crop varieties, emphasizing randomization to ensure unbiased estimation of treatment effects under a superpopulation model.[5] Concurrently, during the 1920s and 1930s, Ronald A. Fisher developed key principles of experimental design at the Rothamsted Experimental Station, including randomization, replication, and blocking, to control for variability in field trials and enable valid inference about causal relationships in randomized settings.[6] These contributions laid the groundwork for rigorous causal inference by highlighting the role of randomization in isolating treatment effects from confounding factors. Donald B. Rubin built upon and generalized this foundation starting in the 1970s, shifting the focus to a broader potential outcomes framework applicable beyond strictly randomized experiments. In his seminal 1974 paper, Rubin formalized the use of potential outcomes to define and estimate causal effects in both experimental and non-experimental (observational) data, introducing notation and assumptions that allowed for principled handling of missing counterfactuals. He further elaborated on these ideas in 1977, clarifying the estimation of causal effects under various assignment mechanisms and emphasizing the challenges of unobserved counterfactuals in observational settings. These works marked a pivotal expansion, enabling causal inference in real-world scenarios where randomization was infeasible. The framework evolved into what is commonly termed the Neyman-Rubin model through subsequent developments in the 1980s and 1990s, particularly in addressing biases in observational studies. A key advancement came in 1983, when Rubin and Paul R. Rosenbaum introduced the propensity score—the conditional probability of treatment assignment given observed covariates—as a dimension-reduction tool to balance treatment groups and mimic randomization. This method facilitated applications in social sciences, where observational data predominated, and spurred further theoretical refinements, such as bounds on causal effects and sensitivity analyses.[6] By the 2000s, the Neyman-Rubin model had achieved widespread adoption in economics and medicine, serving as a cornerstone for causal analysis of policy interventions and clinical treatments using observational data. Its integration into econometric toolkits and epidemiological studies underscored its versatility, with influential texts solidifying its role in multidisciplinary causal inference.[7]Core Concepts
Potential Outcomes
In the Rubin causal model, potential outcomes form the foundational concept for defining causal effects. For each unit i in a population, the potential outcome under treatment is denoted Y_i(1), representing the value of the outcome variable Y if unit i receives the treatment, while Y_i(0) denotes the potential outcome under no treatment (control). The observed outcome for unit i is then Y_i = Y_i(1) if the treatment indicator Z_i = 1 (indicating treatment receipt) and Y_i = Y_i(0) if Z_i = 0. This framework, originating in the work of Neyman and formalized by Rubin, treats potential outcomes as fixed but unknown attributes of each unit prior to treatment assignment. Potential outcomes are interpreted as counterfactuals, capturing what would have happened to unit i under the alternative treatment condition that did not occur. For instance, in a study evaluating a job training program, Y_i(1) might represent unit i's earnings if enrolled in the program, while Y_i(0) reflects earnings without enrollment, embodying the hypothetical scenario not realized for that unit. This counterfactual reasoning underpins the model's ability to conceptualize causation as a comparison across unobservable states, distinguishing it from mere associations in observed data. At the unit level, the key challenge arises from the fact that only one potential outcome can be observed per unit, rendering the other a counterfactual that is inherently unobservable. This "fundamental problem of causal inference," as termed by Holland, precludes direct measurement of individual-level contrasts for any single unit, limiting empirical verification to aggregates across units. In contrast, at the population level, potential outcomes enable inferences about average effects when data from multiple units under different treatments are available, though such inferences rely on distributional assumptions about the unobservables.Treatment Assignment
In the Rubin causal model, treatment assignment refers to the process by which units are allocated to receive a treatment or control condition, which determines which potential outcome is observed for each unit.[8] For unit i, the treatment indicator Z_i is typically binary, taking the value 1 if the unit receives the treatment and 0 if it receives the control; this framework can be generalized to multi-valued treatments where Z_i represents one of several possible levels.[9] The assignment mechanism is formally defined by the probability \pi_i = \Pr(Z_i = 1 \mid X_i), where X_i denotes the covariates for unit i, capturing how treatment probabilities may depend on observable characteristics.[9] A key distinction in assignment mechanisms is between sharp randomization, where \pi_i is fixed and identical for all units (independent of covariates), and mechanisms that allow \pi_i to vary conditionally on covariates, such as in targeted or adaptive designs.[9] In experimental settings, the assignment mechanism plays a crucial role in ensuring balance between treatment and control groups, thereby facilitating valid causal inferences. Common methods include complete randomization, where each unit is independently assigned to treatment with fixed probability \pi; stratified randomization, which allocates treatments within subgroups defined by key covariates to improve balance on those factors; and cluster randomization, where entire groups (e.g., schools or communities) are assigned as units to avoid interference within clusters.[8][10] Under certain conditions, the assignment mechanism supports the ignorability assumption, which posits that treatment assignment is independent of the potential outcomes given the covariates: Z_i \perp (Y_i(0), Y_i(1)) \mid X_i.[9] This assumption, often achieved through randomization, ensures that observed covariates suffice to control for selection biases in estimating causal effects.[8]Stable Unit Treatment Value Assumption (SUTVA)
The Stable Unit Treatment Value Assumption (SUTVA) is a foundational assumption in the Rubin causal model that ensures potential outcomes for each unit are well-defined and invariant to the treatments received by other units or to variations in treatment implementation. It comprises two interrelated components: no interference, which posits that the potential outcome of a unit under a given treatment is unaffected by the treatments assigned to other units; and consistency, which requires that the observed outcome for a unit matches its potential outcome under the treatment actually received, implying no hidden versions of the treatment that could produce different results.[11] Formally, SUTVA can be stated as the condition that the potential outcome for unit i under treatment z, denoted Y_i(z), does not depend on the vector of treatments assigned to all other units, \mathbf{z}_{-i}. This is expressed as Y_i(z) = Y_i(z, \mathbf{z}_{-i}) for all possible z and \mathbf{z}_{-i}. The assumption thus restricts the potential outcomes framework—where each unit has a well-defined counterfactual outcome under each treatment—to settings without spillover or contextual dependencies.[12] The implications of SUTVA are critical for causal inference, as it prevents spillover effects where one unit's treatment influences another's outcome, thereby ensuring that treatment effects can be attributed solely to the unit's own assignment. It also assumes that treatments are uniformly defined and delivered, without variations such as differences in dosage or implementation that could alter outcomes across units.[12] Without SUTVA, potential outcomes become ill-defined, complicating the identification of causal effects and potentially leading to biased estimates.[11] Violations of SUTVA occur in scenarios involving interference, such as contagion in social networks where one individual's treatment (e.g., information sharing or behavior adoption) affects peers' outcomes independently of their own treatment.[11] Similarly, the consistency component can be breached by hidden treatment versions, as in cases where the same nominal treatment yields different effects due to variations like terrain differences in an exercise intervention or batch inconsistencies in drug administration.[11]Causal Effects
Individual Causal Effect
In the Rubin causal model, the individual causal effect for a specific unit i is defined as the difference between the potential outcomes under treatment and under no treatment, denoted as \tau_i = Y_i(1) - Y_i(0), where Y_i(1) is the outcome if unit i receives the treatment and Y_i(0) is the outcome if it does not.[7] This formulation captures the unit-specific impact of the treatment, serving as the building block for understanding causation at the most granular level. However, \tau_i is fundamentally unobservable for any given unit because only one potential outcome can be realized and observed—the other remains a counterfactual that cannot be directly accessed.[4] As a result, the individual causal effect cannot be directly estimated from observed data without imposing additional assumptions, such as those enabling extrapolation from similar units or experimental designs.[4] The Rubin causal model inherently accommodates heterogeneity in individual causal effects, meaning \tau_i can vary substantially across units due to differences in underlying characteristics, contexts, or interactions with the treatment; for some units, the effect may be positive, for others negative, and for yet others zero or negligible.[7][4] This unit-level perspective on causation underpins approaches to personalized inference, where the goal is to predict or understand treatment impacts tailored to specific individuals rather than aggregated groups.[7]Average Treatment Effect
In the Rubin causal model, the average treatment effect (ATE) represents a population-level measure of causal impact, defined as the expected difference between the potential outcomes under treatment and control across the entire population: \tau = E[Y(1) - Y(0)]. This quantity equals the expected value of the individual causal effects, \tau = E[\tau_i], where \tau_i = Y_i(1) - Y_i(0) for each unit i, providing an aggregate summary of how the treatment shifts outcomes on average. The ATE assumes the stable unit treatment value assumption (SUTVA) holds, ensuring that potential outcomes for one unit are unaffected by the treatment assignments of others.[13] Variants of the ATE address subgroup-specific effects within the population. The average treatment effect on the treated (ATT) is the expected causal effect conditional on units receiving treatment: E[Y(1) - Y(0) \mid Z=1], where Z indicates treatment assignment. Similarly, the average treatment effect on the controls (ATC) conditions on units not receiving treatment: E[Y(1) - Y(0) \mid Z=0]. These conditional measures are particularly relevant in observational studies where treatment assignment is not random, allowing researchers to focus on effects for specific groups of interest, such as policy beneficiaries.[13] Under complete randomization in experimental settings, the ATE can be unbiasedly estimated using the difference in sample means between treated and control groups: \hat{\tau} = \bar{Y}_1 - \bar{Y}_0, where \bar{Y}_t is the observed mean outcome for units assigned to treatment level t \in \{0,1\}.[14] This estimator is unbiased for the finite-population ATE, defined as the average difference in potential outcomes over the N units in the study sample: \tau_{fs} = \frac{1}{N} \sum_{i=1}^N (Y_i(1) - Y_i(0)). In contrast, the superpopulation perspective treats the sample as drawn from a larger infinite population, where the ATE is an expectation over both the finite sample effects and the sampling distribution: E[\tau_{fs}] = \tau. The finite-population approach, originating in Neyman's framework, emphasizes inference about the specific study units, while the superpopulation view supports generalization to broader contexts, with the choice depending on the research goals.[14]Other Effect Measures
In the Rubin causal model, causal effects can extend beyond population-wide averages to account for heterogeneity driven by covariates, outcome distributions, or specific subpopulations, providing more nuanced insights into treatment impacts. These measures are defined within the potential outcomes framework, where individual effects Y_i(1) - Y_i(0) vary across units, and identification relies on assumptions like ignorability conditional on covariates or valid instruments. The conditional average treatment effect (CATE) captures the expected causal effect for units sharing the same covariate profile, allowing researchers to assess how treatment benefits differ based on observable characteristics such as age, income, or health status. It is formally defined as\tau(x) = \mathbb{E}[Y(1) - Y(0) \mid X = x],
where Y(1) and Y(0) are the potential outcomes under treatment and control, respectively, and X denotes the vector of covariates. This measure is central to personalized or targeted causal inference, as it enables estimation of treatment effects that are heterogeneous across the covariate space, facilitating policy recommendations tailored to specific groups. For instance, in medical trials, CATE might reveal stronger effects for patients with certain biomarkers, supporting stratified interventions. Quantile treatment effects address distributional shifts in outcomes, focusing on how treatment alters specific points along the potential outcome distributions rather than just means, which is particularly relevant when effects are asymmetric or when interest lies in extreme values like poverty thresholds or high-risk events. The \alpha-quantile treatment effect is given by
\Delta(\alpha) = Q_{Y(1)}(\alpha) - Q_{Y(0)}(\alpha),
where Q_{Y(t)}(\alpha) is the \alpha-th quantile of the potential outcome distribution under treatment status t. This approach reveals, for example, whether a policy reduces outcomes more for those at the lower tail of the distribution, preserving the potential outcomes structure while accommodating non-normal or skewed data. Seminal work embeds this in instrumental variable settings to handle endogeneity, ensuring identification under monotonicity and relevance assumptions.[15] The local average treatment effect (LATE) provides a targeted measure in scenarios involving instrumental variables, where treatment assignment is not fully compliant or randomized, estimating the effect only for the subgroup whose treatment status is altered by the instrument—known as compliers. Within the Rubin causal model, LATE is the average of individual treatment effects over this complier subpopulation, formally
\tau_{\text{LATE}} = \mathbb{E}[Y(1) - Y(0) \mid \text{complier}],
identified as the instrument's effect on the outcome divided by its effect on treatment receipt, under exclusion restriction and monotonicity. This embeds naturally in the potential outcomes framework by partitioning units into principal strata (always-takers, never-takers, compliers, defiers), focusing inference on the relevant local group without assuming homogeneous effects across the full population.[16] Subgroup effects, often operationalized through CATE for discrete covariate strata, quantify causal impacts within predefined categories defined by baseline characteristics, such as demographic groups or risk levels, to uncover variation in treatment responsiveness. These are computed as the average effect conditional on subgroup membership, \mathbb{E}[Y(1) - Y(0) \mid S = s], where S indexes the strata, and serve to test for effect modifiers while maintaining the model's unit-level potential outcomes. Methods like recursive partitioning can systematically identify such subgroups with distinct effects, enhancing interpretability in observational or experimental data where overall averages mask important disparities.
Identification and Estimation
The Fundamental Problem of Causal Inference
In the Rubin causal model, the fundamental problem of causal inference arises from the inherent unobservability of counterfactual outcomes for any given unit. For a specific unit u, only one potential outcome can be observed—either the outcome under treatment Y_t(u) or under control Y_c(u)—but never both simultaneously, rendering the individual causal effect Y_t(u) - Y_c(u) directly unknowable without some form of replication or assumption.[17] This limitation stems from the structure of the potential outcomes framework, where each unit's response to different treatments is defined but not jointly observable in a single instance. The implications of this problem are profound for causal inference, as it underscores that direct observation of causation is impossible, forcing reliance on assumptions to approximate counterfactuals by leveraging variation across multiple units or repeated interventions. Without such approximations, causal effects cannot be identified solely from observed data, distinguishing causal analysis from mere correlational studies that fail to address what would have happened under alternative conditions.[17] This unobservability highlights why the Rubin model emphasizes the need for rigorous assumptions, such as those enabling inference from populations or experiments, to bridge the gap between observed facts and hypothetical scenarios. Paul W. Holland formalized this challenge in 1986, explicitly stating the fundamental problem as: "It is impossible to observe the value of Y_t(u) and Y_c(u) on the same unit and, therefore, it is impossible to observe the effect of t on u."[17] In this formulation, Holland ties the problem to the principle of "no causation without manipulation," asserting that causes must be manipulable interventions to warrant causal claims, as non-manipulable factors like attributes cannot produce observable contrasts in potential outcomes.[17] Philosophically, the fundamental problem reinforces the classic distinction between correlation and causation by centering on unobservable counterfactuals, which correlations alone cannot resolve without additional structure from the Rubin model. This approach shifts focus from passive associations to active effects of causes, requiring manipulability to ensure that observed differences reflect genuine interventions rather than confounding influences.[17]Randomization and Experimental Design
In the Rubin causal model, randomization serves as the primary mechanism for identifying causal effects in experimental settings by ensuring that treatment assignment is independent of the potential outcomes. This independence implies that the expected value of the outcome under treatment among those assigned to treatment equals the population expectation, i.e., E[Y(1) \mid Z=1] = E[Y(1)], and similarly E[Y(0) \mid Z=0] = E[Y(0)].[2] Consequently, the simple difference in sample means, \hat{\tau} = \bar{Y}_1 - \bar{Y}_0, where \bar{Y}_1 and \bar{Y}_0 are the means in the treated and control groups, respectively, provides an unbiased estimator of the average treatment effect (ATE), \tau = E[Y(1) - Y(0)].[2] This unbiasedness holds under the model's assumptions, including the stable unit treatment value assumption (SUTVA), and contrasts with observational studies where such independence typically does not exist.[7] Experimental designs in the Rubin framework vary to balance efficiency, precision, and generalizability. Complete randomization assigns each unit independently to treatment or control with fixed probabilities (e.g., 50% each), which ensures the aforementioned independence but can lead to imbalances in covariates by chance in finite samples.[7] Blocked or stratified randomization mitigates this by dividing units into homogeneous blocks based on key covariates and randomizing within each block, reducing variance in the ATE estimator and increasing power without altering unbiasedness.[7] Factorial designs extend this to multiple factors, randomizing units across all combinations of treatment levels to estimate main effects and interactions simultaneously; for instance, a $2^k design with k binary factors allows identification of each factor's causal effect under the no-interference assumption.[18] Power calculations for these designs typically rely on the variance of the ATE estimator to determine the minimum sample size needed to detect a hypothesized effect size at a desired significance level and power, often assuming normality of outcomes or using simulation-based methods.[7] For finite-sample inference under complete randomization, Jerzy Neyman derived the exact sampling variance of \hat{\tau}, which accounts for the randomization distribution rather than superpopulation assumptions: \operatorname{Var}(\hat{\tau}) = \frac{\sigma^2_1}{n_1} + \frac{\sigma^2_0}{n_0} - \frac{\sigma^2_\tau}{N}, where \sigma^2_1 = \operatorname{Var}(Y(1)), \sigma^2_0 = \operatorname{Var}(Y(0)), \sigma^2_\tau = \operatorname{Var}(Y(1) - Y(0)), n_1 and n_0 are the treated and control sample sizes, and N = n_1 + n_0 is the total sample size. An unbiased plug-in estimator replaces the unknown population variances with their sample analogs, enabling conservative confidence intervals via the normal approximation or exact randomization tests.[7] This variance formula highlights that the estimator's precision improves with larger samples and lower heterogeneity in individual treatment effects, as captured by \sigma^2_\tau.[7] Real-world experiments often involve noncompliance, where units assigned to treatment do not receive it or control units access the treatment. In such cases, the intention-to-treat (ITT) analysis preserves randomization's validity by estimating the causal effect of treatment assignment on outcomes, computed as the difference in means across randomized groups regardless of actual receipt; this provides a policy-relevant lower bound on the true treatment effect under monotonicity (no defiers).[19] To recover the effect of actual treatment receipt, the complier average causal effect (CACE) targets the subgroup that complies with assignment, identified as the ITT effect divided by the first-stage compliance rate under assumptions like exclusion restriction (assignment affects outcome only through receipt) and monotonicity; Bayesian methods can further incorporate prior information for inference.[19] These approaches maintain the Rubin model's focus on potential outcomes while addressing practical deviations from ideal compliance.[19]Observational Data Methods
In observational studies, causal effects under the Rubin causal model are identified when treatment assignment is independent of potential outcomes conditional on observed covariates, known as the ignorability or conditional independence assumption. This assumption states that the potential outcomes are independent of the treatment indicator given the covariates: \{Y(1), Y(0)\} \perp Z \mid X.[20] It also requires positivity, ensuring $0 < \Pr(Z=1 \mid X) < 1 for all X in the support.[20] Under these conditions, methods can emulate randomization by balancing covariate distributions between treated and untreated groups. Propensity score methods leverage the balancing score e(X) = \Pr(Z=1 \mid X), the probability of treatment given covariates, to reduce dimensionality and achieve covariate balance.[20] Within levels of the propensity score, treatment assignment is independent of covariates, enabling unbiased estimation of causal effects.[20] Common implementations include matching, where treated units are paired with untreated units having similar propensity scores to form a pseudo-randomized sample; stratification, which divides the sample into strata based on propensity score quantiles and estimates effects within each before averaging; and weighting, such as inverse probability weighting (IPW), where weights are $1/e(X) for treated and $1/(1-e(X)) for untreated to create a pseudo-population with balanced covariates.[20] These approaches reduce bias from observed confounding but require accurate estimation of the propensity score, often via logistic regression.[20] Regression adjustment estimates causal effects by modeling the outcome as a function of treatment and covariates, assuming a linear form such as E[Y \mid Z, X] = \beta_0 + \beta_1 Z + \gamma' X for continuous outcomes, with the average treatment effect identified as \beta_1 under ignorability.[21] This method controls for confounding by including covariates in the regression, but its performance depends on correct model specification; misspecification can lead to bias, particularly with nonlinear relationships or high-dimensional covariates.[21] Monte Carlo studies have shown that regression adjustment often reduces bias effectively when combined with matched sampling, though it may increase variance compared to propensity-based methods in unbalanced settings.[21] Doubly robust estimators combine propensity score and outcome regression models, remaining consistent if at least one is correctly specified.[22] For the average treatment effect, a common form is the augmented inverse probability weighting (AIPW) estimator: \hat{\tau}_{DR} = \frac{1}{n} \sum_{i=1}^n \left[ \left( \frac{Z_i (Y_i - \hat{m}(1,X_i))}{\hat{e}(X_i)} - \frac{(1-Z_i) (Y_i - \hat{m}(0,X_i))}{1-\hat{e}(X_i)} \right) + (\hat{m}(1,X_i) - \hat{m}(0,X_i)) \right], where \hat{e}(X) is the estimated propensity score, \hat{m}(z,X) = E[Y \mid Z=z, X] is the outcome model, and the first term corrects for estimation errors.[22] This approach provides efficiency gains over single-model methods and greater protection against model misspecification, making it widely adopted in observational data analysis.[22] When unmeasured confounding is suspected, sensitivity analysis quantifies how violations of ignorability affect estimates. One prominent method uses the E-value, which calculates the minimum strength of association that an unmeasured confounder must have with both treatment and outcome to fully explain away an observed effect.[23] For a risk ratio RR, the E-value is RR + \sqrt{RR(RR-1)}, indicating robustness; for example, an E-value of 3 suggests the confounder must be at least three times as strongly associated with treatment and outcome as measured covariates to nullify the effect.[23] This tool facilitates transparent reporting of potential biases without specifying the confounder, aiding interpretation in non-experimental settings.[23]Examples
Illustrative Example
To illustrate the core concepts of the Rubin causal model, consider a hypothetical randomized experiment evaluating the effect of a job training program on employment outcomes for 100 unemployed individuals. The treatment indicator W_i is 1 if individual i is assigned to the program and 0 otherwise, while the outcome Y_i is a binary measure of employment six months later (1 if employed, 0 if not). Under the Rubin causal model, each individual has two potential outcomes: Y_i(0) under no training and Y_i(1) under training.[24] The model assumes the stable unit treatment value assumption (SUTVA) holds, meaning the potential outcome for any individual depends only on their own treatment assignment and not on the assignments of others, with a consistent version of the treatment applied to all. This allows the individual causal effect to be defined as \tau_i = Y_i(1) - Y_i(0), though it cannot be observed for any unit due to the fundamental problem of causal inference: only one potential outcome is realized and observable for each individual, as Y_i = Y_i(1) \cdot W_i + Y_i(0) \cdot (1 - W_i).[24] For concreteness, suppose the potential outcomes have been hypothetically assigned for a subset of four individuals, as shown in the table below. The true individual effects vary, highlighting heterogeneity, and the average treatment effect (ATE) across these units is \tau = \frac{1}{4} \sum_{i=1}^4 \tau_i = 0.| Unit | Y_i(0) | Y_i(1) | \tau_i = Y_i(1) - Y_i(0) |
|---|---|---|---|
| 1 | 0 | 1 | 1 |
| 2 | 1 | 1 | 0 |
| 3 | 0 | 0 | 0 |
| 4 | 1 | 0 | -1 |