Fact-checked by Grok 2 weeks ago

Regression discontinuity design

Regression discontinuity design (RDD) is a quasi-experimental for estimating causal effects of or interventions by exploiting a known discontinuity in the probability of at a specific value of a continuous running variable, such as a or . In this , individuals just above and below the are assumed to be similar in all respects except for receipt, allowing researchers to compare outcomes across the threshold to isolate the effect. RDD was first introduced by Donald L. Thistlethwaite and in 1960 as an alternative to traditional ex post facto experiments for evaluating program impacts, such as the effects of scholarships on students' career aspirations based on National Merit Scholarship qualifying test scores. The approach gained prominence in during the late 1990s and early 2000s, with seminal applications addressing policy questions where was infeasible. There are two main variants: sharp RDD, where treatment assignment is strictly determined by the (e.g., eligibility for a program above a score ), and fuzzy RDD, where the induces a discontinuous jump in the treatment probability rather than certainty, often requiring instrumental variable techniques for identification. The validity of RDD relies on core assumptions, including the of potential outcome distributions across the in the absence of and the absence of precise of the running by individuals near the . These assumptions enable local near the , mimicking a and providing high compared to other observational methods like difference-in-differences or matching, which often require stronger parallel trends or overlap conditions. However, estimates are local to the , limiting , and implementation involves challenges such as optimal selection and functional form specification to minimize bias. RDD has been widely applied across disciplines including , , and to evaluate policies with clear eligibility rules. Notable examples include estimating the effects of smaller sizes on student test scores in using enrollment caps at 40 students per (Angrist and Lavy, 1999), the incumbency advantage in U.S. House elections by comparing outcomes in races decided by narrow vote margins (Lee, 2008), and the impact of eligibility at age 65 on utilization and mortality (Card, Dobkin, and Maestas, 2008). These studies demonstrate RDD's utility in credibly identifying causal effects in real-world settings with administrative data.

Introduction

Definition and Core Idea

Regression discontinuity design (RDD) is a quasi-experimental method for estimating causal effects when is determined by a known rule based on whether a continuous , called the running variable, exceeds a specific threshold known as the cutoff. In this framework, the running variable—often denoted as X and also referred to as the forcing variable—serves as the basis for eligibility, such that units with X \geq c are assigned to (with treatment indicator D_i = 1), while those with X < c are assigned to control (D_i = 0), where c is the cutoff value. This deterministic rule creates a sharp discontinuity in treatment status precisely at the cutoff, enabling causal identification under the assumption that potential outcomes and other covariates are continuous functions of X around c. The core idea of RDD exploits this discontinuity to isolate the treatment effect: absent the treatment, the expected outcome would vary smoothly across the cutoff, but the intervention induces a "jump" in the outcome at c, which represents the causal impact. This jump allows estimation of the local average treatment effect (LATE) for individuals near the threshold, akin to a randomized experiment in a local neighborhood around c, provided there is no precise manipulation of the running variable at the boundary. Conceptually, the setup can be visualized in a scatterplot of the outcome Y against the running variable X, with fitted regression lines on either side of the cutoff c; the lines are continuous and smooth within treatment and control regions but exhibit a vertical discontinuity at c, where the difference in heights captures the LATE. Common estimation approaches, such as local polynomial regression, model this conditional expectation to quantify the discontinuity while controlling for the smooth trend in X.

Historical Background

The regression discontinuity design (RDD) originated in the field of psychology and education during the 1960s as a quasi-experimental approach to causal inference. It was first proposed by Donald L. Thistlethwaite and Donald T. Campbell in their 1960 paper, where they applied it to evaluate the impact of National Merit Scholarship certificates on students' career aspirations, college enrollment, and performance, leveraging a cutoff in test scores for award eligibility. This design treated the score threshold as a natural assignment rule, allowing estimation of treatment effects by comparing outcomes just above and below the cutoff, under the assumption of continuity in potential outcomes absent the intervention. Early extensions incorporated interrupted time series elements into quasi-experimental designs. Following its initial adoption in educational evaluations, interest in RDD waned in statistics by the early 1980s and in psychology by the early 1990s, with some methodological advancements in the 1970s, including bias correction techniques proposed by researchers like Arthur Goldberger (1972) and Donald Rubin (1974), as well as nonparametric approaches. The design experienced a significant revival in economics starting in the late 1990s, driven by econometricians seeking robust alternatives to randomized experiments in observational data. Seminal applications included Joshua Angrist and Victor Lavy's 1999 study on class size effects in Israel using enrollment cutoffs, and Wilbert van der Klaauw's 2002 analysis of financial aid impacts on college enrollment. David S. Lee's 2008 work further formalized RDD's credibility by demonstrating its local randomization properties in close U.S. House elections, equating it to randomized experiments near the cutoff. This resurgence in the 2000s integrated RDD into broader causal inference frameworks, with key theoretical contributions from Jinyong Hahn, Petra Todd, and Wilbert van der Klaauw (2001) on identification and estimation. The design evolved from its roots in education policy—where it assessed interventions like scholarships—to widespread use across social sciences, including labor economics (e.g., unemployment benefits) and political economy. Influential reviews, such as Guido W. Imbens and Thomas Lemieux's 2008 guide, synthesized practical implementation and spurred adoption in diverse applications. By the 2010s, RDD had become a standard tool in empirical research, with hundreds of studies published.

Examples

Canonical Sharp RDD Example

A canonical example of a sharp regression discontinuity design (RDD) considers the assignment of merit-based scholarships to high school students based on their scores in a standardized aptitude test, such as the Preliminary SAT (PSAT). In this setup, the running variable is the test score X_i, and treatment assignment D_i is deterministic: students with scores at or above a fixed cutoff c receive the scholarship (D_i = 1), while those below the cutoff do not (D_i = 0). The outcome of interest is the students' career aspirations or plans for advanced study, Y_i, which may be influenced by the financial and motivational benefits of the award. In data from such a scenario, a scatterplot of individual outcome values against test scores reveals a smooth, continuous distribution of the running variable across the cutoff, with no bunching or gaps in the density of scores at c, suggesting students cannot precisely manipulate their test performance to cross the threshold. However, the fitted regression line or binned averages of the outcome show a distinct upward jump at the cutoff, where outcomes for students just above c are markedly higher than for those just below, reflecting the treatment's impact. Non-parametric approaches can visualize this discontinuity by smoothing the data locally around the cutoff (see ). The size of this jump in the outcome at the cutoff estimates the causal effect of the scholarship for students with test scores near c, as the sharp change in treatment probability from 0 to 1 occurs while potential confounders vary continuously across the threshold. This local average treatment effect captures the impact for the subpopulation induced to receive treatment by the cutoff rule. Under the key assumption that individuals have only imprecise control over the running variable near the cutoff—preventing strategic sorting—the sharp RDD can be interpreted through a local randomization lens, where treatment assignment mimics a randomized experiment within a narrow bandwidth around c. This equivalence supports unbiased estimation by treating units just below and above the cutoff as comparable groups, with the discontinuity isolating the scholarship's true effect from selection biases.

Real-World Policy Application

One prominent real-world application of regression discontinuity design (RDD) is the evaluation of Israel's class size policy, which mandates splitting classes exceeding 40 students into two sections, effectively reducing average class size at that threshold. This policy, rooted in a historical guideline from the 12th-century scholar limiting classes to 40 pupils, creates a sharp discontinuity in treatment assignment based on enrollment numbers as the running variable. Researchers and exploited this rule in their 1999 study using administrative data from Israeli public schools in 1995, focusing on fourth- and fifth-grade students where enrollment discontinuities were evident. The analysis revealed a clear discontinuity in pupil-teacher ratios at the 40-student cutoff, with classes just above the threshold experiencing approximately a 20-student reduction in average class size compared to those just below, confirming the sharp RDD framework where treatment is deterministically assigned at the threshold. On outcomes, the study measured student achievement via standardized test scores in Hebrew, mathematics, and English; it found significant positive jumps in math and reading scores for fourth and fifth graders at the cutoff, equivalent to about 0.2 to 0.3 standard deviations higher performance in smaller classes, though no such effects appeared for third graders. These findings highlight how smaller classes can boost achievement in later primary grades, influencing subsequent policy discussions on education funding. Applying RDD in this context presented challenges, particularly in selecting an optimal bandwidth around the cutoff to balance bias reduction and precision, as overly narrow windows risked insufficient observations near the threshold due to natural variation in enrollment. Data considerations included ensuring the running variable—total grade-level enrollment—was continuously distributed and not manipulated by schools, with the dataset covering over 2,000 schools but yielding limited cases exactly at the discontinuity points, necessitating robust local polynomial regressions to estimate effects reliably. Despite these hurdles, the design's quasi-experimental nature provided credible causal evidence on class size impacts without randomization.

Methodology

Treatment Assignment and Running Variable

In regression discontinuity design (RDD), the running variable, often denoted as X, is a continuous covariate that determines treatment assignment through a predefined cutoff value c. For instance, in educational settings, X might represent a standardized test score, while in policy evaluations, it could be an individual's age or income level. Treatment D is assigned such that D = 1 if X \geq c and D = 0 otherwise, creating a sharp discontinuity in treatment probability at the cutoff. This setup allows researchers to compare outcomes for units just above and below c, exploiting the discontinuity for causal inference. In the sharp RDD framework, treatment assignment exhibits perfect compliance, meaning the treatment status D is a deterministic step function of the running variable, jumping discontinuously from 0 to 1 precisely at the cutoff c. This contrasts with scenarios of imperfect compliance and ensures that all eligible units above c receive treatment without exception. The original conceptualization of this design traces back to evaluations of scholarship awards based on test scores, where the cutoff enforced strict assignment rules. Selecting an appropriate running variable is crucial for the validity of RDD; it should ideally be continuous and precisely measured, though discrete variables can be accommodated with suitable methods, and relevant to the treatment decision, while remaining exogenous to the outcome except through the cutoff-induced discontinuity. For example, age serves as a suitable running variable in studies of pension eligibility because it is continuously distributed and not manipulable by individuals. The variable should also exhibit positive density around c to ensure sufficient observations for comparison. Improper choice, such as a discrete or easily manipulated variable, can undermine the design's quasi-experimental properties. The bandwidth refers to the restricted range of the running variable centered around the cutoff c—typically symmetric and narrowing as the analysis focuses on units closer to c—which defines the local scope for identification in RDD. By limiting the sample to this bandwidth, researchers isolate the causal effect at the discontinuity, treating units within it as comparable except for treatment status. Bandwidth selection balances bias from model misspecification against variance from fewer observations, often guided by data-driven methods to optimize precision. This local focus ensures that the estimated effect approximates the average treatment effect for marginal units at c.

Sharp Regression Discontinuity

In the sharp regression discontinuity design (SRD), treatment assignment is strictly deterministic, with units receiving treatment if their running variable X exceeds a fixed cutoff c, and no treatment otherwise. This setup contrasts with scenarios of partial compliance and ensures that the treatment indicator T = 1\{X \geq c\} perfectly predicts treatment receipt. The statistical model for the observed outcome Y is typically expressed as Y = \beta_0 + \beta_1 T + f(X) + \varepsilon, where \beta_1 captures the treatment effect at the cutoff, f(X) is a smooth function of the running variable, and \varepsilon is an error term. This formulation assumes that any discontinuity in the conditional expectation of Y given X at c arises solely from the treatment assignment. Under the potential outcomes framework, each unit has untreated outcome Y(0) and treated outcome Y(1), with the observed outcome given by Y = (1 - T) Y(0) + T Y(1). The key assumption in SRD is the continuity of the conditional expectations E[Y(0) \mid X = x] and E[Y(1) \mid X = x] at x = c, implying that potential outcomes evolve smoothly around the cutoff absent treatment. This continuity holds if units cannot precisely manipulate X to cross c, ensuring a locally random assignment akin to an experiment near the threshold. The treatment effect is identified by the jump in the regression function at the cutoff: \lim_{x \downarrow c} E[Y \mid X = x] - \lim_{x \uparrow c} E[Y \mid X = x] = E[Y(1) - Y(0) \mid X = c]. This estimand represents the average treatment effect (ATE) for the subpopulation at X = c, as treatment assignment is deterministic and all units comply with the rule, effectively making it the local ATE without the complications of noncompliance seen in other designs.

Estimation Techniques

Non-Parametric Methods

Non-parametric methods in (RDD) emphasize flexible, data-driven estimation near the cutoff without assuming a specific global functional form for the conditional expectation of the outcome. These approaches leverage the local continuity assumption to isolate the treatment effect as a discontinuity in the outcome at the cutoff point, typically using kernel-weighted or polynomial-based smoothers restricted to a narrow bandwidth around the cutoff. A cornerstone of non-parametric estimation in RDD is local polynomial regression, which fits low-order polynomials separately to observations on each side of the cutoff to approximate the underlying regression functions. Common choices include linear (order p=1) or quadratic (order p=2) polynomials, with linear often preferred for its bias reduction properties at boundaries. In a sharp RDD, the treatment effect is recovered as the vertical jump in the fitted regression functions evaluated at the cutoff. The estimation proceeds by minimizing weighted least squares within a bandwidth h, yielding fitted values of the form \hat{Y}(x) = \sum_{k=0}^{p} \gamma_k (x - c)^k \quad \text{for} \quad x < c, and analogously for x \geq c with coefficients \beta_k, where c is the cutoff and weights decline with distance from c. The local treatment effect estimate is then \hat{\tau} = \beta_0 - \gamma_0, representing the difference in intercepts at c. Higher-order terms allow for curvature in the regression functions but require careful order selection to balance bias and variance. Bandwidth selection is critical to non-parametric RDD, as it determines the trade-off between bias (from including distant observations) and variance (from using few observations). Optimal h minimizes asymptotic mean squared error and can be computed via plug-in rules, such as the Imbens-Kalyanaraman method, which derives a data-dependent bandwidth tailored to local linear regression under standard smoothness assumptions. Cross-validation alternatives further refine h by minimizing out-of-sample prediction error within the bandwidth. Results from non-parametric estimation are commonly visualized through scatterplots of the outcome against the running variable, overlaid with fitted polynomial curves and pointwise confidence bands on either side of the cutoff. These plots facilitate intuitive assessment of the discontinuity's presence and magnitude while highlighting sensitivity to bandwidth or polynomial order choices.

Parametric Methods

Parametric methods in regression discontinuity design (RDD) involve fitting structured global models to the entire sample to estimate the treatment effect at the cutoff, leveraging functional form assumptions for efficiency in larger datasets. These approaches contrast with local methods by imposing a specific polynomial structure across the running variable, which can reduce variance at the cost of potential bias if the assumed form is misspecified. A common parametric specification is global polynomial regression, where the outcome Y is modeled as Y = \beta_0 + \tau D + \sum_{k=1}^p \beta_k X^k + \sum_{k=1}^p \delta_k D \cdot X^k + \varepsilon. Here, D is the treatment indicator (1 if X \geq c, 0 otherwise), X is the running variable centered at the cutoff c, and the interaction terms D \cdot X^k allow the polynomial to differ on either side of the cutoff. The treatment effect is captured by \tau, representing the discontinuity in the conditional expectation function at c. This flexible form accommodates nonlinearity while maintaining continuity in the absence of treatment. Piecewise polynomials, including splines, extend this by explicitly fitting separate polynomial functions to the left and right of the cutoff, ensuring continuity in the regression function except for the treatment-induced jump. For instance, cubic splines can be used to model smooth transitions, with knots at the cutoff to allow differing curvatures on each side without global high-order terms. These methods are particularly useful when the relationship between the outcome and running variable exhibits distinct patterns pre- and post-cutoff. Higher-order polynomial terms in these models reduce bias by better approximating the true underlying functions but increase variance, as coefficients for distant terms become imprecise and sensitive to outliers far from the cutoff. Researchers must balance this trade-off, often selecting orders (e.g., quadratic or cubic) based on fit criteria like AIC or by reporting results across multiple specifications to assess robustness. Excessive orders, such as fifth or higher, are generally discouraged due to overfitting risks. In implementation, parametric RDD estimates are typically obtained via ordinary least squares, with robust standard errors clustered at the running variable values (or discrete bins if applicable) to account for heteroskedasticity and potential serial correlation near the cutoff. This clustering adjusts for dependencies within groups defined by the running variable, enhancing inference validity. Bandwidth selection is less central than in local methods, though sensitivity checks across sample restrictions around the cutoff are recommended.

Assumptions

Continuity of Potential Outcomes

In regression discontinuity design (RDD), the continuity of potential outcomes represents the core identifying assumption that enables causal inference at the treatment cutoff. Formally, this assumption posits that the conditional expectations of the potential outcomes under no treatment, E[Y(0) | X = x], and under treatment, E[Y(1) | X = x], are continuous functions of the running variable X at the cutoff value c. Absent the treatment, no other factors induce a discontinuity in these expectations, ensuring that the only jump in observed outcomes at c stems from the treatment effect itself. This framework, where X serves as the assignment variable, underpins the design's validity by isolating the treatment's local impact. The continuity assumption facilitates counterfactual prediction across the cutoff, allowing researchers to estimate the average treatment effect at c as the difference in the limits of the conditional expectation of the observed outcome Y: \tau = \lim_{x \downarrow c} E[Y | X = x] - \lim_{x \uparrow c} E[Y | X = x]. Here, observations just below c (untreated) provide a counterfactual for those just above (treated), and vice versa, under the smoothness condition. This local identification mimics the structure of a randomized experiment near c, where assignment approximates randomness due to imprecise control over X, thereby balancing unobserved confounders on either side of the threshold. Violations of this assumption arise when unobserved determinants of outcomes exhibit discontinuities at c, such as through sorting behaviors where individuals strategically position themselves relative to the cutoff, altering the composition of groups on either side and creating jumps in untreated potential outcomes. For instance, in educational settings, students or schools may bunch at enrollment thresholds, leading to heterogeneous untreated outcomes. Similarly, anticipation effects—such as pre-cutoff behavioral changes in response to known future treatment eligibility—can induce discontinuities in E[Y(0) | X = x], as seen in age-based policy implementations where individuals adjust actions in advance. These violations undermine the counterfactual validity, potentially biasing treatment effect estimates.

No Manipulation of the Running Variable

A key assumption in the (RDD) is that agents cannot precisely manipulate the running variable X near the cutoff c, implying that the probability density function of X, denoted f(X), remains continuous at c. This continuity ensures no systematic bunching of observations immediately above or below the threshold, which would otherwise indicate strategic sorting by individuals seeking to access or avoid . Without this assumption, the local randomization property of RDD—treating observations just above and below c as comparable—could be violated, as manipulators would self-select into the preferred based on their anticipated outcomes. The rationale for this assumption stems from the expectation that precise control over X would distort its distribution, creating detectable discontinuities. For instance, if treatment confers benefits, agents might adjust X to cluster just above c, such as by retaking standardized tests multiple times to narrowly meet a scholarship eligibility score, resulting in excess density on the treatment side and a gap below. Conversely, if treatment is undesirable, bunching might occur just below c. Such behavior undermines the design's validity because it introduces , where treated and control units near c differ systematically in unobservables related to the outcome. This assumption is particularly crucial in , where treatment assignment is deterministic based on X \geq c, but it also applies to where manipulation could still affect compliance probabilities. To assess potential violations intuitively, researchers examine the histogram or kernel density estimate of X around c; a visible jump in height at the threshold suggests manipulation. The McCrary test formalizes this by estimating the density separately on each side of c using local polynomial regression on binned data and testing for a significant difference in the limiting densities, providing evidence against continuity if rejected. In applications like close elections where vote shares determine funding, the absence of density jumps supports the assumption, whereas discontinuities in school enrollment histograms have revealed manipulation in class size cap studies. From a policy perspective, the no manipulation assumption encourages the selection of running variables and cutoffs that are inherently difficult to game, enhancing RDD credibility. Examples include birthdates for determining school entry eligibility, which parents cannot easily alter, in contrast to self-reported income thresholds or adjustable exam scores that invite strategic behavior. This design choice aligns with the broader continuity assumption for potential outcomes by ensuring the input process generates quasi-random variation near c, though it focuses specifically on the ex ante distribution of X rather than outcome smoothness.

Validity Testing

Density and Manipulation Tests

Density and manipulation tests in regression discontinuity designs (RDD) assess whether individuals or units can manipulate the running variable around the cutoff c, which would violate the no-manipulation assumption essential for valid causal inference. These tests focus on detecting discontinuities in the density of the running variable, as manipulation often leads to bunching or sorting behavior near the threshold. The McCrary density test, introduced in a seminal 2008 paper, provides a formal method to evaluate continuity in the running variable's density function at the cutoff. It involves estimating kernel densities separately to the left (f^-) and right (f^+) of c using local linear regression on a finely binned histogram of the running variable. Specifically, for points r near c, the density \hat{f}(r) is obtained by minimizing a weighted least squares objective with a triangular kernel, applied conditionally on whether r is above or below c: L(\phi_1, \phi_2, r) = \sum_{j=1}^J \left[ Y_j - \phi_1 - \phi_2 (X_j - r) \right]^2 K\left( \frac{X_j - r}{h} \right) \left[ 1(X_j > c) 1(r \geq c) + 1(X_j < c) 1(r < c) \right], where Y_j are bin counts, X_j are bin midpoints, K(t) = \max\{0, 1 - |t|\}, and h is the bandwidth; the density estimate is then \hat{f}(r) = \hat{\phi}_1. The test statistic is the log difference \hat{\theta} = \ln \hat{f}^+(c) - \ln \hat{f}^-(c), with a t-statistic based on the asymptotic standard error: \hat{\sigma}_\theta = \sqrt{\frac{1}{nh} \frac{24}{5} \left( \frac{1}{\hat{f}^+(c)} + \frac{1}{\hat{f}^-(c)} \right)}, under the null of no discontinuity (\theta = 0). A significant jump indicates potential manipulation, as seen in applications like close elections where candidates may inflate vote margins. Bunching analysis complements the density test by quantifying the extent of manipulation through the excess mass in the distribution just below or above the cutoff. This involves fitting a flexible counterfactual density (e.g., using polynomials or local polynomials) to the running variable away from c and measuring the area of the "bunch" relative to this baseline, often expressed as the proportion of excess observations B = \int_{c-\delta}^{c} (\hat{f}(x) - f_{\text{counterfactual}}(x)) dx for some window \delta. In RDD contexts, significant excess mass below c (and corresponding deficit above) signals strategic sorting, as agents shift to the preferred side of the threshold. This approach, adapted from elasticity estimation, has been formalized for density discontinuities in RDD to estimate manipulation magnitudes. These tests have limited power against imperfect or non-monotonic manipulation, potentially yielding false negatives if agents face frictions or cannot precisely control the running variable, such as random components in scoring systems. For instance, the McCrary test may fail to detect issues in settings with small sample sizes or smooth densities, underscoring the need for complementary validity checks. Interpretation should consider context, as a null result supports but does not prove the no-manipulation assumption.

Covariate Continuity Tests

Covariate continuity tests serve as a key falsification strategy in regression discontinuity design (RDD) by examining whether observable covariates, which should theoretically remain continuous at the cutoff c, exhibit discontinuities in their conditional expectations. These tests involve applying the same RDD estimation procedures—such as local —to baseline covariates like age, income, education, or demographic characteristics, rather than the outcome variable of interest. Under the assumption extended to observables (detailed in the Continuity of Potential Outcomes section), no significant jumps should occur at c, as treatment assignment should not affect predetermined variables. A seminal application appears in analyses of U.S. House elections, where tests on pre-determined district characteristics like past vote shares and candidate experience showed no discontinuities near the 50% vote share threshold, supporting the design's validity. When multiple covariates are tested, researchers must account for the to avoid spurious rejections due to chance. A common adjustment is the , which divides the nominal significance level (e.g., 0.05) by the number of covariates tested, such as 20 in evaluations of labor market reforms, yielding a stricter like 0.0025. Joint tests, such as or chi-squared statistics aggregating discontinuities across covariates, can also assess overall . For instance, in RDD studies, these methods confirm when p-values remain insignificant post-correction, indicating no systematic differences near the cutoff. Significant discontinuities in covariates signal potential violations of RDD assumptions, suggesting where individuals or units sort across the cutoff based on observables, implying possible sorting on unobservables or non-exogeneity of the running variable. Such failures undermine the local interpretation, as they indicate factors influencing both and outcomes. To further validate, placebo cutoff tests apply RDD estimation to covariates at arbitrary points away from the true c, expecting no discontinuities; unexpected jumps at these fake thresholds cast additional doubt on the design.

Advantages and Limitations

Key Advantages

One of the primary strengths of the (RDD) lies in its transparency, as the treatment assignment rule is explicitly defined by a known cutoff value in the running variable, making the identification strategy straightforward to understand and replicate. This clarity is often enhanced through graphical representations, such as scatterplots of the outcome against the running variable with fitted regression lines on either side of the cutoff, which visually demonstrate the discontinuity and build credibility without relying on complex assumptions. RDD excels at estimating local causal effects for individuals near the , where status changes discontinuously, effectively mimicking a in that narrow subpopulation and minimizing from observed or unobserved covariates that vary smoothly across the . By focusing on this "marginal" group—those just above and below the —the isolates the impact without needing to for global selection biases that plague broader observational analyses. The design's robustness to unobservables stems from the continuity assumption, which posits that potential outcomes and other covariates are smooth functions of the away from the , allowing omitted variables to be handled as long as they do not exhibit jumps at the threshold. This local randomization-like property ensures that units on either side of the are comparable in expectation, providing high even in the presence of factors that are not perfectly measured. RDD has gained empirical prevalence in policy evaluation due to the abundance of natural experiments featuring sharp cutoffs, such as age-based eligibility thresholds for scholarships or retirement benefits, enabling credible in fields like , labor , and . Since the late , its adoption has surged, with hundreds of applications in demonstrating its practicality for assessing real-world interventions.

Main Limitations

One primary limitation of the regression discontinuity design (RDD) is that it identifies causal effects that are local to the point, meaning estimates apply only to individuals or units very close to the and may not generalize to the broader or policy-affected group. This restriction on arises because the design exploits discontinuity at a specific point, treating observations near the as approximately randomized, but effects farther away remain unidentified without additional assumptions. For instance, in evaluating eligibility based on test scores, the estimated impact reflects outcomes for students scoring just above or below the , not necessarily for all eligible students. RDD estimates are also highly sensitive to the choice of , the range of the running included around the ; narrower bandwidths reduce but often result in small sample sizes and high variance due to limited , while wider bandwidths increase precision at the risk of incorporating heterogeneous effects that violate assumptions. Optimal bandwidth selection methods, such as cross-validation or bias-corrected approaches, aim to balance these trade-offs, but results can still vary substantially across reasonable choices, necessitating robustness checks. If the no-manipulation assumption fails—such as when individuals strategically sort around the cutoff to gain treatment—the design can produce severe bias, as the composition of groups on either side of the threshold becomes systematically different. This risk is particularly acute in settings with high stakes, like elections or financial aid, where density tests can detect bunching but cannot fully rule out subtle manipulation. Finally, RDD requires dense and precise data around the cutoff to achieve reliable estimates; sparse or coarse data, common for rare events or administrative thresholds, can lead to imprecise results or infeasible implementation, often making the design costly or impractical in such contexts.

Extensions

Fuzzy Regression Discontinuity

In fuzzy regression discontinuity design (RDD), the assignment does not guarantee perfect at the c, but instead induces a discontinuity in the probability of receiving , P(D=1|X), where D indicates treatment receipt and X is the running variable. This setup arises when eligibility at the increases the likelihood of uptake without ensuring it for all units, such as due to self-selection or . Unlike the sharp RDD, where jumps from 0 to 1 exactly at c, the fuzzy case models , allowing for estimation of effects among those who respond to the . The fuzzy RDD identifies the local (LATE) for compliers—units that receive only if assigned eligibility—under the standard RDD assumptions of in potential outcomes and no of the running variable. Specifically, the effect \tau is given by the of the intent-to-treat (ITT) effect on the outcome Y to the first-stage effect on probability: \tau = \frac{\lim_{x \to c^+} E[Y|X=x] - \lim_{x \to c^-} E[Y|X=x]}{\lim_{x \to c^+} P(D=1|X=x) - \lim_{x \to c^-} P(D=1|X=x)}. This Wald estimand isolates the causal impact for compliers local to the , assuming the ( assignment) is valid. The sharp RDD emerges as a special case when the first-stage jump equals 1, implying full . Estimation in fuzzy RDD typically employs two-stage least squares (2SLS), using an indicator for being above the as the for actual receipt. In the first stage, the D is regressed on the and running variable (often via local polynomials or bandwidth-restricted samples); the second stage then uses the predicted to estimate the effect on Y. This approach accounts for endogeneity in while leveraging the discontinuity, with standard errors adjusted for the instrumental variables framework. A representative example is the evaluation of programs, where eligibility is determined by a cutoff, such as 80%. Students just above the become eligible, increasing the probability of receiving the scholarship (e.g., from 20% to 60%), but not all eligible students take it up due to factors like alternative funding or lack of interest. The fuzzy RDD then estimates the scholarship's impact on outcomes like college enrollment or graduation rates for compliers—those induced to accept by eligibility—by dividing the discontinuity in the outcome by the first-stage jump in take-up.

Regression Kink Design

The regression design (RKD) extends the regression discontinuity framework to settings where the intensity, denoted as T(X), is a of the running variable X but features a change in slope at a c. Unlike sharp discontinuities in assignment, RKD exploits this , where the of T(X) with respect to X jumps at c, inducing a corresponding change in the slope of the of the outcome Y given X. This approach is particularly suited for evaluations where varies smoothly but rules create predictable slope shifts, such as in benefit schedules or systems. Under standard assumptions analogous to those in regression discontinuity designs—continuity of potential outcomes and no of the running variable around the —the effect is identified by the ratio of the differences in the derivatives of the outcome and functions at the from either side. Specifically, the local effect \tau is given by \tau = \frac{ \left. \frac{d E[Y \mid X]}{dX} \right|_{c^+} - \left. \frac{d E[Y \mid X]}{dX} \right|_{c^-} }{ \left. \frac{d T(X)}{dX} \right|_{c^+} - \left. \frac{d T(X)}{dX} \right|_{c^-} }. This identifies the for units near the , capturing how the policy-induced change in intensity affects the outcome's responsiveness to the running variable. The method requires the density of X to be continuously differentiable at c to ensure valid local comparisons. Estimation in RKD typically involves fitting local polynomials separately on each side of the to recover the left- and right-hand slopes of E[Y \mid X], with the difference providing estimate. Common implementations use linear or polynomials within a around c, selected via cross-validation or rules like Imbens-Kalyanaraman, and robust standard errors to account for bias. Alternatively, parametric spline models can impose global functional forms while allowing slope discontinuities at c, offering efficiency gains when the functional form is plausible. These methods adapt non-parametric techniques from standard regression discontinuity but emphasize derivative estimation over level jumps. A prominent application of RKD arises in analyzing schedules, where marginal tax rates increase at income thresholds, creating kinks that alter the effective slope of after-tax with respect to pre-tax . This setup allows estimation of the elasticity of or with respect to the net-of- rate, informing optimal taxation . For instance, studies exploiting kinks in brackets have estimated elasticities around 0.2 to 0.5 for high- earners, highlighting behavioral responses to marginal rate changes without the bunching distortions seen at notches. Such analyses underscore RKD's value in quantifying subtle incentives in continuous environments.

References

  1. [1]
  2. [2]
    [PDF] Regression Discontinuity Designs in Economics - Princeton University
    Regression Discontinuity (RD) designs were first introduced by Donald L. Thistlethwaite and Donald T. Campbell. (1960) as a way of estimating treatment.
  3. [3]
    Using Maimonides' Rule to Estimate the Effect of Class Size on ...
    Jan 1, 1997 · Our use of Maimonides' rule can be viewed as an application of Campbell's (1969) regression-discontinuity design to the class size question. The ...
  4. [4]
    [PDF] Randomized experiments from non-random selection in U.S. House ...
    May 21, 2007 · These ideas are illustrated in an analysis of U.S. House elections, where the inherent uncertainty in the final vote count is plausible, which ...
  5. [5]
    [PDF] The Impact of Nearly Universal Insurance Coverage on Health Care ...
    Formal identification of an RD model that relates an outcome y (e.g., insurance coverage) to a treatment (Medicare age-eligibility) that depends on age 1a2 ...Missing: RDD | Show results with:RDD
  6. [6]
    [PDF] Regression Discontinuity Designs: A Guide to Practice
    In the FRD design, there are four regression functions that need to be estimated: the expected outcome given the forcing variable, both on the left and right of ...<|control11|><|separator|>
  7. [7]
    [PDF] ON INTERPRETING THE REGRESSION DISCONTINUITY DESIGN ...
    Our discussion highlights key distinctions between “locally randomized” RD designs and real experiments, including that statistical independence and random ...
  8. [8]
    Using Maimonides' Rule to Estimate the Effect of Class Size on ...
    Maimonides' rule of 40 is used here to construct instrumental variables estimates of effects of class size on test scores.
  9. [9]
    Regression-Discontinuity Analysis: An Alternative to the Ex-Post ...
    This study presents a method of testing casual hypotheses, called regression-discontinuity analysis, in situations where the investigator is unable to ...
  10. [10]
  11. [11]
    Regression Discontinuity Designs in Economics
    Regression Discontinuity Designs in Economics by David S. Lee and Thomas Lemieux. Published in volume 48, issue 2, pages 281-355 of Journal of Economic ...Missing: 2008 | Show results with:2008
  12. [12]
    Identification and Estimation of Treatment Effects with a Regression ...
    THE REGRESSION DISCONTINUITY (RD) data design is a quasi-experimental design with the defining characteristic that the probability of receiving treatment ...
  13. [13]
    [PDF] Local Polynomial Order in Regression Discontinuity Designs1
    Oct 21, 2014 · The seminal work of Hahn et al. (2001) has established local linear nonparametric regression as a standard approach for estimating the treatment ...
  14. [14]
    Optimal Bandwidth Choice for the Regression Discontinuity Estimator
    We investigate the choice of the bandwidth for the regression discontinuity estimator. We focus on estimation by local linear regression, which was shown to ...ERROR CRITERION AND... · FEASIBLE OPTIMAL... · AN ILLUSTRATION AND...
  15. [15]
    [PDF] Why High-Order Polynomials Should Not Be Used in Regression ...
    We argue that controlling for global high-order polynomials in regression discontinuity analysis is a flawed approach with three major problems: it leads to ...
  16. [16]
    [PDF] A Density Test - University of California, Berkeley
    This paper describes identification problems encountered in the regression discontinuity design pertaining to manipulation of the running variable and describes ...
  17. [17]
    Manipulation of the running variable in the regression discontinuity ...
    This paper develops a test of manipulation related to continuity of the running variable density function. The methodology is applied to popular elections to ...Missing: bunching | Show results with:bunching
  18. [18]
    [PDF] Including Covariates in the Regression Discontinuity Design
    Second, as pointed out in Imbens and Lemieux (2008), covariates may mitigate small sample biases in cases where the number of observations close to the ...
  19. [19]
    [PDF] Regression Discontinuity Designs: A Guide to Practice
    This paper was prepared as an introduction to a special issue of the Journal of Econometrics on regression discontinuity designs.
  20. [20]
    [PDF] Regression Discontinuity Designs - RD Packages
    Abstract. The regression discontinuity (RD) design is one of the most widely used nonexperimental methods for causal inference and program evaluation.
  21. [21]
    [PDF] Regression Kink Design: Theory and Practice
    In this paper, we discuss how the methods that are now widely used in RDD estimation can be extended to implement a regression kink design (RKD), a term first ...Missing: seminal | Show results with:seminal
  22. [22]
    [PDF] The Elasticity of Taxable Income with Respect to Marginal Tax Rates
    This paper critically surveys the large and growing literature estimating the elasticity of taxable income ... kink point of the Earned Income Tax Credit96.