Fact-checked by Grok 2 weeks ago

Field experiment

A field experiment is an empirical research method in which investigators manipulate one or more independent variables in a natural, real-world environment to assess causal effects on outcomes, typically through random assignment to treatment and control groups, thereby bridging the gap between controlled laboratory conditions and observational data.^[1]^[2] Emerging prominently in economics, psychology, and other social sciences since the early 2000s, field experiments encompass three primary variants: artefactual field experiments, which apply laboratory-style tasks to non-standard (real-world) subjects; framed field experiments, which incorporate field-specific contexts into tasks, commodities, or information sets; and natural field experiments, where participants engage in genuine behaviors unaware of their involvement in the study.^[3]^[4] These approaches enable rigorous causal inference via randomization while capturing behaviors in authentic settings, such as testing incentives in labor markets or policy interventions in developing economies.^[5]^[6] Field experiments excel in providing high ecological validity and external generalizability compared to lab-based studies, as they reflect participants' natural responses amid real stakes and distractions, though they often entail trade-offs like diminished control over extraneous variables, higher costs, and risks of ethical issues from real-world manipulations.^[7]^[5] Their defining impact includes transforming development economics, exemplified by the 2019 Nobel Prize in Economic Sciences awarded to Abhijit Banerjee, Esther Duflo, and Michael Kremer for pioneering randomized field experiments to evaluate poverty alleviation strategies, demonstrating tangible effects of interventions like deworming programs on education and health outcomes.^[8] Despite such successes, ongoing debates highlight limitations in scalability—small-scale trials may not replicate at population levels due to general equilibrium effects—and potential underestimation of long-term dynamics or spillovers, underscoring the need for complementary methods to ensure robust policy insights.^[9]^[5]

Definition and Fundamentals

Core Definition

A field experiment is a research methodology that incorporates controlled manipulation of independent variables and randomization, akin to laboratory experiments, but conducts these interventions within participants' natural environments rather than artificial settings.^[10] This approach enables the observation of behavioral responses under realistic conditions, where extraneous variables like social norms, incentives, and contextual factors influence outcomes in ways that laboratory isolation cannot replicate.^[1] By embedding experimental rigor into everyday contexts—such as workplaces, markets, or communities—field experiments prioritize ecological validity, allowing inferences about causal effects that generalize beyond contrived scenarios.^[11] Key characteristics include the deliberate assignment of treatments to randomly selected groups to minimize selection bias and confounding, while permitting natural participant behaviors and external influences to unfold.^[4] Unlike purely observational studies, field experiments isolate treatment effects through this randomization, providing stronger evidence for causality than correlational data; however, they sacrifice some precision due to incomplete control over environmental noise.^[12] In disciplines like economics and social sciences, variations such as natural field experiments involve covert interventions where subjects remain unaware of their participation, enhancing behavioral authenticity by avoiding Hawthorne effects.^[13] The primary aim is to bridge the gap between abstract theory and practical application, testing hypotheses in settings where decisions carry real stakes, such as financial or reputational costs.^[14] This method has proven particularly valuable for evaluating policy interventions, as evidenced by randomized trials in development economics that demonstrate causal impacts on outcomes like education or health adoption.^[7] Despite logistical challenges, field experiments yield findings with higher external validity, informing evidence-based decisions in complex systems.^[15]

Types and Variations

Field experiments are classified into types based on the extent to which they incorporate elements of the field environment, as delineated by Harrison and List in their 2004 taxonomy published in the Journal of Economic Literature.^[16] This framework evaluates experiments along dimensions such as subject pool (laboratory students versus field participants), informational environment (abstract versus context-specific), tasks (standardized lab procedures versus field-relevant activities), and stakes (hypothetical or symbolic versus consequential real-world outcomes).^[16] The classification emphasizes a spectrum from those retaining laboratory-like controls to those fully embedded in natural settings, enabling causal inference while varying ecological validity.^[16] Artefactual field experiments employ standard laboratory protocols but recruit participants from non-laboratory populations, such as professionals or consumers in their typical environments, to test behavioral responses under controlled conditions.^[16] For instance, researchers might administer trust games—abstract economic tasks typically run in university labs—to field subjects like market vendors, preserving internal validity through randomization while introducing real-world participant heterogeneity.^[16] This type mitigates selection biases from student samples but limits generalizability due to artificial tasks and low stakes.^[16] Framed field experiments extend artefactual designs by embedding laboratory tasks within field-relevant contexts, such as using actual commodities as incentives or providing domain-specific instructions to enhance realism without altering core procedures.^[16] An example includes offering real consumer goods as prizes in decision-making games conducted with shoppers, which introduces salient payoffs and contextual cues to better approximate natural motivations.^[16] These experiments balance experimental control with increased external validity, though they may still suffer from awareness effects if participants recognize the contrived elements.^[16] Natural field experiments represent the most field-oriented type, involving interventions in everyday environments with field participants undertaking routine tasks, often without subjects' knowledge of their involvement to minimize behavioral distortions like Hawthorne effects.^[16] Classic examples encompass altering donation solicitations during door-to-door campaigns or varying product prices in retail settings to observe purchasing patterns, leveraging randomization for causal identification amid genuine stakes and unobtrusive measurement.^[16] This variation excels in external validity for policy-relevant behaviors but demands careful ethical oversight and faces challenges in scalability and replication due to contextual dependencies.^[16] Variations across disciplines adapt these types to specific domains, such as economics' focus on incentive structures in markets or psychology's emphasis on social influence in workplaces.^[17] In political science, natural field experiments often test voter mobilization via randomized mailings or canvassing, as in Gerber and Green's 2000 study randomizing absentee ballot promotions to 29,380 households, which increased turnout by 8.7 percentage points. Public health applications frequently employ framed or natural designs for interventions like randomized condom distribution in clinics, prioritizing real-world compliance over lab abstraction.^[17] Ethical and logistical adaptations, including covert versus overt implementations, further diversify designs, with covert approaches favored for behavioral authenticity despite consent controversies.^[16]

Comparison to Laboratory and Quasi-Experiments

Field experiments incorporate random assignment to treatments in naturalistic environments, paralleling laboratory experiments in enabling causal identification by equalizing groups on observables and unobservables, but diverging in setting to prioritize real-world applicability over isolation of mechanisms.^[18] Laboratory experiments achieve superior internal validity through meticulous control of extraneous variables in sterile conditions, minimizing confounds and demand effects, yet their contrived stimuli and participant pools often yield low external validity, as behaviors elicited may not translate beyond the lab.^[19]^[20] Field experiments, by embedding interventions amid authentic incentives, distractions, and social dynamics, enhance ecological validity and generalizability, though they incur risks of spillover effects, non-compliance, and measurement noise that can dilute precision.^[21]^[22]

Aspect	Laboratory Experiments	Field Experiments
Internal Validity	High: Rigorous controls and randomization isolate effects.^[19]	Moderate to high: Randomization counters bias, but field confounds persist.^[18]^[20]
External Validity	Low: Artificial contexts limit real-world mimicry.^[23]	High: Natural settings capture genuine responses and scalability.^[21]
Implementation	Feasible and cost-effective with small samples.	Logistically demanding, prone to attrition and ethical hurdles.^[22]

Compared to quasi-experiments, which exploit natural variation or policy shocks without random assignment, field experiments furnish stronger causal evidence by directly manipulating treatments to avert selection bias and endogeneity inherent in non-randomized comparisons.^[24] Quasi-experimental approaches, such as difference-in-differences or instrumental variables, demand auxiliary assumptions—like parallel trends or exclusion restrictions—to approximate causality, rendering them more susceptible to model misspecification and unobserved heterogeneity.^[25]^[26] While both field experiments and quasi-experiments leverage real-world data for external validity, the former's randomization obviates reliance on such assumptions, yielding more robust inference when feasible, as evidenced in domains like economics where field trials have overturned correlational findings from quasi-designs.^[18]^[26]

Historical Development

Field experiments in the natural sciences trace their origins to early efforts in medicine and agronomy, where researchers sought to test interventions amid uncontrolled environmental variables. In 1747, Scottish physician James Lind conducted a comparative trial aboard HMS Salisbury during a blockade in the English Channel, selecting 12 sailors afflicted with scurvy and assigning them to six pairs receiving distinct dietary supplements, including citrus fruits for two pairs; the citrus-treated groups recovered rapidly, establishing a causal link between vitamin C sources and scurvy prevention in a real-world maritime setting.^[27] ^[28] This prospective, controlled intervention, though lacking full randomization, exemplified field experimentation by leveraging natural conditions to isolate treatment effects, influencing later clinical trial designs.17588-0/fulltext) Agricultural field experiments advanced systematically in the early 20th century at the Rothamsted Experimental Station in England. Statistician Ronald A. Fisher, employed there from 1919, developed randomized block designs to mitigate soil heterogeneity and other field variability in crop yield trials, publishing foundational principles in his 1926 paper "The Arrangement of Field Experiments," which emphasized replication, randomization, and local control for valid inference.^[29] ^[30] These methods enabled precise estimation of fertilizer, variety, and treatment effects on yields, forming the basis for modern experimental agriculture and extending to other natural sciences like ecology.^[31] In the social sciences, field experiments emerged later, borrowing randomization and control from natural science precedents to examine human behavior in naturalistic environments, often prioritizing ecological validity over laboratory isolation. Psychologist Charles Sanders Peirce introduced randomization into experimental designs in the 1880s to counter bias in psychophysical studies, laying groundwork for causal claims in behavioral contexts.^[32] By the mid-20th century, sociologists applied these techniques to group dynamics; for instance, Muzafer Sherif's 1954 Robbers Cave study randomized boys into competing camp groups to induce and resolve intergroup conflict, revealing realistic conditions for prejudice formation and reconciliation through superordinate goals.^[1] Such work highlighted field methods' utility for capturing spontaneous social processes, though early adoption was sporadic due to ethical concerns and logistical challenges in human subjects research.^[33]

Expansion in Economics Post-1990s

The expansion of field experiments in economics after the 1990s marked a shift toward randomized controlled trials (RCTs) as a primary tool for causal inference, particularly in development economics, where researchers sought to test micro-level interventions in real-world settings to address poverty and policy effectiveness.^[13] This period saw academics, rather than governments or firms, drive the methodology's adoption, contrasting with earlier waves of experimentation.^[7] Pioneering work began with Michael Kremer's 1997 RCT in western Kenya, which randomized textbook provision across schools to evaluate impacts on student learning, revealing minimal short-term gains and prompting scrutiny of conventional aid assumptions. By the early 2000s, this approach proliferated, with RCTs comprising a growing share of empirical studies; for instance, a 2016 analysis found that RCTs represented about 60% of development papers in top general-interest journals by that decade, up from negligible levels pre-1990s.^[34] Institutions formalized this expansion, amplifying its scale and rigor. In 2003, MIT economists Abhijit Banerjee and Esther Duflo co-founded the Abdul Latif Jameel Poverty Action Lab (J-PAL), which centralized RCT design, implementation, and replication, training researchers and partnering with governments in over 80 countries by 2020 to evaluate interventions like deworming programs and cash transfers. J-PAL's efforts contributed to over 1,000 RCTs by the mid-2010s, focusing on scalable policies; a notable example is the 2004-2007 PROGRESA evaluation in Mexico, which randomized cash incentives for school attendance and health checkups, demonstrating sustained increases in enrollment by 20% among poor households.^[35] This institutional push extended beyond development to labor economics, where field experiments tested hiring discrimination—such as Bertrand and Mullainathan's 2004 study sending identical resumes with Black- or White-sounding names, finding 50% lower callback rates for Black names—and behavioral nudges in savings or tax compliance.^[36] The methodology's growth reflected methodological advantages for external validity, though not without debate over generalizability from specific contexts like rural India or Kenya to broader economies.^[37] By the 2010s, field experiments diversified into artefactual designs (lab-like tasks in natural settings) and framed experiments (context-specific incentives), with annual publications rising steadily from fewer than 10 in 1995 to over 100 by 2015 across economics subfields.^[13] The 2019 Nobel Prize in Economics awarded to Banerjee, Duflo, and Kremer underscored this era's impact, recognizing RCTs' role in evidence-based policymaking, such as proving deworming's long-term income boosts of up to 20% in Kenyan cohorts tracked over 10 years.^[7] Despite critiques of narrow focus on marginal interventions over structural reforms, the post-1990s surge established field experiments as a cornerstone of empirical economics, with over 5,000 registered trials by 2020 emphasizing randomization to isolate causal effects amid confounding real-world variables.^[38]

Key Milestones and Nobel Recognition

The foundational principles of field experimentation emerged in agricultural science during the 19th century, with systematic trials at institutions like the Rothamsted Experimental Station in England, established in 1843, testing the effects of fertilizers, manures, and crop rotations on yields under varying soil conditions.^[10] A critical advancement came in the 1920s through Ronald A. Fisher's development of randomization techniques at Rothamsted, detailed in his 1925 book Statistical Methods for Research Workers and 1935 work The Design of Experiments, which introduced blocking and replication to minimize bias and enable causal inference from field data.^[19] ^[39] In the social sciences, field experiments expanded mid-20th century to evaluate public policies, exemplified by the U.S. negative income tax experiments from 1968 to 1982 across sites like New Jersey and Seattle, which randomized households to assess work incentives and poverty reduction under guaranteed income schemes.^[40] Economics saw limited use until the post-1990s surge, driven by integration with lab methods and natural settings; key early contributions included John List's 1990s-2000s studies on charitable giving and market behavior in real auctions, demonstrating how field randomization reveals deviations from theoretical predictions like altruism in dictator games.^[41] Nobel recognition underscores field experiments' causal rigor: the 2019 Sveriges Riksbank Prize in Economic Sciences awarded to Abhijit Banerjee, Esther Duflo, and Michael Kremer acknowledged their pioneering randomized evaluations of interventions like remedial education in India (Kremer's 1990s work) and deworming programs in Kenya, which generated empirical evidence on poverty alleviation by isolating treatment effects in developing economies.^[42] ^[43] This prize highlighted how thousands of field trials since the early 2000s, often via organizations like the Abdul Latif Jameel Poverty Action Lab (founded 2003), have shifted policy from intuition to data-driven interventions.^[42]

Methodological Framework

Design and Randomization Principles

Field experiments employ randomization as the cornerstone of their design to facilitate causal inference by creating comparable treatment and control groups in naturalistic environments. Random assignment ensures that, on average, observable and unobservable covariates are balanced across groups, minimizing selection bias and confounding factors that plague observational studies. This principle, rooted in the potential outcomes framework, allows researchers to estimate the average treatment effect (ATE) as the difference in outcomes between randomized groups, assuming the stable unit treatment value assumption (SUTVA) holds, which posits no interference between units and consistent treatment delivery.^[44]^[45] Design principles emphasize pre-specifying hypotheses, treatments, and outcomes to guard against data mining and p-hacking, with power calculations determining sample sizes sufficient for detecting effects of substantive magnitude—typically aiming for 80% power at a 5% significance level. Replication across multiple units per treatment arm is essential to reduce sampling error and enable generalizable estimates, while blocking or stratification groups similar units (e.g., by baseline characteristics like village size in agricultural trials) to enhance precision by accounting for heterogeneity. In field settings, cluster randomization is often preferred over individual assignment to mitigate spillovers, such as peer effects in school interventions, where entire clusters (e.g., classrooms) receive the treatment; this preserves SUTVA at the cluster level but requires adjustments for intra-cluster correlation in analysis, inflating standard errors by design effect factors that can exceed 2-10 depending on clustering strength.^[46]^[47] Randomization methods include simple random assignment for small-scale studies, stratified randomization to balance key covariates explicitly, and more advanced techniques like restricted randomization (e.g., minimizing maximum imbalance) when full randomness risks poor covariate balance in finite samples. Ethical and logistical constraints in field contexts—such as consent requirements or implementation feasibility—necessitate adaptive designs, like phased rollouts or encouragement designs for instrumental variables, but these must maintain ex ante comparability to uphold internal validity. Empirical evidence from development economics shows that deviations from pure randomization, such as convenience sampling within strata, can introduce imbalances unless corrected via re-randomization or covariance adjustments, underscoring the need for transparency in randomization protocols published in pre-analysis plans.^[46]^[45]

Data Collection and Analysis

In field experiments, data collection emphasizes capturing real-world behavioral responses through a combination of unobtrusive observation, administrative records, and targeted surveys to minimize interference with natural settings. Researchers often leverage existing data sources, such as transaction logs from retailers or public health registries, to record outcomes like purchase volumes or health metrics without relying solely on participant recall, which reduces self-report bias. For instance, in a 2001 study by Levitt on sumo wrestling integrity, video footage and match records provided objective outcome data, enabling analysis of anomalous win rates under randomization of match incentives. Similarly, economic field experiments frequently integrate digital tracking, like mobile app usage logs in a 2018 trial by Athey et al. on ride-sharing pricing, where geolocation and transaction data yielded high-frequency observations of demand elasticity. To address potential contamination between treatment and control groups in non-laboratory environments, data collection protocols incorporate spatial or temporal separation, such as cluster randomization by geographic units, ensuring independence of observations. Attrition and non-compliance are monitored via baseline covariates and follow-up mechanisms; for example, in Gerber and Green's 2000 voter mobilization experiments, turnout data from official election records mitigated dropout issues, achieving compliance rates over 90% through direct mail interventions. Quality control involves pre-testing instruments for validity, as seen in Karlan and Zin'sman 2010 microcredit trial, where loan repayment data from financial institutions was cross-verified against borrower surveys to detect measurement error.^[48] Analysis in field experiments primarily employs intent-to-treat (ITT) estimators to preserve randomization's integrity, calculating average treatment effects via difference-in-means tests or ordinary least squares regressions adjusted for covariates. For clustered designs, standard errors are clustered at the unit level to account for intra-group correlation; a 2014 meta-analysis by Gertler et al. on development interventions found that such adjustments increased standard errors by 20-50% compared to naive models, highlighting the importance of robust variance estimation.^[49] Power analyses precede implementation, targeting sample sizes sufficient for detecting effects of practical magnitude—e.g., a 5% shift in behavior—with 80% power at α=0.05, as recommended in Gerber et al.'s 2010 methodological overview. Heterogeneity of treatment effects is explored through subgroup regressions or interaction terms, with pre-registration of analysis plans to guard against p-hacking; Banerjee et al.'s 2015 review of 77 field experiments in development economics noted that failing to adjust for multiple comparisons inflated false positives by up to 30%.^[50] Instrumental variable approaches handle partial compliance, as in Angrist et al.'s 2002 analysis of lottery-based school assignments, where ITT divided by first-stage compliance yielded local average treatment effects on earnings. Sensitivity tests for threats like spillover effects use placebo outcomes or network models, ensuring causal claims rest on empirical robustness rather than assumption.

Implementation in Real-World Settings

Implementation of field experiments in real-world settings requires collaboration with organizations such as firms, governments, or NGOs to access natural environments and participants while embedding randomized treatments without substantial disruption to ongoing operations.^[41] Researchers typically partner with these entities to leverage existing infrastructure for treatment delivery and data access; for instance, economists John List and Steven Levitt collaborated with a travel business in 2008 to test dynamic pricing by randomly assigning 5% and 10% price increases to subsets of customers, observing behavioral responses through proprietary sales records.^[41] Randomization occurs at appropriate levels—individual, household, or cluster—to balance confounders, as in the New Jersey Income Maintenance Experiment (1968–1971), where 1,300 low-income households were randomly assigned to negative income tax variants and monitored via quarterly surveys for labor supply effects.^[41] Data collection integrates administrative records, behavioral observations, or follow-up surveys, prioritizing minimal interference to preserve ecological validity, though this demands careful protocol design to ensure compliance and reduce attrition, which plagued earlier social experiments like the Job Training Partnership Act evaluations in the 1980s.^[41] Logistical demands include securing buy-in from partners wary of risks to reputation or operations, necessitating pilot testing and phased rollouts; for example, Michael Kremer's 1990s–2000s experiments in Kenyan schools partnered with the government and NGOs to randomize deworming treatments across villages, achieving high compliance through community sensitization and yielding a 25% reduction in school absenteeism.^[10] Ethical protocols adapt to field constraints, often forgoing full informed consent in natural field experiments to avoid Hawthorne effects, but requiring institutional review board approval and safeguards against harm, as emphasized in guidelines from bodies like the Poverty Action Lab.^[7] Implementation scales via iterative designs, starting small to refine treatments before larger deployments, though challenges persist in maintaining internal validity amid uncontrolled externalities like weather or policy changes.^[41] Critiques highlight scalability limitations, as field experiments remain opportunistic and resource-intensive compared to lab analogs, with costs amplified by coordination—evident in the British Electricity Pricing Experiment (1966–1972), which randomized four tariff schemes among 3,420 customers but faced metering and billing integration hurdles.^[41] Partnerships mitigate these by sharing burdens, yet demand transparency on data ownership and results dissemination to sustain trust, particularly with governments implementing findings, as in development RCTs where local capacity-building ensures post-experiment sustainability. Overall, successful execution hinges on balancing experimental rigor with contextual fidelity, enabling causal estimates transferable to policy.^[41]

Strengths for Causal Inference

Enhanced Ecological Validity

Field experiments enhance ecological validity by administering treatments within participants' natural, everyday environments, thereby capturing behaviors and responses that more faithfully replicate real-world dynamics than those elicited in controlled laboratory settings.^[51] This subtype of external validity assesses the generalizability of findings to authentic settings, where contextual cues, social interactions, and routine constraints influence outcomes in ways artificial lab conditions often fail to mimic.^[51] For instance, economic field experiments involving actual market transactions or policy interventions demonstrate participant decisions under genuine stakes and incentives, reducing distortions from hypothetical scenarios or observer awareness.^[52] A primary mechanism for this enhancement lies in the unobtrusive integration of experimental manipulations into ongoing real-life activities, which minimizes demand characteristics—participants' tendencies to alter behavior based on perceived expectations—and Hawthorne effects, where awareness of observation alone modifies conduct.^[11] In natural field experiments, subjects frequently remain unaware of their enrollment, allowing observed actions to emerge from unaltered motivations and environmental pressures, as evidenced in studies of resource conservation where behaviors align closely with baseline non-experimental patterns.^[53] This contrasts with laboratory paradigms, which prioritize internal validity through isolation but sacrifice ecological realism, often yielding effects that diminish or reverse upon translation to field contexts due to overlooked interactive complexities.^[11] Consequently, field experiments bolster causal inferences applicable to practical domains like public policy and behavioral interventions, where ecological fidelity ensures robustness against the "streetlight effect" of over-relying on convenient but unrepresentative lab data.^[54] Empirical reviews across social sciences affirm that this validity edge facilitates scalable insights, such as in development economics trials, though it demands careful design to isolate treatment effects amid ambient variability.^[52] Mainstream academic sources, while generally endorsing this advantage, occasionally underemphasize potential trade-offs with internal precision, reflecting a disciplinary preference for field methods in applied fields despite historical lab dominance.^[51]

Robustness to Hypothetical Bias

Field experiments demonstrate robustness to hypothetical bias, a form of discrepancy where individuals' stated preferences in surveys or hypothetical scenarios diverge from their actual behaviors, often leading to overestimation of willingness to pay or participation.^[55] This bias arises because hypothetical responses lack real costs or consequences, incentivizing socially desirable answers or inflated commitments without accountability. In contrast, field experiments embed interventions in natural environments, eliciting revealed preferences through observable actions, such as purchases or compliance, thereby aligning responses with genuine incentives. Empirical evidence underscores this advantage. For instance, a 2009 study comparing hypothetical surveys to field experiments on charitable giving found that stated intentions overestimated actual donations by factors of 2 to 5 times, while field-based solicitations yielded more accurate behavioral data reflective of real constraints like budget limits. Similarly, in environmental economics, contingent valuation methods relying on hypotheticals have produced willingness-to-pay estimates inflated by 200-500% compared to field experiments measuring actual contributions to conservation efforts. These discrepancies highlight how field experiments' real-world stakes—encompassing opportunity costs, social pressures, and immediate feedback—curb exaggeration, fostering causal inferences grounded in authentic decision-making processes. Critics note potential confounds in field settings, such as unobserved heterogeneity or Hawthorne effects, yet the mitigation of hypothetical bias remains a core strength, particularly when complemented by pre-registration and replication. Meta-analyses of randomized field trials across economics and psychology confirm that effect sizes from behavioral interventions are 20-40% smaller and more consistent than those from lab-based hypotheticals, attributing this to reduced response inflation.^[56] Thus, field experiments enhance reliability for policy-relevant inferences, prioritizing observable actions over self-reported hypotheticals prone to distortion.

Complementarity with Other Methods

Field experiments complement laboratory experiments by applying randomization in natural environments, which enhances external validity while laboratory settings prioritize internal validity through controlled manipulations that isolate causal mechanisms.^[57]^[12] Laboratory studies often reveal behavioral patterns under stylized conditions, such as isolated decision-making tasks, but these may not generalize due to the absence of real stakes, social interactions, or contextual cues; field experiments mitigate this by testing similar hypotheses amid authentic incentives and distractions, as seen in economic studies of charitable giving where lab altruism diminishes in field solicitations.^[57] This synergy enables sequential research: laboratory findings inform field designs, and field outcomes refine theoretical understanding of applicability.^[58] Field experiments also augment quasi-experimental and econometric methods by introducing deliberate randomization to address confounding in observational data, providing a robustness check against endogeneity or selection biases inherent in non-randomized real-world variation.^[13] For example, instrumental variable approaches in econometrics depend on valid exclusion restrictions, which field experiments can validate or supplant through direct treatment assignment in comparable populations.^[59] In development economics, randomized field interventions have corroborated correlations from household surveys, such as the causal impact of deworming on school attendance, where observational data suggested links but lacked identification.^[60] This complementarity extends to structural modeling, where field data calibrates parameters on preferences or frictions that lab or archival sources alone cannot precisely estimate.^[61] Across disciplines, field experiments integrate with surveys and archival analyses by embedding experimental variation within large-scale, naturally occurring datasets, allowing for heterogeneous effects analysis that pure observational methods overlook.^[62] In political science, for instance, field tests of voter mobilization complement laboratory simulations of persuasion by revealing decay in real turnout responses over time.^[63] Such multi-method triangulation—combining field randomization with lab precision and econometric controls—strengthens inference, as no single approach fully resolves trade-offs between control, realism, and scale.^[13]^[12]

Limitations and Methodological Critiques

Challenges to Internal Validity

Field experiments, while leveraging randomization to enhance causal inference, remain susceptible to several threats to internal validity, which is the extent to which observed effects can be confidently attributed to the treatment rather than alternative explanations.^[64] One primary challenge is selective attrition, where participants drop out differentially between treatment and control groups, potentially biasing estimates if attrition correlates with outcomes or treatment effects; for instance, a 2019 review of economics field experiments found attrition rates averaging 20-30% in development studies, often linked to treatment-induced discouragement or mobility.^[65] Researchers mitigate this through intent-to-treat analyses, but such approaches assume random missingness, which rarely holds in naturalistic settings.^[66] Spillover effects, or interference between units, further undermine internal validity by contaminating control groups; in field settings with social networks or shared environments, treated individuals may influence untreated ones via information diffusion, emulation, or resource substitution, as documented in agricultural extension trials where control farmers adopted practices from neighbors, diluting estimated impacts by up to 50%.^[33]^[67] Classical randomization assumes the stable unit treatment value assumption (SUTVA), which posits no interference, but violations in clustered or networked populations require adjustments like cluster randomization or network-aware estimators, though these reduce statistical power.^[64] Non-compliance, or failure to deliver or receive the intended treatment, introduces endogeneity akin to observational data; in a synthesis of field experiments, up to 40% exhibited partial compliance due to implementation errors or participant evasion, shifting inferences toward local average treatment effects on compliers rather than the full population.^[68] Confounding from unmeasured time-varying factors, such as maturation or external shocks, can also persist despite randomization if baseline imbalances or post-randomization events (e.g., policy changes) interact with treatment; historical analyses of randomized field trials highlight how macroeconomic fluctuations confounded labor market interventions in the 1990s.^[69] These issues necessitate robust checks, including balance tests and sensitivity analyses, yet field constraints often limit their feasibility compared to lab controls.^[70]

Issues of Generalizability and Scalability

Field experiments, while enhancing internal validity through real-world implementation, frequently encounter challenges in generalizing findings to broader populations or contexts due to site-specific selection and overlap conditions. Internal overlap requires that treatment effects align across observed and unobserved covariates within the experimental site, but violations—such as heterogeneous responses driven by unmeasured local factors—can undermine causal estimates' reliability. External overlap demands similarity between the experimental sample and target population distributions; empirical analyses of field experiments in labor markets and education reveal frequent mismatches, with selection into sites biasing results toward atypical participants, thus limiting applicability beyond the tested locale.^[71]^[72] Site selection bias further complicates generalizability, as experimenters often choose accessible or cooperative venues, skewing samples toward non-representative groups; for instance, corporate field experiments in tech firms may overrepresent educated, urban demographics, reducing confidence in extrapolating to rural or low-income settings. Cultural and contextual variability exacerbates this, with psychological field studies showing that interventions effective in one cultural milieu fail in others due to differing norms or individual traits, as evidenced by cross-national replications where effect sizes halved when moving from Western to non-Western samples.^[73]^[74] Scalability poses distinct hurdles, as small-scale field experiments overlook systemic responses that emerge at larger volumes, such as general equilibrium effects where increased demand alters prices or depletes resources. In economic development trials, localized incentives like cash transfers succeed modestly but falter when scaled nationwide, as they induce market saturation or crowd out private initiatives; a review of randomized controlled trials identifies six key barriers, including non-constant returns to scale and implementation fidelity loss due to diluted monitoring. "Voltage drops"—declines in efficacy as interventions expand—arise from behavioral spillovers, where participants anticipate widespread adoption and adjust strategies, reducing marginal impacts by up to 50% in education and health pilots.^[75]^[76]^[77] Logistical demands intensify at scale, with fixed costs per participant rising nonlinearly due to supply constraints for high-quality administrators or inputs; experiments in behavioral economics demonstrate that while proofs-of-concept yield positive returns, replication at provincial levels often yields null or negative outcomes from these frictions. Addressing scalability requires preemptive designs incorporating equilibrium modeling or phased rollouts, yet many field experiments neglect these, prioritizing proof-of-concept over feasible expansion.^[78]^[79]

Resource and Logistical Demands

Field experiments typically require substantial financial investments, often exceeding those of laboratory counterparts due to the need for real-world implementation. Costs can include personnel salaries for field workers, travel expenses, participant incentives, and materials for interventions, with examples from development economics showing per-participant costs ranging from $5 to $50 in low-income settings, scaling to hundreds of thousands for large-scale trials involving thousands of subjects.^[6] Logistical complexities arise from coordinating interventions in uncontrolled environments, such as securing site access, managing randomization across dispersed locations, and ensuring treatment fidelity without constant oversight, which demands robust protocols and contingency planning.^[80] Human resource demands are equally intensive, necessitating interdisciplinary teams including researchers, local enumerators trained in data collection, and sometimes partnerships with governments or NGOs for feasibility. In organizational field experiments, for instance, collaboration with firms or institutions is often required to embed treatments into ongoing operations, adding layers of negotiation and compliance monitoring that can extend timelines by months.^[81] Ethical and regulatory hurdles, such as obtaining institutional review board approvals for non-laboratory settings, further amplify resource needs, as do efforts to mitigate attrition or contamination between treatment arms in natural settings.^[82] Scalability poses additional challenges, as expanding sample sizes to achieve statistical power—often requiring 1,000 or more participants to overcome field noise—increases both budgetary and operational burdens, limiting replication or rapid iteration compared to lab methods.^[83] Despite these demands, proponents argue that the causal insights gained justify the investment when lab results fail to translate, though critics note that high upfront costs can deter junior researchers or underfunded fields.^[84]^[85]

Applications Across Disciplines

Economics and Development Policy

Field experiments have become a cornerstone of development economics, enabling causal identification of interventions' effects on poverty, education, and health in real-world settings. Pioneered by researchers like Abhijit Banerjee, Esther Duflo, and Michael Kremer—who received the 2019 Nobel Prize in Economic Sciences for their experimental approach—these studies use randomization to test policies directly among affected populations, contrasting with prior reliance on observational data prone to confounding factors.^[42] This method has informed scalable programs, such as conditional cash transfers (CCTs), by quantifying returns on investments like schooling incentives or parasite control, often revealing high benefit-cost ratios that justify government adoption.^[86] A seminal example is the evaluation of school-based deworming in western Kenya, conducted by Edward Miguel and Michael Kremer starting in 1998 across 50 schools. Randomly assigning deworming treatments reduced absenteeism by 25% through both direct health improvements and community spillovers, with long-term follow-ups showing treated individuals earning 13% more hourly wages and experiencing 14% higher consumption expenditures two decades later.^[87] ^[88] These findings, costing about 44 cents per child annually, have supported national deworming campaigns in Kenya and over 40 countries, demonstrating returns exceeding 40:1 in some estimates.^[87] In Mexico, the PROGRESA program (later Oportunidades), launched in 1997, used a phased rollout as a natural randomization to assess CCTs linking cash payments—averaging 90 pesos monthly per child—to school attendance and clinic visits. Evaluations found enrollment rises of 20% for secondary school girls and improved nutrition, prompting expansion to six million households by 2013 and influencing similar programs in over 60 nations, including Brazil's Bolsa Família.^[89] ^[90] However, field experiments have also debunked overstated claims; a 2015 randomized evaluation of microcredit expansion in Hyderabad, India, by Banerjee, Duflo, and colleagues revealed only modest increases in business activity and no significant poverty reduction, challenging narratives of microfinance as a transformative tool.^[91] Organizations like the Abdul Latif Jameel Poverty Action Lab (J-PAL), founded in 2003, have scaled this approach, conducting over 1,100 evaluations that shaped policies in sectors like agriculture and finance, emphasizing mechanisms such as incentives over assumptions of perfect rationality.^[86] While academic sources on these topics exhibit left-leaning tendencies in policy advocacy, the rigor of randomization mitigates bias by directly measuring outcomes, though generalizability remains debated due to context-specific designs.^[7]

Psychology and Behavioral Studies

Field experiments in psychology examine behavioral phenomena in naturalistic environments, allowing researchers to manipulate variables while capturing responses untainted by artificial lab conditions. This approach yields higher ecological validity, as participants exhibit genuine reactions influenced by ambient social cues, reducing artifacts like demand characteristics. In behavioral studies, they test theories of social influence, prosociality, and conformity by embedding interventions in everyday contexts such as public transport, workplaces, or communities.^[11]^[92] The Piliavin et al. (1969) "Subway Samaritan" study exemplifies applications in prosocial behavior research. Conducted on 8.5-mile New York City subway routes over 103 trials, confederates staged collapses of victims depicted as ill (carrying cane) or intoxicated (with liquor bottle), with observers recording intervention rates, speed, and helper demographics. Help was provided to 62% of ill victims within 70 seconds on average, compared to 14% immediate help for drunk victims, with black victims aided less by white passengers but more by black ones; drunkenness and race amplified bystander hesitation via attributions of responsibility diffusion and stigma. These findings supported a cost-benefit arousal model over pure diffusion of responsibility, informing urban helping dynamics.^[93]^[94] Obedience and authority compliance have been probed through workplace field experiments, notably Hofling et al. (1966), where 22 nurses received phone orders from a fictitious doctor (using a real but unauthorized drug name) to administer 20mg of Astroten, double the maximum dosage. Despite hospital rules requiring written orders and dosage checks, 21 nurses prepared to comply before interception, while a prior survey of 21 nurses deemed such obedience unethical. This revealed entrenched hierarchical deference overriding protocols in high-stakes medical settings, contrasting lab obedience rates and highlighting contextual amplifiers like perceived expertise.^[11] Intergroup relations and conflict resolution draw on classics like Sherif's Robbers Cave experiment (1954-1955), a field study with 22 fifth-grade boys at an Oklahoma summer camp. Initially isolated into rival groups with induced competitions (e.g., tug-of-war, baseball), hostility escalated via name-calling and raids; introducing superordinate tasks like fixing a water tank fostered cooperation and prejudice reduction. Quantitative measures, including autokinetic effect ratings for in-group bias, confirmed realistic conflict theory: competition over resources drives antagonism, resolvable by mutual goals. This informed behavioral interventions for reducing bias in schools and communities.^[1] Contemporary behavioral studies extend field experiments to digital and organizational realms, such as testing social proof on decision-making via manipulated public displays or online prompts, validating lab-derived mechanisms like conformity under peer observation. These applications underscore field experiments' role in causal inference for policy, from anti-discrimination nudges to workplace equity training, though they demand ethical safeguards against unintended distress.^[95]^[96]

Other Fields Including Marketing and Public Health

Field experiments in marketing apply randomized interventions in authentic consumer settings, such as retail outlets, online platforms, or direct mail campaigns, to isolate causal effects on purchasing behavior, pricing sensitivity, and promotional responses. These experiments address limitations of lab studies by capturing real incentives and external validity, often revealing counterintuitive results that challenge traditional marketing assumptions. For example, a 2009 field experiment by Anderson and Simester with a women's apparel catalog tested price endings, randomizing 39,000 customers across treatments and finding that prices ending in 88 cents increased quantity sold by 7-8% compared to 89 cents, attributed to perceived discounts rather than mere salience. Similarly, List and colleagues conducted field experiments in sports card markets, exposing arbitrage opportunities and demonstrating that experienced traders exhibit less irrationality than novices, informing models of market efficiency.^[97] In public health, field experiments deploy randomized interventions in community or clinical settings to evaluate behavioral and epidemiological outcomes, such as disease prevention or health adoption, where natural confounding is high. A landmark example is the 1998-2002 Kenyan deworming field experiment by Miguel and Kremer, which randomized primary school treatments across 50 schools serving 32,000 children, reducing worm prevalence by 25% and increasing school attendance by 2.4 percentage points annually, with benefits extending to non-treated peers via externalities. More recent applications include nudge-based trials; a 2019 set of three randomized field experiments in Dutch supermarkets and canteens, involving over 2,000 participants, tested labeling and placement interventions, boosting healthy food selection by 5-15% through default positioning without restricting choice.^[98] These studies underscore field experiments' role in scaling evidence for policy, though they require careful ethical oversight to mitigate risks like unequal access to treatments.^[99] Beyond these core areas, field experiments have informed environmental resource management, such as randomized incentives for water conservation in households, yielding 10-20% usage reductions in trials across U.S. utilities. In operations contexts overlapping public health, a 2024 preregistered field experiment rewarded gym attendance with social incentives, increasing participation by 15-20% among paired users compared to solo rewards, highlighting relational nudges for sustained behavior change.^[100] Such applications emphasize the method's versatility in testing causal mechanisms under real-world constraints, prioritizing designs that balance internal validity with scalability.

Ethical and Philosophical Debates

In field experiments, obtaining informed consent—defined as the voluntary agreement of participants after full disclosure of risks, benefits, and procedures—presents unique challenges compared to laboratory settings, as revealing the experimental nature could alter natural behaviors and invalidate causal inferences.^[101] Researchers frequently employ partial disclosure, deception, or institutional review board (IRB) waivers for minimal-risk studies, arguing that full consent would introduce demand effects or selection bias; for instance, in audit studies testing discrimination, participants are unaware of their role to preserve ecological validity.^[99] However, this practice inherently limits participant autonomy, the ethical principle emphasizing self-determination and the right to make uncoerced choices, as subjects may unknowingly contribute to data collection without opportunity for refusal.^[101] Ethical frameworks, such as those outlined in the Common Rule (45 CFR 46) administered by U.S. federal agencies, permit consent waivers in field contexts where obtaining it is impracticable and risks are low, as seen in many randomized controlled trials (RCTs) in development economics conducted by organizations like the Abdul Latif Jameel Poverty Action Lab (J-PAL).^[102] In such trials, often involving community-level interventions like randomized provision of educational resources in villages, consent may be secured from local leaders or a subset of participants, but not universally from all affected individuals, particularly illiterate or vulnerable populations where verbal or proxy consent is used.^[102] Critics contend that these approaches erode autonomy by prioritizing aggregate knowledge gains over individual rights, potentially treating participants as means to societal ends rather than ends in themselves, a tension rooted in Kantian ethics but amplified in real-world scalability demands.^[99] Empirical reviews of field experiments reveal that few studies systematically assess post-experiment comprehension or satisfaction with consent processes, with one analysis of deception-based designs finding no reported evaluations of autonomy impacts in the reviewed cases.^[103] Philosophical debates highlight that field experiments' reliance on unobtrusive methods can conflict with respect for persons, a core Belmont Report principle, as incomplete information undermines the voluntariness essential to autonomy.^[104] Proponents counter that in public policy trials—such as randomized lotteries for social services—de facto consent arises from participation in existing systems, and debriefing post hoc restores transparency without prior harm; yet, evidence from behavioral studies indicates that even minimal deceptions can erode trust in institutions if discovered.^[101] To mitigate these issues, some protocols advocate for "broad consent" models, where participants agree to randomization within service delivery, but adoption remains inconsistent, with surveys of researchers showing varied interpretations of when autonomy is sufficiently preserved.^[105] Ongoing calls urge updated standards, including mandatory risk-benefit analyses tailored to field deception and participatory ethics consultations to better align experiments with participant agency.^[103]

Risks of Harm and Unequal Treatment

Field experiments, particularly randomized controlled trials (RCTs) in development economics and public health, carry risks of direct harm to participants when interventions involve withholding established treatments or testing unproven ones under real-world conditions. For instance, in health-related field trials, control groups may forgo interventions like deworming medications or insecticide-treated bed nets, potentially exacerbating conditions such as parasitic infections or malaria in resource-poor settings where these are known to be effective. Such designs assume clinical equipoise—genuine uncertainty about efficacy—but critics argue this often fails in practice, especially when prior evidence suggests benefits, leading to preventable morbidity or mortality.^[106] Nobel laureate Angus Deaton has highlighted these ethical dangers, contending that randomizing access to potentially life-saving aids in impoverished populations prioritizes methodological purity over human welfare, effectively treating people as means to inferential ends.^[107] Unequal treatment emerges inherently from randomization, as treatment groups receive benefits—such as cash transfers, educational programs, or policy interventions—while control groups do not, fostering resentment, social friction, or perceived injustice within communities. In international development RCTs, this disparity can widen existing inequalities, particularly when experiments span villages or households aware of the allocation, prompting spillover effects like theft, migration, or breakdown in social norms as controls seek to access treatments informally.^[108] Political science field experiments amplify these issues through direct manipulations, such as deceptive mailings or canvassing that influence behaviors like voting or compliance, potentially undermining participant autonomy and causing psychological distress if outcomes lead to regretted decisions. Empirical reviews indicate that while harms are often mitigated via institutional review boards (IRBs), the scale of field settings—unlike contained labs—extends risks to non-consenting bystanders, including broader community destabilization from uneven resource distribution.^[99] Mitigation strategies, such as phased rollouts or post-trial access for controls, are recommended but not universally applied, leaving gaps in accountability. Deaton and others note that the power imbalances in low-income contexts exacerbate these risks, as participants from vulnerable populations may consent under duress or incomplete information, prioritizing short-term gains over long-term equity concerns.^[109] Guidelines from organizations like the Poverty Action Lab emphasize pre-registration and ethical protocols to minimize harm, yet enforcement varies, with some experiments proceeding despite foreseeable inequities.^[110] Overall, these risks underscore the tension between causal inference gains and the moral imperatives of non-maleficence and justice in experimental design.

Broader Critiques of Experimental Paternalism

Critics of experimental paternalism argue that field experiments designed to test behavioral interventions, such as nudges, inherently undermine individual autonomy by exploiting cognitive biases to steer choices toward outcomes deemed preferable by researchers or policymakers, even when alternatives remain available. This approach, often framed as "libertarian paternalism," is seen as manipulative because it relies on non-transparent defaults or framing effects that influence decisions without individuals' full awareness or consent, thereby diminishing personal agency and treating subjects as predictably irrational rather than capable of self-directed reasoning.^[111]^[112]^[113] A core philosophical objection is the presumption that experimenters possess superior knowledge of participants' welfare, ignoring the subjective nature of preferences and the possibility that individuals, even if systematically biased, may value their own errors or non-standard choices more than externally imposed corrections. Proponents of this critique, drawing from classical liberal principles, contend that such interventions disrespect the pluralism of human values and fail to acknowledge that people often have unique insights into their circumstances that aggregated experimental data cannot capture.^[114]^[115]^[112] Furthermore, experimental paternalism risks a slippery slope toward coercive policies, as successful field trials of subtle nudges may embolden authorities to escalate to more restrictive measures under the guise of evidence-based improvement, eroding the nominal preservation of choice. Libertarian scholars highlight that defaults in experiments, while not outright bans, impose transaction costs on opting out—such as time, effort, or social pressure—that effectively coerce compliance, contradicting claims of true voluntariness.^[115]^[116]^[117] This dynamic is particularly concerning in policy applications, where governments wielding experimental results may prioritize aggregate utility over dispersed individual liberties, potentially fostering dependency and reducing societal resilience to errors.^[113]^[114]

Impact and Evolving Practices

Influence on Evidence-Based Policy

Field experiments have significantly advanced evidence-based policymaking by delivering causal evidence on policy interventions in naturalistic settings, enabling governments and organizations to identify effective programs and avoid scaling ineffective ones. Unlike observational studies, randomized field experiments minimize selection biases and confounding variables through random assignment, providing robust estimates of treatment effects that inform decisions on resource allocation. For instance, in development economics, randomized controlled trials (RCTs) conducted by researchers such as Abhijit Banerjee, Esther Duflo, and Michael Kremer demonstrated the impacts of interventions like deworming programs and remedial tutoring, leading to their adoption in policies across multiple countries and influencing billions in aid spending.^[118] This empirical approach earned the trio the 2019 Nobel Prize in Economics, underscoring its role in shifting policy from intuition to data-driven causal inference.^[42] In the United States, field experiments have shaped social welfare and labor policies, with organizations like MDRC conducting large-scale RCTs on programs such as welfare-to-work initiatives in the 1990s, which revealed modest employment gains but limited long-term income effects, informing the design of the 1996 Personal Responsibility and Work Opportunity Reconciliation Act (PRWORA). Similarly, the congressionally mandated Head Start Impact Study, an RCT launched in 1998, found negligible cognitive benefits from the preschool program for most participants, prompting refinements in early childhood education funding rather than expansion without evidence. These evaluations have encouraged federal agencies to incorporate randomization into program assessments, as seen in the Department of Health and Human Services' use of RCTs for homelessness prevention and job training, reducing reliance on anecdotal or correlational evidence.^[119]^[120] Internationally, field experiments have influenced public health and economic policies, such as trials on iodized salt fortification that reduced anemia rates, leading to nationwide rollouts in India and other nations. In public administration, a review of 42 field experiments highlights their application in testing bureaucratic reforms and service delivery, fostering "politically robust" designs that withstand partisan challenges and promote scalable interventions. However, adoption varies; while entities like the World Bank and UK Behavioural Insights Team routinely integrate field experiment findings, barriers such as political resistance and short-term horizons can limit translation to policy, emphasizing the need for designs that align with decision-makers' incentives. Overall, these experiments have cultivated a culture of experimentation in government, prioritizing verifiable impacts over ideological preferences.^[121]^[122]^[123]

Recent Innovations and Hybrid Approaches

Recent innovations in field experiments emphasize scalability through digital platforms and adaptive designs, enabling researchers to conduct interventions at larger scales while maintaining randomization. For example, in economics, experiments leveraging mobile applications and online interfaces have tested interventions like cash transfers or information nudges across thousands of participants in real-time natural settings, as seen in studies from 2020 onward that integrated geolocation data for precise targeting. These approaches address limitations of traditional field experiments by reducing costs and allowing dynamic adjustments based on interim results, though they require careful controls to avoid selection biases introduced by digital access disparities. Hybrid methods combining field experiments with observational data have advanced causal estimation, particularly for heterogeneous treatment effects and external validity. One calibration technique pairs randomized plot-level data from field trials with satellite-derived observational metrics, such as vegetation indices, to forecast outcomes like crop yields; a 2022 analysis of maize rotations in Zambia demonstrated that this hybrid reduced root mean squared error by 13% compared to experimental data alone and 26% versus observational data only.^[124] Similarly, double machine learning frameworks integrate experimental results with non-experimental datasets to validate assumptions like unconfoundedness, enabling robust testing of treatment effect modifiers in large administrative records.^[125] In psychology and behavioral economics, lab-in-the-field protocols represent a key hybrid, deploying incentivized lab tasks—such as public goods games or risk elicitation—in everyday environments to capture context-specific behaviors among diverse groups. Reviews from 2024 highlight their utility in development settings, where they reveal cultural variations in cooperation or time preferences not evident in WEIRD (Western, Educated, Industrialized, Rich, Democratic) lab samples, with protocols standardized for replicability across sites.^[126] ^[7] These methods bridge the internal validity of labs with field realism, though critics note potential Hawthorne effects from task framing. Emerging integrations with machine learning further hybridize field experiments by automating outcome prediction and subgroup analysis. For instance, post-experiment ML models trained on experimental and auxiliary observational data improve policy targeting, as in labor market studies estimating personalized job referral effects from 2023 field trials.^[127] Such techniques, while promising for efficiency, demand transparency in model selection to mitigate overfitting risks in sparse field data. Ongoing conferences, like the Advances with Field Experiments series, underscore these trends, fostering innovations in ethical scaling and data fusion for policy-relevant insights.^[128]

Future Challenges in Replication and Transparency

Field experiments face unique hurdles in replication due to their reliance on real-world contexts, which often preclude exact duplication of conditions across sites or time periods. A study examining two iterations of a direct mail intervention in agricultural extension services found that while the initial experiment detected both direct effects and spillovers, the replication in a subsequent year failed to confirm the direct effect, reducing the detectability of spillovers and highlighting variability introduced by temporal factors such as weather or farmer responsiveness.^[129] Similarly, a 2016 survey of economics experiments indicated that approximately 40% failed to replicate, a rate lower than in psychology but still indicative of systemic issues like publication bias and selective reporting that undermine iterative testing in field settings.^[130] These challenges persist because field experiments typically involve large-scale collaborations with organizations, where logistical dependencies—such as access to proprietary data or partner cooperation—diminish over time, making independent reproductions resource-intensive and prone to confounds from evolving external variables. Transparency exacerbates replication difficulties, as field experiments often withhold detailed protocols or raw data to protect participant privacy or commercial sensitivities, limiting external verification. In economics, while data-sharing practices have improved relative to psychology, pre-registration of analysis plans remains less adopted, with only about 20% of studies in top journals employing it as of 2021, compared to higher rates in laboratory-based fields.^[131] This gap arises from the improvisational nature of field interventions, where unforeseen adaptations during implementation complicate full disclosure without risking misinterpretation or ethical breaches under regulations like GDPR. Moreover, incomplete reporting of exclusion criteria or subgroup analyses in field trials fosters "researcher degrees of freedom," where post-hoc adjustments inflate false positives, as evidenced by broader social science replication efforts showing diminished effect sizes upon retesting.^[132] Looking ahead, fostering replicability will demand structural reforms, including incentives for multi-site collaborations and standardized reporting templates tailored to field contexts, yet entrenched academic pressures favoring novel over confirmatory work pose ongoing barriers. The high costs of scaling field experiments—often exceeding laboratory analogs by orders of magnitude—discourage widespread replication, particularly in under-resourced regions where initial studies originate.^[129] Privacy laws and institutional review board constraints will likely intensify transparency tensions, requiring innovations like synthetic data generation or federated learning to balance openness with compliance, though these technologies remain nascent and unproven at scale. Without addressing these, the credibility of field experiments in informing policy—such as in development economics—risks erosion, as selective non-replication perpetuates overstated causal claims.^[133]

References

[1]
Sage Research Methods - Field Experiments
Field experiments are studies using experimental design that occur in a natural setting. Researchers examine how the manipulation of at least ...
[2]
About - Field Experiments
Field experiments observe subjects in natural environments. Types include artefactual (lab with non-standard subjects), framed (with field context), and ...
[3]
Types - Field Experiments
Framed. Identical to artefacutal field experiments but with field context in either the commodity, task, or information set that the subjects use. Natural.
[4]
An introduction to field experiments in economics - ScienceDirect.com
Three main types of field experiments have emerged in the past decade within economics: artefactual, framed, and natural field experiments.Editorial · Introduction · References (17)
[5]
Field experiments in economics: The past, the present, and the future
This study presents an overview of modern field experiments and their usage in economics. Our discussion focuses on three distinct periods of field ...Review Paper · 4. The Current Generation Of... · Acknowledgement
[6]
[PDF] NBER WORKING PAPER SERIES FIELD EXPERIMENTS IN LABOR ...
This chapter overviews the burgeoning literature in field experiments in labor economics. The essence of this research method involves researchers engineering ...
[7]
Handbook of Field Experiments - Poverty Action Lab
One of those disciplines is economics and one of the methods used to investigate economic questions is field experiments.
[8]
The Prize in Economic Sciences 2019 - Popular science background
In contrast to traditional clinical trials, the Laureates have used field experiments in which they study how individuals behave in their everyday environments.
[9]
[PDF] Field experiments and the practice of Economics - Nobel Prize
Field experiments involve running small, well-controlled experiments, getting results, preparing policy briefs, and then getting full-scale adoption. J-PAL has ...
[10]
What is a field experiment? | University of Chicago News
A field experiment is a research method that uses some controlled elements of traditional lab experiments, but takes place in natural, real-world settings.
[11]
Experimental Method In Psychology
Sep 25, 2023 · A field experiment is a research method in psychology that takes place in a natural, real-world setting. It is similar to a laboratory ...
[12]
Field Experiments - American Economic Association
One is called a social experiment, in the sense that it is a deliber- ate part of social policy by the government. Social experiments involve deliberate, ran-.
[13]
[PDF] NBER WORKING PAPER SERIES FIELD EXPERIMENTS IN ...
This study presents an overview of modern field experiments and their usage in economics. Our discussion focuses on three distinct periods of field ...
[14]
Field experiments in economics: The past, the present, and the future
Field experiments provide a bridge between laboratory and naturally-occurring data in that they represent a mixture of control and realism usually not achieved ...
[15]
Introduction to Field Experiments and Randomized Controlled Trials
Jul 24, 2023 · Field experiments, or randomized studies conducted in real-world settings, can take many forms. While experiments on college campuses are often ...
[16]
Field Experiments - American Economic Association
We propose six factors that can be used to determine the field context of an experiment: the nature of the subject pool, the nature of the information that the ...
[17]
Field Experiments Across the Social Sciences - Annual Reviews
Using field experiments, scholars can identify causal effects via randomization while studying people and groups in their naturally occurring contexts.
[18]
[PDF] EXPERIMENTAL AND QUASI-EXPERIMENTAL DESIGNS FOR ...
Quasi-Experiment: An experiment in which units are not assigned to conditions randomly.
[19]
Field Experiment - an overview | ScienceDirect Topics
A field experiment is a scientific study that is conducted outside of a controlled laboratory setting, in a real-world environment.
[20]
[PDF] Do Natural Field Experiments Afford Researchers More or Less ...
A commonly held view is that laboratory experiments provide researchers with more “control” than natural field experiments, and that this advantage is to be ...
[21]
[PDF] The Use of Experimental Methods by IS Scholars - HAL-SHS
Field experiments: Although several advantages of using the field rather than laboratory environ- ment, such as higher external validity, have been identified, ...
[22]
[PDF] Experimental methods: Extra-laboratory experiments-extending the ...
Apr 17, 2013 · In general, field experiments have many theoretical advantages, but serious practical drawbacks. In some cases setting up a field experiment in ...
[23]
Chapter 10 Experimental Research - Lumen Learning
Experimental research can be grouped into two broad categories: true experimental designs and quasi-experimental designs. Both designs require treatment ...
[24]
Quasi-Experimental Designs for Causal Inference - PMC
This article discusses four of the strongest quasi-experimental designs for identifying causal effects: regression discontinuity design, instrumental variable ...
[25]
Quasi-Experimental Design | Definition, Types & Examples - Scribbr
Jul 31, 2020 · Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity as they can use real-world ...Differences between quasi... · Types of quasi-experimental... · When to use quasi...
[26]
Strengths and Weaknesses of Experimental and Quasi ...
Jun 26, 2023 · Recent years have seen important advances in the design and analysis of both randomized experiments and quasi-experiments. A particular focus ...
[27]
Lind and scurvy: 1747 to 1795 - PMC - NIH
During a 10-week absence from shore, 80 out of 350 sailors were struck down by scurvy, and Lind's prospective controlled experiment—in which he compared the ...Missing: field | Show results with:field
[28]
Who was James Lind, and what exactly did he achieve - PMC - NIH
Lind reports the trial as having been undertaken in May 1747, while the Salisbury was at sea enforcing a blockade in the English Channel. Lind's trial involved ...
[29]
Introduction to Fisher (1926) The Arrangement of Field Experiments
Fisher introduced the subdivision of sums of squares now known as an analysis of variance (anova) table (1923), derived the exact distribution of the (log of ...
[30]
Ronald Fisher: Founder of the Modern Experiment
In the early 1920s, Fisher was employed at an agricultural research station north of London, where he was tasked with developing ways to improve their ...
[31]
R. A. Fisher and Experimental Design: A Review - jstor
R. A. Fisher's contributions to experimental design are surveyed, particular attention being paid to. (1) the basic principles of replication, ...
[32]
Experimentation and social interventions: a forgotten but important ...
The history of social experimentation indicates clearly that all the same issues have attended attempts to evaluate the impact of social interventions.
[33]
Field Experiments (Chapter 7) - Experimental Sociology
Nov 23, 2024 · Field experiments rely on the three design elements common to all experiments – manipulation, group comparison, and randomization – and, ...
[34]
Have RCTs taken over development economics? - World Bank Blogs
Jun 13, 2016 · We see that RCTs are a much higher proportion of the development papers published in general interest journals than in development journals.
[35]
[PDF] Should the Randomistas (Continue to) Rule?
While the use of RCTs in development applications began around 1980, a rapid expansion in their use emerged some 20 years later. About 60% of the impact ...
[36]
Field Experiments in Labor Economics - ScienceDirect.com
We overview the use of field experiments in labor economics. We showcase studies that highlight the central advantages of this methodology.
[37]
[PDF] The Role of Theory in Field Experiments | David Card
Since 1995, the number of field experiments has increased steadily, while the diversity of subject matter has also expanded to include such areas as behavioral.
[38]
[PDF] Using RCTs to Estimate Long-Run Impacts in Development ...
Dec 3, 2018 · We conclude that the rise of development economics RCTs since roughly 2000 provides a novel opportunity to generate high-quality evidence on ...Missing: onwards | Show results with:onwards<|control11|><|separator|>
[39]
1.1 - A Quick History of the Design of Experiments (DOE) | STAT 503
... Ronald Fisher developed in the UK in the first half of the 20th century. He really laid the foundation for statistics and for design of experiments. He and ...
[40]
[PDF] Field Experiments in Economics - OAPEN Home
It explains key concepts such as control and randomization and identifies two distinct origins of field experimentation in economics: controlled laboratory ...
[41]
Field Experiments in Economics: The Past, The Present, and The ...
Sep 19, 2008 · This study presents an overview of modern field experiments and their usage in economics. Our discussion focuses on three distinct periods of field ...
[42]
The Prize in Economic Sciences 2019 - Press release - NobelPrize.org
Oct 14, 2019 · The research conducted by this year's Laureates has considerably improved our ability to fight global poverty. In just two decades, their new ...
[43]
MIT economists Esther Duflo and Abhijit Banerjee win Nobel Prize
Oct 14, 2019 · Prof. Esther Duflo, who along with Prof. Abhijit Banerjee and Michael Kremer of Harvard was awarded the 2019 Nobel Prize in Economics, speaks ...
[44]
Randomization | The Abdul Latif Jameel Poverty Action Lab
Randomization methods Conceptually, randomization simply means that every experimental unit has the same probability of being assigned to a given group. Below, ...
[45]
Randomization in Field Experiments - SSRN
Aug 22, 2018 · This chapter discusses several important topics related to randomization in field experiments. In the field, researchers face constraints in the ...
[46]
In Pursuit of Balance: Randomization in Practice in Development ...
We present new evidence on the randomization methods used in existing experiments, and new simulations comparing these methods.
[47]
[PDF] Using Randomization in Development Economics Research: A Toolkit
This toolkit covers randomization's rationale, practical introduction, design issues, data analysis, and drawing conclusions from randomized evaluations.
[48]
https://www.aeaweb.org/articles?id=10.1257/app.2.1.1
[49]
https://www.aeaweb.org/articles?id=10.1257/app.6.4.27
[50]
https://www.aeaweb.org/articles?id=10.1257/jep.29.3.103
[51]
Internal, External, and Ecological Validity in Research Design ... - NIH
Ecological validity examines, specifically, whether the study findings can be generalized to real-life settings; thus ecological validity is a subtype of ...
[52]
The Use of Field Experiments in Environmental and Resource ...
This article provides a review of studies that have used field experiments to inform (1) benefit–cost analysis and (2) efforts to promote resource conservation.
[53]
[PDF] Behavioral Insights from Field Experiments in Environmental ...
Thus, one main advantage of natural field experiments is that the observed behavior cannot be the result of experimenter/interviewer demand effects (Orne, 1962) ...
[54]
[PDF] Field Experiments Design Analysis And Interpretation
Naturalistic Setting: Conducting experiments in real-world settings enhances ecological validity, allowing findings to be more generalizable. Designing Field ...<|separator|>
[55]
https://www.aeaweb.org/articles?id=10.1257/jep.25.3.113
[56]
https://www.aeaweb.org/articles?id=10.1257/aer.20181462
[57]
[PDF] Field Experiments: A Bridge Between Lab and Naturally-Occurring ...
List (2004a) represents a framed field experiment that moves the analysis from the laboratory environment to the natural setting where the actors actually.
[58]
Two Strands of Field Experiments in Economics: A Historical ...
Dec 10, 2019 · A social field experiment is defined by four characteristics in the definition, namely, (a) public funding, (b) a rigorous statistical design, ( ...<|separator|>
[59]
Theory, Experimental Design and Econometrics Are Complementary ...
Jun 5, 2019 · Theory, Experimental Design and Econometrics Are Complementary (And So Are Lab and Field Experiments). Experiments are conducted with ...
[60]
[PDF] Field Experiments in Development Economics1 Esther Duflo ...
Field experiments have been designed to shed light on core issues in economics, such as the role of incentives or social learning. In recent years, several ...
[61]
Inference from field and laboratory experiments in economics
It is often through a collection of experiments with different complementary domains that we can progressively establish whether and if so to what other domains ...
[62]
Field Experiments in Economics - OAPEN Library
This book adopts an integrated history and philosophy of science approach to consider the historical origins and methodological pitfalls of field experiments ...Abstract · Keywords · ClassificationMissing: milestones | Show results with:milestones
[63]
[PDF] American Politics Research - Temple University
Sep 10, 2016 · Field experiments complement laboratory studies by helping scholars investigate how political stimuli operate in complex real-world ...
[64]
Module 9 Threats to the Internal Validity of Randomized Experiments
Randomized experiments can run into issues that undermine their ability to demonstrate causal effects – that is, threaten the internal validity of randomized ...
[65]
[PDF] Testing Attrition Bias in Field Experiments
Attrition is a common threat to the internal validity of field experiments in economics. We conduct a systematic review of the field experiment literature ...
[66]
Data analysis | The Abdul Latif Jameel Poverty Action Lab
Attrition occurs when study group members drop out of the study or data on them cannot be recovered. If characteristics of attrits (drop-outs) are correlated ...
[67]
Randomized and Observational Approaches to Evaluating ... - NCBI
A major threat to the internal validity of the randomized experiment is "spillover." This phenomenon—the communication of ideas, skills, or even outcomes from ...
[68]
(PDF) Field Experiments - ResearchGate
Finally, it provides an overview over current and emerging directions in field experimentation and concludes with a brief history of field experiments.
[69]
Threats to validity of Research Design
History, maturation, selection, mortality and interaction of selection and the experimental variable are all threats to the internal validity of this design.
[70]
Addressing Methodologic Challenges and Minimizing Threats to ...
To address challenges to internal validity, we articulate methods and the underlying assumptions used to handle (1) different outcome measures used in ...
[71]
Addressing Validity and Generalizability Concerns in Field ...
Jul 1, 2020 · In this paper, we systematically analyze the empirical importance of standard conditions for the validity and generalizability of field experiments.Missing: issues | Show results with:issues
[72]
[PDF] Addressing validity and generalizability concerns in field experiments
The paper analyzes internal and external overlap and unconfoundedness conditions, which are key for validity and generalizability of field experiments.
[73]
[PDF] Addressing Validity and Generalizability Concerns in Field ...
Key concerns for field experiments include internal overlap, external overlap, and no-site selection bias, which are addressed by varying overlap and using ...
[74]
Recognizing limits on the generalizability of findings of ... - NIH
Unrecognized limits on generalizability in psychological research are a serious concern, influenced by cultural background, individual variability, and ...
[75]
[PDF] From Proof of Concept to Scalable Policies: Challenges and ...
Sep 2, 2017 · In this paper, we begin by exploring six main challenges in drawing conclusions from a localized randomized controlled trial about a policy ...
[76]
Why Big Ideas Fail To Scale—And How To Fix It with John List
Feb 17, 2022 · Solving problems like poverty, education inequality or discrimination require policy interventions that can scale, but they rarely do.
[77]
The Five Vital Signs of a Scalable Idea and How to Avoid a Voltage ...
Apr 19, 2022 · One of the first steps to reaching scale is not losing steam as your idea grows. When a seemingly promising idea loses efficacy or profitability as it expands, ...Missing: challenges | Show results with:challenges
[78]
[PDF] What Can We Learn from Experiments? Understanding the Threats ...
First, even in the case of the insoluble com- ponents of the scalability problem, such as upward-sloping supply curves for administra- tor quality, ...
[79]
[PDF] Experimentation at scale - UC San Diego Department of Economics
Jul 31, 2017 · Abstract. This paper makes the case for greater use of randomized experiments “at scale”. We review various critiques of experimental ...
[80]
Everything That Can Go Wrong in a Field Experiment (and What to ...
Jan 16, 2015 · Field experiments in the developing world can lead to major breakthroughs but can also offer serious challenges. How can researchers prepare ...
[81]
[PDF] Field Experiments in Organizations - CEBMa
Dec 21, 2016 · Field experiments are the gold standard for organizational research, yielding valid findings and are crucial for establishing causality. They ...
[82]
Lab-in-the-field experiments: perspectives from research on gender
Researchers have designed both laboratory and field experiments in economics, with each having distinct advantages and limitations. In this paper, we explore ...
[83]
[PDF] The Value of Field Experiments - MIT
Jun 16, 2016 · Yet conduct- ing field experiments is often costly, and optimizing marketing decisions may require a lot of experiments if there are many ...
[84]
Experimental studies of conflict: Challenges, solutions, and advice to ...
Jun 28, 2023 · A second, practical, downside of field experiments is the logistical and resource demands that they usually create, which might make relying ...
[85]
FIELD EXPERIMENTS IN ECONOMICS: SOME ...
Specifically, going into the field can dramatically increase the demands on, and challenges to, experimental control. This is particularly true for experiments ...
[86]
Introduction to randomized evaluations - Poverty Action Lab
Randomized evaluations can be used to measure impact in policy research: to date, J-PAL affiliated researchers have conducted more than 1,100 randomized ...
[87]
Twenty-year economic impacts of deworming - PNAS
Individuals who received two to three additional years of childhood deworming experienced a 14% gain in consumption expenditures and 13% increase in hourly ...
[88]
Primary School Deworming in Kenya - Poverty Action Lab
Researchers evaluated a mass school-based deworming program in Western Kenya, and found that deworming substantially improved health and school participation.
[89]
The Impact of PROGRESA on Health in Mexico - Poverty Action Lab
PROGRESA involves a cash transfer that is conditional on the recipient household engaging in a set of behaviors designed to improve health and nutrition. The ...
[90]
Conditional Cash Transfers: The Case of Progresa/Oportunidades
As of 2013, the program covers almost six million households, about 20 per- cent of all households in Mexico. As previously noted, the program's prin- cipal ...
[91]
The Miracle of Microfinance? Evidence from a Randomized Evaluation
The Miracle of Microfinance? Evidence from a Randomized Evaluation by Abhijit Banerjee, Esther Duflo, Rachel Glennerster and Cynthia Kinnan.
[92]
FIELD EXPERIMENTS — RESEARCH METHODS - PsychStory
Mar 11, 2025 · ADVANTAGES OF FIELD EXPERIMENTS. Field experiments usually offer higher mundane realism and ecological validity than lab experiments.<|control11|><|separator|>
[93]
Piliavin (1969) Subway Samaritan Study - Simply Psychology
Jan 17, 2025 · This study was designed to investigate how a group of people would react if they saw a person who collapsed on a train.Aim · Procedure · Findings
[94]
Good Samaritanism: An underground phenomenon? - APA PsycNet
Investigated the effect of several variables on helping behavior, using subway express trains as a field laboratory.
[95]
Follow the leader? A field experiment on social influence
In this paper we conduct an artefactual field experiment that contributes to our understanding of how different types of actors influence risky decisions. The ...
[96]
[PDF] Field Experiments on Social Media - DSpace@MIT
Here we review recent innovations in experimental approaches to studying online behavior, with a particular focus on research related to misinformation and ...
[97]
https://www.aeaweb.org/articles?id=10.1257/000282805775014431
[98]
Nudging healthy and sustainable food choices: three randomized ...
Nov 30, 2019 · The study consisted of three randomized controlled field experiments aimed at investigating the effect of nudging people to make more healthy ...Missing: peer- | Show results with:peer-
[99]
Ethics in field experimentation: A call to establish new standards to ...
Nov 23, 2020 · A large number of social science field experiments do not reflect compliance with current ethical and legal requirements that govern research ...
[100]
Friends with Health Benefits: A Field Experiment - PubsOnLine
Apr 17, 2024 · A preregistered field experiment tested whether rewarding individuals for attending the gym with a friend increases gym attendance more than ...
[101]
Ethics of Field Experiments - Annual Reviews
Jan 13, 2021 · documented informed consent is sometimes impractical or inappropriate, has the potential to ... Reflections on the ethics of field experiments.
[102]
Define intake and consent process - Poverty Action Lab
... informed consent in field experiments. It addresses the documentation of consent among illiterate participants, identifies who must consent in evaluations ...
[103]
Ethics in field experimentation: A call to establish new standards to ...
Nov 23, 2020 · In all of the field experiments of this class we could find, none conducted or reported assessments of informed consent or postexperimental ...
[104]
[PDF] What does informed consent mean when conducting a field ...
Apr 14, 2025 · The informed consent concerns in social science field experiments I mention here are obviously just two of a bevy of critiques that have been ...
[105]
Bit By Bit - Ethics - 6.6.1 Informed consent
First, in order to move beyond overly simplistic ideas about informed consent, I want to tell you more about field experiments to study discrimination. In ...<|separator|>
[106]
Ethical considerations - Field Trials of Health Interventions - NCBI
Jun 1, 2015 · Trials of an intervention should be undertaken only when there is uncertainty about the balance of potential benefit and potential harm, with ...
[107]
Understanding and misunderstanding randomized controlled trials
RCTs can play a role in building scientific knowledge and useful predictions but they can only do so as part of a cumulative program.
[108]
The Ethics of Randomized Experiments in International Development
May 2, 2021 · With many RCTs, ethical concerns are a byproduct of experimental integrity. Any RCT must restrict a potentially beneficial treatment to a ...
[109]
Indecent Proposals in Economics: The Moral Problem With ...
May 21, 2020 · The indiscriminate use of RCTs has not been without controversy, with other Nobel laureates like Angus Deaton and Joseph Stiglitz clearly being ...
[110]
Ethical conduct of randomized evaluations - Poverty Action Lab
This resource is intended as a practical guide for researchers to use when considering the ethics of a given research project.
[111]
'Nudge Ethics: Critical Views' Summary - BehavioralEconomics.com
With respect to the AUTONOMY of the nudged, critics lament that nudges take advantage of people's biases and often lack transparency, are manipulative, diminish ...
[112]
From libertarian paternalism to liberalism: behavioural science and ...
Dec 13, 2021 · Third, libertarian paternalism does not respect the subjectivity or plurality of values, which in a nutshell means that it endorses changing ...
[113]
The Manipulation of Choice: Ethics and Libertarian Paternalism ...
At its best, libertarian paternalism is a relatively benign form of manipulation. At its worst, White suggests, it undermines our autonomy and raises the cost ...
[114]
On the Supposed Evidence for Libertarian Paternalism - PMC
Libertarian paternalists argue that results from psychological research show that our reasoning is systematically flawed and that we are hardly educable.
[115]
The False Allure Of Libertarian Paternalism - Hoover Institution
Apr 24, 2018 · Libertarian paternalism consciously hopes to preserve freedom of contract by eschewing mandatory rules and relying instead on a framework of default rules.
[116]
Libertarian Paternalism Is an Oxymoron | Gregory Mitchell - UVA Law
Libertarian paternalism does leave rational persons a way out of the central planner's paternalism, but often the exit will not be costless, as the ...
[117]
Behavioral economics and the 'new' paternalism - ScienceDirect.com
The paper provides a critical appraisal of the normative program of behavioral economics known as 'new paternalism'.Behavioral Economics And The... · 4. Catalogue Of... · 8. Critical AppraisalMissing: critiques | Show results with:critiques
[118]
[PDF] Experiments, Policy, and Theory in Development Economics
The relatively recent emergence of lab-like field experiments draws on this complementarity between the two experimental methods (Viceisza, 2013). Among other ...<|separator|>
[119]
[PDF] Economic Experiments for Policy Analysis and Program Design
A second example of an RCT experiment of a large Government program is the congressionally mandated Head Start Impact Study in 1998. This study aimed to ...
[120]
Randomized Controlled Trials of Public Policy
Randomized experiments have been used to assess the success of homelessness prevention programs, welfare time-limits and employment restrictions, and job- ...
[121]
How Can Experiments Play a Greater Role in Public Policy?
Nov 18, 2019 · Policymakers should carefully evaluate the results of policy experiments, and should avoid scaling programs before there is sufficient evidence of efficacy.
[122]
A Systematic Review of Field Experiments in Public Administration
Mar 23, 2020 · This systematic review identifies 42 field experiments in public administration and serves as an introduction to field experiments in public administration.
[123]
[PDF] A "politically robust" experimental design for public policy evaluation ...
Abstract. We develop an approach to conducting large-scale randomized public policy exper- iments intended to be more robust to the political interventions ...
[124]
Combining randomized field experiments with observational satellite ...
In this paper, we introduce a calibration-based approach for combining both experimental and observational data in the same analysis. Our approach is ...
[125]
A Double Machine Learning Approach to Combining Experimental
We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and ...Missing: field | Show results with:field
[126]
Lab-in-the-Field Methods in Development Economics: A Review of ...
Feb 13, 2024 · Lab-in-the Field experiments can provide important behavioral insights on decisions made by diverse populations in developing countries.
[127]
Recent Developments in Experimental Economics and Field ...
American Economic Association. Recent Developments in Experimental Economics and Field Experiments. Friday, Jan. 3, 2025. 8:00 AM - 10:00 AM (PST). San ...Missing: psychology 2020-2025
[128]
Advances with Field Experiments Conference 2024
The AFE 2024 conference, held at LSE on Sept 5-6, gathered academics to present innovative field experiment work, with keynotes by Athey, Bandiera, and ...
[129]
Challenges to Replication and Iteration in Field Experiments
Challenges to Replication and Iteration in Field Experiments: Evidence from Two Direct Mail Shots by Jake Bowers, Nathaniel Higgins, Dean Karlan, ...
[130]
About 40% of economics experiments fail replication survey - Science
Mar 3, 2016 · 40% of economics experiments fail replication survey. Compared with psychology, the replication rate is rather good, researchers say.
[131]
[PDF] Evidence on Research Transparency in Economics - eScholarship
Aug 1, 2021 · Psychology is ahead of economics in the adoption of preregistration but behind economics in data sharing, while sociology has the slowest ...
[132]
The replication crisis has led to positive structural, procedural, and ...
Jul 25, 2023 · The 'replication crisis' has introduced a number of considerable challenges, including compromising the public's trust in science and ...
[133]
'An Existential Crisis' for Science - Institute for Policy Research
Feb 28, 2024 · The replication crisis refers to a pattern of scientists being unable to obtain the same results previous investigators found.