Fact-checked by Grok 2 weeks ago

Data Colada

Data Colada is a blog founded in September 2013 by behavioral scientists Uri Simonsohn, Leif D. Nelson, and Joseph P. Simmons, dedicated to scrutinizing the evidentiary foundations of empirical research in psychology and related social sciences through statistical analysis, replication studies, and critiques of methodological practices.^[1] The platform emphasizes quantitative examinations of published findings to identify patterns suggestive of selective reporting, p-hacking, or data irregularities, with posts typically spanning 700 to 1,000 words and incorporating original data reanalyses or simulations.^[1] Its authors, affiliated respectively with ESADE Business School, the University of California, Berkeley, and the Wharton School of the University of Pennsylvania, have leveraged the blog to advance tools like the p-curve, a statistical method for assessing whether significant results reflect genuine effects rather than biased reporting.^[2]^[3]^[4] Data Colada has gained prominence for documenting apparent data falsification in influential studies, including collaborations with Dan Ariely on dishonesty experiments and papers co-authored by Harvard's Francesca Gino, prompting retractions and institutional investigations that underscore broader challenges in research reproducibility.^[5]^[6] These efforts have highlighted systemic vulnerabilities in academic data handling, such as fabricated datasets mimicking expected patterns, while advocating for preregistration, transparency in raw data, and robust inference methods to mitigate researcher flexibility in analysis.^[7] By focusing on first-hand statistical forensics over narrative commentary, the blog has influenced reforms in journal policies and heightened awareness of evidentiary standards, positioning it as a key resource in the movement to restore credibility to behavioral science.^[8]

Origins and Establishment

Founding and Initial Purpose

Data Colada was established in September 2013 by behavioral scientists Uri Simonsohn, Leif Nelson, and Joseph Simmons, all affiliated with academic institutions focused on judgment and decision-making research.^[1] The trio, known for their prior collaborative work on statistical methods to detect questionable research practices, launched the blog to enable swift publication of investigative analyses that traditional peer-reviewed journals could not accommodate due to lengthy review timelines.^[1] This initiative arose amid growing concerns in the early 2010s about reproducibility crises in social psychology, where the founders had already contributed tools like the p-curve analysis to quantify evidence for selective reporting or genuine effects.^[2] The blog's initial purpose centered on rigorous, data-driven scrutiny of published findings, emphasizing quantitative reanalyses, study replications, and explorations of statistical anomalies indicative of p-hacking, file-drawer problems, or data fabrication.^[1] Posts were designed to be self-contained and accessible, limited to 700–1,000 words, while incorporating visualizations, code snippets, and empirical tests to substantiate claims.^[1] Unlike formal academic outlets, Data Colada prioritized transparency and iterative feedback, with a policy of pre-posting contact to authors for potential revisions, though it committed to publishing regardless of responses to maintain independence.^[9] The inaugural post on September 17, 2013, exemplified this mandate by analyzing a psychology study with implausible data patterns, which prompted the authors to retract it shortly thereafter, validating the blog's model of "just posting it" for accelerating accountability.^[9] Early content thus targeted vulnerabilities in experimental design and analysis within behavioral sciences, aiming to foster a culture of evidentiary rigor by publicly dissecting high-profile claims without deference to institutional prestige.^[1] This foundation positioned Data Colada as a counterweight to systemic incentives for positive results, drawing on first-hand expertise from its creators who had replicated dozens of studies revealing inflated effect sizes across the field.^[10]

Key Founders and Contributors

Data Colada was founded in September 2013 by three behavioral scientists: Uri Simonsohn, Leif Nelson, and Joseph P. Simmons, who serve as its primary authors and investigators.^[1] These individuals, all professors in business and decision sciences, established the blog to scrutinize questionable data practices in empirical research, drawing on their expertise in statistical analysis and experimental design.^[11] Their collaborative work emphasizes forensic examination of datasets for anomalies indicative of fabrication or selective reporting, often without direct accusations but through presentation of statistical irregularities.^[12] Uri Simonsohn, a professor of behavioral science and decision sciences at ESADE Business School in Barcelona, Spain, has been instrumental in developing analytical tools featured on the blog, such as the p-curve method for detecting selective reporting.^[1] Prior to ESADE, Simonsohn held positions at the University of Pennsylvania's Wharton School, where he collaborated closely with Nelson and Simmons. His contributions often focus on graphical and distributional evidence of data manipulation, as seen in early posts analyzing implausible patterns in psychological datasets. Leif D. Nelson, an associate professor of marketing at the University of California, Berkeley's Haas School of Business, brings expertise in judgment and decision-making research. Nelson co-authors many investigations, particularly those probing inconsistencies in experimental outcomes from social psychology studies. His involvement underscores the blog's emphasis on replicability and transparency in behavioral science.^[13] Joseph P. Simmons, the Dorothy Silberberg Professor of Operations, Information, and Decisions at the University of Pennsylvania's Wharton School, specializes in applied statistics and has contributed to methodological critiques on the blog, including discussions on powering studies and interpreting effect sizes. Simmons' background in statistical rigor informs the trio's joint exposés, such as those involving fabricated data in high-profile dishonesty experiments.^[14]^[15] While the core team remains these three, occasional guest analyses or acknowledgments appear in posts, but no other individuals are credited as foundational contributors. Their sustained collaboration, spanning over a decade, has positioned Data Colada as a pivotal resource for auditing scientific integrity without institutional affiliation or funding biases.^[1]

Investigative Methodology

Core Techniques and Statistical Tools

Data Colada's analytical approach emphasizes detecting statistical anomalies that suggest selective reporting, p-hacking, or data fabrication through rigorous examination of reported results without requiring raw data access. A foundational tool is p-curve analysis, developed by founders Joseph Simmons, Leif Nelson, and Uri Simonsohn in 2013, which evaluates the evidential value of a collection of statistically significant findings (p < .05) by analyzing the distribution of those p-values.^[16]^[17] The method generates a p-curve by plotting significant p-values and comparing it to expected shapes under null hypotheses of no true effects combined with questionable practices; a positively skewed curve (peaking near p = .025 and declining toward p = .05) indicates genuine power and rules out selective reporting as the sole explanation for significance, while a flat or right-tailed curve signals insufficient evidential value.^[18]^[2] Refinements, such as excluding borderline p-values and simulations for robustness to heterogeneity or ambitious p-hacking, have addressed critiques, with empirical applications demonstrating practical efficacy despite vulnerabilities in contrived low-power scenarios.^[19]^[20] For individual study scrutiny, particularly in suspected fraud cases, Data Colada applies consistency checks like the GRIM test, which assesses whether reported means and sample sizes align arithmetically with the granularity of integer data (e.g., whole numbers yielding impossible decimals when averaged).^[21] This technique flags fabrication by revealing inconsistencies feasible only through post-hoc invention rather than genuine measurement, as seen in reviews of psychological reporting anomalies.^[22] Complementary distributional analyses include off-label uses of the Kolmogorov-Smirnov (KS) test to evaluate empirical cumulative distributions against theoretical expectations, such as verifying the proportion of participants exhibiting treatment effects in randomized designs where uniform distributions should prevail under no effect.^[23] Deviations, quantified by maximum distance between distributions, highlight improbably patterned outcomes inconsistent with random assignment or natural variability.^[23] In forensic investigations, probabilistic modeling quantifies the implausibility of observed data under honest error versus intentional manipulation, incorporating randomization integrity tests, digit distribution scrutiny (e.g., against Benford's law for fabricated numbers), and pattern detection for artifacts like clustered identical values or linear residuals suggestive of algorithmic generation or manual swaps.^[5]^[6] For instance, in field experiments, they compute the probability of randomization failures or fabricated sequences exceeding binomial expectations, often combining multiple indicators—such as duplicated clusters or timestamp anomalies—for cumulative evidence of fraud.^[5] These methods prioritize empirical improbability over motive, with simulations validating thresholds (e.g., p < 10^{-6} for rejection).^[6] While not infallible, their integration has exposed fabrications in high-profile cases by leveraging arithmetic impossibilities and distributional irregularities verifiable from published summaries alone.^[21]

Evolution of Analytical Approaches

Data Colada's analytical approaches initially emphasized meta-analytic techniques to evaluate the evidential value of published findings, particularly through the development of the p-curve method. Introduced in a 2014 paper and refined in subsequent blog posts, p-curve analyzes the distribution of statistically significant p-values (typically p < 0.05) from a set of studies to distinguish genuine effects from those inflated by selective reporting practices such as p-hacking or publication bias.^[24] By plotting these p-values and comparing them to expected distributions under null and alternative hypotheses, p-curve estimates average power and rules out selective reporting as the sole explanation for significance, assuming right-skewed curves indicate true effects.^[18] Early posts from 2014 to 2018 addressed critiques, demonstrating robustness to effect size heterogeneity and ambitious p-hacking scenarios, while excluding non-significant p-values to focus on reported results.^[20]^[19] Over time, these methods expanded beyond aggregate assessments to incorporate granular, dataset-specific forensic tools for detecting outright data fabrication, marking a shift evident from around 2019 onward. Investigations began integrating checks for statistical impossibilities, such as applying the GRIM (Granularity-Related Inconsistency of Means) test to verify if reported means and standard deviations align with underlying integer data, often revealing rounding errors inconsistent with raw collection practices.^[5] This evolution paralleled high-profile cases, including a 2021 analysis of a field experiment on child incentives, where randomization failures combined with anomalous patterns across datasets—such as improbably uniform outcomes—provided evidence of fabrication beyond mere errors.^[5] By 2023, analytical rigor intensified in multi-part series, employing visual and distributional forensics like "clusterfake" detection, where response data exhibited duplicated or mirrored clusters improbable under genuine collection, as seen in examinations of co-authored papers on dishonesty and decision-making.^[6] Techniques now routinely cross-reference timestamps, survey metadata, and inter-study consistencies, identifying fabricated entries mimicking legitimate patterns, such as in cases where suspicious data aligned too perfectly with external records.^[7] This progression reflects a move from probabilistic inference on literature-level biases to causal dissection of individual datasets, prioritizing empirical anomalies over theoretical modeling, though p-curve's utility has waned in favor of these targeted diagnostics amid rising fraud scrutiny.^[25]

Notable Investigations and Findings

In September 2013, Data Colada's inaugural post detailed Uri Simonsohn's discovery of fabricated data in a paper published in the Judgment and Decision Making journal, encountered while sourcing examples of uncorrelated variables from its data repository for an unrelated project. The dataset, involving participants' estimations of coin flip outcomes, displayed unnatural consistency in purportedly random responses, such as improbably precise clustering around expected probabilities that deviated from genuine behavioral variability. Rather than privately contacting the authors, the team opted for immediate public disclosure to test the efficacy of transparency in prompting institutional response; the paper was subsequently retracted following verification of the irregularities.^[9] This incident underscored patterns in fabricated data, including overly uniform "random" elements, and set a precedent for Data Colada's policy of posting concerns openly when evidence warranted, bypassing prolonged private negotiations that might allow further obfuscation. In April 2014, a follow-up post compared fabrication techniques across historical cases, such as Gregor Mendel's selectively reported pea plant ratios and Diederik Stapel's entirely invented datasets in social priming studies, to illustrate detectable statistical artifacts like improbable exactness or symmetry in manipulated results. These analyses highlighted how fraudsters often fail to simulate realistic noise, aiding forensic identification in social psychology datasets.^[26] A pivotal exposure occurred in May 2014, when Data Colada examined a social psychology paper flagged for improbable linearity in observed scores relative to true scores, a hallmark of fabricated rather than organically collected data. In genuine experiments, measurement error introduces expected nonlinearity and heteroscedasticity; the dataset's straight-line conformity across predicted ranges, persisting after ruling out p-hacking or selective reporting via simulations, pointed to post-hoc invention or alteration. This scrutiny contributed to broader inquiries into researcher Jens Förster, whose social influence and priming studies exhibited similar anomalies, resulting in multiple retractions by late 2014 after institutional probes confirmed data inconsistencies beyond mere error.^[27]^[28] By June 2015, Data Colada's post on fraud mitigation critiqued overreliance on incentive reforms, arguing from case evidence that self-reported motives (e.g., pressure for novel findings) inadequately explain fabrication prevalence, and advocated enhanced post-publication auditing using tools like granger causality tests for temporal data fabrication or distribution checks for implausible uniformity. These interventions, amid social psychology's replication challenges, exposed vulnerabilities in fields reliant on small-sample behavioral experiments, prompting journals to tighten data-sharing mandates without yet mandating raw code or preregistration universally.^[29]

Mid-Period Cases in Behavioral Science (2017–2021)

During 2017–2021, Data Colada's investigations in behavioral science emphasized rigorous data audits and replication attempts, building on earlier methodological critiques to probe specific published studies for anomalies in data handling and reporting. The blog initiated the "Data Replicada" series around 2019, targeting papers in journals like Journal of Consumer Research and Journal of Marketing Research that provided open data, with the aim of verifying whether reported results could be exactly reproduced using the shared datasets and described methods. These audits frequently uncovered discrepancies, such as inability to match exact p-values or effect sizes without assuming undisclosed flexibility in analysis, suggesting potential questionable research practices like selective disclosure of covariates or outcome measures.^[30]^[31] In one Data Replicada case from February 11, 2020 (post #84), the team examined a 2019 study claiming that low self-concept clarity both increased retention of identity-relevant magazine subscriptions and decreased acquisition of new ones. Attempts to replicate the exact statistical results from the provided data required introducing unspecified variables or transformations not detailed in the paper, leading to conclusions that the findings likely relied on unreported analytical choices rather than robust evidence. The authors responded by acknowledging possible errors in data sharing but maintained the substantive conclusions held under alternative specifications.^[30] A similar audit on August 18, 2020 (post #90) scrutinized a Journal of Marketing Research paper asserting that displaying multiple copies of a product enhanced perceived efficacy compared to a single copy. Reproduction efforts failed to yield the reported significance levels without hypothesizing hidden moderation or data exclusions, prompting questions about the reliability of the effect in consumer behavior contexts. These cases underscored persistent challenges in ensuring computational reproducibility even when raw data were available, as behavioral science studies often involved complex preprocessing steps prone to omission.^[31] The period's most prominent investigation culminated on August 17, 2021, in post #98, which provided statistical evidence of fabrication in a widely cited 2012 Proceedings of the National Academy of Sciences field experiment co-authored by Dan Ariely and colleagues. The study had claimed that signing an honesty pledge at the top of a form (versus the bottom) reduced omissions in self-reported car mileage by insurance customers in the U.S., attributing this to heightened self-awareness. Data Colada's analysis, aided by anonymous collaborators, identified implausible straight-line response patterns across multiple variables—such as identical sequences in open-ended fields like odometer readings and policy numbers—consistent with manual invention rather than genuine survey entries. Additional red flags included digit distributions deviating from Benford's Law expectations for financial data and impossibly precise clustering of responses. Ariely, who received the dataset from the partnering insurance firm, denied involvement in any fabrication and suggested external mishandling, but the paper was retracted by PNAS in October 2023 after independent verification confirmed the anomalies. This exposure, involving over 18,000 cited instances of the work, amplified debates on fraud detection in high-impact behavioral economics research.^[5]

Recent Analyses and Broader Applications (2022–Present)

In June 2025, Data Colada analyzed a LinkedIn-based audit study published in the Quarterly Journal of Economics, which examined racial differences in response rates to networking requests from Black versus White male profiles. The blog praised the study's methodological strengths, including its large scale and randomization, positioning it as one of the strongest published audit experiments on discrimination. However, it highlighted a key shortcoming: the profiles' bios inadvertently signaled socioeconomic status differences, potentially confounding race effects with class perceptions and inflating estimated discrimination. This critique emphasized the need for tighter controls in field experiments to isolate causal mechanisms.^[32] In September 2025, the blog addressed a critique of the p-curve method in the Journal of the American Statistical Association, which argued that p-curve exhibits poor statistical properties under certain theoretical scenarios, such as extreme heterogeneity or selective reporting. Data Colada countered by simulating practical research conditions, including real-world p-hacking and file-drawer effects, and demonstrated that p-curve reliably distinguishes evidential value from selective reporting artifacts—even under "piano-dropping" levels of bias. The analysis reaffirmed p-curve's applied utility for meta-analytic assessments in psychology and beyond, where theoretical fragility does not undermine empirical performance.^[2] A September 2024 post applied the two-sample Kolmogorov-Smirnov test—typically used for distribution comparisons—to between-subjects experiments, questioning its "off-label" extension to estimate the proportion of individuals showing treatment effects. By simulating data under various effect sizes and noise levels, the authors showed that the test's statistic correlates imperfectly with true effect prevalence, often overestimating it due to sensitivity to outliers and small samples. They recommended cautious interpretation and complementary metrics, such as permutation tests, to avoid overstating individual-level impacts in non-randomized settings. This work broadened the blog's statistical toolkit to practical experimental design challenges in behavioral science.^[23] These contributions reflect Data Colada's shift toward methodological refinement and cross-disciplinary applications, influencing how researchers scrutinize audit designs, evidential tools like p-curve, and distributional tests amid ongoing debates on replicability. By prioritizing simulation-based validation over abstract theory, the analyses promote causal clarity without assuming uniformity in effects or data practices.^[8]

Impact on Scientific Integrity

Catalyzing Retractions and Corrections

Data Colada's analyses have prompted the retraction of multiple papers by uncovering patterns indicative of data fabrication, such as improbable clustering, duplicated values, or inconsistencies with raw data files. Their inaugural post on September 17, 2013, examined datasets from the journal Judgment and Decision Making and highlighted anomalies in participant responses, which aligned with issues in prior fraud cases; this scrutiny contributed to a subsequent retraction in Psychological Science.^[9] A prominent case involved a 2012 Proceedings of the National Academy of Sciences paper on using pre-filled honesty pledges in tax forms, co-authored by Dan Ariely and colleagues, which reported fabricated data showing reduced dishonesty. Data Colada's post 98, published August 17, 2021, demonstrated fabrication through mismatched survey responses, implausibly uniform patterns, and discrepancies between reported and actual data; the journal retracted the paper on August 23, 2021, after authors could not verify the dataset's integrity.^[5]^[33] In June 2023, posts 109 through 112 exposed fraud in four papers co-authored by Harvard Business School professor Francesca Gino, including a 2012 Psychological Science study on extrinsic incentives and intrinsic motivation ("clusterfake" patterns where data points impossibly clustered near means) and others with fabricated participant details like Harvard class years. These findings triggered a Harvard investigation, leading to retractions of two Gino papers by September 2023 and institutional requests for further retractions, though Gino has contested the allegations and initiated (later dismissed) legal action against the bloggers.^[6]^[34]^[35] Beyond full retractions, Data Colada's work has spurred expressions of concern, corrections, and expanded audits of related publications, as seen in Gino's case where co-authors withdrew additional papers amid heightened scrutiny. Their methodology—focusing on verifiable data artifacts rather than intent—has influenced journals to adopt stricter data-checking protocols, contributing to over a dozen documented instances of post-publication amendments across behavioral sciences.^[36]^[37]

Contributions to the Replication Crisis Debate

Simmons, Nelson, and Simonsohn's 2011 paper demonstrated that common questionable research practices (QRPs), including p-hacking through flexible analyses and selective reporting of dependent variables, can produce false positives exceeding 60% under standard null hypothesis significance testing, even absent true effects, thereby highlighting systemic vulnerabilities in psychological research that contribute to non-replicability. This analysis shifted the replication crisis debate from mere anecdotal failures toward quantifying how researcher degrees of freedom undermine evidential validity, prompting widespread adoption of pre-registration to curb such practices.^[38] In 2014, the trio introduced p-curve analysis, a statistical method that plots the distribution of significant p-values (p < .05) from a body of studies to detect evidential value: right-skewed curves indicate genuine effects, while left-skewed or flat distributions signal p-hacking or selective reporting without true underlying effects.^[39] Applied to fields like power posing, p-curve has revealed inflated effects in literatures previously deemed robust, fueling arguments that publication bias, rather than measurement error alone, drives many replication discrepancies.^[40] The tool's conservatism—falsely detecting low evidential value only rarely—has made it a benchmark for meta-assessing replicability without direct re-experiments, influencing guidelines from journals and funding bodies to prioritize transparent p-value reporting.^[41] Through Data Colada posts, the blog has critiqued simplistic interpretations of replication rates, arguing that low success rates (e.g., 40%) do not equate to absent effects if original studies were underpowered, as replication power must account for effect size uncertainty to avoid overestimating null findings.^[42] Their "Data Replicada" series, launched in 2019, systematically attempts replications of post-crisis publications in journals like Journal of Consumer Research, revealing persistent issues such as hidden confounds and non-replicable patterns despite reforms, thus sustaining debate on whether the crisis reflects incomplete behavioral change or inherent field-wide incentives.^[43] These efforts recast the crisis as a "credibility revolution," emphasizing methodological reforms like open data over dismissing non-replications as definitive disproofs.^[44]

Reception and Critiques

Academic and Media Praise

Data Colada's investigative efforts have been commended by academics for enhancing transparency and rigor in behavioral science. Simine Vazire, editor-in-chief of Psychological Science, launched a 2023 crowdfunding campaign to fund their legal defense against a defamation lawsuit, raising over $370,000 from researchers worldwide, underscoring the perceived importance of their work in detecting data irregularities.^[45] Vazire emphasized the need to protect such grassroots scrutiny to prevent chilling effects on misconduct investigations.^[35] Other scholars have highlighted their contributions to statistical tools for fraud detection. In a 2025 review of methods for identifying fabricated data, their techniques were cited as foundational for empirical assessments of research validity, aiding broader efforts to combat questionable practices.^[46] Their blog's exposés, such as on p-hacking and failed randomizations, have been credited with prompting institutional reforms, including increased data-sharing mandates.^[5] Media coverage has portrayed Data Colada as pivotal whistleblowers in high-profile scandals. The Wall Street Journal profiled them in 2023 as a "band of debunkers" whose forensic analyses expose fraud in elite academia, from Dan Ariely's dishonesty studies to Francesca Gino's papers, fostering accountability.^[11] The New Yorker noted their meticulous approach, with Gino herself praising their "determination and skill" in data sleuthing prior to her allegations against them.^[44] City Journal lauded their role in unraveling the 2012 "dishonest honesty study," crediting open science practices they advocate for enabling such discoveries.^[47] These accounts frame their anonymous tips-driven model as a vital counter to systemic oversight failures in peer review.

Methodological and Ethical Criticisms

Critics have questioned the robustness of Data Colada's statistical methods for detecting selective reporting and questionable research practices, particularly their early advocacy for p-curve analysis. P-curve, intended to assess evidential value by analyzing distributions of significant p-values, has been shown in simulation studies to suffer from flaws such as unreliability under effect size heterogeneity, vulnerability to p-hacking, and distorted inferences when excluding non-significant results.^[25] Post-2018 critiques, including those by Brunner and Schimmack demonstrating poor statistical properties and by Montoya highlighting irreproducible applied conclusions, received no substantive rebuttal from the blog, which ceased p-curve applications after 2019 in favor of fraud-focused investigations.^[25] In fraud detection, Data Colada relies on forensic tools like GRIM tests for impossible means, checks for duplicated observations, and scrutiny of residual patterns or data distributions for implausibility. While effective for screening, these methods are probabilistic and prone to false positives from non-fraudulent sources such as data entry errors, measurement artifacts, or unaccounted collection protocols.^[21] A comprehensive review of such statistical detectors emphasizes their utility in identifying anomalies but underscores limitations, including failure to distinguish intentional fabrication from incompetence or benign anomalies without auxiliary evidence like whistleblower testimony.^[21]^[46] Ethically, Data Colada's practice of publicly posting detailed allegations against named researchers prior to formal institutional probes has drawn accusations of vigilante justice and reputational harm. Psychologists such as Norbert Schwarz have likened the approach to a "witch hunt," arguing it overlooks the interpretive nuances of social science data and imposes undue punitive pressure.^[44] Daniel Gilbert, a Harvard psychologist, characterized the bloggers as "shameless little bullies" engaging in tactics reminiscent of authoritarian surveillance, while a former president of the Association for Psychological Science termed it "methodological terrorism."^[44] Such disclosures, critics contend, bypass due process, amplify media scrutiny, and risk irreversible career damage even when suspicions prove unfounded or contested, as evidenced by defamation lawsuits alleging biased or incomplete analyses.^[44]^[48] Detractors from within psychology further argue that the social costs—disrupted collaborations, student opportunities, and field morale—often outweigh marginal gains in scientific correction, given the robustness of broader paradigms to isolated retractions.^[49]

Legal and Institutional Controversies

Francesca Gino Fraud Allegations

In June 2023, the Data Colada blog published a four-part investigative series titled "Data Falsificada," alleging data fabrication by Harvard Business School professor Francesca Gino in four co-authored papers published between 2012 and 2020.^[6] The posts, authored by behavioral scientists Uri Simonsohn, Leif Nelson, and Joseph Simmons, presented statistical analyses of publicly available datasets and original files obtained from sources like Dropbox, highlighting patterns inconsistent with honest data collection.^[50] Data Colada had privately notified Harvard University of the concerns in late 2022, prompting an internal review before the public disclosures.^[51] The first post examined a 2012 paper co-authored with Max Bazerman, Yuval Feldman, and Maurice Schweitzer, where data on survey responses showed improbable straight-line patterns in scatterplots of key variables, with the probability of such alignments occurring by chance estimated at less than 1 in a billion.^[6] Subsequent posts analyzed Excel files from two other papers: a 2020 study on signatures and cheating with Dan Ariely, and a 2014 paper on observing unethical behavior. Metadata from Excel's calcChain feature indicated that edits were targeted exclusively to cells altering statistical outcomes, such as changing response values to flip p-values from insignificant to significant, while leaving unrelated cells untouched.^[50] Timestamps and edit histories suggested manual intervention post-data collection, with sequences of changes that systematically supported the papers' hypotheses.^[52] The fourth post scrutinized a dishonesty experiment dataset, revealing duplicated or reordered participant IDs and fabricated entries that mimicked expected behavioral patterns too precisely for random variation.^[53] Gino has denied all allegations of misconduct, asserting that she never falsified data and that observed anomalies stemmed from errors by research assistants, third-party data collectors, or routine file handling, such as Excel auto-formatting.^[54] She claimed the calcChain evidence reflected benign updates, like formula recalculations, rather than tampering, and argued that statistical improbabilities could arise from unmodeled complexities in behavioral data.^[55] In August 2023, Gino filed a defamation lawsuit against Data Colada's authors, alleging their posts contained false statements made with malice; however, a Massachusetts federal judge dismissed the claims in September 2024, ruling that the bloggers' analyses constituted protected opinion based on disclosed evidence and methodologies.^[35] The allegations contributed to retractions of three implicated papers by June 2024, including the 2012 and 2020 studies, following journal investigations that cited the Data Colada evidence as compelling indicators of manipulation.^[51] Harvard's October 2023 investigative report, later unsealed, detailed forensic analysis supporting intentional alteration in at least one dataset, such as deletion of observations and substitution of values to fabricate results aligning with Gino's predictions.^[56] Gino maintains her innocence, framing the scrutiny as a miscarriage of process influenced by unverified assumptions.^[57]

Harvard University Investigation and Tenure Revocation

In June 2023, following a series of blog posts by Data Colada detailing apparent data alterations in four papers co-authored by Francesca Gino, Harvard Business School Dean Srikant M. Datar initiated an internal investigation and placed Gino on unpaid administrative leave, barring her from campus and revoking her named professorship.^[58]^[50] The allegations centered on evidence of fabricated or manipulated data in studies Gino co-authored with Dan Ariely and others, including altered survey responses and impossible data patterns, such as participants reporting implausible class years.^[51]^[59] Harvard's Faculty Conduct Committee, comprising three tenured professors, conducted a two-year probe, reviewing documents, data files, and witness testimonies.^[60] In a March 2024 report—unsealed amid Gino's legal challenges—the committee concluded that Gino had committed research misconduct by falsifying data in multiple studies, including one published in the Proceedings of the National Academy of Sciences where survey responses appeared digitally altered to support the hypothesized results.^[59]^[58] The report emphasized that the alterations were not mere errors but intentional manipulations traceable to Gino's involvement, rejecting her claims of third-party interference or data entry mistakes.^[61] In May 2025, following the committee's recommendation, Harvard's governing Harvard Corporation—the university's highest academic body—revoked Gino's tenure in an unprecedented move, marking the first such revocation in Harvard's history for research misconduct.^[58]^[60] This decision terminated her employment at the university, despite Gino's ongoing denial of wrongdoing and her separate defamation lawsuit against Data Colada, which she filed in 2023 alleging the bloggers' posts lacked sufficient evidence of her direct involvement.^[62]^[63] Harvard maintained that the investigation's findings were independent and substantiated by forensic data analysis, including timestamps and file metadata inconsistent with innocent explanations.^[64]

Ongoing Defamation Lawsuit Developments

In August 2023, Harvard Business School professor Francesca Gino filed a $25 million defamation lawsuit against Data Colada bloggers Uri Simonsohn, Leif Nelson, and Joseph Simmons, claiming their June 2023 blog posts alleging data fabrication in her co-authored studies constituted false statements that damaged her professional reputation.^[35]^[65] Gino argued the posts implied she personally committed fraud without sufficient evidence, exceeding protected opinion.^[66] On September 11, 2024, U.S. District Judge Myong J. Joun dismissed Gino's defamation claims against the Data Colada defendants with prejudice, ruling that their statements—framed as analyses of suspicious data patterns—were non-actionable opinions protected by the First Amendment rather than verifiable assertions of fact.^[35]^[67] The judge emphasized that courts should not second-guess scientific debates through libel actions, noting the bloggers disclosed their evidence and reasoning transparently.^[35] This decision aligned with precedents shielding academic whistleblowers from defamation liability when critiques involve interpretive judgments.^[67] In May 2025, Data Colada moved for Rule 11 sanctions and attorney's fees, arguing Gino's suit was frivolous and intended to intimidate scientific scrutiny.^[68] On July 12, 2025, Judge Joun denied the motion, finding that while Gino's claims lacked merit, they did not meet the high threshold for bad-faith litigation, as her counsel reasonably pursued arguments amid disputed evidence.^[68]^[69] As of October 2025, no appeal of the dismissal has been publicly filed, leaving the ruling intact, though Gino maintains the bloggers' analyses were misleading.^[66]^[61]

References

[1]
About - Data Colada
Dec 21, 2018 · The authors: Uri Simonsohn, Leif Nelson and Joe Simmons. The blog. Last update: December 21st, 2018. Launched on September of 2013 ...
[2]
[129] P-curve works in practice, but would it work if you dropped a ...
Sep 23, 2025 · Data Colada. Menu. Home · Table of Contents · Feedback Policy · About. Menu ... Uri Simonsohn. P-curve is a statistical tool we developed about 15 ...
[3]
Faculty & Research - Simonsohn, Uri - Esade
He has been a professor at the Wharton School of the University of Pennsylvania for the last 15 years, and also held an appointment at the University of ...Missing: affiliation | Show results with:affiliation
[4]
Leif Nelson - Berkeley Haas
Leif Nelson. Ewald T. Grether Professor in Business Administration & Marketing | Barbara and Gerson Bakar Faculty Fellow | Distinguished Teaching FellowMissing: affiliation | Show results with:affiliation
[5]
[98] Evidence of Fraud in an Influential Field Experiment About ...
Aug 17, 2021 · A single fraudulent dataset almost never provides enough evidence to answer all relevant questions about how that fraud was committed. And this ...Missing: investigations | Show results with:investigations
[6]
[109] Data Falsificada (Part 1): "Clusterfake" - Data Colada
Jun 17, 2023 · A four-part series of posts detailing evidence of fraud in four academic papers co-authored by Harvard Business School Professor Francesca Gino.Missing: notable | Show results with:notable
[7]
[117] The Impersonator: The Fake Data Were Coming From Inside ...
Jun 12, 2024 · In all of my experiences with data fraud, I had never encountered this scenario, where extremely suspicious data perfectly matched the ...Missing: notable | Show results with:notable
[8]
Data Colada - Thinking about evidence and vice versa
P-curve is a statistical tool we developed about 15 years ago to help rule out selective reporting, be it p-hacking or file-drawering, as the sole ...[109] Data Falsificada (Part 1)[118] Harvard’s Gino Report ...About[128] LinkedOut: The Best ...[98] Evidence of Fraud in an ...
[9]
[1] "Just Posting It" works, leads to new retraction in Psychology
Sep 17, 2013 · When discussing the work of others, our policy here at Data Colada is to contact them before posting. We ask for feedback to avoid ...Missing: founding | Show results with:founding
[10]
Table of Contents - Data Colada
Table of Contents: About Research, Design, About Research Tips, Comment on media coverage, Credibility, Lab Data Replicada, Discuss own paper, Discuss Paper by ...[109] Data Falsificada (Part 1) · [110] Data Falsificada (Part 2)
[11]
https://www.wsj.com/science/data-colada-debunk-stanford-president-research-14664f3
[12]
The Data Sleuth Taking on Shoddy Science - Freakonomics
Aug 1, 2025 · Uri Simonsohn and two other academics, Joe Simmons and Leif Nelson, run a blog called Data Colada, where they debunk fraud, call out cheaters, ...Missing: notable | Show results with:notable
[13]
Leif Nelson, Author at Data Colada
... Uri Simonsohn, Leif Nelson, and Joseph Simmons. For permission to reprint individual blog posts on DataColada please contact us via email.. Menu. Home · Table ...
[14]
Joseph Simmons – Operations, Information and Decisions Department
I am (somehow) the Dorothy Silberberg Professor of Applied Statistics. I am also a Professor of Operations, Information, and Decisions.
[15]
Joe Simmons, Author at Data Colada
Author: Joe Simmons. [127] Meaningless Means #4 ... Uri Simonsohn, Leif Nelson, and Joseph Simmons. For permission to reprint individual blog posts on DataColada ...
[16]
P-Curve: A Key to the File Drawer
Apr 24, 2013 · Uri Simonsohn (Contact Author). ESADE Business School ( email ) ; Leif D. Nelson. University of California, Berkeley - Haas School of Business ( ...Missing: tools | Show results with:tools
[17]
p-curve Archives - Data Colada
P-curve is a statistical tool that identifies if significant findings have evidential value, or are due to selective reporting of studies.
[18]
[61] Why p-curve excludes ps>.05 - Data Colada
Jun 15, 2017 · P-curve is not perfect. But it makes minor and sensible assumptions, and is robust to realistic deviations from those assumptions.
[19]
[45] Ambitious P-Hacking and P-Curve 4.0 - Data Colada
Jan 14, 2016 · P-curve is a tool that allows you to diagnose the evidential value of a set of statistically significant findings. It is simple: you plot the ...
[20]
[67] P-curve Handles Heterogeneity Just Fine - Data Colada
Jan 8, 2018 · In this post, we demonstrate that p-curve performs quite well in the presence of effect size heterogeneity, and we explain why the methods researchers have ...
[21]
Tools of the data detective: A review of statistical methods to detect ...
Feb 1, 2025 · The purpose of the present study was to review a collection of existing statistical tools to detect data fabrication, assess their strengths and limitations.
[22]
Methods to detect published mistakes without raw data?
Oct 23, 2017 · For example the GRIM test [1]. ... There are a few blogs that much more systematically track inconsistencies in reported data, e.g., Data Colada ...
[23]
[120] Off-Label Smirnov: How Many Subjects Show an Effect in ...
Sep 16, 2024 · There is a classic statistical test known as the Kolmogorov-Smirnov (KS) test (Wikipedia). This post is about an off-label use of the KS-test ...
[24]
[24] P-curve vs. Excessive Significance Test - Data Colada
Jun 27, 2014 · P-curve is a tool that assesses if, after accounting for p-hacking and file-drawering, a set of statistically significant findings have evidential value.Missing: GRIM | Show results with:GRIM
[25]
Datacolada Has Given Up on p-Curve - Replicability-Index
Aug 9, 2025 · When p-curve debuted in 2014, it was billed as a powerful tool for detecting publication bias and estimating evidential value from the ...
[26]
[19] Fake Data: Mendel vs. Stapel - Data Colada
Apr 14, 2014 · See details in the Nonexplanations section of "Just Post It", SSRN [ ↩ ]. Related. [40] Reducing Fraud in Science June 29, 2015 In "About ...
[27]
[21] Fake-Data Colada: Excessive Linearity
May 8, 2014 · In this post we present new and more intuitive versions of the analyses that flagged the paper as possibly fraudulent. We then rule out p- ...Missing: exposures 2013-2016
[28]
Anatomy of an inquiry: The report that led to the Jens Förster ...
Apr 30, 2014 · ... Excessive linearity is not something that anybody checks the data for. Let me emphasize: I read the papers. I taught some of them in my ...<|separator|>
[29]
[40] Reducing Fraud in Science - Data Colada
Jun 29, 2015 · When journals reject original submissions it is not their job to figure out why the authors run an uninteresting study or executed it poorly.Missing: contributors | Show results with:contributors
[30]
[84] Data Replicada #3: Does Self-Concept Uncertainty Influence ...
Feb 11, 2020 · Low self-concept clarity both increases the tendency to retain and decreases the tendency to acquire an identity-relevant magazine subscription.
[31]
[90] Data Replicada #7: Does Displaying Multiple Copies of a ...
Aug 18, 2020 · The authors propose that presenting multiple product replicates as a group (vs. presenting a single item) increases product efficacy perceptions.Missing: 2017 2018
[32]
[128] LinkedOut: The Best Published Audit Study, And Its Interesting ...
Jun 23, 2025 · There is a recent QJE paper reporting a LinkedIn audit study comparing responses to requests by Black vs White young males.
[33]
Daily briefing: Honesty study to be retracted over faked data - Nature
Aug 23, 2021 · An influential 2012 paper about how to promote honesty when filling out forms will be retracted because it was based on fabricated data.
[34]
After honesty researcher's retractions, colleagues expand scrutiny of ...
Jul 18, 2023 · In June, data sleuths published a series of posts on their blog, Data Colada, detailing what they say is evidence of fraud in four of Gino's ...
[35]
Honesty researcher's lawsuit against data sleuths dismissed - Science
Sep 12, 2024 · ... data had already been retracted in 2021. Joun wrote in his decision that Data Colada's assertions of fraud in Gino's work are protected by ...
[36]
Weekend reads: Who should pay for sleuthing?; the Gino retraction ...
Sep 23, 2023 · The sleuths behind Data Colada, who are being sued by Harvard professor Francesca Gino, publish retraction requests made by the university.
[37]
Meet the scientific sleuths: More than two dozen who've had an ...
the scientists behind Data Colada — have found fatal flaws in high-profile studies of behavior ...
[38]
Psychology's Replication Crisis Has Made The Field Better
Dec 6, 2018 · The replication crisis arose from a series of events that began around 2011, the year that social scientists Uri Simonsohn, Leif Nelson and Joseph Simmons ...
[39]
[PDF] P-Curve: A Key to the File-Drawer
This article was published Online First July 15, 2013. Uri Simonsohn, The Wharton School, University of Pennsylvania; Leif. D. Nelson, Haas School of Business, ...<|separator|>
[40]
[PDF] P-curving Power Posing 1 Running Head
Simonsohn constructed the p-curve disclosure table, conducted the p-curve analysis, and wrote the manuscript. We thank two editors and four reviewers for ...<|separator|>
[41]
P-curve: A key to the file-drawer. - APA PsycNet
P-curve is the distribution of statistically significant p values for a set of studies (ps < .05). Because only true effects are expected to generate right- ...
[42]
[47] Evaluating Replications: 40% Full ≠ 60% Empty - Data Colada
Mar 3, 2016 · For a replication to fail, the data must support the null. They must affirm the non-existence of a detectable effect. There are four main ...Missing: contributions | Show results with:contributions
[43]
[81] Data Replicada - Data Colada
Dec 9, 2019 · We will focus on trying to replicate recently published findings, so as to get a sense for whether non-obvious research published after the “ ...Missing: R- index
[44]
They Studied Dishonesty. Was Their Work a Lie? | The New Yorker
Sep 30, 2023 · According to an unpublished Data Colada analysis, six hundred and fifty of the odometer readings were manually swapped between conditions, which ...
[45]
How the reform-minded new editor of psychology's flagship journal ...
Oct 13, 2023 · You organized a campaign to fund Data Colada's legal fees, which has raised more than $370,000. Why did you think this was so important? Should ...
[46]
Tools of the data detective: A review of statistical methods to ... - NIH
Feb 1, 2025 · Francesca Gino was accused of fabricating data on four of her papers by a blog entitled Data Colada, which comprised Uri Simonsohn, Joe Simmons, ...
[47]
https://www.city-journal.org/article/max-h-bazerman-inside-an-academic-scandal-fraudulent-data-study
[48]
Data Colada Post 1 - Francesca v Harvard
Data Colada alleged that the study relies on data manipulation. An HBS report concluded much the same thing. They are both wrong.
[49]
I'm so sorry for psychology's loss, whatever it is - Experimental History
Aug 29, 2023 · The bloggers at Data Colada published a four-part series (1, 2, 3, 4) alleging fraud in papers co-authored by Harvard Business School professor Francesca Gino.
[50]
[110] Data Falsificada (Part 2): "My Class Year Is Harvard"
Jun 20, 2023 · This is the second in a four-part series of posts detailing evidence of fraud in four academic papers co-authored by Harvard Business School ...Missing: 2013-2016 | Show results with:2013-2016<|separator|>
[51]
[118] Harvard's Gino Report Reveals How A Dataset Was Altered
Jul 9, 2024 · Harvard professor Francesca Gino is suing us for defamation after (1) we alerted Harvard to evidence of fraud in four studies that she co-authored.Missing: behavioral 2017-2021 Ariely
[52]
[111] Data Falsificada (Part 3): "The Cheaters Are Out of Order" - Data Colada
No readable text found in the HTML.<|separator|>
[53]
[112] Data Falsificada (Part 4): "Forgetting The Words" - Data Colada
Jun 30, 2023 · This is the last post in a four-part series detailing evidence of fraud in four academic papers co-authored by Harvard Business School Professor Francesca Gino.Missing: date | Show results with:date
[54]
Innocent of Data Colada Allegations - Francesca v Harvard
Innocence. There is one thing I know for sure: I did not commit academic fraud. I did not manipulate data to produce a particular result.Policy injustice · New Yorker Rebuttal · Here · Reactions to the MCAP
[55]
Francesca Gino's Post - LinkedIn
Aug 2, 2023 · I have never, ever falsified data or engaged in research misconduct of any kind. Today I had no choice but to file a lawsuit against Harvard University.
[56]
[PDF] Harvard-Report-on-Gino.pdf - Data Colada
Oct 10, 2023 · The report found Professor Gino committed research misconduct, including falsifying data, altering participant conditions, and misrepresenting ...
[57]
'I Am Innocent': Embattled HBS Prof. Francesca Gino Defends ...
including a more thorough response to their Sept. 16 blog ...
[58]
Harvard Revokes Tenure From Francesca Gino, Business School ...
May 27, 2025 · HBS Dean Srikant M. Datar placed Gino on unpaid administrative leave, barred her from campus, and revoked her named professorship in June 2023.
[59]
Honesty researcher committed research misconduct, according to ...
Mar 15, 2024 · ... Data Colada, for damage to her reputation and lost income and career opportunities. The nearly 1300-page report from HBS includes the ...
[60]
In extremely rare move, Harvard revokes tenure and cuts ties with ...
May 25, 2025 · The university's top governing board, the Harvard Corporation, decided this month to revoke Francesca Gino's tenure and end her employment at Harvard Business ...
[61]
Harvard professor Francesca Gino's tenure is revoked amid data ...
May 28, 2025 · Gino sued Harvard and Data Colada for defamation, seeking $25 million in relief. The suit points to changes Harvard made to its internal ...
[62]
Harvard Professor Who Studied Honesty Loses Tenure Amid ...
May 27, 2025 · A Harvard professor who has written extensively about honesty was stripped of her tenure this month, a university spokesman said on Tuesday.
[63]
Harvard Sues Ex-HBS Professor Gino for Defamation, Accusing Her ...
Sep 12, 2025 · But Gino has battled Harvard's penalties in court since August 2023, accusing the University of defaming her, mishandling her tenure review ...
[64]
Star Harvard business professor stripped of tenure, fired for ...
May 27, 2025 · A renowned Harvard University professor was stripped of her tenure and fired after an investigation found she fabricated data on multiple studies focused on ...
[65]
How a Scientific Dispute Spiralled Into a Defamation Lawsuit
Sep 12, 2024 · A few weeks later, Gino filed a twenty-five-million-dollar lawsuit—for defamation, among other things—against Data Colada and Harvard. Gino's ...<|control11|><|separator|>
[66]
[116] Our (First?) Day In Court - Data Colada
May 8, 2024 · Then Francesca Gino filed a $25M defamation lawsuit against us and Harvard. ... Gino has the opportunity to appeal the case to a higher court.
[67]
Prof. Francesca Gino's Libel Claims Against Harvard Business ...
Sep 11, 2024 · Gino sued Harvard for breach of contract, defamation and related torts, and invasion of privacy, and also sued the Data Colada Defendants for defamation and ...<|separator|>
[68]
Judge Declines To Force Ex-HBS Prof. Gino To Pay Legal Fees for ...
Jul 12, 2025 · Gino To Pay Legal Fees for Bloggers Who Accused Her of Data Fraud. A federal judge rejected the data investigation blog Data Colada's request ...
[69]
Francesca Gino Wins A Round In Her Continuing Courtroom Drama
Jul 14, 2025 · JUDGE FOUND GINO'S DEFAMATION CHARGES WEAK · DATA COLADA LAWYERS CALLED GINO'S LAWSUIT FRIVOLOUS AND PUNITIVE · GINO WILL CONTINUE HER LEGAL ...