Fact-checked by Grok 2 weeks ago

The Good Judgment Project

The Good Judgment Project (GJP) was a research initiative launched in 2011 by the U.S. (IARPA) and led by psychologists Philip Tetlock and Barbara Mellers at the , focused on developing and testing methods to substantially improve the accuracy of probabilistic forecasts for geopolitical, economic, and social events through crowd-sourced volunteer predictions and structured judgment processes. Participating as one team in IARPA's four-year Aggregative Contingent Estimation (ACE) forecasting tournament, GJP recruited thousands of online volunteers to generate over one million forecasts on approximately , outperforming control groups of unaided forecasters by more than 60% and prediction markets by 25-30% in accuracy, as measured by scores that penalize both overconfidence and underprecision. Central to its success were the identification of "superforecasters"—the top 2% of participants who maintained high performance year-over-year with a 0.65 , exhibiting traits like , , and active —and techniques such as aggregating multiple judgments, team , brief in probabilistic reasoning, extended deliberation time, frequent updating of predictions, and applying an "outside view" via reference classes from similar past events. These empirical findings challenged conventional reliance on individual experts or classified intelligence, showing instead that domain-relevant past performance, precise probabilistic expressions, and collaborative aggregation yield superior results even from non-specialists. Upon the tournament's conclusion in , Tetlock and Mellers established Good Judgment Inc., a commercial entity that deploys networks of superforecasters and these validated practices to deliver tailored and services to , nonprofit, and clients.

Origins and Development

Inception as IARPA's ACE Program Participant (2011)

The (IARPA), a U.S. government organization focused on high-risk, high-reward technologies, initiated the Aggregative Contingent Estimation (ACE) program in 2011 to evaluate methods for enhancing the accuracy, precision, and timeliness of probabilistic forecasts on events. The program's core objective was to test whether eliciting and statistically aggregating judgments from large, diverse groups of non-expert forecasters could outperform conventional , which had been critiqued for overreliance on small teams of specialists prone to and . This effort stemmed from causal lessons drawn from major intelligence shortcomings, including the failure to anticipate the , 2001, attacks—attributed partly to siloed information and probabilistic underestimation—and the erroneous 2002 on Iraq's weapons of mass destruction, which highlighted systemic issues in aggregating uncertain evidence into predictive probabilities. IARPA structured ACE as a multi-year among competing teams, benchmarking crowd-sourced predictions against unaided intelligence community analysts to determine scalable improvements via empirical competition rather than theoretical advocacy. IARPA competitively awarded contracts to several university-led teams, selecting psychologists Philip Tetlock and Barbara Mellers from the University of Pennsylvania's to helm the Good Judgment Project (GJP) based on their complementary expertise in forecasting dynamics. Tetlock's foundational 2005 study in Expert Political Judgment: How Good Is It? Could We Ever Know? tracked 27,451 predictions from 284 experts in , , and related fields over years, revealing median accuracy no superior to lay benchmarks or random chance, with "hedgehogs"—those adhering rigidly to ideological paradigms—far underperforming integrative "foxes" who updated beliefs flexibly against disconfirming data. Mellers, a specialist in behavioral , brought insights from laboratory experiments on debiasing cognitive errors like overconfidence, providing a rigorous basis for designing interventions to elicit calibrated probabilities from untrained participants. Their selection reflected IARPA's emphasis on teams capable of first-principles experimentation: Tetlock's track record validated skepticism toward elite intuition, while Mellers' methods offered testable paths to probabilistic rigor, untainted by access to classified sources that might confound crowd wisdom assessments. Upon launch in 2011, the GJP established an online to and engage thousands of volunteer forecasters from the general public, prioritizing diversity in backgrounds and viewpoints over specialized credentials to simulate broad aggregation unaffected by institutional echo chambers. Participants provided numerical probability estimates on approximately 500 predefined questions annually, centered on geopolitical developments—such as diplomatic outcomes, actions, or economic shifts—calibrated for objective resolution by external sources within one to twelve months to enable rapid feedback loops and learning. This setup operationalized ACE's hypothesis that ordinary individuals, prompted to think probabilistically and aggregate via algorithms, could generate forecasts grounded in updated evidence rather than narrative-driven hunches, with recruitment amplified through media outreach to amass over 20,000 initial registrants by the tournament's early phases. The platform enforced real-time updates to predictions as new information emerged, fostering a controlled environment for causal analysis of judgment formation independent of or policy pressures.

Tournament Execution and Victory (2011-2015)

The tournament, sponsored by the , ran annually from 2011 to 2015, challenging teams to submit probabilistic forecasts on approximately 150-500 geopolitical, economic, and security questions per year, with resolutions typically within 12 months based on verifiable outcomes. The Good Judgment Project (GJP), led by researchers Barbara Mellers, Philip Tetlock, and Lyle Ungar, entered as one of five initial competing teams alongside a control group of U.S. analysts; additional teams joined in later years, totaling up to nine competitors by the end. Forecasts were evaluated using scores, which measure the mean squared difference between predicted probabilities and binary outcomes (lower scores indicate higher accuracy), emphasizing both (alignment of stated probabilities with observed frequencies) and (distinguishing likely from unlikely events). In Year 1 (2011), GJP recruited over volunteers through public solicitation, soliciting individual forecasts without targeted interventions to establish an empirical baseline; aggregated predictions via simple averaging yielded Brier scores roughly on par with other entrants and the intelligence control group, demonstrating the viability of crowd aggregation from diverse amateurs but highlighting room for systematic improvement. From Year 2 onward (2012-2014), GJP iteratively tested interventions grounded in psychological research on judgment debiasing, including brief, self-paced online modules (about 30 minutes per session) that instructed forecasters on applying base rates (historical frequencies of similar events), (drawing analogies from comparable past cases to inform priors), and Fermi estimation (breaking complex questions into multiplicative components for rough quantitative bounds, e.g., estimating event likelihood via order-of-magnitude calculations). These modules, delivered sporadically rather than as intensive courses, aimed to counter overreliance on inside views and narrative fallacies, with empirical tests showing standalone training effects of around 10% relative improvement over untrained baselines. Complementing training, GJP implemented teaming by assigning top performers to small groups (3-5 members) for asynchronous online discussions, encouraging evidence-based updates and reducing individual errors through collective deliberation; this boosted accuracy by an additional 13% relative to solo efforts. Aggregation protocols evolved to include extremizing, algorithmically adjusting medians away from 50% toward 0% or 100% when arguments strongly favored one outcome, and weighted averaging favoring recent, revised forecasts from high performers. Across Years 2-4, these combined interventions—training, collaboration, and refined aggregation—produced cumulative reductions of 20-30% compared to Year 1 baselines, with progressive gains evident in sequential seasons as forecasters adapted through feedback on resolved questions. By tournament's end in 2015, GJP's overall scores reflected approximately 60% greater accuracy relative to the Year 1 baseline and the intelligence community's unaided forecasts, outperforming all competing teams by 35-72% in aggregate and ; this superiority held across question categories, validated by IARPA's independent scoring and cross-checked with logarithmic rules that penalize overconfidence.

Transition to Independent Research and Commercialization

Following the conclusion of the IARPA program in 2015, Good Judgment Project leaders Philip Tetlock and Barbara Mellers published empirical analyses of the tournament's methods and outcomes in peer-reviewed journals, including a 2015 article in Perspectives on Psychological Science that detailed strategies for selecting and training superforecasters to enhance probabilistic accuracy. These publications synthesized data from over a million forecasts across 500 questions, emphasizing techniques like team aggregation and extremizing that contributed to the project's 30-60% superiority over baseline benchmarks. In late 2015, Tetlock and Mellers established Good Judgment Inc. as a private venture to extend the project's empirically validated approaches into applied forecasting beyond government sponsorship. This commercialization drew directly on ACE-derived insights, such as the identification of individuals exhibiting traits like active open-mindedness and numeracy, which early follow-up analyses affirmed as predictive of accuracy in diverse predictive tasks. Initial independent replications post-2015, including extensions through platforms like Good Judgment Open, tested recruitment and training in non-geopolitical arenas such as and , yielding consistent outperformance relative to untrained crowds and affirming the portability of core selection criteria. These efforts preserved methodological continuity from the , prioritizing causal factors like deliberate practice over domain-specific expertise.

Methodology and Practices

Probabilistic Forecasting Techniques

The Good Judgment Project employed as the foundational method for generating predictions, requiring forecasters to express uncertainty numerically, such as assigning a 70% probability to a binary event occurring by a specified date. These forecasts were evaluated using the , a measure that assesses accuracy by comparing predicted probabilities to actual outcomes, with penalties for both incorrect directional predictions and overconfidence in high or low probabilities; scores range from 0 (perfect) to 1 (worst), incentivizing well-calibrated estimates over binary yes/no assertions. A core technique involved , where forecasters identified analogous historical events or "comparison classes" to establish base rates, thereby grounding predictions in empirical precedents rather than unanchored ; for instance, analyzing past geopolitical resolutions to inform probabilities on similar conflicts. This approach mitigated base-rate neglect, a common cognitive error, by prioritizing data-driven anchors over inside-view narratives. Forecasters also utilized Fermi estimation to decompose complex, quantitative questions into stepwise approximations, estimating intermediate variables—such as sizes, rates, or proportions—to arrive at overall probabilities; this method, exemplified in breaking down queries like economic impacts into multiplicative factors, facilitated causal and revealed hidden uncertainties. Predictions were refined through iterative Bayesian updating, where forecasters adjusted probabilities in response to new evidence, treating initial estimates as priors and incorporating posterior shifts via likelihood ratios to reflect evidential weight; this process emphasized dynamic responsiveness over static opinions, contrasting with the often unchanging expert judgments prevalent in traditional analysis. Such updates occurred frequently, leveraging disconfirming data to challenge entrenched beliefs and promote causal realism in .

Participant Selection, Training, and Teaming

The Good Judgment Project recruited participants openly from the general public through channels such as professional societies, alumni associations, science blogs, and word-of-mouth referrals, emphasizing self-selection based on interest rather than specialized expertise. No prior or was required, though participants generally held at least a ; initial pools ranged from 2,200 to 3,900 forecasters annually across the 2011–2015 Aggregate Contingent Estimation (ACE) tournament years. Selection occurred via performance on initial tasks, prioritizing demonstrated accuracy and sustained participation to filter out chance performers, with top individuals advancing to enhanced conditions. This approach identified high-potential forecasters from diverse backgrounds, including amateurs without elite credentials, whose subsequent accuracy often surpassed that of credentialed analysts. Training consisted of brief, cognitive-debiasing modules, typically under one hour in duration, delivered as randomized interventions within the tournament structure. Content focused on probabilistic reasoning, recognizing common pitfalls such as overconfidence in judgments and in evidence evaluation, alongside exercises to align stated probabilities with observed outcomes. Randomized controlled trials embedded in the project demonstrated that trained forecasters achieved improvements of 6% to 11% relative to untrained controls, with effects persisting for at least one year across geopolitical questions. These gains stemmed from deliberate practice in debiasing techniques, enabling ordinary volunteers to refine judgments without relying on institutional expertise. Teaming involved assigning selected forecasters to small collaborative groups, often 7 to 20 members or elite subsets of around 12, where they engaged in structured to update forecasts. Protocols encouraged causal , aggregation of diverse perspectives, and sharing of external information like news updates, fostering error reduction through collective scrutiny rather than individual intuition. Empirical analysis showed teams produced lower scores than solo forecasters, with heightened engagement—such as fivefold increases in comments and tenfold in shared resources—correlating with improved and . This process mitigated personal biases via interpersonal challenge, yielding accuracies that exceeded non-teamed baselines in the .

Extremizing and Aggregation Methods

The Good Judgment Project utilized weighted aggregation algorithms to combine individual probabilistic forecasts, assigning greater influence to predictions from forecasters with superior historical accuracy and higher update frequency. This elitist weighting scheme departed from unweighted "wisdom of crowds" methods, which proved ineffective in control groups lacking targeted interventions, as those aggregates aligned closely with random benchmarks. Within teams, medians of member forecasts further mitigated effects, drawing on that collaborative aggregation enhances by pooling diverse causal insights. Post-aggregation, the project applied extremizing transformations to counteract the observed tendency of crowd judgments to compress toward moderation, thereby restoring sharper estimates reflective of underlying evidential strength. This involved nonlinear adjustments via the function t(p) = \frac{p^a}{p^a + (1-p)^a} where a > 1, with the exponent a calibrated empirically—typically around 3.08 for nonexpert aggregates—to push probabilities closer to 0 or 1 based on forecaster and error patterns. The rationale stemmed from two mechanisms: asymmetric random errors at extremes biasing means inward, and forecasters' habitual regression toward 0.5 amid informational gaps, both diluting collective confidence unless corrected. Validation on tournament data confirmed these refinements outperformed raw aggregates by enhancing without introducing systematic overconfidence.

Empirical Findings and Superforecasters

Performance Metrics in the ACE Tournament

The Good Judgment Project (GJP) demonstrated superior forecasting accuracy in the IARPA Aggregative Contingent Estimation () tournament from 2011 to 2015, evaluated primarily via the , a measure of probabilistic accuracy where lower values indicate better . Across more than 500 geopolitical questions spanning approximately 100-150 per year, GJP's aggregate averaged around 0.25, compared to approximately 0.35 for the no-training control group of ordinary forecasters, equating to a roughly 30% error reduction relative to the baseline. This outperformance exceeded IARPA's initial target of a 20% improvement in the first year, with GJP surpassing competing teams by 35-72% and U.S. intelligence analysts by over 30%. Domain-specific results highlighted strengths in politics and economics, where GJP's interventions yielded the most pronounced gains; for instance, forecasters accurately assessed low probabilities for Syrian regime use of chemical weapons against civilians in targeted scenarios, contributing to overall calibration in these areas. Performance remained superior but comparatively weaker in military events, such as predictions involving troop movements or conflict escalations, though still statistically better than controls due to aggregated probabilistic adjustments. These variations reflected the tournament's emphasis on diverse event types, with over 150,000 individual forecasts informing the metrics. Year-over-year improvements compounded through iterative interventions like probability training, team collaboration, and performance tracking, with trained GJP teams outperforming untrained counterparts by margins achieving (p < 0.001) in both and components of the . In Year 1 (85 closed questions), GJP reduced errors by over 60% relative to controls; by Year 2 (114 questions), this rose to 78%, as tracking enabled top performers to refine judgments without regression. Such gains persisted across the four years, validating the efficacy of these methods on the tournament's evolving question set.

Traits and Profiles of Top Performers

Superforecasters, defined as the top 2% of participants in the Good Judgment Project's tournaments, demonstrated Brier scores 30% to 60% superior to the average forecaster across hundreds of geopolitical questions spanning 2011 to 2014. These individuals maintained their edge over multiple years without regressing to the mean, achieving superior (average score of 0.01, indicating near-perfect alignment between stated probabilities and outcomes) and (0.40 versus 0.32 for others). Unlike media-highlighted specialists, superforecasters often hailed from diverse, non-expert backgrounds, including students, retirees, and professionals without domain-specific credentials, comprising roughly 74% U.S. citizens with an average age of 40 and 64% holding . Regression analyses of forecaster data revealed that accuracy correlated with fluid (e.g., higher Raven's Advanced Matrices scores, r ≈ -0.22 for improvement) and crystallized , but these factors explained only part of the variance, plateauing beyond moderate levels where cognitive styles dominated. Actively open-minded thinking emerged as a key predictor (standardized β = -0.07, p < 0.03 in multiple regression models with R = 0.64), characterized by tolerance for ambiguity, , and , enabling forecasters to integrate diverse evidence without premature closure. also factored positively, reflecting a toward deliberate analysis over intuitive judgments. Humility and frequent causally enhanced performance, as superforecasters routinely conducted postmortems, adjusted probabilities in response to new (likened to habitual " flossing"), and quantified numerically to avoid thinking. These practices, validated through experimental interventions, reduced overconfidence and improved discrimination (AUC of 96% versus 75% for average forecasters), countering reliance on unexamined "gut feel" prevalent in and circles. Overall, dispositional profiles emphasizing and iterative updating—rather than raw intellect or expertise—distinguished top performers, with amplifying these traits via aggregation.

Comparative Accuracy Against Baselines and Experts

In the Aggregative Contingent Estimation () tournament sponsored by the (IARPA) from 2011 to 2015, the Good Judgment Project (GJP) demonstrated superior accuracy relative to baselines such as random guessing and simple extrapolative models, achieving scores that exceeded these benchmarks by factors of 2 to 3 across hundreds of geopolitical and economic questions. GJP's aggregated forecasts also outperformed expert aggregates from competing teams, including those leveraging domain specialists, by 35% to 72% in and metrics. This edge extended to comparisons with U.S. intelligence community analysts, where GJP forecasts proved over 30% more accurate on identical questions, even when analysts had access to —a finding that aligns with prior indicating that professional experts often perform no better than basic probabilistic baselines in long-range due to overconfidence and hedgehog-style thinking. Relative to prediction markets and crowdsourced platforms, GJP superforecasters exhibited particular strength in domains involving non-tradable geopolitical risks, such as civil unrest or diplomatic shifts, where is limited and incentives for participation sparse. In direct tests against an internal intelligence community , GJP methods yielded 34.7% higher accuracy over 139 questions. While efficient financial markets remain superior for liquid assets like equities or commodities—showing no instances of underperformance in applicable traded scenarios—GJP aggregates surpassed futures market benchmarks by up to 66% on eligible event types within the . Post-tournament replications from 2016 to 2018, including longitudinal tracking of cohorts on new question sets, confirmed sustained outperformance against expert and comparators, with year-over-year performance correlations reaching 0.65, undermining attributions of initial results to transient or selection artifacts. These validations involved independent question resolution by third-party analysts and maintained advantages of 20-30% over aggregated intelligence assessments in replicated geopolitical domains.

Criticisms and Debates

Challenges to Methodological Validity

Critics of the Good Judgment Project have highlighted potential in participant recruitment, noting that the initial pool consisted of self-selected volunteers who responded to public announcements and demonstrated sustained engagement, which may have enriched the sample with inherently motivated and capable individuals compared to unselected or populations. This self-selection process, involving over 5,000 participants to identify approximately 260 superforecasters (top 2% performers), could inflate baseline accuracy relative to broader benchmarks, complicating causal attributions of interventions to observed improvements. Replications in smaller pools have raised questions about the causal impact of protocols, with one of 195 forecasters finding no statistically significant enhancement from an additional 20 minutes of beyond initial exposure, despite modest shifts in median standardized accuracy scores (e.g., -0.437 for the trained group versus -0.252 for controls). Such findings suggest that gains attributed to debiasing modules in the original project—such as instruction on base rates, probabilistic reasoning, and updating beliefs—may partly reflect regression to the mean among initially variable performers rather than robust -induced , as extreme early scores tend to moderate over repeated trials without . The emphasized that "training does not imply learning... nor can guarantee that taught methods... have been put in practice," underscoring challenges in isolating effects from statistical artifacts. Debates persist over the extremizing component of aggregation, where forecasts are adjusted toward 0 or 1 based on forecaster and historical within the dataset; some analysts argue this technique may exploit random noise in individual predictions or risk to the specific question set, potentially yielding inflated without generalizable validity beyond the calibrated environment. In small-sample contexts, extremizing's reliance on optimized parameters derived from data has been critiqued for lacking external validation, as it assumes consistent noise structures across diverse domains.

Questions on Scalability and Reproducibility

The Good Judgment Project identified approximately 260 from an initial pool of around 5,000 participants across four years of the IARPA , representing roughly the top 2% of performers. This constrained group, reliant on sustained high engagement from a motivated volunteer base, highlights practical limits on ; replicating such identification in organizational settings would demand comparable large-scale screening and retention efforts, which a 2020 study deemed resource-intensive under real-world constraints like limited time and incentives. Annual attrition among all forecasters ranged from 3% to 7%, while superforecaster retention hovered around 70% year-over-year, indicating that even among tops, consistent outperformance erodes without ongoing training and selection pressure. Efforts to reproduce superforecaster identification beyond the IARPA context have yielded mixed outcomes, often constrained by smaller pools and expedited timelines. In a 2020 experiment with 314 experts over just 9 months, only 2 individuals (top 2% of 195 fully engaged participants, following 36% attrition) met criteria, providing supportive but qualified evidence amid engagement challenges absent tournament incentives like prizes. The study references Philip Tetlock's prior research, including 2005 analyses showing experts' forecasts degrading over time and failing to outperform benchmarks consistently, suggesting that superperformance may not reliably persist without the structured, high-stakes environment of the original project. These findings underscore reproducibility hurdles in non-subsidized applications, where dilution of accuracy occurs as participant quality varies without rigorous filtering. Superforecaster traits, including actively open-minded thinking, have been linked to slightly more moderate ideological profiles and lower dogmatism, potentially introducing representational biases by underweighting forecasts from ideological extremes. While this correlates with empirical accuracy in data, it raises causal questions about whether aggregated outputs systematically favor centrist priors, limiting applicability to polarized domains where outlier views might hold unresolved predictive value.

Alternative Explanations for Observed Improvements

Critics have argued that the Good Judgment Project's (GJP) observed forecasting improvements may partly stem from regression to the mean and early-period luck rather than solely from training or teaming interventions. In the tournament, GJP teams achieved strong Year 1 results, but skeptics posit that random variance could explain initial outperformance, with subsequent selection of "superforecasters" based on those outcomes creating an illusion of sustained skill; pessimistic analyses suggest superforecasters might regress toward average team performance if chance dominated early success. Although GJP data showed year-over-year correlations in forecaster rankings around 0.65 and retention of top status for about 70% of superforecasters, this consistency has been questioned as potentially driven by persistent high engagement—such as making 7.8 predictions per question versus 1.4 for others and clicking news links 255 times versus 58—rather than unique cognitive traits, implying self-selection of motivated participants inflated apparent edges. Selection on observed performance has also been cited as a form of cherry-picking, where design favored teams benefiting from fortuitous early resolutions, and identification relied on ex-post filtering from a pool of approximately 2,800 participants, potentially overlooking continuous variation in ability rather than discrete "super" categories. High rates, around 7% in Year 1, and differential effort (e.g., superforecasters updating forecasts five times more frequently) introduce bias, as disengaged forecasters may drop out or contribute minimally, skewing aggregates toward dedicated subsets without proving causal improvements from GJP methods. Comparisons to non-human baselines reveal limited evidence of probabilistic superiority attributable to GJP techniques. Aggregated judgments have not consistently outperformed simple Bayesian models or statistical algorithms in controlled tests; for instance, coherence-adjusted Bayesian forecasts sometimes yielded higher accuracy than human ensembles in GJP analyses. Recent evaluations, including those by GJP affiliates, indicate that capable models often match or exceed solo human forecasters on geopolitical questions, with hybrid human- systems performing best, suggesting human elements like frequent updating may introduce noise or overfit to training data rather than add robust signal beyond mechanical aggregation. Ideological critiques highlight potential underweighting of tail risks in GJP's incrementalist approach, which emphasizes probabilistic updating and fox-like integration of evidence over bold, hedgehog-style warnings of systemic fragility. has implicitly challenged such methods by arguing that normal-distribution assumptions in forecasting tournaments undervalue black-swan events—rare, high-impact shocks like financial crises—favoring preparation via over prediction, a view echoed in analyses claiming superforecasting fosters complacency toward extremes by prioritizing high-probability scenarios and calibrated medians. Right-leaning commentators, drawing on Taleb's , contend this toward mirrors institutional failures in anticipating disruptions, such as lapses on transformative geopolitical shifts, where overreliance on wisdom dilutes vigilance for causal chains. GJP's focus on short-to-medium-term geopolitical questions (mostly under two years) exacerbates this, as predictability declines sharply beyond five years due to nonlinear dynamics, limiting tests against fat-tailed realities.

Commercial Extension and Ongoing Work

Establishment of Good Judgment Inc.

Good Judgment Inc. was incorporated in 2015 by Philip Tetlock and Barbara Mellers as a commercial extension of the Good Judgment Project, which concluded its IARPA-funded that year after generating over one million forecasts across 500 questions. The firm capitalized on the project's validated , including methods for extremizing probabilistic predictions and aggregating crowd judgments, to pivot toward private-sector consulting that applies these techniques to business decision-making under . This shift enabled early engagements with corporations seeking to forecast risks such as market volatility and disruptions, distinct from the geopolitical focus of the original tournaments. Central to the company's initial offerings were superforecasting training programs, adapted from the Good Judgment Project's protocols that emphasized iterative feedback, base-rate awareness, and team deliberation to cultivate forecasters outperforming analysts. These programs were tested in client pilots, where applications reportedly yielded measurable gains in predictive accuracy for , building directly on the research's causal evidence of skill acquisition through structured practice rather than innate expertise. In parallel, Good Judgment Inc. introduced the Good Judgment Open platform in 2015, a public tool that preserved tournament-style scoring and resolution criteria to generate ongoing forecast data while serving as a talent pipeline for paid services. This launch facilitated the maintenance of empirical rigor in non-commercial settings, allowing the firm to refine aggregation algorithms with inputs before scaling to confidential client needs.

Forecasting Services and Crowdsourcing Platforms

Good Judgment Inc. delivers bespoke forecasting services to clients in government, non-governmental organizations, and the , utilizing networks of superforecasters to generate probabilistic assessments on targeted topics such as geopolitical risks, outcomes, and technological advancements. For instance, the firm has produced forecasts on U.S. foreign aid funding levels for evaluators like , aiding resource allocation decisions through crowd-aggregated predictions from high-performing forecasters. In the domain of governance, superforecasters provide outlooks on milestones like international cooperation agreements and regulatory developments, emphasizing evidence-based probabilities over speculative narratives. These services extend to election-related challenges, where Good Judgment has fielded questions on outcomes such as national vote shares and seat distributions during the 2022-2024 cycles, with resolutions tracked against official results to refine future models. Client engagements often involve customized tournaments or dashboards that integrate qualitative analysis alongside quantitative scores, enabling decision-makers to quantify uncertainties in scenarios like policy implementation timelines. Complementing bespoke offerings, Good Judgment operates GJ Open, a public platform that solicits probabilistic forecasts from thousands of participants worldwide on questions spanning , , and , including projections for 2026 global elections. The platform aggregates crowd wisdom via mechanisms like weighted averaging of calibrated predictions, fostering broad participation while offering free training resources to enhance user accuracy. Paid tiers, such as FutureFirst subscriptions, grant access to professionally curated forecasts updated daily, team training modules, and performance benchmarking tools for organizational use. Internal empirical tracking underpins these platforms, with annual reviews—such as the 2023-2024 assessments—reporting consistent scores among superforecasters, where predicted probabilities align closely with observed frequencies across resolved questions. However, proprietary client data restricts independent verification of aggregate outcomes, limiting external audits to publicly disclosed subsets like collaborations with outlets such as on annual world event predictions. This approach prioritizes operational reliability through iterative feedback loops, though full transparency remains constrained by commercial sensitivities.

Recent Applications and Developments (2016-2025)

Following the establishment of Good Judgment Inc., the organization's superforecasters applied their methods to a broadening array of domains beyond core geopolitical forecasting, including monetary policy shifts and technological risks, with empirical reviews demonstrating sustained outperformance against market benchmarks. In 2023, superforecasters achieved perfect accuracy (8/8) on predictions featured in The Economist's "The World Ahead" issue, encompassing volatile indicators such as economic growth trajectories and conflict continuations. By 2024, they scored 4.5 out of 8 on similar forecasts, correctly anticipating sub-5% GDP growth in China, the timing of Britain's general election, and the persistence of the Ukraine conflict, while adapting aggregation techniques to handle heightened election volatilities. Monetary policy applications gained prominence in 2023-2024, with superforecasters outperforming futures markets by approximately 30% on average for rate decisions, including forecasts on adjustments amid post-Brexit economic pressures. These efforts extended to U.S. [Federal Reserve](/page/Federal Reserve) dynamics indirectly through integrated economic modeling, where probabilistic assessments of rate paths informed client decision-making in volatile environments. Concurrently, new challenges emerged on export controls and regulatory hurdles for innovations, reflecting adaptability to sector-specific disruptions like supply chain constraints under U.S.- trade tensions. Data from these periods highlighted the methodology's robustness, though reliant on a curated pool of approximately 100-200 active superforecasters, which imposed limits on scaling for real-time, high-volume predictions. By 2025, expansions into governance marked a pivotal development, with superforecasters issuing calibrated predictions on U.S.- tech races that emphasized converging incentives—such as mutual reliance on chip supply chains—over zero-sum rivalry narratives prevalent in coverage. Forecasts assessed the likelihood of multilateral agreements akin to "Chips for Peace" involving the U.S., , , and others, assigning low probabilities to sweeping restrictions that could hinder U.S. competitiveness in development. These views contrasted with alarmist portrayals of an imminent , prioritizing empirical signals like shared market stability interests; for instance, superforecasters estimated modest risks of power-seeking behaviors materializing before 2030, informed by iterative updates from historical tech adoption patterns. Partnerships proliferated for non-geopolitical risks, including collaborations with for U.S. foreign aid projections and energy firms for regulatory , yielding datasets that validated cross-domain transferability while underscoring persistent challenges in size for , rapid-response applications.

Broader Impact and Legacy

Influence on Intelligence and Policy Forecasting

The Good Judgment Project's success in the IARPA-sponsored Aggregative Contingent Estimation () tournament from 2011 to 2015, where it outperformed community benchmarks by over 30% in accuracy, prompted recommendations for integrating its methods into U.S. practices. Post-tournament analyses highlighted the need for analysts to shift from deterministic narratives to probabilistic assessments, reducing tendencies toward overconfidence observed in traditional reporting. Tetlock and colleagues advocated this in their 2016 review, drawing on GJP data to propose training reforms that emphasize updating beliefs with evidence and expressing uncertainty in numerical probabilities, which trials indicated could enhance without classified data access. In policy applications, Good Judgment Inc., the commercial successor to the project, has supplied calibrated forecasts on geopolitical risks, including pre-2022 assessments of Russia-Ukraine tensions that assigned moderate probabilities to escalation scenarios, differing from many expert analyses favoring binary outcomes. These outputs, aggregated from teams, have informed client organizations on risk , promoting nuanced views over polarized predictions common in discourse. Dissemination of superforecasting techniques through workshops has reached government analysts, with evidence from derived training programs showing measurable reductions in , such as a one-hour intervention improving probabilistic reasoning among professionals in randomized studies. Longitudinal evaluations of these methods, building on GJP's original findings, confirm modest but consistent gains in forecast for institutional users, though remains constrained by organizational inertia.

Implications for Cognitive Biases and Expert Overconfidence

The Good Judgment Project's forecasting tournaments provided empirical evidence challenging the presumption of expert superiority in probabilistic judgment, as superforecasters—typically non-specialist generalists—outperformed domain experts, including analysts with access to , by approximately 30% in accuracy. This outcome reinforced Philip Tetlock's earlier hedgehog-fox dichotomy, where "foxes" who integrate diverse perspectives and update beliefs based on evidence achieved superior calibration and compared to "hedgehogs" reliant on singular ideological frameworks, with foxes demonstrating meaningfully higher aggregate success rates in predictive tasks. In the tournaments, superforecasters' edge stemmed from causal updating—iteratively refining models against new data—rather than domain-specific intuition, highlighting how specialization often entrenches overconfidence without enhancing foresight. Project techniques explicitly targeted cognitive biases prevalent in expert forecasting, such as anchoring to initial estimates and fallacies that prioritize coherent stories over probabilistic . Superforecasters mitigated anchoring by systematically combining "inside" (case-specific) and "outside" (base-rate) views, fostering more realistic probability assignments and reducing the undue influence of first-encountered data. These methods countered -driven errors, where forecasters—often in ideologically aligned institutions—construct overconfident scenarios that dismiss risks, as evidenced by superforecasters' closer alignment to empirical outcomes in politically charged domains. Overconfidence emerged as a systemic flaw in judgment, with data showing regular forecasters consistently overprecise in their predictions, while superforecasters maintained near-perfect through practiced and explicit uncertainty modeling. This enabled superior causal , as superforecasters treated beliefs as testable hypotheses, regularly revising them in response to disconfirming rather than defending entrenched priors. Such findings underscore overconfidence not as isolated error but as a default mode amplified in settings, where is low and ideological substitutes for evidentiary rigor.

Long-Term Empirical Validation and Future Directions

Subsequent replications of the Good Judgment Project's core findings between 2016 and 2025 have largely affirmed the efficacy of identifying and cultivating superforecasters through talent-spotting, , teaming, and aggregation, with approximately 70% of superforecasters retaining status across consecutive years in follow-up analyses. A 2021 experimental involving 314 experts in a context replicated the of rare superforecasters who outperformed baselines, supporting the of these methods in constrained real-world settings without contradicting the original results. No large-scale disconfirmations have emerged, though some analyses indicate potential diminishing marginal gains in highly familiar or saturated forecasting domains where initial expertise advantages erode over repeated iterations. Looking forward, integrations of judgments with tools show promise for enhancing scalability, as demonstrated in Good Judgment Inc.'s 2023-2025 projects forecasting governance risks and power-seeking behaviors, where human probabilistic reasoning complemented machine-generated scenarios to refine long-horizon predictions. These hybrids address human limitations in processing volumes while leveraging superforecasters' bias-correction skills, with empirical tests suggesting improved accuracy over either alone in volatile domains. Proposed expansions include dedicated tournaments on and technological risks to empirically test robustness against domain-specific uncertainties, building on the project's geopolitical successes. An unresolved empirical question pertains to the role of ideological among forecasters, as homogeneous groups risk centrist biases that underweight black-swan events influenced by polarized dynamics; tournaments incorporating viewpoint have reduced partisan errors, underscoring the need for randomized trials to quantify its causal impact on tail-risk . Future validations should prioritize such designs to distinguish skill from selection effects in diverse cohorts.

References

  1. [1]
    About Superforecasting | Unprecedented Accurate & Precise ...
    Good Judgment's professional Superforecasters deliver unparalleled accuracy on forecasting questions across the political, economic and social spectrum.
  2. [2]
    ACE - IARPA
    The goal of the ACE Program is to dramatically enhance the accuracy, precision, and timeliness of intelligence forecasts for a broad range of event types.
  3. [3]
    Evidence on good forecasting practices from the Good Judgment ...
    The Good Judgment Project (GJP) was the winning team in IARPA's 2011-2015 forecasting tournament. In the tournament, six teams assigned probabilistic answers to ...
  4. [4]
    [PDF] Aggregative Contingent Estimation (ACE) - Obama White House
    “The goal of the ACE Program is to dramatically enhance the accuracy, precision, and timeliness of intelligence forecasts for a broad range of event.
  5. [5]
  6. [6]
    [PDF] The Reasoning Under Uncertainty Trap: A Structural AI Risk - arXiv
    provision of accurate intelligence post 9/11 and Iraq intelligence failures (see e.g., Robb, 2005), and launched a series of programs inviting outside ...
  7. [7]
    CHIPS Articles: The Good Judgment Project - doncio.navy.mil
    Good Judgment Project blog banner. The Good Judgment Project is a four-year research study organized as part of a government-sponsored forecasting tournament. ...
  8. [8]
    [PDF] The Good Judgment Project - University of Pennsylvania
    The Good Judgment Project (GJP) is a four-year research study begun in 2011 by psychology and management professors Phillip Tetlock, author of the award-winning ...
  9. [9]
    Developing expert political judgment: The impact of training and ...
    Jan 1, 2023 · This article tests the power of a cognitive-debiasing training module (“CHAMPS KNOW”) to improve probability judgments in a four-year series of geopolitical ...Missing: yearly | Show results with:yearly
  10. [10]
    [PDF] What I've Learned from the Good Judgment Project - SOA
    Jul 2, 2015 · The Good Judgment Project sprouted out of a number of surprises in the U.S. intelligence community. How could they have been blindsided by so ...
  11. [11]
    The Superforecasters' Track Record - Good Judgment
    Superforecasters beat all competing research teams in the IARPA ACE tournament by 35-72%. Learn more » · US intelligence analysts. Good Judgment was over 30 ...Missing: execution 2011-2015
  12. [12]
    Identifying and Cultivating Superforecasters as a Method of ...
    May 18, 2015 · Satopää V. A., Baron J., Foster D., Mellers B. A., Tetlock P. E., Ungar L. H. (2014). ... Perspectives on Psychological Science. ISSN: 1745-6916 ...
  13. [13]
    The Science Of Superforecasting - Good Judgment
    Good Judgment research discovered four keys to accurate forecasting: talent-spotting, training, teaming, and aggregation.
  14. [14]
    Are you a super-forecaster? The Good Judgment Project needs you
    Dec 30, 2016 · They also founded a private start-up company in 2015, Good Judgment Inc., that makes their "super-forecasters" available for hire to ...
  15. [15]
    How generalizable is good judgment? A multi-task, multi-benchmark ...
    Jan 1, 2023 · Good judgment is often gauged against two gold standards – coherence and correspondence. Judgments are coherent if they demonstrate consistency with the axioms ...
  16. [16]
    [PDF] Superforecaster-Accuracy.pdf - Good Judgment
    Good Judgment measures accuracy using the Brier score, a score that shows how far a forecast fell from the ... This is consistent with the Good Judgment Project ...
  17. [17]
    [PDF] Developing expert political judgment
    Aug 22, 2016 · ments in which our team, the Good Judgment Project, was a ... ing people reference-class forecasting reduces base-rate ne- glect ...
  18. [18]
    Superforecasters' Toolbox: Fermi-ization in Forecasting
    Fermi-ization is a valuable tool in a Superforecaster's toolbox. Since the days of the original Good Judgment Project and continuing in Good Judgment Inc's ...
  19. [19]
    How The Superforecasters Do It - Commoncog
    Dec 24, 2019 · ... Good Judgment Project used to evaluate its forecasters, and then we ... In order to answer this question, a good Fermi-estimation goes roughly as ...
  20. [20]
    Beliefs as Hypotheses: The Superforecaster's Mindset
    What gave them the edge, the research team behind the Good Judgment Project ... As practitioners of Bayesian thinking, they update their probabilities ...
  21. [21]
    [PDF] Identifying and Cultivating Superforecasters as a Method of ...
    The Good Judgment Project (GJP). In this article ... Forecasters received status rewards according to their accuracy or Brier score during the tournament.
  22. [22]
    [PDF] Two Reasons to Make Aggregated Probability Forecasts More ...
    When judgments are highly variable yet their mean is still high and close to the correct side, extremizing would be insufficient because of these wrong-side ...
  23. [23]
    Good judgment in forecasting international affairs (and an invitation ...
    Nov 26, 2013 · ... chemical weapons inspections in Syria would be completed before Dec. 1 –- an incredibly important and complex topic. Each prediction is a ...
  24. [24]
    IARPA Announces Publication of Data from the Good Judgment ...
    Apr 28, 2017 · This data set includes millions of forecasts made by participants over the four years of the ACE forecasting competition and led to many ...Missing: execution 2011-2015
  25. [25]
    [PDF] Psychological Strategies for Winning a Geopolitical Forecasting ...
    Better forecasters had higher scores on mea- sures of fluid and crystallized intelligence and open- mindedness (results discussed in Mellers, Stone, Metz,.
  26. [26]
    [PDF] Drivers of Prediction Accuracy in World Politics
    Apr 21, 2014 · Actively open-minded thinkers also have greater tolerance for ambiguity and weaker need for closure (the tendency to want to reach conclusions ...
  27. [27]
    Ten Commandments for Aspiring Superforecasters - Good Judgment
    These commandments describe behaviors that have been “experimentally demonstrated to boost [forecasting] accuracy.”
  28. [28]
    Comparing top forecasters and domain experts — EA Forum
    Mar 6, 2022 · Upd 2022-03-14: Good Judgement Inc representative confirmed that Goldstein et al (2015) didn't have a superforecaster-only pool. Unfortunately, ...
  29. [29]
    Evidence on good forecasting practices from the Good Judgment ...
    Jul 2, 2019 · IARPA ran a forecasting tournament from 2011 to 2015, in which five teams plus a control group gave probabilistic answers to hundreds of ...Missing: ACE | Show results with:ACE<|separator|>
  30. [30]
    Comparing Superforecasting and the Intelligence Community ...
    Apr 12, 2022 · We do know, however, that GJ's best methods were 34.7% more accurate than those of the ICPM over 139 forecasting questions running from the fall ...
  31. [31]
    [PDF] Comparing Top Forecasters to Domain Experts - Arb Research
    “Note that GJP forecasters improved their scores after updating. However, the FFI forecasters could not update on their predictions”. “Based on the first 150 ...<|separator|>
  32. [32]
    Superforecasting reality check: Evidence from a small pool of ...
    Jul 3, 2020 · Perspectives on Psychological Science. 2015;10(6):753–757. doi ... The Good Judgment Project. Retrieved January 12, (2015), from https ...
  33. [33]
    Why an Open Mind Is Key to Making Better Predictions
    Oct 2, 2015 · There's a slight tendency for people who are superforecasters to be more moderate and less ideological, but there are lots of superforecasters ...Missing: bias | Show results with:bias
  34. [34]
    [PDF] Identifying and Cultivating Superforecasters as a Method of ...
    The Good Judgment Project (GJP). In this article, we describe the best-performing strategy of the winning research program: the GJP. To preempt confusion, we ...
  35. [35]
    Book Review: Superforecasting | Slate Star Codex
    Feb 4, 2016 · ... Good Judgment Project have a control group? Did it put, say, the ... reference class forecasting, and Bayesian updating can likewise ...<|separator|>
  36. [36]
    Measuring probabilistic coherence to identify superior forecasters
    In the large-scale Good Judgment Project (GJP; Mellers et al., 2015) ... Bayesian forecast, yielded the most accurate judgments. Despite coherence's ...
  37. [37]
    Human vs AI Forecasts - Good Judgment
    For the best results today, our view is human + AI. Use Superforecasters and Superforecasting methods with capable, secure models for faster, better forecasts.Missing: Bayesian | Show results with:Bayesian<|separator|>
  38. [38]
    Human and Algorithmic Predictions in Geopolitical Forecasting
    Aug 29, 2023 · Geopolitical forecasting is an algorithm-unfriendly domain, with hard-to-quantify data and elusive reference classes that make predictive model-building ...Missing: post- | Show results with:post-
  39. [39]
    My Final Case Against Superforecasting (with criticisms considered ...
    May 30, 2020 · Superforecasting may increase efficiency, leading to fragility, and can be used to avoid preparing for rare, high-impact events, and may not ...
  40. [40]
    Foxes, Hedgehogs & Investors - LONGRIVER
    Jan 31, 2020 · ^ Regarding risk, Tetlock and Gardner acknowledge a criticism from investor and philosopher Nassim Taleb that the future is not Normally ...
  41. [41]
    The Case Against Superforecasting - by R.W. Richey
    Mar 23, 2025 · But the problem is not with their accuracy, but rather with how that “accuracy” leads people to be less prepared for black swans. Also it should ...Missing: selection | Show results with:selection
  42. [42]
    [PDF] Superforecasting: How to Upgrade Your Company's Judgment
    The Good Judgment Project demonstrated that as little as one hour of training improved forecasting accuracy by about 14% over the course of a year. (See the ...
  43. [43]
    How Good Judgment Project Uses Superforecasting - Built In
    Jul 21, 2020 · Superforecasters aim to predict what pure data science cannot. Find out how Good Judgment Project uses superforecasting to make global ...
  44. [44]
    Good Judgment: See the future sooner with Superforecasting
    Good Judgment uses the science of Superforecasting to turn your strategic uncertainty into manageable risk. Learn More. A quick peek at what the ...Good Judgment Open · About · Our Team · Online Training
  45. [45]
    Good Judgment Inc. — Forecasts on U.S. Foreign Aid Funding ...
    Good Judgment Inc. provides forecasting services using their network of "superforecasters":4 individuals with demonstrated track records of making accurate ...
  46. [46]
    Superforecasting AI Governance - Good Judgment
    Superforecasters are generalist forecasters whose consistent accuracy placed them in the top 1-2% of the more than 100,000 forecasters from around the world.Missing: protocols | Show results with:protocols
  47. [47]
    Superforecasting & Good Judgment In The Press/News
    What the “superforecasters” predict for major events in 2025. The World Ahead 2025, The Economist (November 2024). Following another successful collaboration ...
  48. [48]
    Superforecasting Case Studies - Good Judgment
    Learn how Good Judgment's clients use Superforecasting methods to generate better forecasts and improve their decision-making.
  49. [49]
    Good Judgment® Open
    Good Judgment's co-founder, Philip Tetlock, literally wrote the book on state-of-the-art crowd-sourced forecasting. Learn more about Good Judgment and the ...Featured Questions · Challenges · Sign In · Sign UpMissing: IARPA | Show results with:IARPA
  50. [50]
    Practice forecasting and train your team at Good Judgment Open
    Best forecasters on GJ Open compete for a chance to join the ranks of Good Judgment's professional Superforecasters. Read about the selection process here.
  51. [51]
    Good Judgment's 2024 in Review
    Superforecasters always keep score. As we turn to 2025 at Good Judgment Inc, we review 2024 for highlights, statistics, and key developments.Missing: 2022-2024 accuracy
  52. [52]
    Good Judgment's 2024 in Review
    Jan 9, 2025 · Good Judgment's Superforecasters continued to outperform financial markets in 2024. This caught the eye of, among others, Financial Times ...<|separator|>
  53. [53]
    What the “superforecasters” predict for major events in 2024
    Nov 13, 2023 · Good Judgment, a forecasting firm, has recruited many such people to its team of superforecasters, who work together to provide detailed, specific forecasts.Missing: Fed | Show results with:Fed
  54. [54]
    Full Marks from The Economist - Good Judgment News and Insights
    Nov 30, 2023 · Good Judgment's Superforecasters aced 8/8 forecasting questions published last year in The Economist's “The World Ahead 2023” issue.
  55. [55]
    What the “superforecasters” predict for major events in 2025
    Nov 20, 2024 · The superforecasters scored 4.5/8 with their forecasts for 2024, correctly calling China's sub-5% GDP growth, Britain's election, continued conflict in Ukraine.Missing: Inc. 2023
  56. [56]
    [PDF] Superforecasting AI Governance - Good Judgment
    May 5, 2025 · Across the varied domains under consideration, Superforecasters repeatedly brought up four themes as central to their forecasts: 1.
  57. [57]
    [PDF] Superforecasting Power-Seeking AI II_supplementary project_231006
    In this part of the project, Good Judgment's Superforecasters were asked to forecast the probabilities that Artificial General. Intelligence (AGI) will exist ...<|separator|>
  58. [58]
    [PDF] Lessons Learned in Superforecasting the Russian Invasion of Ukraine
    Superforecasters at Good Judgment Inc are an international team of skilled generalists with a ... • underestimation of Putin's willingness to take major risks;.
  59. [59]
    Superforecasting® Ukraine - Good Judgment
    Will Russia and Ukraine announce a ceasefire with an intended indefinite ... Good Judgment Inc. | Good Judgment®, Superforecaster®, Superforecasting ...
  60. [60]
    The World Is More Uncertain Than You Think
    Sep 3, 2025 · 69 The Good Judgment Project found that a group of forecasters who were randomly assigned to take a one-hour online training program in reducing ...
  61. [61]
    Professional Superforecasting Workshops from Good Judgment
    Join us at one of Good Judgment's popular interactive workshops to get a hands-on introduction to Superforecasting techniques and practice your new skills in ...Missing: yearly progression ACE interventions
  62. [62]
    Philip Tetlock, Dan Gardner: Superforecasting - Principus
    Nov 23, 2022 · Foxes beat hedgehogs on both calibration and resolution. Foxes had real foresight. Hedgehogs didn't. Like all of us, hedgehog forecasters first ...
  63. [63]
    Philip Tetlock: Why Foxes Are Better Forecasters Than Hedgehogs
    Jan 26, 2025 · The aggregate success rate of Foxes is significantly greater, Tetlock found, especially in short-term forecasts. And Hedgehogs routinely ...Missing: dichotomy | Show results with:dichotomy
  64. [64]
    [PDF] For Further Research - The Association for Financial Professionals
    Oct 28, 2021 · cGood Judgment Inc. Page 22. Anchoring Mitigation Strategies: Combining the Inside v Outside Views. The core question: “Will things really be ...
  65. [65]
    Page 43 – Good Judgment
    Nov 2, 2021 · From the paper: Regular forecasters tend to show overconfidence, whereas the Superforecasters are close to perfect calibration. Dr. Karvetski's ...Missing: implications | Show results with:implications
  66. [66]
    A study of expert overconfidence - ScienceDirect.com
    Overconfidence is one of the most common (and potentially severe) problems in expert judgment. To assess the extent of expert overconfidence, we analyzed a ...
  67. [67]
    Superforecasting reality check: Evidence from a small pool of ...
    Feb 16, 2021 · In this study, we provide supportive empirical evidence from an experiment with an initial (small) pool of 314 experts and an identification phase of (just) 9 ...Abstract · Empirical Findings · Supplementary Material
  68. [68]
    Superforecasting AI - Good Judgment
    Results of Good Judgment's “Superforecasting Power-Seeking AI” project, with an in-depth analysis of Joseph Carlsmith's paper.
  69. [69]
    Predicting the future with humans and AI - Mellers - 2023
    Dec 7, 2022 · The Good Judgment Project compared the accuracy of aggregated prediction polls to the accuracy of prediction markets (Atanasov et al., 2017; ...
  70. [70]
    Predicting the Future: Harnessing the Power of Probabilistic ... - D-Lab
    Apr 29, 2025 · A team out of the University of Pennsylvania led by Philip Tetlock and Barbara Mellers – pioneers in forecasting research – dubbed the “Good ...
  71. [71]
    Open-Minded Forecasting in a Deeply Polarized World
    Studies based on the Good Judgment Project (GJP) found that being an “actively open-minded thinker” is positively correlated with being an accurate forecaster.Missing: extremizing criticism
  72. [72]
    [PDF] exposure to similar vs. diverse perspectives in forecasting
    Having a diverse group of individuals sharing their thoughts and perspectives may help to counteract biases that may be present in the data or in the ...