Superforecaster
A superforecaster is an individual who demonstrates exceptional and sustained accuracy in probabilistic forecasting of future events, particularly in domains such as geopolitics, economics, and science, outperforming both laypersons and domain experts through empirical validation in large-scale prediction tournaments.[1][2] The concept emerged from research led by political scientist Philip E. Tetlock, who coined the term after analyzing participants in the Good Judgment Project (GJP), a competition sponsored by the U.S. Intelligence Advanced Research Projects Activity (IARPA) from 2011 to 2015, where superforecasters—selected from thousands of volunteers—achieved accuracy levels 60% superior to control groups in the first year and 78% in the second, surpassing U.S. intelligence analysts and other research teams.[3][4] Unlike intuitive or expert predictions often marred by overconfidence and ideological bias, superforecasters distinguish themselves through deliberate practices rooted in Bayesian updating and causal analysis, including breaking down questions into subcomponents, actively seeking disconfirming evidence, assigning numerical probabilities rather than binary outcomes, and iteratively refining estimates as new information emerges.[5][6] They tend to exhibit traits such as intellectual humility, numeracy, and a non-deterministic worldview, enabling them to navigate uncertainty without anchoring to initial beliefs or groupthink.[7] This skill, Tetlock's studies show, is cultivable rather than innate, with superforecasters maintaining top performance across years—about 70% retaining elite status from one tournament season to the next—and applying it in real-world contexts via organizations like Good Judgment Inc., which deploys teams of around 180 such forecasters for client forecasting needs.[8][9] The identification of superforecasters challenges conventional reliance on credentialed experts, whose forecasts Tetlock's earlier Expert Political Judgment project found no better than chance in many cases, highlighting instead the value of track-record-based evaluation and crowd-aggregation techniques refined by superforecasters, such as ensemble forecasting where aggregated individual predictions yield even higher accuracy.[10] While scalable applications include policy advisory and risk assessment, the approach underscores empirical limits: superforecasting excels in resolvable, trackable questions but falters in truly black-swan events dominated by unknown unknowns, emphasizing probabilistic humility over false precision.[2]History
Origins and Etymology
The term "superforecaster" combines the prefix super-, denoting exceptional quality or performance, with "forecaster," a person who predicts future events or trends.[11] It specifically denotes individuals who demonstrate markedly superior accuracy in probabilistic forecasting compared to average participants or experts in structured prediction tasks.[6] The concept originated in the Good Judgment Project (GJP), a research initiative launched in 2011 by psychologist Philip Tetlock and decision scientist Barbara Mellers under the auspices of the U.S. Intelligence Advanced Research Projects Activity (IARPA).[12] The GJP conducted large-scale forecasting tournaments involving thousands of volunteers assessing geopolitical and economic questions, where a small subset of participants—about 2%—consistently outperformed intelligence analysts and other benchmarks by updating predictions dynamically and achieving Brier scores up to 60% better than control groups.[13] Tetlock coined "superforecaster" to describe these top performers, whose skills were identified through empirical analysis rather than prior expertise, challenging assumptions that domain knowledge alone suffices for accurate prediction.[14] The term gained prominence with the 2015 publication of Tetlock's book Superforecasting: The Art and Science of Prediction, co-authored with Dan Gardner, which detailed GJP findings and argued that superforecasting stems from cultivable habits like Bayesian updating and active open-mindedness rather than innate genius or esoteric knowledge.[15] This work built on Tetlock's earlier research, including his 2005 book Expert Political Judgment, which had revealed widespread forecasting failures among pundits and specialists, setting the stage for identifying outliers like superforecasters.[12]Early Research and Tetlock's Contributions
Philip Tetlock initiated systematic research on forecasting accuracy in the 1980s through small-scale tournaments that tracked predictions from experts in politics, economics, and related fields. These efforts, spanning from 1984 to 2003, involved 284 participants generating thousands of forecasts on geopolitical and economic outcomes, which Tetlock evaluated against objective resolutions to assess calibration and discrimination.[16] The analysis revealed that experts struggled with long-term predictions, often performing no better than chance or simple baselines like random selection. This groundwork culminated in Tetlock's 2005 book Expert Political Judgment: How Good Is It? How Can We Know?, which synthesized data from approximately 27,000 predictions made by those 284 experts over two decades. The study distinguished between "hedgehogs"—ideologically driven specialists who exhibited overconfidence and poor adaptability—and "foxes," more eclectic thinkers who marginally outperformed baselines through probabilistic reasoning and belief updating. Overall, however, even foxes' accuracy remained limited, underscoring systemic flaws in expert forecasting reliant on narrative coherence over empirical tracking. Tetlock argued that judgment quality hinges on both correspondence to reality (accurate calibration) and coherence (logical consistency), with most experts failing the former.[17] Motivated by these shortcomings, Tetlock shifted toward experimental interventions to cultivate superior forecasters. In 2011, he co-led the Good Judgment Project (GJP) with Barbara Mellers as part of the U.S. Intelligence Advanced Research Projects Activity (IARPA)'s Aggregative Contingent Estimation (ACE) program, a four-year competition launched to test crowd-based forecasting methods against intelligence analysts. GJP recruited over 20,000 volunteers to provide probabilistic estimates on 500 real-world questions, incorporating techniques like base-rate awareness, team deliberation, and frequent updates, which yielded accuracy gains of 30% over control groups and control forecasters.[1] Tetlock's pivotal contribution emerged from GJP's identification of "superforecasters"—the top 2% of participants whose aggregate performance surpassed experts, crowds, and even GJP's broader pool by wide margins, achieving Brier scores indicative of high calibration and resolution. These individuals exemplified traits like openness to evidence and iterative refinement, challenging prior pessimism about human prediction limits. Tetlock formalized this in his 2015 book Superforecasting: The Art and Science of Prediction, co-authored with Dan Gardner, which disseminated GJP methodologies and traits for broader application, establishing superforecasting as a trainable skill rooted in empirical validation rather than intuition.[1][18]Key Projects and Milestones
The Good Judgment Project (GJP), co-led by Philip Tetlock and Barbara Mellers, was launched in 2011 as a participant in the U.S. Intelligence Advanced Research Projects Activity (IARPA) Aggregative Contingent Estimation (ACE) forecasting tournament, which ran from 2011 to 2015 and aimed to improve geopolitical prediction accuracy through crowd-sourced methods.[19][3] GJP recruited over 20,000 volunteer forecasters to answer approximately 100-150 annual questions on international events, employing strategies such as talent identification, probabilistic training, team collaboration, and forecast aggregation.[20] In the ACE program's first year (2011-2012), GJP's top-performing forecasters outperformed the control group of intelligence analysts by 60% in prediction accuracy, rising to 78% in the second year, and ultimately surpassing all competing teams by 35-72% across the tournament.[21] This success highlighted the efficacy of structured forecasting processes, with GJP's "superforecasters"—defined as the consistent top 2% of participants—demonstrating sustained outperformance through habits like frequent updating of beliefs and belief calibration.[21] The project's findings culminated in the 2015 publication of Superforecasting: The Art and Science of Prediction by Tetlock and Dan Gardner on September 29, which formalized the superforecaster concept based on empirical data from GJP, arguing that superior forecasting stems from learnable skills rather than innate expertise.[2] Following the ACE program's conclusion in 2015, Tetlock and Mellers established Good Judgment Inc. to commercialize these methods, offering forecasting services to government, nonprofit, and private sector clients, including ongoing platforms like Good Judgment Open for public participation.[22]Scientific Foundations
Forecasting Tournaments and Methodology
Forecasting tournaments consist of competitive exercises in which participants assign numerical probabilities to the binary outcomes of predefined real-world events, such as geopolitical developments or policy changes, with accuracy assessed via proper scoring rules that reward well-calibrated and resolved predictions.[18] These tournaments test whether forecasting skill exists and can be cultivated beyond chance or expert opinion, using verifiable resolutions to score performance over time.[19] The foundational tournament series was the U.S. Intelligence Advanced Research Projects Activity (IARPA)'s Aggregative Contingent Estimation (ACE) program, conducted from 2011 to 2015, which sought to boost the precision of intelligence forecasts by rigorously evaluating methods for eliciting, weighting, and aggregating probabilistic judgments.[19] Each annual season involved 100 to 150 questions drawn from national security and international affairs domains, with clear, objective resolution criteria tied to public or official sources to minimize ambiguity.[18] Participating teams, including university-led groups, competed to refine forecasting techniques, generating over one million individual predictions across approximately 500 questions in total.[1] The Good Judgment Project (GJP), directed by Philip Tetlock and Barbara Mellers, achieved the highest accuracy in the ACE tournaments through a multi-stage methodology: broad recruitment of volunteer forecasters via online platforms, initial solo probabilistic forecasting on tournament questions, brief training in cognitive debiasing to counter common biases like overconfidence, and iterative updates to predictions as new evidence surfaced.[1][18] Top performers—defined as the uppermost 5% by cumulative accuracy—were annually selected and grouped into elite teams of about 12 members each, fostering collaborative deliberation while preserving individual accountability to incentivize diverse inputs and error correction.[18] Performance was quantified using the Brier score, the mean squared error between a forecaster's probability estimates (expressed as decimals from 0 to 1) and actual outcomes (coded as 0 or 1), penalizing both miscalibration and poor resolution; scores range from 0 for perfect accuracy to up to 2 for consistent error, with 0.5 typical for uninformative guesses on even-odds events.[18] GJP's superforecaster teams recorded Brier scores approximately 30% to 60% superior to those of intelligence analysts, crowdsourced baselines, and rival research teams across seasons 2 through 4, demonstrating sustained skill rather than luck through metrics like high area under the curve (AUC) values near 0.96 and rapid learning from feedback.[1][18] This approach prioritized empirical validation over untested expertise, revealing that structured probabilistic methods and team dynamics outperform solitary intuition.[19][18]Probabilistic Reasoning and Bayesian Principles
Superforecasters distinguish themselves through the consistent application of probabilistic reasoning, expressing predictions as numerical probabilities rather than categorical assertions such as "yes" or "no." This method acknowledges inherent uncertainty in complex events, allowing for nuanced assessments like a 65% chance of occurrence, which can be rigorously tested against outcomes using proper scoring rules, including the Brier score that penalizes both overconfidence and underprecision.[23] In the Good Judgment Project, superforecasters achieved superior accuracy by routinely assigning such probabilities to geopolitical and economic questions, outperforming intelligence analysts who often defaulted to deterministic language.[23][24] Central to their approach is adherence to Bayesian principles, which involve initializing forecasts with base rates—empirical priors drawn from analogous historical cases—and iteratively updating them via new evidence, mirroring Bayes' theorem's formula for posterior probability: P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)}, where priors P(H) are adjusted by the likelihood P(E|H) of observed evidence E.[25] Tetlock's analysis revealed that superforecasters, unlike novices, actively sought disconfirming evidence and revised estimates promptly, with updates occurring multiple times per forecast as information accrued, fostering a dynamic belief system resistant to anchoring bias.[26] This Bayesian updating was evident in tournament performance, where superforecasters' forecasts improved by an average of 10-15% in logarithmic scoring terms after incorporating fresh data, compared to static predictions from control groups.[8] To operationalize these principles, superforecasters decompose questions into subcomponents, estimate conditional probabilities for each, and aggregate them, often employing Fermi-style approximations for tractability.[25] They prioritize reference class forecasting, selecting relevant base rates while avoiding the "reference class problem" through sensitivity testing across alternatives, which enhances calibration—defined as the alignment between stated probabilities and realized frequencies, with superforecasters achieving 70-80% hits on their high-confidence predictions in validated studies.[23] Such practices underscore a commitment to causal realism, weighing multiple hypotheses without premature commitment, thereby mitigating overreliance on narrative coherence at the expense of probabilistic rigor.[24]Characteristics of Superforecasters
Psychological and Cognitive Traits
Superforecasters exhibit distinct psychological traits characterized by humility, reflectiveness, and a willingness to revise beliefs in light of new evidence. They demonstrate high levels of active open-mindedness, treating predictions as testable hypotheses rather than fixed convictions, which enables them to update forecasts dynamically without attachment to initial views.[8][27] This temperament contrasts with overconfidence common in average forecasters, as superforecasters maintain cautious self-assessments, acknowledging uncertainty and avoiding deterministic thinking.[28] Cognitively, superforecasters possess above-average intelligence, particularly in reflective and analytical domains, allowing for systematic deliberation over rapid intuition.[6] They excel in numeracy and probabilistic reasoning, framing judgments in terms of likelihoods rather than absolutes, which facilitates Bayesian updating of probabilities based on incoming data.[27] Studies from forecasting tournaments indicate that these individuals perform better due to cognitive styles that mitigate biases, such as confirmation bias, through deliberate evidence-seeking and self-critique.[29][18] Empirical analyses of top performers in the Good Judgment Project reveal correlations with fluid intelligence measures and task-specific skills like intuitive statistical reasoning, though raw IQ alone does not predict success—deliberative habits amplify cognitive strengths.[5] Unlike domain experts who may suffer from specialized overreach, superforecasters apply generalizable cognitive tools across diverse topics, emphasizing precision in aggregating weak signals into coherent probabilities.[8]Habits and Practices
Superforecasters demonstrate distinctive habits and practices that enhance their predictive accuracy, derived from empirical analysis of top performers in the Good Judgment Project's forecasting tournaments, where participants addressed geopolitical and economic questions over periods exceeding 1,000 days. These individuals, comprising roughly the top 2% of forecasters, habitually apply probabilistic reasoning, regularly update predictions in response to new evidence, and maintain decision journals to track and refine their thought processes.[30][31] Such practices correlate with Brier scores— a metric combining calibration and resolution—superior to those of average participants by factors of 30-60% in controlled studies. Philip Tetlock, principal investigator of the Good Judgment Project, synthesized these behaviors into "Ten Commandments for Aspiring Superforecasters," emphasizing deliberate, evidence-based cognition over intuition or ideological commitment.[30]- Triage: Superforecasters prioritize questions in a "Goldilocks zone" of difficulty—neither trivially easy nor impossibly vague—where informed effort yields measurable improvements, avoiding overinvestment in low-value or intractable forecasts.[30]
- Break seemingly intractable problems into tractable sub-problems: They decompose complex issues into components amenable to Fermi estimation or targeted research, isolating knowable elements from unknowns and testing underlying assumptions.[30]
- Strike the right balance between inside and outside views: Forecasters integrate case-specific details (inside view) with statistical base rates from analogous historical events (outside view), habitually querying frequencies of similar outcomes to anchor probabilities.[30]
- Strike the right balance between under- and overreacting to evidence: They update beliefs Bayesian-style, distinguishing signal from noise while guarding against confirmation bias or undue weight on recent data, often adjusting forecasts incrementally as evidence accumulates.[30]
- Look for the clashing causal forces at work in each problem: Superforecasters map competing drivers and counterarguments, maintaining a mental checklist of indicators that could falsify their views and synthesizing hybrid perspectives from apparent contradictions.[30]
- Strive to distinguish as many degrees of doubt as the problem permits but not more: They quantify uncertainty numerically—e.g., expressing 60% confidence rather than binary yes/no—through repeated practice to refine granularity without spurious precision.[30]
- Strike the right balance between under- and overconfidence, between prudence and decisiveness: Balancing calibration (alignment of stated probabilities with outcomes) and resolution (discrimination between likely and unlikely events), they avoid paralysis by analysis yet qualify assertions proportionally to evidence strength.[30]
- Look for the errors behind your mistakes but beware of rearview-mirror hindsight biases: Post-mortems on resolved forecasts focus on cognitive pitfalls like overconfidence or base-rate neglect, using journals to differentiate skill from luck without retroactive rationalization.[30]
- Bring out the best in others and let others bring out the best in you: In team settings, they employ perspective-taking, precise questioning, and constructive feedback to aggregate diverse inputs, leveraging collective intelligence while mitigating groupthink.[30]
- Master the error-balancing bicycle: Through iterative practice with feedback loops—such as tournament scoring—superforecasters dynamically correct for tendencies toward over- or under-adjustment, akin to learning equilibrium via trial and error.[30]
Effectiveness and Validation
Performance Metrics from Studies
In studies from the Good Judgment Project (GJP), superforecasters—identified as the top 2% of individual performers in large-scale forecasting tournaments—were evaluated using the Brier score, which measures the mean squared difference between predicted probabilities and binary outcomes (with scores ranging from 0 for perfect accuracy to 2 for complete inaccuracy).[18] Their standardized Brier scores averaged -0.34 across years 2 and 3 of the tournaments, outperforming top-team individuals at -0.14 and all other participants at 0.04, with statistical significance (p < 0.01).[18] This translated to 50-60% greater accuracy relative to top-team forecasters on quick-response tasks requiring minimal deliberation time.[18] Superforecasters also excelled in decomposed Brier score components: resolution (ability to distinguish likely from unlikely events) at 0.40 versus 0.35 for top-team individuals and 0.32 for others; calibration (alignment of predicted probabilities with observed frequencies) at 0.01 deviation versus 0.03 and 0.04; and area under the curve (AUC) for discrimination at 96% versus 84% and 75%.[18] Raw Brier scores for superforecasters typically ranged from 0.20 to 0.25 across diverse geopolitical and economic questions, reflecting sustained performance without regression to the mean, unlike average forecasters.[18] Approximately 70% of designated superforecasters retained their elite status from one tournament year to the next, with correlations between yearly performances indicating skill stability rather than luck.[8] In aggregate, team-based predictions from superforecasters in the GJP's first IARPA-sponsored season were 35% to 72% more accurate (via Brier score reductions) than those from competing research teams, including methods reliant on expert aggregation or crowds.[3] These results held across hundreds of questions resolved between 2011 and 2015, spanning topics like international conflicts, economic indicators, and policy shifts, with superforecasters showing the fastest learning rates (-0.26 in Brier improvement) through iterative feedback.[18]Comparisons to Experts, Crowds, and AI
Superforecasters have demonstrated superior accuracy compared to domain experts in controlled forecasting tournaments. In the Intelligence Advanced Research Projects Activity (IARPA) Aggregate Project, teams from the Good Judgment Project, which included superforecasters, achieved a Brier score approximately 30% lower than that of U.S. intelligence analysts with access to classified information, indicating markedly higher predictive accuracy. This outperformance persisted across geopolitical, economic, and scientific questions, challenging the conventional reliance on expert intuition, which Tetlock's earlier research found often approximates random chance for long-range forecasts. Relative to crowds, superforecasters consistently outperform unaggregated or baseline crowd predictions, though hybrid or incentivized crowd mechanisms like prediction markets can narrow the gap. In the Good Judgment Project, superforecaster teams surpassed large crowd baselines by 15-30% in accuracy, even when crowds were drawn from diverse, informed participants.[32] For instance, in a 2023 Financial Times contest involving over 8,500 readers as a crowd proxy, Good Judgment superforecasters achieved higher accuracy on key global events.[33] However, studies indicate that while individual superforecasters excel, aggregated superforecaster teams can rival or exceed prediction markets, which themselves often beat naive crowds by incorporating probabilistic updates and incentives.[34] Comparisons to artificial intelligence reveal superforecasters maintaining an edge as of 2024-2025, though AI capabilities are rapidly advancing in specific domains. On the ForecastBench benchmark, superforecasters recorded a Brier score of 0.093, outperforming the best large language models at 0.111, a 19% relative advantage across diverse forecasting tasks.[35] Independent evaluations confirm superforecasters' lower Brier scores (e.g., 0.0225) against leading AI models like o1 on real-world questions, attributing human superiority to adaptive reasoning and Bayesian updating habits.[36] Nonetheless, state-of-the-art AI has surpassed superforecasters and human AI experts in select 2025 benchmarks focused on technical progress, such as AI capability timelines, suggesting domain-specific vulnerabilities where superforecasters exhibit undue conservatism.[37] Emerging hybrid human-AI systems show promise for further gains, but pure AI forecasting remains below superforecaster levels in generalizability.[21]Notable Figures and Organizations
Prominent Individuals
Jean-Pierre Beugoms, a military historian, joined the Good Judgment Project in 2011 and emerged as one of its first superforecasters, demonstrating sustained high accuracy in forecasting, especially elections.[38] His performance earned recognition in Adam Grant's 2021 book Think Again, highlighting superforecasters' ability to update beliefs amid uncertainty.[39] Kjirste Morrell, a mechanical engineer holding a PhD from MIT, ranks consistently among top superforecasters, with expertise in geopolitical predictions.[39] She was also profiled in Think Again for exemplifying traits like probabilistic thinking and evidence-based revision.[39] Robert de Neufville, a Harvard and Berkeley graduate qualified as a superforecaster in 2014, focuses on existential risks and co-hosts a podcast on forecasting.[39] His work extends to authoring content that applies superforecasting principles to long-term uncertainties.[40] Other profiled superforecasters include Elan Pavlov, a theoretical computer science PhD with postdoc experience in behavioral economics, specializing in military and AI-related geopolitical forecasts;[39] and Dan Mayland, a geopolitical analyst and novelist whose predictions emphasize Middle East dynamics.[39] These individuals, drawn from diverse fields like engineering, academia, and intelligence, collectively outperform benchmarks by leveraging habits such as frequent updating and aggregation.[41]Good Judgment Project and Related Entities
The Good Judgment Project (GJP) was a forecasting research initiative led by psychologists Philip Tetlock and Barbara Mellers at the University of Pennsylvania, funded as part of the U.S. Intelligence Advanced Research Projects Activity (IARPA) Aggregative Contingent Estimation (ACE) program from 2011 to 2015.[19][1] The project recruited over 20,000 volunteer participants to generate probabilistic forecasts on approximately 500 geopolitical, economic, and scientific questions, resulting in more than one million individual predictions evaluated against real-world outcomes.[1][42] GJP's methodology emphasized team-based forecasting, where top performers—later termed superforecasters—collaborated to refine predictions through iterative updates and aggregation techniques, outperforming other ACE teams by 35% to 72% in accuracy metrics such as Brier scores.[21] These superforecasters, comprising the top 2% of participants, also demonstrated over 30% greater accuracy than unaided U.S. intelligence analysts on comparable questions.[21] Following the ACE program's conclusion, Tetlock and Mellers co-founded Good Judgment Inc. in 2015 as a private entity to commercialize superforecasting techniques, offering customized forecasting services to clients in government, nonprofit, and corporate sectors.[22][1] The company maintains a vetted network of approximately 150 superforecasters drawn from GJP alumni, who provide calibrated probability estimates on strategic risks, policy outcomes, and market events, with historical performance showing sustained superiority over baseline benchmarks.[21] Good Judgment Inc. has collaborated with entities such as the U.S. government and private firms for applications including geopolitical risk assessment, achieving documented improvements in decision-making through probabilistic aggregation. Related to these efforts, Good Judgment Open operates as a public platform launched by the company, enabling broader participation in crowd-sourced forecasting challenges with questions updated weekly on topics like international relations and technology trends.[43] It serves as a training ground for aspiring forecasters and a data source for refining superforecasting practices, with participant scores ranked against historical superforecaster benchmarks to encourage skill development via frequent, evidence-based updates.[43]Applications
In Government and Policy
Superforecasters have been integrated into government forecasting efforts primarily through structured tournaments and consulting services aimed at enhancing intelligence analysis and policy anticipation. The Good Judgment Project, operational from 2011 to 2015 under the U.S. Intelligence Advanced Research Projects Activity (IARPA)'s Aggregative Contingent Estimation program, recruited crowds of forecasters, including those exhibiting superforecasting traits, to predict geopolitical events relevant to national security.[44] The initiative demonstrated that teams employing superforecaster practices—such as probabilistic reasoning, frequent updating, and aggregation of diverse judgments—outperformed intelligence analysts and other crowds by up to 30% in accuracy on questions like international conflicts and economic indicators.[22] Post-tournament, Good Judgment Inc., founded by project leaders Philip Tetlock and Barbara Mellers, has extended these methods to government clients for real-time policy support. Policymakers and military leaders use superforecasting to anticipate global events, including foreign policy shifts and security threats, by commissioning scenario-based predictions from certified superforecasters.[22] For instance, in 2024, Good Judgment partnered with the UK Government's Futures Procurement Network to deliver foresight insights, aiding procurement and strategic planning amid uncertainties like supply chain disruptions.[45] Similarly, the firm produced forecasts on U.S. foreign aid funding in 2025, focusing on global health allocations, which informed evaluations by organizations influencing congressional policy.[46] Applications extend to specialized policy domains, such as AI governance, where superforecasters generated probabilistic assessments in 2025 on regulatory trajectories and international agreements, drawing from public data to guide executive branch deliberations.[47] The Behavioural Insights Team, a UK government unit, has advocated for superforecasting-inspired practices in policy evaluation, emphasizing iterative prediction markets to test interventions before implementation, as seen in trials for economic and public health forecasting.[48] These efforts highlight superforecasting's role in reducing reliance on expert intuition, which studies show often underperforms calibrated crowds, though adoption remains limited by institutional inertia and classification constraints in sensitive policy areas.[22]In Business and Strategic Decision-Making
Good Judgment Inc., co-founded by Philip Tetlock and Barbara Mellers following the Good Judgment Project, extends superforecasting to private sector clients for corporate forecasting and risk assessment.[22] The firm deploys a network of approximately 180 superforecasters—individuals selected for their top 1-2% accuracy in probabilistic predictions—to address business questions spanning economic, geopolitical, and market uncertainties.[49] These forecasts support strategic decisions, such as resource allocation in energy firms, where predictions inform development priorities amid volatile commodity prices and regulatory shifts.[22] In practice, businesses leverage superforecasting through tools like FutureFirst, which provides ongoing access to aggregated superforecaster probabilities and qualitative analyses on client-specified scenarios, updated daily.[50] This approach has demonstrated utility in investment contexts, where superforecasters outperformed futures markets in anticipating Federal Reserve interest rate hikes, offering calibrated probabilities that reduced noise and bias in decision processes.[51] For instance, in 2023, superforecasters beat a crowd of 8,500 Financial Times readers in forecasting key economic events, highlighting advantages in team-based aggregation over individual or market judgments.[33] Strategic applications emphasize four evidence-based elements: talent-spotting via track record evaluation, targeted training to foster probabilistic thinking and belief updating, teaming to mitigate individual errors through diverse perspectives, and aggregation to refine collective outputs.[6] Research underpinning these methods attributes roughly 30% greater accuracy to superforecaster teams compared to baseline analysts, even without classified data advantages.[21] Companies apply this to challenge overconfident internal predictions, as outlined in frameworks for upgrading organizational judgment, enabling better scenario planning for mergers, market entries, or supply chain disruptions.[52] In 2024, private clients submitted 151 forecasting questions to the platform, yielding 1,132 predictions that informed real-time strategic adjustments.[53]Training and Development
Core Principles for Improvement
Superforecasters cultivate forecasting accuracy through deliberate habits and cognitive practices identified in research from the Good Judgment Project (GJP), a tournament from 2011 to 2015 where teams competed to predict geopolitical events.[22] Participants who excelled, termed superforecasters, outperformed intelligence analysts by 30% in Brier scores, a metric measuring calibration and resolution of probabilistic predictions.[54] These individuals were not experts in specific domains but shared traits like openness to evidence and iterative belief updating, honed via training modules on probabilistic reasoning and bias mitigation. Key principles for improvement, distilled as "Ten Commandments for Aspiring Superforecasters" by Philip Tetlock, emphasize breaking down complex questions, balancing perspectives, and rigorous self-critique.[30] These derive from analyzing superforecasters' approaches in the GJP, where trained forecasters improved accuracy by 7-10% over untrained peers through practices like aggregating team inputs and extremizing probabilities toward extremes while preserving calibration.[8] The principles include:- Triage: Prioritize questions where effort yields returns, avoiding overcommitment to low-stakes or unresolvable forecasts, as superforecasters focused on 20-30 high-value questions annually in GJP trials.[30]
- Decompose problems: Divide intractable issues into sub-problems, such as estimating base rates and causal drivers separately, which reduced error by enabling targeted analysis in GJP simulations.[30]
- Balance inside and outside views: Integrate specific case details (inside view) with statistical baselines (outside view), as overreliance on either led to 15-20% worse calibration in tournament data.[30]
- Balance short- and long-term perspectives: Weigh immediate trends against enduring patterns, preventing recency bias evident in non-superforecasters' 10-15% higher overconfidence.[30]
- Identify clashing forces: Map competing causal mechanisms, like economic incentives versus political inertia, to model uncertainty more granularly, a habit correlating with top GJP decile performance.[30]
- Quantify doubt precisely: Assign probabilities beyond binary outcomes (e.g., 65% vs. 70%), as vague estimates in GJP yielded poorer resolution scores by underdistinguishing shades of likelihood.[30]
- Avoid over- or underreaction: Update beliefs proportionally to evidence strength, with superforecasters adjusting forecasts by 5-10% on average per new datum versus larger swings by others.[30]
- Analyze errors without hindsight: Review mistakes for systematic flaws like confirmation bias, but reconstruct pre-outcome reasoning to avoid retrofitting explanations, improving future accuracy by 8% in iterative GJP training.[30]
- Collaborate effectively: Leverage group deliberation to refine individual views, as GJP teams using structured debate outperformed solo forecasters by 20-23% in aggregate scores.[30][55]
- Balance errors dynamically: Treat forecasting as a skill requiring constant adjustment, akin to riding a bicycle, where superforecasters logged thousands of predictions to refine error patterns over years.[30]