Fact-checked by Grok 2 weeks ago

Pascal's mugging

Pascal's mugging is a in , with the term coined by in 2007 and elaborated by philosopher in 2009, highlighting a in expected maximization where a would seemingly be obligated to surrender resources to a stranger promising an enormous but extremely improbable benefit. In the scenario, a mugger approaches the and claims to possess extraordinary powers, such as the to resurrect a vast number of happy human lives or avert an equivalent , in exchange for a modest sum from the 's ; the mugger asserts a tiny probability that the claim is true, yet the expected remains positive due to the astronomical payoff outweighing the small cost. This setup exploits unbounded functions, where no finite loss can be deemed insignificant against potentially infinite gains, leading the to comply despite intuitive skepticism about the mugger's credibility. The poses a challenge to classical by suggesting that agents committed to calculations could be vulnerable to exploitation through arbitrarily escalated claims, potentially resulting in repeated small losses that accumulate without any realized benefits. Bostrom notes that this issue extends beyond mugging to broader concerns, such as the rationality of donating to distant causes with minuscule success probabilities but massive potential impacts, questioning the practical limits of probabilistic reasoning in and . Philosophers have proposed remedies, including bounded utility functions to cap extreme values or adjustments to discount low-probability estimates more aggressively for suspiciously tailored claims. One formal approach involves pre-committing to a planning strategy that accounts for estimation errors in probabilities, such as scaling down expected utilities for offers with overestimated success chances to avoid over-optimism while still allowing acceptance of genuinely extreme opportunities. These responses aim to preserve the normative appeal of expected utility theory while mitigating its counterintuitive implications in high-stakes, low-probability scenarios.

Origins

Historical Context

Pascal's Wager, originally formulated by the French philosopher and mathematician in the 17th century, represents one of the earliest probabilistic arguments in , positing that rational individuals should adopt in due to the potential infinite utility of eternal reward outweighing finite costs. In his posthumously published (1670), Pascal framed the decision as a gamble where the infinite bliss of heaven, if exists, combined with even a minuscule probability of divine , dominates over the finite losses of if does not exist. This dominance argument hinges on the asymmetry of infinite utility versus finite disutility, establishing a foundational tension between probability, utility, and rational choice that would influence later philosophical debates. The development of modern in the built upon such probabilistic intuitions by formalizing as a cornerstone of rational under . and Oskar Morgenstern's Theory of Games and Economic Behavior (1944) introduced the von Neumann-Morgenstern utility theorem, which axiomatizes preferences to yield a where choices maximize , providing a rigorous framework for evaluating risks with unbounded outcomes. This theory shifted focus from ad hoc wagers to systematic analysis, enabling the quantification of decisions involving low-probability, high-stakes events. Infinite utility paradoxes, which challenge the coherence of unbounded utilities in decision theory, emerged prominently in philosophical literature during the early 20th century. Frank Ramsey, in his 1926 essay "Truth and Probability," critiqued infinite utilities as problematic, arguing that they lead to inconsistencies in probabilistic reasoning and advocating for finite utility scales to maintain rationality. Similarly, Leonard Savage's The Foundations of Statistics (1954) addressed these issues by developing subjective expected utility theory, which incorporates personal probabilities but cautions against infinities that could render expected values indeterminate or infinite. These discussions highlighted the vulnerabilities of expected value calculations when utilities escape finite bounds, setting the stage for ongoing refinements in decision-theoretic foundations.

Formulation by Key Thinkers

The term "Pascal's Mugging" was coined by in an October 23, , post on the Overcoming Bias blog, where he introduced it as a finite analog to , highlighting vulnerabilities in expected utility calculations for low-probability, high-stakes events. In this formulation, a mugger approaches a demanding $5 and threatens to use "magic powers from outside " to simulate and destroy the lives of $3\uparrow\uparrow\uparrow\uparrow 3 people—a number expressed via representing an immense tower—if the demand is refused. Yudkowsky argued that even a minuscule probability of the threat's success, far smaller than the vast utility at stake, could yield a positive for compliance, challenging Bayesian decision-making under Solomonoff induction priors. Nick Bostrom further elaborated on the thought experiment in his 2009 paper "Pascal's Mugging," published in the journal Analysis, framing it as a paradox in decision theory and the handling of extreme utilities. Bostrom presented a variant where the mugger promises to perform magic granting an extra 1,000 quadrillion happy days of life (approximately $10^{18} utils, assuming one util per happy day) in exchange for a small payment equivalent to one util, with the fulfillment probability set low enough (1 in 10 quadrillion) to make the expected utility gain substantial—around 100 utils—despite skepticism about the mugger's claims. He emphasized how such scenarios expose tensions in unbounded utility functions, where tiny probabilities multiplied by enormous payoffs dominate rational choice, even when the claims seem implausible, and connected this to broader issues in infinite ethics without relying on actual infinities. These formulations emerged within early rationalist communities, with Yudkowsky's post appearing on Overcoming Bias—a founded in November 2006 by to explore biases in reasoning—and gaining traction after the launch of in February 2009, a platform Yudkowsky established to refine techniques and host discussions on . The mugging analogy quickly became a staple in these forums for debating how to avoid pathological outcomes in probabilistic reasoning.

The Thought Experiment

Core Scenario

Pascal's mugging presents a thought experiment in which an individual is approached by a stranger who makes an extraordinary claim about their ability to influence vast amounts of utility on a minuscule probability. The mugger asserts possession of advanced capabilities—such as simulating entire populations or accessing higher-dimensional powers—to either create billions of happy lives or inflict immense suffering on simulated beings, contingent on the victim's compliance. In exchange for a trivial cost, like handing over five dollars or the contents of one's wallet, the mugger promises to avert a catastrophic outcome or deliver enormous positive utility, with the probability of their claim being true estimated as extremely low, such as one in a billion or far less. This setup creates an intuitive dilemma for decision-makers guided by expected value theory, where the minuscule chance of the mugger's success, multiplied by the astronomically high stakes, rationally suggests yielding to the demand despite the scenario's apparent resemblance to an obvious . The paradox arises because refusing feels prudent given the low credibility of the claim, yet compliance appears warranted to avoid potential regret over forgoing massive expected gains, rendering the agent vulnerable to repeated exploitation by similar low-stakes, high-claim propositions. Variations of the include finite scales, where the promised is large but bounded (e.g., 1,000 quadrillion happy days with a 1 in 10 quadrillion probability), versus more extreme formulations involving effectively infinite or hyper-exponential utilities. The "" analogy draws from , but reframes it as probabilistic , where the threat leverages rather than immediate force. A specific example from Eliezer Yudkowsky's formulation involves a mugger claiming "magic powers from outside " to simulate and 3↑↑↑3 (a tetrationally vast number of) conscious beings unless paid five dollars, highlighting how even implausibly low probabilities can dominate calculations.

Expected Value Calculation

In standard expected value theory, the decision to comply with the mugger's demand is evaluated using the formula for expected utility: EV = p \cdot U + (1 - p) \cdot 0 - C, where p is the probability that the mugger's claim is true, U is the utility of the promised outcome (e.g., a vast number of happy days), and C is the cost of compliance (e.g., the value of the money handed over), assuming the utility of not paying is normalized to zero. This simplifies to EV = p \cdot U - C when p is small, as the term (1 - p) \cdot 0 becomes negligible in the overall assessment. A representative calculation illustrates why compliance appears rational despite skepticism about p. Suppose the mugger demands $5 (so C = 5 utils, assuming $1 util equals one happy day) and promises $10^{15} happy days (U = 10^{15} utils) with a probability of p = 10^{-9} (one in a billion, reflecting high doubt due to the claim's implausibility). The is then: EV = (10^{-9} \times 10^{15}) - 5 = 10^{6} - 5 = 999{,}995 This positive EV suggests paying, as the potential gain outweighs the certain loss. The counterintuitive result arises because even a minuscule p can yield a large positive EV if U is sufficiently enormous, such that p \cdot U \gg C. In logarithmic terms, this occurs when \log U \gg -\log p (or equivalently, U \gg 1/p), ensuring the product dominates the small cost regardless of how skeptically p is estimated, as long as U scales faster than $1/p. For instance, if p = 10^{-n} for large n, selecting U > 10^{n + k} for some modest k (to cover C) makes EV > 0. This dynamic highlights the vulnerability of unbounded expected utility maximization to scenarios with extreme utility disparities.

Implications

Challenges to Decision Theory

Pascal's mugging reveals a fundamental in : rational adhering to expected utility maximization exhibit pathological behavior by becoming vulnerable to repeated exploitation in scenarios with minuscule probabilities of vast rewards. In the classic setup, an agent must decide whether to hand over modest resources to a mugger who promises an astronomically large benefit—such as creating trillions of lives—with only a tiny credence, say one in a . The computation, assuming unbounded utilities, favors compliance, as the potential upside overwhelms the certain loss, even absent corroborating . This susceptibility to "infinite muggings" from any persuasive stranger undermines the theory's prescriptive power, rendering agents impractically credulous and diverting resources from empirically grounded actions to speculative gambles. A key critique targets the reliance on unbounded utility functions, which permit payoffs to escalate without limit, fostering what is termed "" in . Under this framework, for any finite certain good v and arbitrarily small probability , a sufficiently immense potential value V ensures that the lottery with \epsilon V surpasses v, compelling the agent to favor the improbable option regardless of evidential support. This leads to counterintuitive outcomes, such as prioritizing a one-in-a-quintillion chance of over saving millions of verifiable lives, where tiny credences override robust evidence and promote disproportionate focus on remote, high-stakes possibilities. Such highlights how unbounded can prescribe irrational , prioritizing risks over practical . The mugging echoes the St. Petersburg paradox, an earlier conundrum in probability theory originating from a 1713 correspondence and formally analyzed in 1738, where a game's infinite expected value from unbounded payouts clashes with finite willingness to pay. Both expose tensions in expected utility maximization under unbounded scales: the St. Petersburg involves iterative coin flips yielding exponentially growing rewards, while Pascal's mugging employs a single, finite-yet-extreme proposition without mechanical repetition, yet both illustrate how theoretical rationality devolves into impracticality when utilities lack bounds. Unlike the paradox's reliance on repeated trials, the mugging's one-off nature amplifies its challenge to normative decision theory by simulating real-world deceptive encounters. In Bostrom's framework, low-probability vast worlds—such as expansive multiverses or simulated realities—exacerbate this exploitability, as agents assign non-zero expected values to interventions in these domains despite negligible access probabilities. This structure allows muggers (or analogous deceivers) to leverage the agent's utility function against itself, promising outsized impacts in hypothetically immense scopes that dominate decision calculus. The result is a theoretical where rational becomes a liability, as even skeptical agents cannot dismiss such propositions without abandoning core tenets of expected utility maximization.

Applications to Effective Altruism and AI Safety

In (EA), Pascal's mugging underscores debates over allocating resources to low-probability, high-impact causes, such as existential risks from misalignment, versus more reliable interventions to prevent overcommitment to speculative scenarios. Organizations like have critiqued strict maximization for its vulnerability to such muggings, where minuscule probabilities of enormous outcomes (e.g., averting global catastrophes) can dominate funding decisions despite weak evidentiary support. In a 2011 analysis, argued that literal estimates require Bayesian adjustments to discount ungrounded claims of extravagant impact, emphasizing robust evidence and in charity evaluations to avoid irrational prioritization. In AI safety, Nick Bostrom's work at the former (closed in ) highlights Pascal's mugging as a potential flaw in for superintelligent , where unbounded functions could render systems susceptible to by low-probability, high-stakes scenarios, exacerbating risks from biases in future-oriented reasoning. Bostrom's 2009 formulation illustrates how such vulnerabilities might lead to suboptimal or catastrophic choices in AI agents designed for expected maximization, informing broader concerns about aligning advanced systems with human values. Critiques of Pascal's mugging intensify discussions on existential risks within , where the potential for vast future utilities amplifies the case for prioritizing and other x-risks, but also raises alarms about . , in a 2020 interview with philosopher Hilary Greaves, explores how small probabilities of influencing trillions of future lives justify focus on risk reduction in EA. similarly incorporates the mugging in its longtermism curriculum to examine how to weigh tiny probabilities in strategies for mitigating existential threats without succumbing to over-optimistic expected values. As of 2025, ongoing EA Forum discussions post-2020 reflect unresolved tensions around Pascal's mugging in AI alignment research, with contributors debating its implications for longtermist funding without consensus on mitigation. A 2022 post proposes a "reversal test" heuristic to reject mugging-vulnerable expected value claims in AI safety evaluations, arguing it prevents inconsistent prioritization of uncertain high-impact interventions. More recently, a 2025 critique warns that AI risk estimates in EA remain prone to mugging manipulations due to speculative assumptions about neural network opacity and low empirical validation, urging greater skepticism toward unproven longtermist causes. Another 2023 analysis cautions that privileging AI x-risk hypotheses risks mugging-like overinvestment absent stronger evidence.

Remedies

Bounded Utility Approaches

Bounded functions address Pascal's mugging by imposing finite limits on the maximum possible outcomes, preventing arbitrarily large values from overwhelming calculations even when paired with minuscule probabilities. In utilitarian , bounded utilities are constrained to realistic scales to ensure interpersonal comparisons remain meaningful and avoid divergences from intuitive rationality. This approach aligns with von Neumann-Morgenstern expected theory, where utilities are affine transformations but practically bounded to reflect empirical limits on , like the duration and quality of individual lives. Skeptical priors on vast utilities further mitigate the issue by incorporating physical and cosmological constraints, such as those implied by simulation arguments, which dampen credence in scenarios involving astronomically large populations or rewards. Nick Bostrom's suggests that if advanced civilizations run simulations, our reality might be one, leading to lower priors on unsimulated, unbounded future utilities that could justify mugging-like trades. This adjustment tempers fanaticism toward high-stakes, low-probability interventions by emphasizing about unobservable scales. A specific implementation involves truncating the utility function at empirically plausible bounds, rendering the of the mugger's offer negative for sufficiently small probabilities. GiveWell's framework for moral uncertainty, developed since 2011, integrates such bounds implicitly by weighting outcomes against diverse ethical views and avoiding literal unbounded computations, thereby sidestepping mugging scenarios in charitable prioritization.

Probabilistic Adjustments

One approach to mitigating Pascal's mugging involves applying Bayesian prior penalties, drawing on to assign exponentially lower probabilities to complex and implausible claims, such as a mugger possessing extraordinary superpowers capable of affecting vast numbers of lives. , who coined the term "Pascal's mugging," argued that the of such claims should be downweighted based on their descriptive complexity, as simpler explanations (like the mugger lying) are more probable under principles of . This penalty ensures that even enormous potential utilities are offset by sufficiently low priors, rendering the negligible and allowing rational agents to dismiss the threat without paying. A related remedy is the "leverage penalty," proposed by economist , which specifically targets scenarios where an agent is posited to have unusually high influence over outcomes disproportionate to typical circumstances. Hanson suggested adjusting the downward by a factor proportional to the claimed , such as 1/N where N represents the vast number of affected entities (e.g., lives), reflecting the low likelihood that any single individual occupies such a uniquely pivotal position in the world. This approach formalizes skepticism toward high-leverage claims by tying probability inversely to the scale of purported impact, thereby preventing the from dominating decision-making in mugging-like situations. Variants of these probabilistic adjustments incorporate formal measures like Solomonoff induction, which assigns priors based on the algorithmic complexity of hypotheses, further penalizing elaborate scenarios involving superintelligent or supernatural interventions in rationalist discussions of the problem. Under Solomonoff induction, the probability of a mugger's claim decreases exponentially with the Kolmogorov complexity required to describe it, such as simulating or destroying immense populations, making such events inductively improbable despite their scale. This complexity-based prior, rooted in universal prior distributions, has been applied in rationalist literature to systematically dismiss Pascal's mugging by favoring simpler world models over those demanding acceptance of the threat. Recent philosophical work, such as treating extremely low probabilities as effectively zero (Hájek 2024) or using expected choiceworthiness to address fanaticism ( 2024), builds on these adjustments to refine probabilistic reasoning in high-uncertainty scenarios. These methods complement bounded utility approaches by focusing on probability calibration rather than utility capping.

References

  1. [1]
    [PDF] Pascal's mugging - NICK BOSTROM
    1. Related scenarios have recently been discussed informally among various people. Eliezer. Yudkowsky named the problem 'Pascal's mugging' in a post on the ...
  2. [2]
    [PDF] Planning for Pascal's Mugging - PhilArchive
    Abstract. In “Pascal's Mugging” (Bostrom 2009), Pascal gives away his wallet for an extremely tiny chance of an extremely large reward. In this.
  3. [3]
  4. [4]
  5. [5]
    A Brief History of LessWrong
    May 31, 2019 · In 2009, after the topics drifted more widely, Eliezer moved to a new community blog, LessWrong. LessWrong was seeded with series of daily blog ...
  6. [6]
    Pascal's Mugging: Tiny Probabilities of Vast Utilities - LessWrong
    LessWrong.Counterfactual Mugging - LessWrongThat Magical Click - LessWrongMore results from www.lesswrong.comMissing: Eliezer Yudkowsky
  7. [7]
    [PDF] In defence of fanaticism | Global Priorities Institute
    value theory implies Fanaticism; therefore, Fanaticism is true. And likewise ... 'Pascal's mugging', Analysis 69, no. 3: 443-5. [10] Bostrom, Nick 2011 ...
  8. [8]
    How to Save Pascal (and Ourselves) From the Mugger
    This article originated from a discussion of Pascal's Mugging on the Board ... In defense of fanaticism. Ethics, 132(2), 445–477. https://doi.org/10.1086 ...
  9. [9]
    Why we can't take expected value estimates literally (even when ...
    Aug 18, 2011 · Show how a Bayesian adjustment avoids the Pascal's Mugging problem that those who rely on explicit expected value calculations seem prone to.
  10. [10]
    Future of Humanity Institute
    AI Safety. Thinking inside the box: controlling and using an oracle AI ... Pascal's mugging, Nick Bostrom, 2009. Infinite ethics, Nick Bostrom, 2011 ...
  11. [11]
    Hilary Greaves on Pascal's mugging, strong longtermism, and ...
    Oct 21, 2020 · #86 – Hilary Greaves on Pascal's mugging, strong longtermism, and whether existing can be good for us. By Arden Koehler, Robert Wiblin and ...
  12. [12]
    [PDF] Seminar on Longtermism | Open Philanthropy
    Additional: Wilkinson, “In defense of fanaticism.” Bostrom, “Pascal's Mugging.” 7 Ignoring small probabilities. Main: Kosonen, Tiny Probabilities of Vast ...
  13. [13]
    Does this solve Pascal's Muggings? — EA Forum
    Aug 28, 2022 · The original Pascal's Mugging suggests you should give the mugger your 10 livres in the hope that you get the promised 10 quadrillion Utils. The ...Missing: happy | Show results with:happy
  14. [14]
    A critique of EA's focus on longtermism - Effective Altruism Forum
    Oct 23, 2025 · Such expected value calculations are also vulnerable to Pascal's mugging, a deeply technical problem within decision theory that has, as far as ...
  15. [15]
    A list of good heuristics that the case for AI X-risk fails — EA Forum
    Moreover, you risk privileging the hypothesis or falling victim to Pascal's Mugging. Unfortunately, the case for x-risk from out-of-control AI systems seems ...
  16. [16]
    [PDF] Tiny Probabilities of Vast Value - Petra Kosonen
    Jul 5, 2022 · One example of a fanatical case is Pascal's Mugging:15. Pascal's Mugging: A stranger approaches Pascal and claims to be an Operator from the ...
  17. [17]
    How the Simulation Argument Dampens Future Fanaticism
    Aug 23, 2016 · ... Pascal's mugging after hearing Hanson's idea: Yes, if you've got 3↑↑↑↑3 people running around they can't all have sole control over each ...
  18. [18]
    Modeling Extreme Model Uncertainty - GiveWell
    I'm aware that there are arguments in favor of having a bounded utility function (and believe that a bounded utility function may be appropriate), but ...Missing: moral | Show results with:moral
  19. [19]
  20. [20]
    Pascal's Mugging - Penalizing the prior probability? - LessWrong
    May 17, 2011 · Eliezer Yudkowsky wrote that Robin Hanson solved the Pascal's mugging thought experiment: Robin Hanson has suggested penalizing the prior ...
  21. [21]
    Pascal's Mugging - LessWrong
    Sep 23, 2020 · Pascal's mugging refers to a thought experiment in decision theory, a finite analogue of Pascal's wager. Suppose someone comes to me and says, " ...