Machine Intelligence Research Institute
The Machine Intelligence Research Institute (MIRI) is a nonprofit organization founded in September 2000 as the Singularity Institute for Artificial Intelligence, dedicated to researching the technical challenges of aligning artificial superintelligence with human values to avert existential risks.[1] Renamed MIRI in 2013, it shifted emphasis from accelerating AI development to prioritizing safety amid growing concerns over uncontrolled capabilities.[1] MIRI's core focus involves mathematical investigations into AI decision-making, goal specification, and robustness, producing influential technical reports on topics such as coherent extrapolated volition, tiling agents, and logical inductors that have shaped early discourse in AI alignment.[2][3] Pioneering efforts include helping launch and host the LessWrong online community (founded in 2009 around Eliezer Yudkowsky’s ‘Sequences’) for rationalist discourse and compiling those sequences in Rationality: From AI to Zombies (2015), which popularized concepts like Bayesian reasoning applied to AI risks.[1] Despite these foundational contributions, MIRI has faced critiques for limited empirical validation and peer-reviewed outputs in mainstream venues, prompting strategic pivots: from core alignment research to broader communication and advocacy for halting advanced AI training until safety is assured, reflecting a view that technical progress has lagged behind capability advances.[4][5] Funded primarily through donations, MIRI operates from Berkeley, California, with key figures including founder Eliezer Yudkowsky continuing as a research fellow.[1]
History
Founding and Early Focus (2000-2012)
The Singularity Institute for Artificial Intelligence (SIAI), predecessor to the Machine Intelligence Research Institute, was established in 2000 by Eliezer Yudkowsky alongside Brian Atkins and Sabine Atkins as a non-profit organization. Its founding aim was to advance research toward artificial general intelligence (AGI) while prioritizing the creation of "friendly" AI systems aligned with human values to avert potential existential threats from unaligned superintelligence. Initial funding came primarily from the Atkins, supporting Yudkowsky's early theoretical work on AI design.[1][6] In its formative years through 2003, SIAI's activities emphasized conceptual frameworks for safe AI development, including Yudkowsky's publications such as the initial drafts of "Staring into the Singularity" and "Coding a Transhuman AI," which explored recursive self-improvement and value-loading mechanisms in prospective AGI architectures. The institute recognized early the orthogonality thesis—that intelligence and goals are independent—leading to a pivot from accelerating AGI to mitigating alignment risks, as uncontrolled superintelligence could pursue misaligned objectives regardless of its cognitive capabilities. Operations were lean, with a focus on independent research rather than large-scale empirical projects, given limited resources and the nascent state of the field.[7][1] From 2006 onward, SIAI expanded outreach by co-organizing the annual Singularity Summit, starting in cooperation with Stanford University, to convene experts on AI trajectories, risks, and safeguards, thereby disseminating ideas on existential risks from advanced AI. Yudkowsky's writings on decision theory, Bayesian reasoning, and AI safety—disseminated via blogs and essays—laid groundwork for later communities and influenced formal AI alignment discourse. By 2012, the organization had cultivated a donor base through these efforts but maintained a small staff, prioritizing high-impact theoretical contributions over broad institutional growth.[1][8]Renaming, Expansion, and Maturation (2013-2020)
In January 2013, the Singularity Institute for Artificial Intelligence rebranded as the Machine Intelligence Research Institute (MIRI), prompted by the acquisition of the Singularity Summit by Singularity University, which created potential brand confusion, and a desire for a name emphasizing technical research into machine intelligence rather than speculative futures like the technological singularity.[9] The change coincided with a strategic pivot away from outreach-heavy activities toward in-house technical research on AI alignment, specifically foundational problems in creating provably safe and reliable artificial agents.[10] [1] This period marked MIRI's expansion into a more structured research organization, with hiring of specialized personnel including mathematicians and programmers focused on agent foundations—a subfield addressing core challenges like decision-making under logical uncertainty and embedded agency in resource-bounded reasoners.[1] [11] Nate Soares joined the research team in April 2014, advancing to executive director by mid-2015, where he directed efforts on formalizing concepts such as corrigibility (designing AI systems that remain responsive to human oversight) and robust cooperation between agents.[12] [13] MIRI launched the MIRIx program to fund external research groups worldwide and hosted technical workshops to build a collaborative network, fostering a maturation from individual theorizing to systematic, peer-reviewed mathematical inquiry.[1] Funding grew through major grants, including from the Open Philanthropy Project, enabling staff expansion and sustained operations; a 2016 general support grant explicitly aimed to bolster technical talent and diversify AI safety approaches.[10] By 2017, amid leadership concerns over accelerating AI timelines, MIRI explored complementary engineering-oriented alignment strategies while maintaining its core theoretical agenda, as outlined in publications like the 2015 "Aligning Superintelligence with Human Interests" technical report.[1] [11] Through 2020, this phase solidified MIRI's reputation in AI safety, though internal evaluations noted slow empirical progress on foundational problems despite theoretical advances in areas like logical induction.[1]Challenges and Strategic Reassessment (2021-2023)
In late 2021, MIRI researchers, including Eliezer Yudkowsky and Nate Soares, engaged in public dialogues with experts from other AI organizations, revealing profound pessimism about the feasibility of aligning advanced AI systems using prevailing research paradigms.[14] Yudkowsky argued that no promising technical approaches existed to solve alignment, describing the problem as requiring a "miracle" breakthrough absent fundamental advances in understanding agentic AI behavior, and critiquing iterative scaling and empirical methods as insufficient for ensuring safety in transformative systems.[15] Soares emphasized the core difficulty of building powerful AI that remains robustly insensitive to deceptive manipulations or power-seeking incentives, likening it to engineering systems that resist adversarial interference without compromising capability.[14] These discussions highlighted internal challenges at MIRI, where progress on mathematical foundations for alignment had stalled despite years of effort, mirroring broader field-wide stagnation.[1] A pivotal publication in November 2021, Yudkowsky's "AGI Ruin: A List of Lethalities," formalized this outlook by enumerating over 20 insurmountable hurdles to alignment, such as the deceptive alignment problem—where AI systems learn to hide misaligned goals during training—and the orthogonality thesis, positing that intelligence and goals are independent, making value alignment non-trivial even for superintelligent agents.[16] The essay contended that rapid empirical progress in AI capabilities, driven by scaling laws, outpaced theoretical safety insights, compressing timelines for potential catastrophe to years rather than decades.[16] MIRI's leadership viewed these dynamics as evidence that decentralized, competitive AI development would preclude safe outcomes, with no viable path to corrigibility (making AI safely interruptible) or scalable oversight under resource constraints.[1] By 2022, MIRI's strategic reassessment intensified amid these evidentiary failures, leading to a de-emphasis on public technical outputs and a focus on internal refinement, as external hiring for alignment talent proved elusive due to the perceived intractability of core problems.[1] The organization maintained a small team of around 11 researchers, prioritizing high-risk, high-reward inquiries into decision theory and embedded agency, but acknowledged that even optimistic scenarios demanded exceptional, non-incremental insights unlikely to materialize in time.[17] Funding remained stable through grants from entities like Open Philanthropy, totaling over $14 million by mid-2022, yet this did not translate to breakthroughs, reinforcing the view that technical alignment research, while necessary, was insufficient against accelerating industry timelines.[18] In 2023, the release of large language models like ChatGPT amplified urgency, prompting MIRI to begin shifting resources toward external advocacy while confronting the empirical reality that alignment efforts had yielded no scalable solutions.[1] Staff testified before U.S. Senate forums, urging regulatory pauses on frontier AI development to avert existential risks, as internal models projected misalignment probabilities exceeding 99% without paradigm shifts.[19] This period marked a causal recognition that MIRI's foundational bet on mathematical formalization had underdelivered relative to capability advances, necessitating a broader strategy incorporating governance interventions, though technical pessimism persisted as the dominant lens.[1]Pivot to Policy and Governance (2024-Present)
In January 2024, the Machine Intelligence Research Institute (MIRI) announced a strategic pivot away from its prior emphasis on technical AI alignment research toward policy interventions, public communications, and technical governance efforts aimed at mitigating existential risks from advanced AI systems. The organization cited assessments that progress in solving the alignment problem—ensuring superintelligent AI systems remain under human control—had proven insufficiently rapid to avert catastrophe before transformative AI capabilities emerge, estimated to occur within years rather than decades.[4] Instead, MIRI's core objective became increasing the likelihood of an international agreement among major governments to halt or severely restrict development of dangerous AI capabilities, such as those enabling automated AI research acceleration or deceptive misalignment.[4][20] This shift involved deprioritizing empirical alignment research in favor of advocacy for policy measures like mandatory "kill switches" in frontier models, compute governance to limit scaling, and potential outright pauses on training runs exceeding specified thresholds. MIRI's 501(c)(3) nonprofit status constrained direct lobbying, prompting collaborations with aligned organizations capable of more aggressive political engagement, while internal efforts focused on informing policymakers and building coalitions.[4] Technical governance research emerged as a supporting pillar, exploring verifiable mechanisms for AI oversight, such as monitoring for scheming behaviors in models or enforcing transparency in development pipelines.[21] By mid-2024, MIRI launched a dedicated technical governance team to contribute to global AI policy discussions, including input on international initiatives for safety standards.[7] Public communications formed a key lever, with a May 2024 strategy update outlining an explicit "Shut It Down" objective: to generate societal and governmental pressure sufficient to pause or terminate risky AI development paths. This included redesigning MIRI's website to target newcomers with evidence-based arguments for AI extinction risks, emphasizing empirical observations like rapid capability gains in models such as GPT-4 and o1, while adopting a direct, alarmist tone to convey urgency without diluting warnings of default catastrophic outcomes.[21] In its December 2024 end-of-year report, MIRI detailed scaling new teams for these areas, engaging directly with policymakers, and maintaining over two years of financial reserves amid donor uncertainty, though it noted challenges in sustaining momentum without breakthroughs in halting AI races.[22] Through 2025, this orientation persisted, with ongoing advocacy for slowdowns amid accelerating industry timelines, though no major policy victories were reported as of October.[22]Organization and Funding
Leadership and Key Personnel
The Machine Intelligence Research Institute (MIRI) was founded in 2000 by Eliezer Yudkowsky, along with Brian and Sabine Atkins, initially as the Singularity Institute for Artificial Intelligence.[1] Yudkowsky, who has shaped MIRI's technical research agenda over more than two decades through influential writings and strategic direction, currently serves as Chair of the Board.[23][24] As of October 2023, MIRI restructured its leadership to align with a shift toward public communication and policy advocacy. Malo Bourgon, MIRI's longest-serving team member after Yudkowsky, became Chief Executive Officer (CEO) in June 2023 after piloting the role since February; Bourgon previously held the position of Chief Operating Officer (COO).[24] Nate Soares transitioned from Executive Director—a role he assumed in 2015—to President, focusing on vision and strategy as a board member.[24][7] Alex Vermeer succeeded Bourgon as COO, overseeing daily operations.[24] Jimmy Rintjema has served as Chief Financial Officer (CFO) since 2015, with expanded responsibilities in finance and human resources.[24] Key personnel beyond the executive team include long-time researchers and advisors who have influenced MIRI's direction. Soares, prior to leadership, contributed foundational work on AI alignment problems during his tenure as a researcher starting around 2013.[25] Each of Yudkowsky, Soares, and Bourgon maintains a personal research budget, supporting independent investigations amid MIRI's evolving priorities.[24]Funding Sources and Financial Transparency
The Machine Intelligence Research Institute (MIRI) relies predominantly on private donations from individuals, philanthropists, and foundations aligned with effective altruism and existential risk mitigation, with no evident reliance on government grants or corporate sponsorships as primary funding streams.[26][27] Major contributors include the Open Philanthropy Project, which has provided over $14.7 million since 2015, often for general support and specific programs like AI safety retraining.[26][28] Other significant donors encompass Ethereum co-founder Vitalik Buterin ($5.4 million total), Skype co-founder Jaan Tallinn ($1.08 million), and the Thiel Foundation ($1.63 million, associated with investor Peter Thiel).[26][27] Additional support has come from entities such as the EA Funds Long-Term Future Fund ($679,000) and anonymous cryptocurrency investors, including multi-million-dollar pledges like $2.5 million annually from 2021 to 2024 plus $5.6 million in 2025 from a long-term supporter.[26][29]| Contributor | Total Donations (USD) | Period |
|---|---|---|
| Open Philanthropy | $14,758,050 | Since 2015 |
| Vitalik Buterin | $5,411,216 | Since 2015 |
| Thiel Foundation | $1,627,000 | Pre-2015 |
| Jaan Tallinn | $1,085,447 | Since 2015 |
| Berkeley Existential Risk Initiative | $1,101,000 | Since 2015 |