Fact-checked by Grok 2 weeks ago

Allen Institute for AI

The Allen Institute for (AI2) is a Seattle-based non-profit founded in 2014 by Paul G. Allen, the late co-founder of , with the mission to contribute to humanity through high-impact AI research and engineering. Focusing on foundational AI advancements, AI2 emphasizes open-source methodologies to foster innovation in areas such as scientific discovery, climate modeling, and agentic systems for research acceleration. Key initiatives include the development of tools like Asta agents for literature synthesis and , and ScholarQA for efficient literature reviews, alongside efforts to create competitive open models that challenge proprietary systems from major tech firms. By prioritizing transparency and accessibility, AI2 addresses concerns over the closed practices dominating the AI landscape, promoting broader access to advanced technologies for societal benefit.

Founding and History

Establishment and Initial Vision

The Allen Institute for Artificial Intelligence (AI2) was founded in 2014 as a non-profit research organization by Paul G. Allen, the philanthropist and co-founder who had previously established other scientific institutes focused on brain research and cell science. Allen, who recognized artificial intelligence's potential from an early age, aimed to create an entity capable of pursuing transformative AI advancements independent of commercial pressures. The institute was headquartered in Seattle, Washington, with , a prominent researcher, appointed as its inaugural CEO to lead operations. AI2's initial vision centered on conducting high-impact AI research and engineering explicitly in service of the , prioritizing foundational breakthroughs over development. This mission sought to address global challenges through rigorous, reproducible AI innovations, including open-source models and that could accelerate scientific discovery and practical applications. Allen envisioned AI as a to enhance capabilities in organizing and analyzing vast , while emphasizing the need for "" reasoning in systems to mitigate risks and ensure safer deployment—principles that guided early projects like semantic parsing and question-answering systems. From inception, AI2 differentiated itself by committing to transparency and collaboration, releasing research outputs publicly to foster broader progress in the field, in contrast to the era's growing emphasis on closed, profit-driven AI labs. This approach reflected Allen's broader philanthropic ethos of advancing humanity through unbiased scientific inquiry, free from short-term market incentives.

Key Milestones and Expansion

The Allen Institute for Artificial Intelligence (AI2) was established in 2014 as a non-profit research organization in , by philanthropist and co-founder Paul G. Allen, with the mission to conduct high-impact AI research and engineering for the common good. Founding CEO , a researcher, led the institute from its inception, overseeing early efforts in areas such as and semantic parsing. Following Allen's death in October 2018, AI2 launched the as an affiliated entity to foster AI startups, providing pre-seed funding, compute resources, and mentorship; this marked an expansion into applied commercialization, with the incubator becoming independent in 2022. Notable spinouts include Kitt.ai, acquired by in 2017 for conversational AI development; Xnor.ai, acquired by Apple in 2020 for edge AI computing; and Lexion, acquired by in 2022 for contract AI analysis, demonstrating the institute's role in generating over 30 startups that collectively raised more than $200 million in follow-on funding. In June 2022, Etzioni announced his departure as CEO effective September 30, transitioning to roles as board member, advisor, and technical director of the ; Ali Farhadi succeeded him, emphasizing open-source AI models and scientific applications. The raised a $10 million fund in 2020 to support 12 new spinouts in domains like and , followed by a $30 million fund in 2023 and an $80 million third fund in October 2025 to back approximately 70 early-stage AI ventures focused on real-world applications. AI2 secured significant external funding in August 2025, including $75 million from the U.S. and $77 million from , to lead a five-year initiative developing fully open AI models tailored for scientific discovery, underscoring institutional growth in collaborative infrastructure and compute resources. This builds on prior expansions, such as securing $200 million in AI compute credits in March 2024 for startups, enhancing AI2's capacity to support scalable research and without constraints.

Organizational Structure

Leadership and Governance

The Allen Institute for AI (AI2), a non-profit organization, is led by Ali Farhadi, who assumed the role on June 20, 2023. Farhadi, previously head of at Apple, was selected for his expertise in AI research and entrepreneurship, with the board emphasizing his ability to advance AI2's mission of high-impact, open-source AI development. Prior leadership included as founding CEO from the institute's inception in 2014 until late 2022, followed by Peter Clark serving as interim CEO from 2022 to 2023. Governance at AI2 is overseen by a board of directors comprising individuals with ties to technology, , and . Key board members include , trustee of the Paul G. Allen Trust and chair of the broader ; , president of the ; Hope Cochran, managing director at Madrona Venture Group; Steve Hall, venture partner at Cercano Management; and Ed Lazowska, emeritus professor at the . , corporate vice president at , also serves on the board. As a non-profit founded by the late , AI2's structure emphasizes independent research governance aligned with its core values of openness, scientific rigor, and community investment, without profit-driven pressures typical of for-profit entities.

Teams and Research Labs

The Allen Institute for AI (AI2) structures its research efforts through specialized teams that tackle targeted challenges in , often aligned with specific projects or domains such as , environmental applications, and model safety. These teams emphasize open-source outputs and practical impact, drawing on interdisciplinary expertise in , , and . , or Perceptual Reasoning and Interaction Research, focuses on enabling AI systems to perceive, reason about, and interact with the physical world through advancements in , vision-language models, and embodied AI. Key contributions include the Molmo family of open multimodal models achieving state-of-the-art performance in visual understanding and the Unified-IO 2 framework, a unified architecture supporting inputs like images, text, and audio for tasks ranging from generation to navigation. Directed by Ranjay Krishna, PRIOR also develops applications such as Satlas, an AI platform analyzing for global monitoring of infrastructure and cover. The Wildlands team applies to wildland and fire management, using and to estimate surface fuel loads from photographs—quantifying 1-, 10-, and 100-hour fuels in kg/m²—where satellite data falls short. This work supports prescribed burns and mitigation, with tools like the Fuels Data app facilitating rapid data collection for field practitioners; the team collaborates with entities such as the and Idaho Prescribed Fire Council to enhance health and reduce catastrophic fire risks. AI2's environmental teams, including EarthRanger, Skylight, and Climate Modeling, leverage AI for planetary-scale challenges: EarthRanger provides real-time tracking for wildlife protection, detects illegal fishing via and vessel data, and Climate Modeling builds efficient emulators of high-resolution climate simulations to enable faster, cheaper forecasting. These groups integrate large datasets with AI to inform and decisions. A dedicated AI safety team examines fundamental causes of harm in large language models, developing techniques to unlearn biased or dangerous behaviors while prioritizing empirical evaluation over unverified assumptions in alignment research. Complementing these, teams behind initiatives like the OLMo open language model framework conduct foundational work in generative AI, emphasizing fully reproducible training pipelines and transparency in model architectures.

Core Research Areas

Language Models and Generative AI

The Allen Institute for AI (AI2) has prioritized open-source language models to facilitate scientific scrutiny and advancement in generative , emphasizing in data, code, and model weights to address limitations in systems. This approach contrasts with closed models by enabling researchers to replicate, analyze, and iterate on foundational components, such as pre- processes and evaluation metrics. AI2's efforts in this domain build on the recognition that generative models, capable of producing human-like text, require empirical validation of their capabilities and risks through accessible artifacts. A project is OLMo, AI2's framework for developing fully open large models (LLMs), first announced on May 11, 2023, with plans for a 70 billion parameter model comparable to state-of-the-art systems. The initial release, OLMo 7B, occurred on February 1, 2024, alongside its pre-training dataset (), training code, and evaluation tools, positioning it as a state-of-the-art open LLM designed for scientific study rather than commercial deployment. Trained on curated data to minimize biases inherent in web-scraped corpora, OLMo supports generative tasks like text completion and while providing transparency into model internals, such as tokenization and architecture choices. In November 2024, AI2 released OLMo 2, an upgraded family including 7B, 13B, and 32B parameter models trained on up to 6 tokens, which outperformed other fully open models of equivalent size on benchmarks for reasoning, , and generation quality. These models incorporate refinements in data curation and training efficiency, derived from analyses of prior iterations, to enhance reliability in generative outputs without optimizations. AI2's generative also extends to applications in scientific , where large generative models process vast datasets to hypothesize causal relationships, as explored in initiatives leveraging OLMo for domain-specific adaptations. To broaden accessibility, AI2 partnered with Google Cloud on April 8, 2025, integrating OLMo models into Vertex AI for scalable inference and , while maintaining open licensing to prevent . This infrastructure supports empirical studies on generative AI's limitations, including rates and challenges, fostering community-driven improvements over iterative releases. Overall, AI2's contributions underscore a commitment to verifiable progress in language modeling, prioritizing reproducible evidence over opaque performance claims.

AI for Science and Applications

The Allen Institute for AI (AI2) develops AI systems to enhance scientific workflows, including literature search, , and generation, with the aim of accelerating empirical discoveries across domains such as and . These efforts emphasize open-source tools and datasets to promote reproducibility and broad accessibility, drawing on large-scale corpora of scientific publications. Central to AI2's approach are resources for processing and querying scientific literature, such as the platform, which integrates AI for paper discovery and semantic understanding. Supporting datasets include S2ORC, comprising over 10 million English-language open-access academic papers with full text, and S2AG, encompassing metadata for more than 200 million papers including titles and abstracts. Specialized tools like olmOCR enable accurate extraction of text, tables, equations, and handwriting from PDFs while preserving document structure, facilitating downstream analysis. Ai2 ScholarQA provides question-answering capabilities that synthesize responses from multiple papers, incorporating table comparisons and citations for verifiable insights. In agentic AI applications, AI2 launched Asta on August 26, 2025, as an ecosystem featuring autonomous s to assist researchers with tasks like literature summarization and . Asta includes a research assistant for select partners, an open-source benchmarking framework (AstaBench) with over 2,400 problems spanning literature comprehension, code execution, , and discovery workflows, and developer resources such as a 200-million-paper index via the Scientific Corpus Tool. Initial evaluations showed Asta v0 achieving 53.0% on AstaBench, using protocols like Model for modular agent interactions. Broader infrastructure development is supported by the Open Multimodal AI Infrastructure to Accelerate Science (OMAI) project, funded with $152 million ($75 million from NSF and $77 million from NVIDIA) on August 14, 2025. Led by Noah A. Smith and partnering with institutions including the , OMAI builds fully open models tailored for scientific tasks, such as data visualization, for analysis, materials discovery, and protein function prediction. To rigorously assess AI performance in scientific contexts, AI2 introduced SciArena on July 1, 2025, a collaborative platform for evaluating foundation models on open-ended literature tasks through community-driven comparisons and voting. It features an Elo-rated leaderboard and a meta-evaluation benchmark (SciArena-Eval), where top evaluators like o3 reached 65.1% accuracy; as of late June 2025, it hosted 23 models to foster iterative improvements in scientific AI reliability.

Major Projects and Initiatives

OLMo: Open Language Models

OLMo is a family of open models developed by the Allen Institute for AI (Ai2) to facilitate scientific into large language models through complete and . Launched in February 2024, the project emphasizes releasing not only model weights and code but also the full pre- , code, and detailed logs, enabling researchers to replicate and study the entire development process. The initial OLMo 7B model, with 7 billion parameters, was trained on the comprising over 3 sourced from diverse crawls, , and materials, achieving competitive on benchmarks like MMLU while prioritizing openness over proprietary optimizations. Subsequent releases expanded the OLMo 2 family, starting with 7B and 13B models on November 26, 2024, trained on up to 5 trillion tokens and demonstrating superior results among fully open models, including a 9-point MMLU improvement over predecessors and outperforming equivalently sized open-weight competitors on tasks like reasoning and . The lineup grew to include a 32B model on March 13, 2025, which became the first fully open model to surpass GPT-3.5 on multiple evaluations, followed by a 1B model on May 1, 2025, all finetunable on single GPU nodes to lower barriers for experimentation. These models use a transformer architecture with modifications for efficiency, such as grouped-query , and are licensed under permissive terms allowing commercial and research use. A core innovation is OLMoTrace, introduced on April 9, 2025, which traces model outputs back to specific training data sources in , addressing opacity in black-box systems by linking generations to the multi-trillion-token and enabling audits for biases or factual errors. This tool supports causal analysis of model behaviors, contrasting with closed models from industry leaders that withhold training details. The project's Dolma data pipeline includes deduplication and filtering to enhance quality, with all components hosted on platforms like and for community verification. By design, OLMo prioritizes empirical scrutiny over scaled compute alone, fostering advancements in areas like data efficiency and interpretability.

Asta and Agentic Platforms

Asta is an open-source ecosystem developed by the Allen Institute for AI (AI2) to facilitate trustworthy agentic AI applications in scientific research, launched on August 26, 2025. It integrates an capable of search, synthesis, and analysis tasks, a benchmarking framework known as AstaBench, and developer resources for building and evaluating AI agents. The platform draws on a corpus exceeding 108 million scholarly abstracts and 12 million full-text papers to support evidence-based inquiry, emphasizing rigor in tracing ideas to sources and distinguishing established knowledge from hypotheses. Central to Asta's agentic capabilities is its suite of AI agents designed to assist in exploratory workflows, such as framing questions, generating insights from , and maintaining to primary . For instance, features like DataVoyager, introduced on October 1, 2025, enable drilling into structured datasets for and , addressing common challenges in handling complex scientific . Asta agents prioritize by citing underlying papers, with released highlighting the most frequently relied-upon sources for specific queries, allowing users to verify outputs against original . AstaBench serves as the evaluation backbone, providing a suite and leaderboards for assessing AI agents on scientific tasks through a that incorporates multi-step reasoning, factual accuracy, and detection—metrics tailored to scientific reliability rather than . This addresses limitations in prior by focusing on domain-specific challenges like validation and evidence synthesis. Developer resources include baseline agents, APIs, and integration tools, enabling customization and testing within the Asta ecosystem to foster iterative improvements in agentic systems. As an agentic platform, Asta represents AI2's push toward modular, open architectures where AI agents autonomously plan, execute, and refine scientific processes, contrasting with static models by incorporating loops for real-world applicability. Its open licensing under permissive terms supports community contributions, aligning with AI2's broader commitment to accessible tools for accelerating discovery without proprietary barriers. Early evaluations demonstrate superior handling of tasks compared to closed alternatives, though ongoing refinements are needed for edge cases in interdisciplinary science.

Other Initiatives

The Allen Institute for AI has developed , a free AI-powered for that uses to analyze and recommend papers based on content relevance rather than citations alone. Launched publicly in November 2015, it processes millions of academic papers, providing summaries, key phrase extraction, and influence metrics to aid researchers in discovery and navigation of scholarly knowledge. By 2025, Semantic Scholar incorporates generative AI features for enhanced paper understanding and has expanded to include tools for librarians and access for integration. Another flagship effort is Aristo, a project designed to build AI systems capable of reading scientific texts, acquiring knowledge, and answering complex questions through reasoning. Initially aimed at elementary-level comprehension, Aristo achieved a score exceeding 90% on an eighth-grade multiple-choice exam in 2019, marking a milestone in machine understanding of scientific principles. The project has evolved to include solvers for various question types and benchmarks like SciArena for evaluating models on tasks. The PRIOR team conducts research in perceptual reasoning and interaction, developing AI for computer vision tasks such as scene understanding, embodied agents, and human-AI collaboration. Key outputs include AI2-THOR, an interactive 3D simulation environment for training visual AI agents in household scenarios, released to support reproducible research in embodied intelligence. PRIOR's work emphasizes systems that learn from visual data to reason about actions and environments, with applications in robotics and simulation. In August 2025, AI2 launched the initiative, a five-year, $152 million public-private partnership funded by the and to create open multimodal models integrating text, images, and for accelerating scientific . Led in collaboration with the ’s Allen School, OMAI focuses on reproducible infrastructure for fields like and climate science, providing researchers with specialized tools trained on open datasets. This effort builds on AI2's open-source ethos to enable rigorous, verifiable applications in hypothesis generation and .

Open-Source Philosophy and Contributions

Model Releases and Datasets

The Allen Institute for AI (AI2) has released a series of open-source language models under the OLMo framework, beginning with initial 1B and 7B models in 2024, designed to enable full from data preparation through training and evaluation. Subsequent OLMo 2 releases in November 2024 included 7B, 13B, and 32B models trained on up to 6 , achieving performance competitive with other fully open models of similar size while providing complete access to training data, code, and weights. Further iterations, such as OLMo-2-0425-1B in May 2025, continued this progression with updated naming conventions tied to release dates. In addition to text-based models, AI2 released the Molmo family of multimodal models on September 25, 2024, featuring vision-language capabilities that approach system performance, with weights, training code, and evaluation datasets made publicly available. Instruction-tuned variants like , comprising 8B and 70B parameter models, followed on November 22, 2024, emphasizing benchmark advancements in following complex instructions. Earlier Tulu models laid groundwork for open instruction-following, building on OLMo bases. Specialized releases include OLMoASR, a suite of open models launched August 28, 2025, supporting audio processing via the AI2 . AI2 also introduced FlexOLMo in July 2025, enabling post-training data removal from models to address ownership concerns without retraining. AI2's datasets underpin these models and support broader research, with serving as the primary pretraining corpus for OLMo at 3 , comprising web content, academic papers, code, books, and encyclopedic sources, released openly to facilitate . Other key datasets include WildChat for analyzing interactions, S2ORC and S2AG for papers, and for reasoning challenges. The AI2 Safety Toolkit, released June 28, 2024, provides resources like WildTeaming for adversarial attack detection and WildGuard for moderation, promoting responsible development through . These assets are hosted on platforms like under the allenai namespace, enabling community access and extension.

Transparency and Licensing Approaches

The Allen Institute for AI (AI2) prioritizes transparency in its AI development by releasing comprehensive artifacts for its projects, including code, model weights, datasets, intermediate checkpoints, and frameworks, as exemplified by the OLMo framework launched in February 2024. This approach enables researchers to replicate, analyze, and build upon models, fostering empirical scrutiny of processes and outputs. Tools like OLMoTrace, introduced in April 2025, further enhance by linking model outputs in real time to specific tokens in the multi-trillion-token data, addressing opacity in large language models. AI2's licensing strategy balances openness with risk mitigation through the , initiated in 2023, which classifies AI artifacts (e.g., datasets and models) by risk levels—low, medium, or high—and imposes tailored conditions such as usage disclosures, derivative sharing, and prohibitions on harmful applications. For core releases like the OLMo model family, AI2 employs permissive (OSI)-approved licenses, allowing broad commercial and research use of code and weights without restrictive clauses that limit derivatives or generation. In contrast, the dataset—comprising 3 trillion tokens curated from web content, code, and publications—was initially licensed under the medium-risk ImpACT terms in August 2023, requiring attribution and share-alike for derivatives, before transitioning to the ODC-BY (Open Data Commons Attribution) license in 2024 to simplify while mandating and equivalent licensing for modifications. This dual emphasis on full artifact disclosure and conditional permissiveness distinguishes AI2 from AI developers, aiming to accelerate scientific progress while enabling community-driven safety evaluations, though critics note that even permissive licenses may not fully resolve disputes in training data sourcing.

Impact and Reception

Scientific and Technological Achievements

The Allen Institute for AI (AI2) has advanced through innovations like , an AI-driven academic launched in 2015 that indexes over 200 million scientific papers and employs to extract key insights, generate summaries, and identify influential citations, thereby enhancing researcher efficiency across disciplines. This tool has facilitated broader access to literature, with datasets such as S2ORC—comprising over 10 million open-access papers—and S2AG, providing for 200 million entries, enabling downstream research in and knowledge discovery. In language modeling, AI2's OLMo series represents a milestone in open-source AI, with the OLMo 2 32B model, released in 2025, achieving superior performance over GPT-3.5-Turbo and GPT-4o mini on multi-skill academic benchmarks while releasing the complete training stack, including v2 dataset of 3 tokens, to promote and . Similarly, the Molmo multimodal models have set benchmarks in vision-language tasks, outperforming proprietary counterparts in open evaluations. AI2 researchers have secured numerous best paper awards at conferences like NeurIPS and , outpacing similarly sized institutes in publication impact. AI2's AI-for-science efforts include SciArena, a 2025 benchmark evaluating large models on scientific question-answering, which revealed gaps in closed models' reasoning and spurred improvements in open alternatives. Tools like olmOCR for document parsing and Ai2 Paper Finder for iterative literature search further accelerate discovery by handling complex formats and mimicking human search processes. In 2025, AI2 received $152 million from the NSF and to develop open models tailored for scientific applications, underscoring their role in national AI infrastructure. These contributions emphasize causal in AI development, with full model weights, code, and data releases enabling independent verification and extension by the global research community.

Criticisms and Debates

The Delphi model, developed by the Allen Institute for AI in 2021 as a prototype for commonsense moral reasoning, drew substantial criticism for inconsistencies in its ethical judgments and embedded biases. Evaluators noted that Delphi often produced contradictory responses to similar scenarios, such as deeming it ethically worse for a man to wear women's clothing in public than for a woman to wear men's clothing, highlighting failures in consistent application of moral principles. Critics, including analyses from technology watchdogs, accused the model of exhibiting racial biases, such as harsher judgments toward actions associated with African American Vernacular English or stereotypes of minority groups, which amplified unintended discriminatory outputs. These flaws were attributed to training data drawn from crowdsourced judgments, which reflected human societal prejudices rather than objective ethics, leading researchers to later describe Delphi as a limited research tool rather than a reliable moral oracle. Subsequent evaluations underscored Delphi's limited cultural awareness and vulnerability to pervasive biases, with accuracy dropping in nuanced or culturally specific queries. For instance, the model struggled with contextual variations, such as distinguishing intent in ethical dilemmas, and was prone to overgeneralization from Reddit-sourced data, which skewed toward , online user perspectives. AI2 acknowledged these shortcomings, pivoting away from public deployment and toward internal research on ethical AI limitations, but the project's reception fueled broader skepticism about training machines on human moral data without robust debiasing mechanisms. Debates surrounding AI2's open-source philosophy, particularly with projects like OLMo, center on the trade-offs between and potential misuse risks. Proponents within AI2, including CEO Ali Farhadi, argue that full —releasing weights, training data, and code—enables scientific scrutiny and mitigates long-term dangers by allowing collective identification and correction of flaws, contrasting with closed models' "" opacity. However, critics in and communities contend that widely available model weights lower barriers for adversarial actors to fine-tune for harmful applications, such as generating or autonomous weapons, potentially accelerating existential risks without adequate safeguards. This tension has manifested in OLMo's development, where AI2's in August 2024 uncovered hundreds of flaws through evaluations, demonstrating openness's value in flaw detection but also exposing vulnerabilities like instability in end-task performance. While OLMo models exhibit biases and risks inherent to large models—such as perpetuating prejudices—AI2 maintains that empirical outperforms controls, though empirical evidence on misuse rates remains contested due to underreporting in closed systems. AI2 researchers like have reflected that anticipated open-source perils, such as rapid weaponization, have not materialized at scale, attributing this to community norms rather than inherent safety. These debates underscore causal disagreements: whether fosters resilient, verifiable progress or invites uncontrolled .

Recent Developments

2024-2025 Advances and Partnerships

In November 2024, the Allen Institute for AI (Ai2), in collaboration with the , released OpenScholar, a fully open retrieval-augmented designed for synthesizing from over 45 million open-access papers. This system identifies relevant passages to answer complex scientific queries, achieving expert-level performance that surpasses GPT-4o in benchmarks for literature synthesis tasks, while providing transparent sourcing and reproducibility. In August 2025, Ai2 launched Asta, an open-source ecosystem featuring agentic tools for scientific , including a , the AstaBench framework for evaluating agents in holistic scientific workflows, and developer resources to build trustworthy . Building on this, Ai2 introduced Asta DataVoyager on October 1, 2025, an agent enabling natural-language queries on uploaded structured datasets, generating step-by-step, reproducible analyses with full transparency into reasoning and data handling to support data-driven discoveries across disciplines. On April 8, 2025, Ai2 partnered with Google Cloud to integrate its open models—including the OLMo 32B, Tulu, and Molmo families—into the Vertex AI Model Garden, facilitating deployment via APIs and leveraging Google Cloud's infrastructure for high-performance, cost-effective access in regulated sectors and public applications. A major partnership was announced on August 14, 2025, with the U.S. National Science Foundation (NSF) and NVIDIA providing $152 million ($75 million from NSF, $77 million from NVIDIA) to Ai2 for the Open Multimodal AI Infrastructure (OMAI) project, aimed at creating a national open AI ecosystem to accelerate scientific discovery through fully transparent multimodal models. Collaborators include the University of Washington's Allen School of Computer Science & Engineering, University of Hawai’i at Hilo, University of New Hampshire, University of New Mexico, Cirrascale Cloud Services, and Supermicro, focusing on extending models like OLMo and Molmo for broader AI science advancements. This initiative addresses resource gaps for academic researchers by prioritizing open infrastructure over proprietary systems.

References

  1. [1]
    About us - Ai2
    We are a Seattle based non-profit AI research institute founded in 2014 by the late Paul Allen. We develop foundational AI research and innovation.
  2. [2]
    Allen Institutes - PaulAllen
    Inspired by the successful model the other Allen Institutes employed, in 2014 the Allen Institute for Artificial Intelligence (AI2) was founded to conduct ...
  3. [3]
    Ai2: Truly open breakthrough AI
    We lead cutting-edge research to develop the next generation of intelligent robots, safely trained in advanced simulation environments to automate routine tasks ...Team Directory · Careers · Institute · Explore internships
  4. [4]
    AI for science - Ai2
    Ai2 uses AI to help scientists search, understand, and interact with scientific knowledge, assisting in research, data analysis, and making novel discoveries.
  5. [5]
    Asta: Advancing Scientific AI with Agents & Benchmarks - Ai2
    Explore the Asta ecosystem—AI agents for research, rigorous benchmarks, and resources to build and test AI for scientific applications.
  6. [6]
    Asta Agents: AI Tools for Scientific Research - Ai2
    Asta agents are an AI tool for scientists that combines search, synthesis, and data analysis, using AI to find papers, summarize literature, and analyze data.
  7. [7]
    Introducing Ai2 ScholarQA
    Jan 21, 2025 · Ai2 ScholarQA is an experimental solution to help researchers conduct literature reviews more efficiently by providing more in-depth answers.
  8. [8]
    An Industry Insider Drives an Open Alternative to Big Tech's A.I.
    Oct 19, 2023 · The Allen Institute has begun an ambitious initiative to build a freely available AI alternative to tech giants like Google and start-ups like OpenAI.
  9. [9]
    AI nonprofit CEO says 'closed nature' of most artificial intelligence ...
    Feb 24, 2025 · The biggest threat to AI innovation is the closed nature of the practice. We have been pushing very, very strongly towards openness.
  10. [10]
    Allen Institute for AI takes new approach to managing AI risks and ...
    Aug 7, 2023 · Data privacy, embedded biases, intentional and unintentional false information are all major concerns. One possible approach to some of these ...
  11. [11]
    What's new at AI2? Paul Allen's AI institute wins honors, and hints at ...
    Feb 28, 2017 · The institute, known as AI2, was founded by Microsoft co-founder Paul Allen in 2014 with longtime computer science researcher Oren Etzioni as ...Missing: mission statement history
  12. [12]
    Paul Allen's quest to build an artificial brain is one of the hardest ...
    Sep 30, 2015 · “I simply wanted to advance the field of artificial intelligence so that computers could do what they do best (organize and analyze information) ...Missing: statement | Show results with:statement<|control11|><|separator|>
  13. [13]
    How Paul Allen's plan to make AI more sensible aims to keep us safer
    Feb 28, 2018 · Paul Allen's $125 million initiative to give artificial intelligence programs more common sense has another goal: making AI safer for ...Missing: statement | Show results with:statement
  14. [14]
    AI2 Incubator launches $80M fund as it doubles down on real-world ...
    Oct 7, 2025 · AI2 Incubator launches $80M fund as it doubles down on real-world AI applications in Seattle and beyond. GeekWire chronicles the Pacific ...Missing: expansion | Show results with:expansion
  15. [15]
    AI2 Incubator raises third fund to back more early stage startups
    Jul 16, 2025 · Past spinouts from the AI2 Incubator include Kitt.ai, which was acquired by Baidu; Lexion, acquired by Docusign; and Xnor.ai, acquired by Apple.
  16. [16]
    Oren Etzioni stepping down as CEO of Allen Institute for AI after nine ...
    Jun 15, 2022 · Etzioni will continue to serve as CEO until Sept. 30, and will then continue as a board member and advisor. He will also take the position of ...
  17. [17]
    AI2 raises $10m fund - - Global Corporate Venturing
    Jan 17, 2020 · AI2's new pre-seed fund will help create 12 new spinouts in fields such as deep learning, computer vision and natural language processing.
  18. [18]
    NSF and NVIDIA award Ai2 a combined $152M to support building a ...
    Aug 14, 2025 · Ai2 has been awarded $75 million from the U.S. National Science Foundation (NSF) and $77 million from NVIDIA as part of a jointly funded ...Missing: expansion offices
  19. [19]
    AI2 Incubator scores $200M in compute to feed needy AI startups
    Mar 7, 2024 · They've helped build more than 30 startups and last year raised a $30 million fund to continue the work.Missing: expansion | Show results with:expansion<|separator|>
  20. [20]
    Allen Institute for Artificial Intelligence (AI2) Announces New CEO
    AI Researcher, Executive, and Forbes Top 5 AI Entrepreneur Ali Farhadi to Lead Institute's Next Chapter. June 20, 2023 09:00 ET | Source: Allen Institute ...
  21. [21]
    Oren Etzioni - University of Washington
    He was the Founding Chief Executive Officer at the Allen Institute for AI (AI2), having served as CEO from its inception in 2013 until late 2022. He is ...<|separator|>
  22. [22]
    Allen Institute for AI - The Org
    Allen Institute for AI (AI2) is a non-profit research institute founded in ... Board & advisors. Ana Mari Cauce's profile picture · Ana Mari Cauce.
  23. [23]
    Perceptual Reasoning and Interaction Research
    Explore PRIOR, a team working on Perceptual Reasoning and Interaction Research at Allen Institute for AI.
  24. [24]
    Wildlands | Ai2
    We are a team of machine learning researchers, engineers, and designers at the Allen Institute for Artificial Intelligence (Ai2) based in Seattle, WA. Ai2 is a ...
  25. [25]
    Ai2's AI for the Environment teams innovate toward a sustainable ...
    Sep 21, 2023 · Ai2's Climate Modeling team is developing machine learning emulators based on these fine grid models and other data that are reliable, cheap, ...
  26. [26]
    Research principles - Ai2
    Ai2's research principles include: open access, community investment, science-based, human-centered, and sustainability.We Are Open First · We Invest In Community · We Are Human Centered
  27. [27]
    OLMo from Ai2
    OLMo is Ai2's first Open Language Model framework, intentionally designed to advance AI through open research and to empower academics and researchers to study ...
  28. [28]
    Hello OLMo: A truly open LLM - Ai2
    Feb 1, 2024 · The Allen Institute for AI (Ai2) has released OLMo 7B, a truly open, state-of-the-art large language model released alongside the pre-training data and ...
  29. [29]
    [2402.00838] OLMo: Accelerating the Science of Language Models
    Feb 1, 2024 · We have built OLMo, a competitive, truly Open Language Model, to enable the scientific study of language models.
  30. [30]
    Open research is the key to unlocking safer AI - Ai2
    Aug 8, 2024 · Open research is key to safer AI because it allows understanding of model knowledge, enables the entire community to conduct safety research, ...
  31. [31]
    Allen Institute for AI creating an open generative AI language model ...
    May 11, 2023 · AI2 is creating an open generative language model called AI2 OLMo (Open Language Model). It will be comparable in scale to other state-of-the-art LLMs at 70 ...
  32. [32]
    allenai/OLMo-7B - Hugging Face
    Apr 17, 2024 · OLMo is a series of Open Language Models designed to enable the science of language models. The OLMo models are trained on the Dolma dataset.
  33. [33]
    OLMo 2: The best fully open language model to date - Ai2
    Nov 26, 2024 · We find that OLMo 2 7B and 13B are the best fully-open models to-date, often outperforming open weight models of equivalent size.
  34. [34]
    Ai2 Launches OLMo 2, a Fully Open-Source Foundation Model - InfoQ
    Dec 5, 2024 · The Allen Institute for AI research team has introduced OLMo 2, a new family of open-source language models available in 7 billion (7B) and 13 billion (13B) ...
  35. [35]
    Data-driven discovery with large generative models | Ai2
    May 16, 2024 · We aimed to harness the power of massive datasets and advancements in Large Generative Models (LGMs) to accelerate scientific discovery.
  36. [36]
    Ai2 (Allen Institute for AI) Announces Partnership with Google Cloud ...
    April 8, 2025 – Today, Ai2 announced a partnership with Google Cloud to make its portfolio of open AI models available in Vertex AI Model ...
  37. [37]
    Trustworthy AI: The case for openness in language modeling | IBM
    Explore how the Allen Institute for Artificial Intelligence is redefining trust and transparency in AI through open source and rigorous data analysis.
  38. [38]
    Asta: Accelerating science through trustworthy agentic AI - Ai2
    Aug 26, 2025 · Asta is an initiative to accelerate science using agentic AI, including research assistants, a benchmarking framework, and developer resources. ...
  39. [39]
    SciArena: A new platform for evaluating foundation models in ... - Ai2
    Jul 1, 2025 · SciArena is an open evaluation platform where researchers can compare and vote on the performance of different foundation models in tasks related to scientific ...<|separator|>
  40. [40]
    Dissecting OLMo, The Most Open Source LLM Paper!
    Mar 6, 2024 · The dataset used for training is called Dolma. It is built by the Allen Institute of AI and is publicly released. It comprises over 3 trillion ...
  41. [41]
    Olmo release notes | Ai2 platform documentation
    May 1, 2025 · OLMo 2 November 2024​ ... OLMo 2 introduces a new family of 7B and 13B models trained on up to 5T tokens, representing the best fully-open ...Olmo 2 1b May 2025​ · Olmo 2 32b March 2025​ · Olmo 2 November 2024​Missing: sizes | Show results with:sizes
  42. [42]
    OLMo 2 32B: First fully open model to outperform GPT 3.5 and ... - Ai2
    Mar 13, 2025 · The OLMo 2 family of models—now available in 7B, 13B, and 32B parameter sizes, all can be finetuned on a single H100 GPU node, and all models ...
  43. [43]
    allenai/OLMo-2-0425-1B - Hugging Face
    May 1, 2025 · The core models released in this batch include the following: Stage, OLMo 2 1B, OLMo 2 7B, OLMo 2 13B, OLMo 2 32B. Base Model, allenai/OLMo-2- ...
  44. [44]
    allenai/OLMo: Modeling, training, eval, and inference code ... - GitHub
    OLMo is a repository for training and using AI2's state-of-the-art open language models. It is designed by scientists, for scientists.
  45. [45]
    increasing transparency and trust in language models with ... - Ai2
    Apr 9, 2025 · OLMoTrace lets you trace the outputs of language models back to their full, multi-trillion-token training data in real time.
  46. [46]
    From 'black box to glass box': Ai2 links AI outputs to training data in ...
    Apr 9, 2025 · The Allen Institute for AI (Ai2) released a new tool that links AI-generated text to training data, aiming to improve transparency and accountability in ...<|separator|>
  47. [47]
    Analysis of the OLMo Training Framework: A Commitment to Truly ...
    Jun 25, 2025 · The OLMo project is a major step forward in AI, proving that top performance and full transparency can go hand in hand. AI2 has released ...<|separator|>
  48. [48]
    Ai2 Launches Asta: a New Standard for Trustworthy AI Agents in ...
    Aug 26, 2025 · SEATTLE, August 26, 2025--Ai2 (The Allen Institute for AI) today launched Asta, an integrated, open ecosystem designed to transform how ...
  49. [49]
    AstaBench: Benchmarking AI Agents for Science - Ai2
    AstaBench is a benchmark suite & leaderboards for evaluating agents on scientific tasks, powered by a more rigorous evaluation framework for AI agents.
  50. [50]
    Ai2 Asta
    A scholarly research assistant with broad and deep coverage via a corpus of 108M+ scholarly abstracts and 12M+ full text papers. A project from Ai2.
  51. [51]
    Asta DataVoyager: Data-driven discovery and analysis - Ai2
    Oct 1, 2025 · DataVoyager is our new feature in Asta built to address the challenges scientists face in drilling down into structured datasets.
  52. [52]
    Making AI citations count with Asta - Ai2
    We're releasing data that shows which scientific papers our agentic platform for research and discovery, Asta, relies on most when answering ...
  53. [53]
    Asta Resources: Tools for Building Scientific AI Agents - Ai2
    Access tools and baseline agents in Asta Resources, integrated with AstaBench for building, testing, and refining scientific AI agents.
  54. [54]
    AI sidekick for scientists: Ai2 aims to spark big discoveries with Asta ...
    Aug 26, 2025 · Ai2's team of 225 people includes some of the top AI researchers in the world, many of them also affiliated with the University of Washington's ...
  55. [55]
    Ai2 Launches Asta, an Open Agentic Ecosystem for Science - AIwire
    Aug 27, 2025 · The Allen Institute for AI (Ai2) has introduced Asta, a new AI platform that combines an agentic research assistant, a benchmark suite for ...
  56. [56]
    Semantic Scholar | AI-Powered Research Tool
    Semantic Scholar is a free, AI-powered research tool for scientific literature, based at Ai2. Learn More. About. About UsPublishersBlog (opens in a new tab)Ai2 ...About UsAllen Institute for Artificial ...Related topicsAPI OverviewLibrarian Resources
  57. [57]
    Semantic Scholar: AI-powered research tool for scientific literature
    Aug 1, 2025 · Semantic Scholar is a free AI-powered research tool for scientific literature. It was created by the Allen Institute for Artificial Intelligence (AI2).
  58. [58]
    Harnessing Generative AI for Your Academic Research: A Look at ...
    Aug 28, 2024 · Semantic Scholar is a free, AI-powered research tool for scientific literature, developed at the Allen Institute for AI. Its primary goal is ...
  59. [59]
    Project Aristo - Franz Inc.
    Project Aristo is a flagship AI2 project aiming to create a machine that can answer questions, explain, and discuss using knowledge from texts. It is currently ...
  60. [60]
    Aristo AI system finally passes an eighth-grade science test
    Sep 4, 2019 · The Allen Institute for Artificial Intelligence, or AI2, announced today that its Aristo software scored better than 90% on a multiple-choice test geared for ...
  61. [61]
    Aristo Team at AI2 (@ai2_aristo) / X
    Introducing SciArena, a platform for benchmarking models across scientific literature tasks. Inspired by Chatbot Arena, SciArena applies a crowdsourced LLM ...
  62. [62]
    AI2-THOR: An Interactive 3D Environment for Visual AI
    Explore PRIOR, a team working on Perceptual Reasoning and Interaction Research at Allen Institute for AI.
  63. [63]
    Continuous Scene Representations for Embodied AI
    Explore PRIOR, a team working on Perceptual Reasoning and Interaction Research at Allen Institute for AI.
  64. [64]
    Allen Institute for AI lands $152M from Nvidia and NSF to lead ...
    Aug 14, 2025 · The Allen Institute for AI (Ai2) will receive $152M from Nvidia and the NSF to lead a five-year project building fully open AI models for ...
  65. [65]
    NSF and NVIDIA partnership enables Ai2 to develop fully open AI ...
    Aug 14, 2025 · NSF's support is provided through its Mid-Scale Research Infrastructure program, which funds high-impact, high-reward, community-driven ...
  66. [66]
    allenai/OLMo-2-1124-13B - Hugging Face
    We introduce OLMo 2, a new family of 7B and 13B models trained on up to 5T tokens. These models are on par with or better than equivalently-sized fully-open ...
  67. [67]
    Molmo | Ai2
    Sep 25, 2024 · Molmo is a family of open state-of-the-art multimodal AI models. Our most powerful model closes the gap between open and proprietary systems.
  68. [68]
    Tulu - Ai2
    Ai2, a non-profit research institute founded by Paul Allen, is committed to breakthrough AI to solve the world's biggest problems.
  69. [69]
    OLMoASR: A series of open speech recognition models - Ai2
    Aug 28, 2025 · Try out OLMoASR on the Ai2 Playground ! Just click the audio icon in the chat box to speak with OLMo. Check out OLMoASR-Pool ...<|separator|>
  70. [70]
    A New Kind of AI Model Lets Data Owners Take Control | WIRED
    Jul 9, 2025 · A novel approach from the Allen Institute for AI enables data to be removed from an artificial intelligence model even after it has already been ...
  71. [71]
    Dolma - Ai2
    Dolma, the pretraining dataset of OLMo, is an open dataset from a diverse mix of web content, academic publications, code, books, and encyclopedic materials.
  72. [72]
    Open data - Ai2
    We're committed to creating and sharing open datasets that can help move the field forward. For a full list of our available datasets, visit us on Hugging Face.<|separator|>
  73. [73]
    allenai/ai2_arc · Datasets at Hugging Face
    Use this dataset. Homepage: allenai.org · Paper: Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge. Size of downloaded dataset ...
  74. [74]
    The Ai2 Safety Toolkit: Datasets and models for safe and ...
    Jun 28, 2024 · The Ai2 Safety Toolkit is a hub for LLM safety, including WildTeaming for attack identification and WildGuard for content moderation.
  75. [75]
    allenai (Ai2) - Hugging Face
    Ai2 · AI & ML interests · Recent Activity · Papers · Articles · Team members 196 · Collections 24 · spaces 13. Sort: Recently updated · models 796. Sort: Recently ...OLMo 2 · Team · allenai/OLMo-2-0425-1B-Instruct · Activity FeedMissing: list | Show results with:list
  76. [76]
    OLMo: Open Language Model - Ai2
    Feb 1, 2024 · The Ai2 LLM framework is intentionally designed to provide access to data, training code, models, and evaluation code necessary to advance AI through open ...
  77. [77]
    Allen Institute for AI releases 'truly open source' LLM to drive 'critical ...
    Feb 1, 2024 · In addition, OLMo was released under an open source initiative (OSI) approved license, with AI2 saying that "all code, weights, and intermediate ...
  78. [78]
    Ai2 Dolma: 3 trillion token open corpus for language model pretraining
    Aug 18, 2023 · We derive the code subset of Dolma from The Stack , a collection of permissively-licensed GitHub repositories. In practice: We apply heuristics ...
  79. [79]
    Making a switch — Dolma moves to ODC-BY - Ai2
    Apr 15, 2024 · The ODC-BY license is an Open Data Commons license, meaning that it grants the public permission to use Dolma, and users may copy, reproduce, ...
  80. [80]
    allenai/dolma: Data and tools for generating and inspecting ... - GitHub
    Dolma is licensed under ODC-BY; see our blog post for explanation. You can also read more about Dolma in our announcement, as well as by consulting its data ...
  81. [81]
    Dolma: an Open Corpus of Three Trillion Tokens for Language ...
    Jan 31, 2024 · We curate and release Dolma, a three-trillion-token English corpus, built from a diverse mixture of web content, scientific papers, code, public-domain books, ...<|control11|><|separator|>
  82. [82]
    AI2's Semantic Scholar expands to cover 175 million papers in all ...
    Oct 23, 2019 · AI2's Semantic Scholar expands to cover 175 million papers in all scientific disciplines · The full breakout session agenda at TechCrunch Disrupt ...
  83. [83]
    Ai2: The Pioneering Open AI Research House Paul Allen Built
    Nov 15, 2024 · Called AI2, this nonprofit 501(c)(3) firm, founded by the late Microsoft co-founder Paul Allen, aims to create AI for the common good.<|control11|><|separator|>
  84. [84]
    OpenAI's o3 tops new AI league table for answering ... - Nature
    Jul 10, 2025 · SciArena, developed by the Allen Institute for Artificial Intelligence (Ai2) in Seattle, Washington, ranked 23 large language models (LLMs) ...
  85. [85]
    Ai2 named one of Fast Company's Most Innovative Companies for ...
    Mar 18, 2025 · At Ai2, our goal is to support open research and collaboration by making AI models, datasets, and tools fully accessible. We believe that ...
  86. [86]
    The AI oracle of Delphi uses the problems of Reddit to offer dubious ...
    Oct 20, 2021 · It says that Ask Delphi “demonstrates strong promise of language-based commonsense moral reasoning, with up to 92.1 percent accuracy vetted by ...Missing: criticism | Show results with:criticism
  87. [87]
    Racist Technology in Action: an AI for ethical advice turns out to be ...
    Nov 26, 2021 · The Allen Institute for AI launched Delphi, an AI in the form of a research prototype that is designed to model people's moral judgments on a variety of ...Missing: criticism | Show results with:criticism
  88. [88]
    This Program Can Give AI a Sense of Ethics—Sometimes | WIRED
    Oct 28, 2021 · Researchers trained an algorithm to answer questions about human values. Some of the responses are troubling.
  89. [89]
    Investigating machine moral judgement through the Delphi experiment
    Jan 13, 2025 · For instance, Delphi has limited cultural awareness and is susceptible to pervasive biases. Despite these shortcomings, we demonstrate several ...
  90. [90]
    Can a Machine Learn Morality? - The New York Times
    Nov 19, 2021 · They now call Delphi “a research prototype designed to model people's moral judgments.” It no longer “says.” It “speculates.” Advertisement.
  91. [91]
    Teaching artificial intelligence right from wrong: New tool from AI2 ...
    Nov 29, 2021 · AI2 recently developed Delphi, a machine ethics AI designed to model people's ethical judgments on a variety of everyday situations.Missing: criticism | Show results with:criticism
  92. [92]
    Ali Farhadi advocates for open source AI - Fast Company
    Dec 4, 2024 · The head of the Allen Institute for AI is acutely aware of the risks posed by artificial intelligence, but insists secrecy is not the solution.
  93. [93]
    "Open Source AI" is a lie, but it doesn't have to be — EA Forum
    Apr 30, 2024 · Some claim that open sourcing models will potentially increase the likelihood of societal risks ... Allen Institute for AI, OpenELM from Apple Inc ...
  94. [94]
    Be careful with 'open source' AI - LeadDev
    Aug 20, 2024 · Nathan Lambert, a machine learning scientist at the Allen Institute For AI ... open source AI poses a myriad of risks. “Making the weights ...
  95. [95]
    To Err is AI : A Case Study Informing LLM Flaw Reporting Practices
    Oct 15, 2024 · In August of 2024, 495 hackers generated evaluations in an open-ended bug bounty targeting the Open Language Model (OLMo) from The Allen ...
  96. [96]
    OLMo : Accelerating the Science of Language Models - arXiv
    Feb 1, 2024 · OLMo-7B on 6 additional end-tasks. The performance of these additional end-tasks was unstable and provided limited signal during model development.
  97. [97]
    OLMo 7B Instruct · Models - Dataloop
    Limitations. The OLMo 7B Instruct model has its weaknesses. Let's explore some of the challenges and constraints associated with this model. Biases and Risks.
  98. [98]
    Why I build open language models - by Nathan Lambert
    Oct 30, 2024 · Reflections after a year at the Allen Institute for AI and on the battlefields of open-source AI ... Many of the risks that “experts” expected ...
  99. [99]
    Scientific literature synthesis with retrieval-augmented language ...
    Nov 19, 2024 · To help scientists effectively navigate and synthesize scientific literature, we introduce the first fully open retrieval-augmented LMs.
  100. [100]
    OpenScholar: Synthesizing Scientific Literature with Retrieval ... - arXiv
    Nov 21, 2024 · We introduce OpenScholar, a specialized retrieval-augmented LM that answers scientific queries by identifying relevant passages from 45 million open-access ...