Databricks
Databricks, Inc. is an American software company headquartered in San Francisco, California, founded in 2013 by seven UC Berkeley researchers—Ali Ghodsi, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, Andy Konwinski, and Arsalan Tavakoli-Shiraji—who are the original creators of the open-source Apache Spark project.[1][2] The company provides the Databricks Data Intelligence Platform, a unified, cloud-based analytics solution that integrates data engineering, machine learning, and AI capabilities on an open lakehouse architecture, combining the reliability of data warehouses with the flexibility of data lakes.[3][4] This platform leverages foundational open-source technologies developed by its founders, including Delta Lake for reliable data lakes, MLflow for machine learning lifecycle management, and Unity Catalog for data governance.[3] Since its inception, Databricks has grown rapidly, launching its cloud platform in 2014 and expanding to serve over 20,000 organizations worldwide, including more than 60% of the Fortune 500 companies such as Block, Comcast, and Shell.[5] The company's mission is to democratize data and AI, enabling organizations to simplify complex data workflows and accelerate AI-driven insights through features like natural language data discovery and automated AI model deployment.[3] As of September 2025, Databricks achieved a $4 billion annual revenue run-rate, with AI-specific revenue exceeding $1 billion, reflecting over 50% year-over-year growth and a net retention rate above 140%.[5] In a landmark funding milestone, Databricks raised $1 billion in its Series K round in September 2025 at a valuation exceeding $100 billion, up from $62 billion the previous year, to fuel AI innovations such as Agent Bricks for agentic AI applications and Lakebase for AI-optimized databases, while supporting global expansion and acquisitions.[6][7] With over 5,000 global partners and a focus on open standards, Databricks continues to lead in the data and AI ecosystem, powering enterprise-grade solutions across industries like finance, healthcare, and manufacturing.[8]History
Founding and Early Development (2013-2021)
Databricks was founded in 2013 in San Francisco by the original creators of Apache Spark from the University of California, Berkeley's AMPLab, including Ali Ghodsi, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, Andy Konwinski, and Arsalan Tavakoli-Shiraji.[1][3] The company emerged from efforts to commercialize Spark, an open-source unified analytics engine for large-scale data processing, with an initial emphasis on building a cloud-based platform to simplify data engineering, analytics, and machine learning workflows.[9][10] This unified analytics platform, centered on Apache Spark, enabled collaborative environments for data teams to process and analyze massive datasets without managing underlying infrastructure, while contributing back to the open-source community through enhancements to Spark and related projects.[9] In its early years, Databricks introduced key open-source tools to address challenges in data reliability and machine learning operations. Delta Lake, launched in October 2017 as a proprietary storage layer and open-sourced in April 2019, provided ACID transactions, scalable metadata handling, and unified batch and streaming data processing to make data lakes more reliable and performant for analytics workloads.[11][12] Similarly, MLflow was introduced in June 2018 as an open-source platform to manage the end-to-end machine learning lifecycle, including experiment tracking, package management, and model deployment, helping teams standardize workflows across diverse environments.[13] Databricks expanded its cloud integrations to broaden accessibility, partnering with Microsoft in November 2017 to launch Azure Databricks, a fully managed service integrating Spark-based analytics directly into the Azure ecosystem for enterprise-scale data processing.[14] This was followed by a partnership with Google Cloud in February 2021, enabling customers to run Databricks workloads on Google Kubernetes Engine and integrate with services like BigQuery for seamless data lakehouse architectures.[15] By 2021, the platform served more than 5,000 organizations worldwide, reflecting rapid adoption among enterprises tackling complex data challenges.[16] That same year, Databricks was ranked #59 on Fortune's Best Large Workplaces for Millennials list, based on employee feedback highlighting its inclusive culture and innovative environment.[17]Expansion and Innovation (2022-Present)
In 2022, Databricks accelerated its growth by deepening its focus on AI integration and enterprise-scale data solutions, building on its foundational Apache Spark technology to address emerging demands in generative AI and unified analytics. The company achieved significant valuation milestones, reaching $43 billion in September 2023 following a Series I funding round that raised over $500 million, led by T. Rowe Price with participation from Nvidia and Capital One. This valuation reflected Databricks' expanding role in the AI ecosystem, as enterprises increasingly adopted its platform for data-driven AI applications. By December 2024, a $10 billion Series J funding round—primarily non-dilutive financing for employee liquidity and strategic investments—elevated the company's valuation to $62 billion, underscoring investor confidence in its AI momentum amid a booming market for data intelligence tools. A pivotal innovation came in November 2023 with the launch of the Data Intelligence Platform, which unified data management, AI capabilities, and governance into a single lakehouse-based architecture, enabling organizations to build and deploy AI agents securely over enterprise data. This platform incorporated advanced generative AI features, such as semantic understanding of data assets, to streamline workflows from data ingestion to model serving. In March 2024, Databricks released DBRX, an open-source large language model developed using its Mosaic AI tools, which set new benchmarks for efficiency in mixture-of-experts architectures while outperforming models like Llama 2 in key evaluations. These advancements were bolstered by strategic partnerships, including a March 2025 multi-year collaboration with Anthropic to integrate Claude models natively into the platform, allowing over 10,000 customers to develop AI agents with enhanced reasoning and safety features directly on their data. Databricks' expansion extended to substantial investments in infrastructure and talent, exemplified by a $1 billion commitment in March 2025 to bolster San Francisco's economy through expanded headquarters at One Sansome Street and multi-year hosting of its Data + AI Summit, projected to draw up to 50,000 attendees by 2030. Revenue growth highlighted this trajectory, with $1.6 billion in revenue for fiscal year 2024 (ended January 31, 2024) and reaching an annual run-rate of $3 billion by December 2024, driven by over 50% year-over-year expansion in AI and analytics adoption.[18][19] By September 2025, the company surpassed a $4 billion annual recurring revenue run-rate, with more than $1 billion attributed to AI products, while targeting net revenue retention above 140% and serving over 650 customers spending more than $1 million annually. In September 2025, a $1 billion Series K round further propelled its valuation beyond $100 billion, funding AI research, acquisitions, and global scaling to meet surging enterprise demand.[5]Business Developments
Funding and Valuation
Databricks has secured substantial financing since its inception, amassing over $15 billion in total capital through equity rounds and debt facilities by late 2025.[20] This funding has supported the company's expansion in data analytics and AI technologies, with investments reflecting strong investor confidence in its lakehouse architecture and AI-driven growth.[21] The company's funding history includes several landmark equity rounds, detailed in the following table:| Date | Round | Amount Raised | Post-Money Valuation | Key Investors |
|---|---|---|---|---|
| September 2013 | Series A | $14 million | Not disclosed | Andreessen Horowitz |
| October 2019 | Series F | $400 million | $6.2 billion | Andreessen Horowitz, Tiger Global |
| February 2021 | Series G | $1 billion | $28 billion | Franklin Templeton, Amazon Web Services |
| September 2023 | Series I | $500 million | $43 billion | T. Rowe Price, NVIDIA |
| December 2024 | Series J | $10 billion | $62 billion | Thrive Capital, Andreessen Horowitz, NVIDIA |
| September 2025 | Series K | $1 billion | Over $100 billion | Thrive Capital, GIC |