Back to stories
Industry

Databricks Acquires Tabular for $2B to Unify AI and Data Lakehouse

Michael Ouroumis3 min read
Databricks Acquires Tabular for $2B to Unify AI and Data Lakehouse

Databricks has acquired Tabular, the commercial company behind the Apache Iceberg open table format, for approximately $2 billion. The deal brings the most widely adopted open lakehouse format under Databricks' roof and signals the company's intent to own the full stack from data storage to AI model training.

Why Iceberg Matters

Apache Iceberg has won the data lakehouse format war. Over the past two years, it has been adopted by AWS, Google Cloud, Snowflake, and dozens of other platforms as the standard way to store and manage large-scale analytical data. Iceberg provides warehouse-like features — ACID transactions, schema evolution, time travel queries, and efficient partition pruning — on top of object storage like S3.

Tabular, founded by the original Iceberg creators Ryan Blue, Dan Weeks, and Jason Reid, built a managed Iceberg service that simplifies catalog management, access control, and cross-engine compatibility. The company had raised $62 million and had approximately 200 enterprise customers.

Strategic Logic

For Databricks, the acquisition fills a critical gap. The company's own Delta Lake format has competed with Iceberg for years, and while both formats have strong adoption, Iceberg has become the industry-preferred open standard. Rather than continue fighting a format war, Databricks is embracing Iceberg.

"The format debate is over," said Ali Ghodsi, Databricks CEO. "Iceberg won the open format standard. We're going all-in on making Databricks the best platform for Iceberg data."

Databricks announced that its platform will support Iceberg as a first-class citizen alongside Delta Lake. Existing Delta Lake users will have a migration path to Iceberg, and new customers will be able to choose either format. The company expects most new deployments to use Iceberg.

AI Integration

The deeper strategic play is about AI. Training large language models and building AI applications requires access to large, well-organized datasets. Iceberg's ability to manage petabyte-scale data with efficient versioning and access patterns makes it an ideal foundation for AI data pipelines.

Databricks plans to integrate Tabular's catalog technology with its Mosaic AI platform, allowing data scientists to point AI training jobs directly at Iceberg tables without complex ETL pipelines. The goal is a workflow where data engineers prepare data in Iceberg, and AI engineers consume it for model training — all within the same platform.

Snowflake's Response

The acquisition puts pressure on Snowflake, which had partnered with Tabular and adopted Iceberg as a supported format. Snowflake now faces the prospect of relying on Iceberg infrastructure controlled by its primary competitor. Snowflake said in a statement that it remains committed to Iceberg and that the open-source nature of the project protects its independence.

Open Source Concerns

The data engineering community has raised concerns about whether Databricks will continue to invest in Iceberg as a truly open project. Databricks preemptively addressed this, committing to maintaining Iceberg as an Apache Software Foundation project with open governance. The company pledged to increase its contributions to the project and not to create proprietary extensions that fragment the standard.

Whether this commitment holds will be tested over time. The history of corporate stewardship of open-source projects is mixed, and the community will be watching closely.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

Eli Lilly Bets $2.25B on Profluent's AI-Designed Gene Editors in Beyond-CRISPR Deal
Industry

Eli Lilly Bets $2.25B on Profluent's AI-Designed Gene Editors in Beyond-CRISPR Deal

Eli Lilly inked a research collaboration worth up to $2.25 billion with Bezos-backed AI biotech Profluent to develop custom site-specific recombinases — enzymes designed by generative models to perform large-scale DNA editing that current CRISPR tools cannot.

6 min ago2 min read
AWS Unveils Amazon Quick, Connect Agentic AI Suite, and Bedrock Managed Agents Powered by OpenAI
Industry

AWS Unveils Amazon Quick, Connect Agentic AI Suite, and Bedrock Managed Agents Powered by OpenAI

At its April 28 'What's Next with AWS' event, Amazon turned Connect into a four-product agentic AI family, debuted desktop assistant Amazon Quick, and previewed Bedrock Managed Agents running OpenAI's frontier models on AWS infrastructure.

3 hours ago2 min read
Anthropic Opens Sydney Office, Builds on Australian Government MOU as Hourmouzis Takes ANZ Helm
Industry

Anthropic Opens Sydney Office, Builds on Australian Government MOU as Hourmouzis Takes ANZ Helm

Anthropic officially opened its Sydney office this week, naming former Snowflake executive Theo Hourmouzis as General Manager for Australia and New Zealand and reinforcing an earlier-April memorandum of understanding with the Australian government on AI deployment.

4 hours ago3 min read