Iceberg v3: Databricks backs open lakehouses
Databricks put Apache Iceberg v3 into public preview, signalling a push to make table formats the control plane for lakehouse portability and governance. (databricks.com) The change matters because open table formats shape multi‑engine interoperability and long‑term vendor optionality for regulated platforms. (databricks.com)
A lakehouse table format is the label on the boxes in a warehouse: the files hold the data, but the format tells every engine where the rows are, what changed, and who can read them. On April 9, 2026, Databricks put Apache Iceberg version 3 into public preview so its platform can use that shared label instead of a Databricks-only one. (databricks.com) Apache Iceberg is an open specification run by the Apache Software Foundation, and it already lets different query engines read the same table without copying the data. The official specification says version 3 adds new metadata features on top of the earlier version 2 work for row-level changes. (iceberg.apache.org) One of those new features is a deletion vector, which works like a strike-through list for rows inside a file. Instead of rewriting a whole data file to remove 100 bad rows, an engine can keep the file and attach a compact map of which row positions are deleted. (iceberg.apache.org) Another new feature is row lineage, which is a built-in trail showing which rows came from which change. Databricks says row lineage is required for all Iceberg version 3 tables in its preview, because incremental pipelines need a reliable way to tell old rows from new ones. (learn.microsoft.com) Iceberg version 3 also adds a VARIANT type for semi-structured data, which is the messy real-world stuff that looks more like nested shipping forms than neat spreadsheet columns. Databricks says that lets teams analyze JavaScript Object Notation-style records without brittle workarounds that used to sit outside the table format. (databricks.com) The company’s bigger move is not just “we support Iceberg.” Databricks is wiring Iceberg version 3 into Unity Catalog, its governance layer, so managed Iceberg tables, foreign Iceberg tables, and some Delta Lake tables can all be governed from the same catalog. (docs.databricks.com) That matters because Databricks built its business around Delta Lake, which is a different table format with its own transaction log. Its UniForm feature now lets Delta tables generate Iceberg metadata asynchronously, so Iceberg clients can read the same underlying Parquet files without a data rewrite. (docs.databricks.com) In plain English, Databricks is trying to make the metadata layer the control plane and the storage files the common ground. If that works, a company can write data once, keep one copy of the Parquet files, and let different engines read it through Iceberg-compatible metadata. (docs.databricks.com) The timing also shows where the market has moved. Databricks announced full Apache Iceberg support in public preview about 10 months ago, and this April 2026 release pushes further by adopting the newest Iceberg specification instead of treating Iceberg as a side door. (databricks.com, databricks.com) The catch is that this is still preview software with version gates. Databricks says Iceberg version 3 features need Unity Catalog and Databricks Runtime 18.0 or above on its AWS documentation, while its Azure documentation lists Databricks Runtime 17.3 or above, which means teams still need to check cloud-specific support before betting production systems on it. (docs.databricks.com, learn.microsoft.com) For banks, insurers, and other regulated shops, the appeal is simple: governance rules can stay in one catalog while compute engines change over time. For Databricks, the bet is just as simple: if open table formats become the operating system of the lakehouse, the company wants to be the place that manages them. (databricks.com, databricks.com)