Databricks previews Iceberg v3

Databricks put Apache Iceberg v3 into public preview as a way to make lakehouse storage more interoperable and avoid costly rewrites of data pipelines. (databricks.com) This release is being touted as a hedge against vendor lock‑in for teams that need cross‑engine queries and long‑term portability of governed data assets. (databricks.com)

Most company data lakes look open until you try to move them. The files sit in cloud storage, but the rules about updates, deletes, and table history often live in one vendor’s format, so switching engines can mean rewriting pipelines that took years to build. (iceberg.apache.org) Apache Iceberg is the part that writes those rules down. It is an open table format, which means it keeps a shared map of what files belong to a table, what changed, and which version readers should see, instead of making every query engine guess from folder names. (iceberg.apache.org) That map matters because cloud object storage is just a pile of files. Iceberg adds database-like behavior on top of that pile, including schema changes, time travel, and transactions, so engines like Apache Spark, Trino, and others can read the same table without inventing separate copies. (iceberg.apache.org) Iceberg has had three spec versions so far, and version 3 is now marked complete and adopted by the community. Databricks said on April 9, 2026 that its support for Iceberg version 3 had entered public preview. (iceberg.apache.org) (databricks.com) The new pieces in Iceberg version 3 are aimed at a very specific headache: changing huge tables without constantly rewriting them. Databricks highlighted three of them in preview support: deletion vectors, row lineage, and a Variant type for semi-structured data. (databricks.com) (docs.databricks.com) Deletion vectors are like a strike-through list for rows. Instead of rewriting large data files every time a few records are deleted or updated, the table can keep a compact record of which rows to skip, which cuts write amplification and speeds up ingestion and extract, transform, and load jobs. (aws.amazon.com) (databricks.com) Row lineage is a source tag for each row. It lets teams trace which records were added by which operation, which makes incremental processing and audit trails easier when multiple jobs are touching the same table. (aws.amazon.com) (databricks.com) Variant is a built-in way to store semi-structured data like JavaScript Object Notation documents without flattening every field into columns first. Databricks says that brings Iceberg closer to the kind of mixed structured and nested data workloads that teams already run in modern analytics systems. (databricks.com 1) (databricks.com 2) The part Databricks is pushing hardest is that these features are in the open specification, not just in one company’s private layer. Its April 2026 announcement says the same Iceberg version 3 capabilities can be used across managed Iceberg tables, foreign Iceberg tables, and managed Delta Lake tables with UniForm, which is Databricks’ compatibility layer for open formats. (databricks.com) (docs.databricks.com) That is why this preview is really a fight over lock-in. If deletion tracking, row tracking, and semi-structured fields work through an open spec, a company can keep one governed copy of data in object storage and query it from more than one engine, instead of paying the tax of format conversion every time strategy changes. (databricks.com 1) (databricks.com 2) The timing also says something about the market. Databricks documentation says Iceberg version 3 features require Databricks Runtime 18.0 or above in preview, while other vendors including Amazon Web Services, Google, Snowflake, and Dremio have all been publishing Iceberg version 3 support plans or documentation over the past year, which suggests the format war is shifting from “whose table format wins” to “how much of the same table can everyone read.” (docs.databricks.com) (aws.amazon.com) (opensource.googleblog.com) (snowflake.com) (markets.businessinsider.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.