Apache Iceberg vs Delta Lake — Complete 2026 Guide
Architecture differences, Delta UniForm, Iceberg v3 Public Preview on Databricks, the convergence roadmap toward Delta 5.0, OneLake Iceberg REST Catalog support, catalog ecosystem, and the 2026 decision framework — verified against official documentation.
Apache Iceberg and Delta Lake are both open lakehouse table formats that store data in Parquet on object storage with ACID transactions — the difference is metadata strategy and ecosystem. In 2026, the most significant development is convergence: Iceberg v3 (Public Preview on Databricks Runtime 18.0+, April 2026) adopts Delta Lake features — Deletion Vectors, Row Lineage, VARIANT type. Databricks is proposing that Delta 5.0 adopt Iceberg’s adaptive metadata tree. Delta UniForm lets Delta tables be read as Iceberg by Snowflake, BigQuery, Trino, and Dremio from a single copy of Parquet. Microsoft Fabric OneLake now exposes an Iceberg REST Catalog endpoint. The table format war is not over, but the two formats are closer together than at any prior point.
How Each Format Works — Architecture
Both Apache Iceberg and Delta Lake sit between cloud object storage and query engines in the lakehouse stack. Both store table data as Parquet files. The choice between them is a choice about how metadata is managed, how transactions are coordinated, and how many engines can access the same data without conflicts.
Apache Iceberg — Snapshot and Manifest Tree
Iceberg represents a table as a series of snapshots. Each snapshot points to a manifest list, which references one or more manifest files that record the actual data files with partition statistics. A table metadata file tracks the current snapshot, schema, and partition spec. A catalog (Hive, AWS Glue, Nessie, Polaris, Lakekeeper, or a REST catalog) tracks which metadata file is current for each table.
The catalog is what allows multiple engines to coordinate writes. Because the catalog mediates which snapshot is “current”, Spark, Flink, Trino, and Dremio can all commit to the same Iceberg table with consistent optimistic concurrency control using the same catalog API — regardless of which engine committed last.
Delta Lake — Append-Only Transaction Log
Delta Lake records every change in a _delta_log directory inside the table’s storage location. Each commit is a JSON file describing what changed. Periodic Parquet checkpoint files summarise the state so engines do not replay the entire log on every query open. No external catalog is required to read the current state — the log is self-describing.
This self-describing property is what makes Delta Lake easy to get started with: point any Delta-aware engine at a storage path and it can read the table without catalog registration. The trade-off is that multi-engine writes require understanding the log’s conflict detection rules — most fully implemented in Spark and Databricks runtimes.
A key source of confusion in the Iceberg vs Delta Lake debate: both formats store data in Parquet files. The difference is entirely in the metadata layer above the Parquet files. Delta UniForm exploits this — by generating Iceberg metadata alongside the existing Delta _delta_log, a single set of Parquet files can serve both Delta and Iceberg readers without any data copying or rewriting. Source: Databricks documentation — Iceberg reads on Delta Lake.
Feature Comparison — 2026
| Dimension | Apache Iceberg | Delta Lake |
|---|---|---|
| Metadata architecture | Snapshot → manifest list → manifest files → Parquet data. Catalog tracks current metadata file. | Append-only _delta_log with JSON commits + periodic Parquet checkpoints. Self-describing, no external catalog required to read. |
| ACID transactions | Optimistic concurrency via catalog. Multiple engines can commit with consistent semantics through the catalog API. | Optimistic concurrency via transaction log. Most mature in Spark and Databricks runtimes. Other engines can read; full write semantics best in Spark. |
| Schema evolution | Type-safe column IDs — rename and reorder are safe. Widest range of schema changes without breaking readers. | Schema evolution supported. More restrictive for certain changes (column reorder, type widening). Closely tied to Spark semantics. |
| Time travel | Query by snapshot ID or timestamp. Snapshots retained until explicitly expired. | Query by version number or timestamp mapped to transaction log position. VACUUM removes old Parquet files outside retention window. |
| Deletion Vectors | Iceberg v3 adds native Deletion Vectors (Public Preview on Databricks Runtime 18.0+, April 2026). Previously only soft deletes via position/equality delete files. | GA in Databricks since 2023. Efficient row-level deletes without full file rewrites. Cannot be enabled simultaneously with UniForm IcebergCompatV2 (incompatibility being resolved by Iceberg v3). |
| Multi-engine write support | First-class. Spark, Flink, Trino, Dremio, Snowflake all write through the same catalog API with consistent concurrency semantics. | Read widely supported. Write most mature in Spark/Databricks. Databricks Runtime 14.3 LTS required for UniForm-enabled tables. |
| Streaming | Supported in Spark and Flink. Delta Live Tables more mature for Databricks streaming pipelines. | Very strong in Spark and Databricks. Delta Live Tables provide declarative streaming ETL with managed infrastructure. |
| Catalog dependency | Required for multi-engine writes. REST catalog (Polaris, Lakekeeper, Nessie, AWS Glue, Unity Catalog) needed for coordination. | Not required for reads. Recommended (Unity Catalog, Hive Metastore) for full governance and cross-engine access. |
| Vendor governance | Apache Software Foundation. Vendor-neutral specification. Multiple independent implementations. | Linux Foundation (Delta Lake project). Deepest optimisations remain in Databricks managed runtimes. |
The 2026 Convergence — Iceberg v3, Delta 5.0 Proposals
The biggest development in the table format landscape in 2026 is not a new winner — it is convergence. The two formats are adopting each other’s best features, driven primarily by Databricks, which now supports both Delta Lake and managed Iceberg tables natively in Unity Catalog.
Iceberg v3 — Public Preview on Databricks (April 2026)
Iceberg v3 on Databricks is in Public Preview, available on Databricks Runtime 18.0+ with Unity Catalog enabled. Iceberg v3 adopts Deletion Vectors, Row Lineage, and VARIANT natively — customers no longer face a trade-off between Delta Lake performance features and Iceberg compatibility. The result is a single copy of data that serves every engine in your stack, with no replication pipelines to maintain or risk of drift.
- Deletion Vectors — efficient row-level deletes matching Delta Lake’s approach, closing the biggest feature gap between the two formats on Databricks
- Row Lineage — track row-level provenance through transformations, enabling auditable data lineage at the row level
- VARIANT type — native semi-structured JSON data type, matching the VARIANT support in Delta Lake for handling schemaless and evolving JSON payloads
Delta 5.0 and Iceberg v4 — Convergence Roadmap
Databricks is proposing that the next version of Delta, Delta 5.0, adopts the adaptive metadata tree structure — Iceberg v4’s proposed redesign of the core metadata structure from the ground up for better performance, scalability, and interoperability. The adaptive metadata tree simplifies metadata so that most operations require writing only a single file instead of several. The result, if adopted, is that all managed tables in Unity Catalog would be automatically optimised, governed through open APIs, and available to any engine — with Delta and Iceberg metadata structures converging on the same foundation.
The practical implication of Iceberg v3 + Delta UniForm convergence is that the Iceberg vs Delta choice is becoming a catalog and ecosystem choice rather than a feature capability choice. If you are on Databricks, you write Delta and expose it as Iceberg — you get both. If you are on Snowflake, BigQuery, or a REST catalog, you write Iceberg and it is queryable by Delta-aware engines. The question in 2026 is less “which format is better” and more “which catalog and governance layer does your organisation commit to” — because that is where the meaningful lock-in now lives.
Delta UniForm — One Copy, Two Formats
Delta UniForm (the feature name used in marketing; formally called “Iceberg reads on Delta Lake” in Databricks documentation) allows Delta Lake tables in Unity Catalog to be read by external Iceberg clients without rewriting or copying the underlying Parquet data. When enabled, Databricks asynchronously generates Iceberg metadata alongside the existing Delta transaction log. A single set of Parquet files serves both Delta readers (Spark, Databricks) and Iceberg readers (Snowflake, BigQuery, Redshift, Athena, Trino, Dremio).
UniForm Requirements (as of June 2026)
| Requirement | Detail |
|---|---|
| Unity Catalog registration | Table must be registered in Unity Catalog — both managed and external tables supported |
| Column mapping | Column mapping must be enabled on the Delta table |
| Protocol version | minReaderVersion ≥ 2 and minWriterVersion ≥ 7 |
| Databricks Runtime | Writes must use Databricks Runtime 14.3 LTS or above |
| Deletion Vectors | Cannot enable Deletion Vectors on a UniForm-enabled table with IcebergCompatV2 (use REORG to disable and purge deletion vectors before enabling). Iceberg v3 resolves this limitation for tables using Iceberg v3 format. |
| Table type | Iceberg reads cannot be enabled on materialised views or streaming tables |
Tables with Iceberg reads enabled use Zstandard (Zstd) instead of Snappy as the compression codec for underlying Parquet data files. This change is applied automatically when enabling UniForm and cannot be overridden. If you are comparing file sizes between UniForm-enabled and non-UniForm tables, account for the codec difference — Zstd typically produces smaller files than Snappy. Source: Databricks documentation.
The Catalog Ecosystem — 2026 State
The catalog — not the table format — is increasingly where the meaningful architectural decision lives in 2026. The catalog coordinates multi-engine writes, enforces governance, and determines how easily you can share data across organisations and platforms. Here is the current state by cloud and vendor:
Unity Catalog (Databricks)
Supports both Delta Lake and managed Iceberg tables natively. Iceberg REST Catalog API for external engine access with credential vending. Cross-engine attribute-based access control (Beta). Federation connectors to AWS Glue, Snowflake Horizon, Hive Metastore, Salesforce Data Cloud, Google Cloud Lakehouse, and Palantir (Preview). Open source Unity Catalog under Linux Foundation governance. Iceberg v3 available on DBR 18.0+.
AWS Glue & S3 Tables
AWS Glue: managed Iceberg catalog supporting Spark, Trino, Athena, and EMR. Amazon S3 Tables: newer option — first-class AWS resources exposing the Iceberg REST Catalog API, up to 10× higher TPS than general-purpose buckets, Iceberg v3 support, table-level access control, built-in maintenance. Integrates with SageMaker Lakehouse.
Google BigLake Metastore
Serverless managed Iceberg REST catalog on GCP. Supports interoperability between Spark, Trino, and BigQuery on the same tables in Cloud Storage. BigQuery federation lets a table created in Spark be queryable in BigQuery without data copying.
Apache Polaris & Lakekeeper
Polaris: Snowflake-contributed open-source Iceberg REST catalog, now under Apache governance. Lakekeeper: open-source Rust-based Iceberg REST catalog with strong authorisation. Commercial Lakekeeper Plus from Vakamo adds enterprise maintenance. Red Hat certified for OpenShift. Both provide vendor-neutral options for organisations wanting full ownership of their catalog.
Project Nessie
Git-like version control for data lakes. Supports Iceberg tables with branch and tag semantics — create a branch, test schema changes, merge back. Strong for data engineering teams who want code-like change management for table schemas. Integrated in Dremio and available as open-source standalone server.
Hive Metastore (HMS)
Legacy catalog still widely deployed. Both Iceberg and Delta Lake support Hive Metastore for catalog registration in existing Spark clusters. Limited to batch operations in most configurations — lacks REST catalog features like credential vending and scan planning. Migration path to Glue, Unity Catalog, or REST catalog options is the recommended direction for new architectures.
Microsoft Fabric & OneLake — Iceberg Support
Microsoft Fabric’s native format is Delta Lake — every Lakehouse, Warehouse, and Eventhouse stores data in Delta Parquet in OneLake. As of 2026, Fabric has added three levels of Iceberg interoperability that allow Iceberg-native tools to work with OneLake data without copying it.
OneLake Iceberg REST Catalog API
OneLake exposes an Iceberg REST Catalog endpoint. External query engines — Snowflake, Dremio, Trino — can query Fabric tables using standard Iceberg catalog connection strings without accessing the underlying ADLS storage directly. Governance policies configured in Fabric apply through the catalog API.
Table Format Virtualization
OneLake uses a virtualization layer (powered by metadata translators including Apache XTable) to expose Delta Lake tables as Iceberg tables and vice versa without rewriting physical data files. This is the OneLake equivalent of Delta UniForm — one Parquet copy, multiple format views.
OneLake Shortcuts to Iceberg
Create shortcuts in a Fabric Lakehouse pointing to Iceberg tables stored in AWS S3 or ADLS Gen2. When configured, OneLake virtualizes the Iceberg metadata so Fabric’s Spark and SQL engines can query the shortcut folder as if it were a native Delta table. No data movement or ETL pipeline required.
Snowflake Interoperability
Fabric supports a native metadata integration with Snowflake. Snowflake-managed Iceberg tables written directly into OneLake are queryable by Fabric engines immediately. Fabric Delta tables can be read by Snowflake as virtual Iceberg tables via the OneLake Iceberg REST Catalog API.
For the full Fabric lakehouse architecture — Medallion patterns, Materialized Lake Views, SQL analytics endpoint, and Direct Lake — see the Fabric Lakehouse Tutorial. For the Fabric vs Databricks architectural comparison, see the Fabric vs Databricks guide.
When Each Format Fits Best
Apache Iceberg — Natural Fit
- Multi-engine platforms: Spark, Flink, Trino, Dremio, DuckDB, and Snowflake accessing the same tables with native read/write
- Multi-cloud or hybrid deployments requiring a vendor-neutral format portable across AWS, GCP, and Azure
- AWS-native architectures using S3 Tables with managed REST Catalog and built-in maintenance
- GCP architectures using BigLake Metastore with BigQuery federation
- Organisations with long-lived tables requiring advanced, type-safe schema evolution (column renames, reordering)
- Teams building on open REST catalogs (Polaris, Lakekeeper, Nessie) who want full ownership of their catalog infrastructure
Delta Lake — Natural Fit
- Databricks-primary platforms where Delta Live Tables, Predictive Optimization, Liquid Clustering, and managed runtimes provide turnkey performance
- Streaming and CDC pipelines in Spark where Delta’s tight streaming integration reduces operational overhead
- Microsoft Fabric lakehouses where Delta is the native format and Direct Lake provides import-speed Power BI performance
- Teams who want interoperability without format migration — Delta UniForm exposes Delta tables to Snowflake, BigQuery, Trino, and Dremio without rewriting data
- ML and data science workloads on Databricks using MLflow, Feature Store, and model training over Delta tables
Decision Framework — 2026
In 2026, the decision is less about picking a winner and more about understanding which catalog and ecosystem your organisation is committing to. The formats themselves are converging — the catalog is where the strategic lock-in now lives.
| Decision Factor | Points toward Iceberg | Points toward Delta Lake |
|---|---|---|
| Primary compute platform | Trino, Flink, Dremio, DuckDB, Snowflake, BigQuery — engines with native Iceberg read/write | Databricks, Microsoft Fabric — platforms with native Delta optimisations and managed runtimes |
| Cloud strategy | Multi-cloud or hybrid. Need to move data between AWS, GCP, and Azure without format lock-in. | Single cloud (Azure with Fabric, or AWS/Azure with Databricks). Platform-specific optimisations outweigh portability. |
| Governance and catalog | Prefer open REST catalog (Polaris, Lakekeeper) under your own control, or need cross-platform catalog federation via AWS Glue or BigLake Metastore | Unity Catalog provides all governance and optimisation in one managed layer with Databricks or open-source flavour under Linux Foundation |
| Streaming and CDC | Flink-based streaming with Iceberg sink connector. Growing ecosystem but Delta Live Tables more mature for declarative streaming. | Spark Structured Streaming with Delta Lake and Delta Live Tables. Most mature streaming lakehouse option as of 2026. |
| Interoperability without migration | Write native Iceberg — all Iceberg-aware engines can read/write without translation layer | Write Delta + enable UniForm — Delta tables readable as Iceberg by Snowflake, BigQuery, Trino without copying data |
The contributor summary from DEV Community in June 2026 — “The table format question is settled. Apache Iceberg won. Snowflake, Databricks, AWS, Google, and Microsoft all adding first-class Iceberg support” — captures the ecosystem momentum accurately. But it overstates the practical implication for most teams. Databricks-first teams will continue on Delta Lake with UniForm for interoperability. Microsoft Fabric teams will continue on Delta Lake. The Iceberg win is at the ecosystem and interoperability layer — the REST catalog spec has become the standard that everyone is implementing. For new greenfield lakehouses in 2026 where engine flexibility is a priority, start with Iceberg and a REST catalog. For existing Databricks or Fabric deployments, use UniForm for Iceberg interoperability rather than migrating format.
FAQs – Apache Iceberg vs Delta Lake
_delta_log) with JSON commits and Parquet checkpoints — self-describing, no external catalog required to read, but multi-engine write semantics most fully implemented in Spark and Databricks. As of 2026, Iceberg v3 on Databricks closes the feature gap — adopting Deletion Vectors, Row Lineage, and VARIANT — and Databricks is proposing Delta 5.0 adopt Iceberg’s adaptive metadata tree. The formats are converging. Source: Apache Iceberg spec · Delta Lake.Iceberg v3 Public Preview status on Databricks Runtime 18.0+ is sourced from the Databricks April 2026 blog. Delta UniForm requirements are from the Databricks documentation and Microsoft Learn Azure Databricks documentation (updated March 2026). Delta 5.0 adaptive metadata tree proposals are described as proposals in progress — not released features. OneLake Iceberg support is sourced from the Apache Iceberg Knowledge Base — Microsoft Fabric OneLake. UIG Data Lab is an independent publication, not affiliated with or endorsed by Apache Software Foundation, Databricks, or Microsoft Corporation.



