Data Engineering · Lakehouse Architecture

Apache Iceberg vs Delta Lake — Complete 2026 Guide

Architecture differences, Delta UniForm, Iceberg v3 Public Preview on Databricks, the convergence roadmap toward Delta 5.0, OneLake Iceberg REST Catalog support, catalog ecosystem, and the 2026 decision framework — verified against official documentation.

Quick Answer

Apache Iceberg and Delta Lake are both open lakehouse table formats that store data in Parquet on object storage with ACID transactions — the difference is metadata strategy and ecosystem. In 2026, the most significant development is convergence: Iceberg v3 (Public Preview on Databricks Runtime 18.0+, April 2026) adopts Delta Lake features — Deletion Vectors, Row Lineage, VARIANT type. Databricks is proposing that Delta 5.0 adopt Iceberg’s adaptive metadata tree. Delta UniForm lets Delta tables be read as Iceberg by Snowflake, BigQuery, Trino, and Dremio from a single copy of Parquet. Microsoft Fabric OneLake now exposes an Iceberg REST Catalog endpoint. The table format war is not over, but the two formats are closer together than at any prior point.

📅 Last verified: June 2026 ⏱ ~14 min read ✍️ A.J., Data Engineering Researcher & Technical Writer · UIG Data Lab 🔗 Apache Iceberg Spec · Delta Lake

How Each Format Works — Architecture

Both Apache Iceberg and Delta Lake sit between cloud object storage and query engines in the lakehouse stack. Both store table data as Parquet files. The choice between them is a choice about how metadata is managed, how transactions are coordinated, and how many engines can access the same data without conflicts.

Apache Iceberg — Snapshot and Manifest Tree

Iceberg represents a table as a series of snapshots. Each snapshot points to a manifest list, which references one or more manifest files that record the actual data files with partition statistics. A table metadata file tracks the current snapshot, schema, and partition spec. A catalog (Hive, AWS Glue, Nessie, Polaris, Lakekeeper, or a REST catalog) tracks which metadata file is current for each table.

The catalog is what allows multiple engines to coordinate writes. Because the catalog mediates which snapshot is “current”, Spark, Flink, Trino, and Dremio can all commit to the same Iceberg table with consistent optimistic concurrency control using the same catalog API — regardless of which engine committed last.

Delta Lake — Append-Only Transaction Log

Delta Lake records every change in a _delta_log directory inside the table’s storage location. Each commit is a JSON file describing what changed. Periodic Parquet checkpoint files summarise the state so engines do not replay the entire log on every query open. No external catalog is required to read the current state — the log is self-describing.

This self-describing property is what makes Delta Lake easy to get started with: point any Delta-aware engine at a storage path and it can read the table without catalog registration. The trade-off is that multi-engine writes require understanding the log’s conflict detection rules — most fully implemented in Spark and Databricks runtimes.

ℹ️
Both Formats Use Parquet — The Storage Is Identical

A key source of confusion in the Iceberg vs Delta Lake debate: both formats store data in Parquet files. The difference is entirely in the metadata layer above the Parquet files. Delta UniForm exploits this — by generating Iceberg metadata alongside the existing Delta _delta_log, a single set of Parquet files can serve both Delta and Iceberg readers without any data copying or rewriting. Source: Databricks documentation — Iceberg reads on Delta Lake.

Feature Comparison — 2026

DimensionApache IcebergDelta Lake
Metadata architectureSnapshot → manifest list → manifest files → Parquet data. Catalog tracks current metadata file.Append-only _delta_log with JSON commits + periodic Parquet checkpoints. Self-describing, no external catalog required to read.
ACID transactionsOptimistic concurrency via catalog. Multiple engines can commit with consistent semantics through the catalog API.Optimistic concurrency via transaction log. Most mature in Spark and Databricks runtimes. Other engines can read; full write semantics best in Spark.
Schema evolutionType-safe column IDs — rename and reorder are safe. Widest range of schema changes without breaking readers.Schema evolution supported. More restrictive for certain changes (column reorder, type widening). Closely tied to Spark semantics.
Time travelQuery by snapshot ID or timestamp. Snapshots retained until explicitly expired.Query by version number or timestamp mapped to transaction log position. VACUUM removes old Parquet files outside retention window.
Deletion VectorsIceberg v3 adds native Deletion Vectors (Public Preview on Databricks Runtime 18.0+, April 2026). Previously only soft deletes via position/equality delete files.GA in Databricks since 2023. Efficient row-level deletes without full file rewrites. Cannot be enabled simultaneously with UniForm IcebergCompatV2 (incompatibility being resolved by Iceberg v3).
Multi-engine write supportFirst-class. Spark, Flink, Trino, Dremio, Snowflake all write through the same catalog API with consistent concurrency semantics.Read widely supported. Write most mature in Spark/Databricks. Databricks Runtime 14.3 LTS required for UniForm-enabled tables.
StreamingSupported in Spark and Flink. Delta Live Tables more mature for Databricks streaming pipelines.Very strong in Spark and Databricks. Delta Live Tables provide declarative streaming ETL with managed infrastructure.
Catalog dependencyRequired for multi-engine writes. REST catalog (Polaris, Lakekeeper, Nessie, AWS Glue, Unity Catalog) needed for coordination.Not required for reads. Recommended (Unity Catalog, Hive Metastore) for full governance and cross-engine access.
Vendor governanceApache Software Foundation. Vendor-neutral specification. Multiple independent implementations.Linux Foundation (Delta Lake project). Deepest optimisations remain in Databricks managed runtimes.

The 2026 Convergence — Iceberg v3, Delta 5.0 Proposals

The biggest development in the table format landscape in 2026 is not a new winner — it is convergence. The two formats are adopting each other’s best features, driven primarily by Databricks, which now supports both Delta Lake and managed Iceberg tables natively in Unity Catalog.

Iceberg v3 — Public Preview on Databricks (April 2026)

Iceberg v3 on Databricks is in Public Preview, available on Databricks Runtime 18.0+ with Unity Catalog enabled. Iceberg v3 adopts Deletion Vectors, Row Lineage, and VARIANT natively — customers no longer face a trade-off between Delta Lake performance features and Iceberg compatibility. The result is a single copy of data that serves every engine in your stack, with no replication pipelines to maintain or risk of drift.

Iceberg v3 Key Additions
  • Deletion Vectors — efficient row-level deletes matching Delta Lake’s approach, closing the biggest feature gap between the two formats on Databricks
  • Row Lineage — track row-level provenance through transformations, enabling auditable data lineage at the row level
  • VARIANT type — native semi-structured JSON data type, matching the VARIANT support in Delta Lake for handling schemaless and evolving JSON payloads

Delta 5.0 and Iceberg v4 — Convergence Roadmap

Databricks is proposing that the next version of Delta, Delta 5.0, adopts the adaptive metadata tree structure — Iceberg v4’s proposed redesign of the core metadata structure from the ground up for better performance, scalability, and interoperability. The adaptive metadata tree simplifies metadata so that most operations require writing only a single file instead of several. The result, if adopted, is that all managed tables in Unity Catalog would be automatically optimised, governed through open APIs, and available to any engine — with Delta and Iceberg metadata structures converging on the same foundation.

Field note — A.J., Data Engineering Researcher

The practical implication of Iceberg v3 + Delta UniForm convergence is that the Iceberg vs Delta choice is becoming a catalog and ecosystem choice rather than a feature capability choice. If you are on Databricks, you write Delta and expose it as Iceberg — you get both. If you are on Snowflake, BigQuery, or a REST catalog, you write Iceberg and it is queryable by Delta-aware engines. The question in 2026 is less “which format is better” and more “which catalog and governance layer does your organisation commit to” — because that is where the meaningful lock-in now lives.

Delta UniForm — One Copy, Two Formats

Delta UniForm (the feature name used in marketing; formally called “Iceberg reads on Delta Lake” in Databricks documentation) allows Delta Lake tables in Unity Catalog to be read by external Iceberg clients without rewriting or copying the underlying Parquet data. When enabled, Databricks asynchronously generates Iceberg metadata alongside the existing Delta transaction log. A single set of Parquet files serves both Delta readers (Spark, Databricks) and Iceberg readers (Snowflake, BigQuery, Redshift, Athena, Trino, Dremio).

UniForm Requirements (as of June 2026)

RequirementDetail
Unity Catalog registrationTable must be registered in Unity Catalog — both managed and external tables supported
Column mappingColumn mapping must be enabled on the Delta table
Protocol versionminReaderVersion ≥ 2 and minWriterVersion ≥ 7
Databricks RuntimeWrites must use Databricks Runtime 14.3 LTS or above
Deletion VectorsCannot enable Deletion Vectors on a UniForm-enabled table with IcebergCompatV2 (use REORG to disable and purge deletion vectors before enabling). Iceberg v3 resolves this limitation for tables using Iceberg v3 format.
Table typeIceberg reads cannot be enabled on materialised views or streaming tables
Source: Databricks documentation — Read Delta Lake tables with Iceberg clients · Azure Databricks — Microsoft Learn
⚠️
Compression Codec Change with UniForm

Tables with Iceberg reads enabled use Zstandard (Zstd) instead of Snappy as the compression codec for underlying Parquet data files. This change is applied automatically when enabling UniForm and cannot be overridden. If you are comparing file sizes between UniForm-enabled and non-UniForm tables, account for the codec difference — Zstd typically produces smaller files than Snappy. Source: Databricks documentation.

The Catalog Ecosystem — 2026 State

The catalog — not the table format — is increasingly where the meaningful architectural decision lives in 2026. The catalog coordinates multi-engine writes, enforces governance, and determines how easily you can share data across organisations and platforms. Here is the current state by cloud and vendor:

🏢

Unity Catalog (Databricks)

Supports both Delta Lake and managed Iceberg tables natively. Iceberg REST Catalog API for external engine access with credential vending. Cross-engine attribute-based access control (Beta). Federation connectors to AWS Glue, Snowflake Horizon, Hive Metastore, Salesforce Data Cloud, Google Cloud Lakehouse, and Palantir (Preview). Open source Unity Catalog under Linux Foundation governance. Iceberg v3 available on DBR 18.0+.

☁️

AWS Glue & S3 Tables

AWS Glue: managed Iceberg catalog supporting Spark, Trino, Athena, and EMR. Amazon S3 Tables: newer option — first-class AWS resources exposing the Iceberg REST Catalog API, up to 10× higher TPS than general-purpose buckets, Iceberg v3 support, table-level access control, built-in maintenance. Integrates with SageMaker Lakehouse.

🌐

Google BigLake Metastore

Serverless managed Iceberg REST catalog on GCP. Supports interoperability between Spark, Trino, and BigQuery on the same tables in Cloud Storage. BigQuery federation lets a table created in Spark be queryable in BigQuery without data copying.

🧊

Apache Polaris & Lakekeeper

Polaris: Snowflake-contributed open-source Iceberg REST catalog, now under Apache governance. Lakekeeper: open-source Rust-based Iceberg REST catalog with strong authorisation. Commercial Lakekeeper Plus from Vakamo adds enterprise maintenance. Red Hat certified for OpenShift. Both provide vendor-neutral options for organisations wanting full ownership of their catalog.

🌿

Project Nessie

Git-like version control for data lakes. Supports Iceberg tables with branch and tag semantics — create a branch, test schema changes, merge back. Strong for data engineering teams who want code-like change management for table schemas. Integrated in Dremio and available as open-source standalone server.

⚠️

Hive Metastore (HMS)

Legacy catalog still widely deployed. Both Iceberg and Delta Lake support Hive Metastore for catalog registration in existing Spark clusters. Limited to batch operations in most configurations — lacks REST catalog features like credential vending and scan planning. Migration path to Glue, Unity Catalog, or REST catalog options is the recommended direction for new architectures.

Microsoft Fabric & OneLake — Iceberg Support

Microsoft Fabric’s native format is Delta Lake — every Lakehouse, Warehouse, and Eventhouse stores data in Delta Parquet in OneLake. As of 2026, Fabric has added three levels of Iceberg interoperability that allow Iceberg-native tools to work with OneLake data without copying it.

🔗

OneLake Iceberg REST Catalog API

OneLake exposes an Iceberg REST Catalog endpoint. External query engines — Snowflake, Dremio, Trino — can query Fabric tables using standard Iceberg catalog connection strings without accessing the underlying ADLS storage directly. Governance policies configured in Fabric apply through the catalog API.

↔️

Table Format Virtualization

OneLake uses a virtualization layer (powered by metadata translators including Apache XTable) to expose Delta Lake tables as Iceberg tables and vice versa without rewriting physical data files. This is the OneLake equivalent of Delta UniForm — one Parquet copy, multiple format views.

🔌

OneLake Shortcuts to Iceberg

Create shortcuts in a Fabric Lakehouse pointing to Iceberg tables stored in AWS S3 or ADLS Gen2. When configured, OneLake virtualizes the Iceberg metadata so Fabric’s Spark and SQL engines can query the shortcut folder as if it were a native Delta table. No data movement or ETL pipeline required.

❄️

Snowflake Interoperability

Fabric supports a native metadata integration with Snowflake. Snowflake-managed Iceberg tables written directly into OneLake are queryable by Fabric engines immediately. Fabric Delta tables can be read by Snowflake as virtual Iceberg tables via the OneLake Iceberg REST Catalog API.

For the full Fabric lakehouse architecture — Medallion patterns, Materialized Lake Views, SQL analytics endpoint, and Direct Lake — see the Fabric Lakehouse Tutorial. For the Fabric vs Databricks architectural comparison, see the Fabric vs Databricks guide.

When Each Format Fits Best

🧊

Apache Iceberg — Natural Fit

  • Multi-engine platforms: Spark, Flink, Trino, Dremio, DuckDB, and Snowflake accessing the same tables with native read/write
  • Multi-cloud or hybrid deployments requiring a vendor-neutral format portable across AWS, GCP, and Azure
  • AWS-native architectures using S3 Tables with managed REST Catalog and built-in maintenance
  • GCP architectures using BigLake Metastore with BigQuery federation
  • Organisations with long-lived tables requiring advanced, type-safe schema evolution (column renames, reordering)
  • Teams building on open REST catalogs (Polaris, Lakekeeper, Nessie) who want full ownership of their catalog infrastructure

Delta Lake — Natural Fit

  • Databricks-primary platforms where Delta Live Tables, Predictive Optimization, Liquid Clustering, and managed runtimes provide turnkey performance
  • Streaming and CDC pipelines in Spark where Delta’s tight streaming integration reduces operational overhead
  • Microsoft Fabric lakehouses where Delta is the native format and Direct Lake provides import-speed Power BI performance
  • Teams who want interoperability without format migration — Delta UniForm exposes Delta tables to Snowflake, BigQuery, Trino, and Dremio without rewriting data
  • ML and data science workloads on Databricks using MLflow, Feature Store, and model training over Delta tables

Decision Framework — 2026

In 2026, the decision is less about picking a winner and more about understanding which catalog and ecosystem your organisation is committing to. The formats themselves are converging — the catalog is where the strategic lock-in now lives.

Decision FactorPoints toward IcebergPoints toward Delta Lake
Primary compute platformTrino, Flink, Dremio, DuckDB, Snowflake, BigQuery — engines with native Iceberg read/writeDatabricks, Microsoft Fabric — platforms with native Delta optimisations and managed runtimes
Cloud strategyMulti-cloud or hybrid. Need to move data between AWS, GCP, and Azure without format lock-in.Single cloud (Azure with Fabric, or AWS/Azure with Databricks). Platform-specific optimisations outweigh portability.
Governance and catalogPrefer open REST catalog (Polaris, Lakekeeper) under your own control, or need cross-platform catalog federation via AWS Glue or BigLake MetastoreUnity Catalog provides all governance and optimisation in one managed layer with Databricks or open-source flavour under Linux Foundation
Streaming and CDCFlink-based streaming with Iceberg sink connector. Growing ecosystem but Delta Live Tables more mature for declarative streaming.Spark Structured Streaming with Delta Lake and Delta Live Tables. Most mature streaming lakehouse option as of 2026.
Interoperability without migrationWrite native Iceberg — all Iceberg-aware engines can read/write without translation layerWrite Delta + enable UniForm — Delta tables readable as Iceberg by Snowflake, BigQuery, Trino without copying data
Field note — A.J., Data Engineering Researcher

The contributor summary from DEV Community in June 2026 — “The table format question is settled. Apache Iceberg won. Snowflake, Databricks, AWS, Google, and Microsoft all adding first-class Iceberg support” — captures the ecosystem momentum accurately. But it overstates the practical implication for most teams. Databricks-first teams will continue on Delta Lake with UniForm for interoperability. Microsoft Fabric teams will continue on Delta Lake. The Iceberg win is at the ecosystem and interoperability layer — the REST catalog spec has become the standard that everyone is implementing. For new greenfield lakehouses in 2026 where engine flexibility is a priority, start with Iceberg and a REST catalog. For existing Databricks or Fabric deployments, use UniForm for Iceberg interoperability rather than migrating format.

FAQs – Apache Iceberg vs Delta Lake

Both store data in Parquet files on cloud object storage with ACID transactions. The architectural difference is metadata management: Iceberg uses a snapshot-and-manifest tree tracked in a catalog — multiple engines commit through the same catalog API with consistent concurrency. Delta Lake uses an append-only transaction log (_delta_log) with JSON commits and Parquet checkpoints — self-describing, no external catalog required to read, but multi-engine write semantics most fully implemented in Spark and Databricks. As of 2026, Iceberg v3 on Databricks closes the feature gap — adopting Deletion Vectors, Row Lineage, and VARIANT — and Databricks is proposing Delta 5.0 adopt Iceberg’s adaptive metadata tree. The formats are converging. Source: Apache Iceberg spec · Delta Lake.
Delta UniForm (formally “Iceberg reads on Delta Lake” in Databricks documentation) lets external Iceberg clients read Delta Lake tables written by Databricks without rewriting the underlying Parquet data. When enabled on a Unity Catalog Delta table, Databricks asynchronously generates Iceberg metadata alongside the Delta transaction log. One set of Parquet files serves both Delta readers (Spark, Databricks) and Iceberg readers (Snowflake, BigQuery, Redshift, Athena, Trino, Dremio). Requirements: Unity Catalog registration, column mapping enabled, minReaderVersion ≥ 2 and minWriterVersion ≥ 7, Databricks Runtime 14.3 LTS or above for writes. Tables with UniForm enabled use Zstandard (Zstd) instead of Snappy for Parquet compression. Source: Databricks — Read Delta tables with Iceberg clients.
Apache Iceberg v3 reached Public Preview on Databricks in April 2026, available on Databricks Runtime 18.0+ with Unity Catalog enabled. Three headline additions: Deletion Vectors — efficient row-level deletes matching Delta Lake’s approach (previously Iceberg v2 Deletion Vectors were incompatible with UniForm; Iceberg v3 resolves this), Row Lineage — track row-level provenance through transformations, and VARIANT type — native semi-structured JSON data. The significance is that Iceberg v3 closes the feature gap between Delta Lake and Iceberg on Databricks — teams no longer trade off between Delta performance features and Iceberg compatibility. Databricks is also driving Iceberg v4 proposals including an adaptive metadata tree that simplifies metadata from multiple files to a single file per commit. Source: Databricks blog — Iceberg v3 Public Preview.
Yes. Microsoft Fabric’s native format is Delta Lake, and OneLake adds three Iceberg interoperability mechanisms: Iceberg REST Catalog API — external engines (Snowflake, Dremio, Trino) query Fabric tables via standard Iceberg catalog connection strings with credential vending; Table Format Virtualization — OneLake exposes Delta tables as Iceberg and vice versa without rewriting data files (powered by metadata translators including Apache XTable); OneLake Shortcuts — create shortcuts in a Fabric Lakehouse pointing to Iceberg tables in AWS S3 or ADLS Gen2, queryable by Fabric Spark and SQL engines as if they were Delta tables. Additionally, Snowflake-managed Iceberg tables written into OneLake are immediately queryable by Fabric, and Fabric Delta tables are readable by Snowflake as virtual Iceberg tables.
Yes, migrations are possible — but Delta UniForm and OneLake virtualisation make full migration unnecessary for most use cases. If you are on Databricks with Delta Lake, enable UniForm to expose your tables as Iceberg to external engines without rewriting data. If you need to fully migrate: for Delta to Iceberg, use Apache XTable (formerly OneTable) which converts Delta transaction log entries to Iceberg metadata without data copying. For Iceberg to Delta, a read-and-rewrite approach in Spark is typically required. Before committing to a migration, evaluate whether UniForm (Delta → Iceberg reads) or OneLake shortcuts (Iceberg → Fabric reads) solve your interoperability need without a full format conversion — in 2026, they usually do.
Delta Lake has the most mature streaming lakehouse experience in 2026, particularly when combined with Databricks Delta Live Tables — which provides declarative streaming ETL, automatic schema inference, change data capture, and managed pipeline infrastructure. Spark Structured Streaming with Delta is well-proven in production at very high throughput. Apache Iceberg has strong streaming support via Apache Flink’s Iceberg connector and Spark Structured Streaming with Iceberg sink. The ecosystem is mature and production-proven at scale for Flink-first architectures. Choose based on your primary streaming engine: Flink-first → Iceberg. Spark/Databricks-first → Delta Lake. Both support CDC patterns with MERGE INTO (Delta) and equality/position deletes plus Iceberg v3 Deletion Vectors (Iceberg).
⚠ Accuracy Disclaimer

Iceberg v3 Public Preview status on Databricks Runtime 18.0+ is sourced from the Databricks April 2026 blog. Delta UniForm requirements are from the Databricks documentation and Microsoft Learn Azure Databricks documentation (updated March 2026). Delta 5.0 adaptive metadata tree proposals are described as proposals in progress — not released features. OneLake Iceberg support is sourced from the Apache Iceberg Knowledge Base — Microsoft Fabric OneLake. UIG Data Lab is an independent publication, not affiliated with or endorsed by Apache Software Foundation, Databricks, or Microsoft Corporation.

A.J. Data Engineering Researcher & Technical Writer · UIG Data Lab
A.J. researches and writes about data engineering, analytics architecture, Microsoft Fabric, and modern cloud data platforms. Coverage spans Microsoft Fabric, Power BI, Azure Data Engineering, Databricks, Snowflake, Apache Spark, dbt, Apache Airflow, and modern cloud data infrastructure. The focus is practitioner-level content that helps data professionals understand platform capabilities, evaluate technology decisions, optimise costs, and implement practical solutions using official documentation, product updates, community insights, and industry best practices. His writing covers real decisions from real deployments — not documentation rewrites.
Apache Iceberg Delta Lake Lakehouse Delta UniForm Iceberg v3 OneLake Databricks Apache Spark Data Engineering Table Formats

Scroll to Top