Skip to main content
Data

Iceberg, Hudi, Delta — for Engineers Who Don't Run a Data Team

Ravinder··9 min read
DataIcebergHudiDelta LakeLakehouse
Share:
Iceberg, Hudi, Delta — for Engineers Who Don't Run a Data Team

The table-format wars are mostly over. Apache Iceberg won the ecosystem battle — Snowflake, AWS, Google, Azure, and Databricks all support it. But Delta Lake is deeply embedded in Databricks shops and Hudi still has genuine strengths for streaming upsert workloads. If you're architecting a new data lake or porting an old one, you need to understand what these formats actually differ on — not the marketing.

This is not a benchmarks post. Benchmarks for table formats are vendor-published and nearly all misleading. This is a structural comparison for engineers who need to make a decision.


Why Table Formats Exist

Raw Parquet files in S3 or GCS are immutable, have no transaction guarantees, and support no schema evolution. If you write a partial batch and a consumer reads during the write, it sees an inconsistent state. If you add a column to your data, old and new files have different schemas. If you need to delete a record for GDPR compliance, you have to rewrite every Parquet file that contains it.

Table formats solve this by adding a metadata layer on top of the files. They track which files belong to the current version of the table, provide snapshot isolation, and enable schema and partition evolution without rewriting data.

flowchart TD Writer1[Writer 1] --> ML[Metadata Layer\nIceberg / Delta / Hudi] Writer2[Writer 2] --> ML ML --> PF1[Parquet File v1] ML --> PF2[Parquet File v2] ML --> PF3[Parquet File v3 - delete vectors] Reader1[Spark Reader] --> ML Reader2[Trino Reader] --> ML Reader3[DuckDB Reader] --> ML ML --> Snapshot[Snapshot / Version Tracking] Snapshot --> TT[Time Travel]

Apache Iceberg

Iceberg originated at Netflix and became an Apache project in 2018. Its design philosophy is built around correctness and ecosystem openness.

Metadata structure. Iceberg uses a three-layer metadata hierarchy: table metadata JSON → manifest list (snapshot) → manifest files (lists of data files with column stats). This structure enables partition pruning without scanning data files and supports hidden partitioning (partitioning that's transparent to query writers).

Schema evolution. Iceberg tracks columns by ID, not by name. This means you can rename a column without invalidating historical data — the old column name maps to the same column ID. You can add, drop, rename, and reorder columns. Type promotion (int → long, float → double) is supported. This is the strongest schema evolution story of the three formats.

Hidden partitioning. In Hive-style partitioning (and early Delta), partition values are embedded in the directory path. Queries must filter on the partition column or they scan everything. Iceberg's hidden partitioning transforms a column into a partition value invisibly — you query WHERE event_date = '2026-03-01' and Iceberg translates that to a partition filter without the query writer needing to know the partition scheme. Partition schemes can be changed without rewriting data (partition evolution).

Ecosystem. As of 2026, Iceberg has the broadest native support: Snowflake (Iceberg tables), AWS Athena, Google BigQuery (via external tables and now native), Azure Synapse, Trino, Spark, Flink, DuckDB, and Databricks (which supports Iceberg in addition to Delta). This matters for multi-engine data architectures.

# Writing an Iceberg table with PyIceberg
from pyiceberg.catalog import load_catalog
from pyiceberg.schema import Schema
from pyiceberg.types import (
    NestedField, StringType, LongType, TimestampType
)
import pyarrow as pa
 
catalog = load_catalog("glue", **{
    "type": "glue",
    "warehouse": "s3://my-bucket/warehouse"
})
 
schema = Schema(
    NestedField(1, "event_id", StringType(), required=True),
    NestedField(2, "user_id", LongType(), required=True),
    NestedField(3, "event_type", StringType(), required=False),
    NestedField(4, "occurred_at", TimestampType(), required=True),
)
 
catalog.create_table(
    identifier="prod.events",
    schema=schema,
    partition_spec=...  # hidden partitioning by day(occurred_at)
)
 
table = catalog.load_table("prod.events")
arrow_table = pa.table({...})
table.append(arrow_table)

Delta Lake

Delta Lake was created by Databricks and open-sourced in 2019. It is the default format in Databricks environments and has deep integration with Spark.

Transaction log. Delta stores its metadata as a transaction log — a sequence of JSON files in _delta_log/. Each commit appends a new JSON file describing what changed (files added, files removed). Checkpoints compact the log periodically. This is simpler than Iceberg's metadata hierarchy but has different performance characteristics at very high transaction rates.

ACID guarantees. Delta's ACID implementation is mature and well-tested. It handles concurrent writers via optimistic concurrency — writers read the current log, write their data files, then attempt to commit by appending to the log. If two writers conflict, one retries.

Schema enforcement and evolution. Delta enforces schema on write by default — writing data with a new column fails unless you explicitly enable schema evolution with mergeSchema. This is more conservative than Iceberg's default behavior, which some teams prefer (fewer surprise schemas) and others find annoying (more friction for pipeline changes).

# Delta Lake with PySpark
from delta.tables import DeltaTable
from pyspark.sql import SparkSession
 
spark = SparkSession.builder \
    .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
    .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \
    .getOrCreate()
 
# Upsert (MERGE) — Delta's strong suit
delta_table = DeltaTable.forPath(spark, "s3://my-bucket/delta/orders")
 
updates_df = spark.read.parquet("s3://my-bucket/staging/orders_updates/")
 
delta_table.alias("target").merge(
    updates_df.alias("source"),
    "target.order_id = source.order_id"
).whenMatchedUpdateAll() \
 .whenNotMatchedInsertAll() \
 .execute()

DML operations. Delta's MERGE, UPDATE, and DELETE are first-class operations with good performance. The deletion mechanism uses deletion vectors (added in Delta 2.4) — a compact bitmap marking deleted rows in existing files — avoiding full file rewrites for small deletes.

Ecosystem. Delta has strong Databricks integration and native Spark support. External engine support has improved: Trino has Delta connector, DuckDB can read Delta tables. But Iceberg has broader native support outside the Spark/Databricks ecosystem.


Apache Hudi

Hudi (Hadoop Upserts Deletes and Incrementals) was created at Uber and open-sourced in 2019. Its design centers on streaming upsert workloads — it was built to handle the problem of ingesting CDC data into a data lake efficiently.

Table types. Hudi has two table types that Iceberg and Delta don't expose as a user choice:

  • Copy-on-Write (CoW): On every write, affected Parquet files are rewritten with updates merged in. Reads are fast (clean Parquet files). Writes are expensive.
  • Merge-on-Read (MoR): Writes go to small delta files (avro row format). Reads merge base files with delta files on the fly. Writes are fast. Reads pay a merge cost until compaction runs.

For CDC ingestion workloads — where you're continuously streaming updates to existing records — MoR is a significant win. You can ingest updates at high throughput without rewriting large Parquet files on every batch.

Incremental queries. Hudi has a concept of incremental queries built into the format: you can query "all records changed since checkpoint X." This is useful for building downstream consumers that process only new changes, without maintaining external state.

Ecosystem and complexity. Hudi has the steepest operational curve. The CoW/MoR choice is a real design decision with performance implications. Compaction services must run to prevent MoR read degradation. Timeline management (Hudi's metadata log) requires understanding. Multi-engine support is narrower than Iceberg. Hudi is worth it for CDC/streaming-upsert use cases. For batch-heavy analytical tables, Iceberg or Delta is simpler.


Side-by-Side Comparison

Feature Iceberg Delta Lake Hudi
Schema evolution Strongest (ID-based) Good (strict by default) Good
Hidden partitioning Yes No No
Streaming upserts Good Good Best (MoR)
Deletion vectors Yes (v2 spec) Yes (2.4+) Yes (MoR)
Multi-engine support Broadest Good (improving) Narrower
Incremental queries Limited Limited First-class
Operational simplicity High High Medium
Databricks native Supported Native Supported
AWS native Glue + Athena native Athena (via connector) EMR native

How to Pick

Pick Iceberg if:

  • You run a multi-engine stack (Spark + Trino + DuckDB + Snowflake).
  • You need strong schema evolution (frequent column renames and additions).
  • You're on AWS and want native Glue Catalog and Athena support.
  • You want the broadest long-term ecosystem bets.

Pick Delta Lake if:

  • Your primary compute is Databricks or Spark.
  • Your team is already invested in the Delta ecosystem.
  • You want the simplest operational experience within the Databricks platform.
  • Delta's schema enforcement behavior matches your data governance requirements.

Pick Hudi if:

  • You have a high-volume CDC ingestion pipeline where records are frequently updated.
  • You need MoR semantics — fast writes with compaction-controlled read cost.
  • You're already on AWS EMR and have Hudi expertise.
  • Incremental consumption by downstream jobs is a first-class requirement.

ACID Semantics in Practice

All three formats claim ACID. They deliver it at the table level, not the cross-table level — there is no distributed transaction across two tables in any of these formats. If your workload requires atomic writes across multiple tables, you need application-level coordination (write both tables in the same job, use a saga pattern for rollback).

Within a single table, the ACID guarantees are real: concurrent readers always see a consistent snapshot, concurrent writers are isolated (with conflict detection and retry), and commits are durable once the metadata is written.

Time travel is one of the most practically useful features all three formats provide:

-- Iceberg time travel in Trino
SELECT * FROM prod.orders FOR TIMESTAMP AS OF TIMESTAMP '2026-03-01 00:00:00 UTC';
 
-- Delta time travel in Spark SQL
SELECT * FROM orders VERSION AS OF 42;
SELECT * FROM orders TIMESTAMP AS OF '2026-03-01';
 
-- Hudi time travel in Spark
spark.read.format("hudi")
    .option("as.of.instant", "2026-03-01")
    .load("s3://my-bucket/hudi/orders")

Retention period for time travel is configurable. Default is typically 7 days. Longer retention means more metadata overhead and delayed garbage collection of old files. Set retention based on your actual rollback and audit requirements — not indefinitely.


Key Takeaways

  • Apache Iceberg has won the ecosystem battle in 2026 — it has the broadest native engine support and is the default choice for new multi-engine data lakehouses.
  • Delta Lake remains the best choice within Databricks-centric stacks, with mature ACID semantics and the simplest operational experience on that platform.
  • Hudi's Merge-on-Read table type is the strongest option for high-volume CDC/streaming upsert workloads where write throughput matters more than read-time overhead.
  • Schema evolution is where the formats genuinely differ: Iceberg's ID-based column tracking is the most robust, supporting column renames without data migration.
  • All three formats deliver single-table ACID guarantees and time travel; none provide cross-table transactions — coordinate multi-table atomicity at the application layer.
  • Choose based on your compute engine and workload pattern, not vendor prestige: Iceberg for multi-engine, Delta for Databricks, Hudi for streaming upserts.