Apache Iceberg is an open table format specification for large-scale analytical datasets stored in cloud object storage. It was originally developed at Netflix to solve reliability and performance problems with Hive-based tables at petabyte scale, open-sourced in 2018, and donated to the Apache Software Foundation where it became a top-level project in 2020. By 2024-2025, Iceberg had become the dominant open table format standard, supported natively by Apache Spark, Apache Flink, Dremio, Trino, Snowflake, BigQuery, and dozens of other engines and platforms.

Iceberg is not a storage format. The actual data files are stored in Parquet (most commonly), ORC, or Avro. Iceberg is the metadata and transaction layer that sits above those data files and organizes them into a logical table with ACID semantics, versioned history, and rich partition management.

The Three-Layer Architecture

The Iceberg specification defines three layers:

Why Iceberg Won

Iceberg's engine-neutrality is its primary competitive advantage. Unlike Delta Lake, which originated in the Databricks/Spark ecosystem, Iceberg was designed from the start as a vendor-neutral specification. Any engine can implement the spec without licensing fees or proprietary dependencies. This made it the natural choice for organizations that want to use different engines for different workloads (Spark for batch transformation, Flink for streaming ingestion, Dremio for interactive BI queries) without being locked into one vendor's tools.

Hidden Partitioning (Iceberg's mechanism for automatically applying partition transforms without requiring analysts to include partition columns in their queries) was a significant usability improvement over Hive partitioning, where incorrect queries that missed partition columns caused full table scans silently.

Master the Agentic Lakehouse

Start building today with free trials and authoritative resources.

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon
Apache Iceberg and Agentic AI

Apache Iceberg and Agentic AI

Buy on Amazon
Lakehouse Built for Everyone

Lakehouse Built for Everyone

Buy on Amazon