Amazon S3 (and similar services like Azure ADLS or Google Cloud Storage) is the foundational storage layer of the modern data lakehouse. It is infinitely scalable, highly durable, and incredibly cheap. However, it was built to store "objects" (like images and videos), not to act as the highly structured file system required by traditional data warehousing engines.

The Legacy Object Storage Problem

Before Apache Iceberg, older technologies like Apache Hive attempted to map traditional database concepts onto S3 using a "directories-as-partitions" model. This caused massive performance and reliability issues:

How Iceberg Fixes S3

Apache Iceberg was explicitly designed at Netflix to solve these exact object storage bottlenecks. It completely abandons the "directories-as-partitions" approach.

Instead of executing expensive ListObjects API calls across S3, an engine querying an Iceberg table simply downloads the Iceberg Manifest Files. These metadata files contain the exact, absolute URI paths to every single Parquet data file that belongs to the current snapshot. The query engine takes these URIs and issues direct, highly parallelized GET requests to S3, completely bypassing the directory listing bottleneck.

Object Storage Layout Optimization

Another S3 limitation is API request throttling. If an engine sends millions of requests to the same S3 prefix (directory) simultaneously, Amazon S3 will throttle the requests, causing severe performance degradation.

Iceberg solves this through an advanced feature called Object Store File Layout. When enabled, Iceberg will intentionally write Parquet files using randomized hashes in the S3 object key (e.g., s3://bucket/table/data/1a2b3c.../file.parquet). This artificially scatters the data across many different underlying S3 partitions, bypassing Amazon's throttling limits and ensuring maximum possible read/write throughput.

Master the Agentic Lakehouse

Start building today with free trials and authoritative resources.

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon
Apache Iceberg and Agentic AI

Apache Iceberg and Agentic AI

Buy on Amazon
Lakehouse Built for Everyone

Lakehouse Built for Everyone

Buy on Amazon