Amazon S3 (Simple Storage Service) is the most widely deployed data lake storage backend in the world. Its combination of unlimited capacity, eleven nines of durability, and cent-per-GB pricing made it the de facto home for enterprise data that doesn't fit in a traditional database. Building a well-structured S3 Data Lake is the first step toward a production Data Lakehouse.

Bucket and Prefix Design

S3 organizes objects using buckets and key prefixes. A well-designed S3 Data Lake separates data by domain and processing tier. A common pattern uses a single bucket with prefixes that mirror the Medallion Architecture:

IAM and Access Control

AWS Identity and Access Management (IAM) policies govern which principals (users, roles, AI agent task execution roles) can read or write specific S3 prefixes. In a lakehouse context, IAM policies should be organized by data tier: the AI agent's IAM role gets read access to gold-tier prefixes and no write permissions. ETL pipeline roles get write access to bronze and silver tiers. Data stewards get write access to the metadata prefix for catalog operations. These IAM boundaries complement (but do not replace) the row-level and column-level security enforced by the query engine and catalog.

The Upgrade Path: From S3 Data Lake to Iceberg Lakehouse

Many organizations already have existing data in S3 in Parquet format. The migration to an Apache Iceberg Lakehouse does not require rewriting the data files. Iceberg can register existing Parquet files into a new Iceberg table via a metadata-only operation that creates the necessary manifest and manifest list files pointing to the existing Parquet objects. After registration, the S3 files are queryable as a governed Iceberg table with full time-travel, schema evolution, and access control capabilities, without a single byte of data being copied.

Master the Agentic Lakehouse

Start building today with free trials and authoritative resources.

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon
Apache Iceberg and Agentic AI

Apache Iceberg and Agentic AI

Buy on Amazon
Lakehouse Built for Everyone

Lakehouse Built for Everyone

Buy on Amazon