Effective data caching is one of the most impactful levers available for achieving sub-second analytical performance on a cloud object storage-based lakehouse. Because S3 and similar services impose latency and throughput constraints fundamentally different from local disk, sophisticated lakehouse engines implement multiple caching layers at different levels of the stack.
The Caching Hierarchy
Modern lakehouse query engines manage a caching hierarchy that mirrors the CPU memory hierarchy (L1/L2 cache, RAM, SSD, HDD):
- Result Cache (L1 equivalent): The fastest and smallest. Stores complete query results for recently executed queries. Invalidated when the underlying Iceberg table receives a new snapshot. Serves identical repeated queries in milliseconds with zero compute work.
- Columnar NVMe Cache (L2 equivalent): Stores frequently accessed raw Parquet column chunks on local NVMe SSDs attached to executor nodes. Examples include Dremio's C3 cache and Alluxio (used with Trino). Reads are 10-100x faster than S3 and serve different queries accessing the same underlying data.
- Distributed Memory Cache (RAM equivalent): Some engines maintain in-memory pools of hot data across the cluster. These provide the lowest latency but are volatile (lost on node restart) and expensive per GB at cloud memory pricing.
- Metadata Cache: Stores Iceberg manifest files, manifest lists, and schema information in memory, eliminating repeated S3 reads just to plan a query. Critical for tables with many small partitions where metadata reads are a significant fraction of query planning time.
Cache Invalidation and Iceberg
Iceberg's immutable snapshot model simplifies cache invalidation significantly. Because historical snapshot metadata files never change (only new snapshot files are added), any cached metadata for a specific snapshot ID remains valid indefinitely. When the table advances to a new snapshot, only caches referencing live data need invalidation, and the engine can use the new snapshot's version-controlled manifest files to precisely identify which data files are new vs. unchanged.

