Snowflake Query Engine

Snowflake's query engine is built around a three-layer architecture that strictly decouples storage, compute, and cloud services, enabling independent scaling of each tier. This architectural choice was foundational to Snowflake's commercial success and set the template for cloud-native data warehouse design.

The Three Layers

Database Storage Layer: Snowflake stores all data in a proprietary, optimized columnar format on cloud object storage (S3, Azure Blob, or GCS). Data is automatically divided into micro-partitions, which are immutable, compressed blocks of columnar data typically ranging from 50MB to 500MB. Snowflake automatically tracks min/max statistics for every column in every micro-partition, enabling aggressive data pruning without manual indexing.
Compute Layer (Virtual Warehouses): Virtual Warehouses are independent clusters of compute nodes. Different teams can run different warehouses against the same underlying data simultaneously with no resource contention. Warehouses scale up (larger nodes) for heavy single queries and scale out (more nodes) for high concurrency.
Cloud Services Layer: This is Snowflake's "brain," managing authentication, access control, transaction management, and query optimization. The cost-based query optimizer in this layer analyzes micro-partition statistics to build optimal execution plans.

Gen2 Runtime and Snowflake Optima (2025)

Snowflake introduced its Gen2 Warehouse Runtime in 2025, delivering significant performance improvements including up to 5.5x faster DML operations (DELETE, UPDATE, MERGE) and up to 1.8x faster core analytical aggregations. Simultaneously, Snowflake Optima became generally available as an autonomous, continuous optimization engine. Optima acts as a virtual DBA, continuously monitoring workload patterns and proactively optimizing the data layout, micro-partition clustering, and metadata management without requiring manual administrator intervention.

Iceberg Integration

Snowflake's query engine increasingly treats Apache Iceberg tables as first-class citizens alongside its native tables. The same vectorized, push-based execution engine and automatic clustering capabilities that optimize native Snowflake tables are now extended to Iceberg tables, enabling organizations to benefit from Snowflake's performance and governance while maintaining open-format data ownership on their own cloud storage.

The Three Layers

Gen2 Runtime and Snowflake Optima (2025)

Iceberg Integration

Master the Agentic Lakehouse

Start Your Free Dremio Trial

Architecting an Apache Iceberg Lakehouse

The AI Lakehouse

Snowflake Query Engine

The Three Layers

Gen2 Runtime and Snowflake Optima (2025)

Iceberg Integration

Related Articles

Master the Agentic Lakehouse

Start Your Free Dremio Trial

Architecting an Apache Iceberg Lakehouse

The AI Lakehouse