In Apache Iceberg, data skipping is powered by the min/max statistics stored in the manifest files. If data is randomly scattered across thousands of Parquet files, a query engine has to open and scan almost all of them because the min/max range of every file will likely overlap with the query's filter. To fix this, data engineers use compaction to sort the data, grouping similar values into the same physical files.

The Limitation of Linear Sorting

Traditional sorting is hierarchical and linear. If you sort a table by country, and then by city, the data is perfectly clustered for a query filtering by country. However, if a user queries the table filtering only by city = 'Paris', the query engine will still have to scan many files because "Paris" is scattered across the files associated with the "France" country block. Linear sorting strongly biases performance toward the first column in the sort key.

What is Z-Ordering?

Z-Ordering (or Z-curve routing) is an advanced space-filling curve mathematical technique used during Iceberg table compaction. Instead of sorting hierarchically, Z-Ordering maps multi-dimensional data into a single dimension, interleaving the binary representation of the values from multiple columns.

This creates a layout where data points that are logically close in multiple dimensions (e.g., both country and city) are stored physically close together on the disk. Z-Ordering eliminates the bias of hierarchical sorting, giving equal sorting weight to all columns included in the Z-Order expression.

When to Use Z-Ordering

Z-Ordering is computationally expensive to execute during a compaction job. Therefore, it is best applied under specific conditions:

By applying Z-Ordering to the most frequently filtered columns, an organization can dramatically enhance Iceberg's data skipping efficiency, resulting in faster query execution and lower compute costs.

Master the Agentic Lakehouse

Start building today with free trials and authoritative resources.

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon
Apache Iceberg and Agentic AI

Apache Iceberg and Agentic AI

Buy on Amazon
Lakehouse Built for Everyone

Lakehouse Built for Everyone

Buy on Amazon