DuckDB is an open-source, in-process analytical database. Unlike traditional databases that run as a separate server process requiring a client-server connection, DuckDB is embedded directly within your application process, similar to SQLite but built specifically for the high-throughput columnar workloads of analytics rather than transactional storage. In 2026, DuckDB has become a first-class citizen of the Apache Iceberg ecosystem.

DuckDB and Apache Iceberg

DuckDB's Iceberg extension (loaded via INSTALL iceberg; LOAD iceberg;) has matured significantly through 2025 and 2026. Key capabilities include:

The Featherweight Lakehouse

The 2026 "featherweight lakehouse" pattern uses DuckDB as the primary query engine for small-to-medium analytical workloads (typically up to 1TB of data on a single machine). By combining DuckDB with an open Iceberg catalog on Amazon S3, small teams can achieve full ACID transactions, time travel, and schema evolution without the platform overhead of traditional managed warehouses. For workloads that outgrow single-machine processing, the same Iceberg tables are immediately accessible to larger, distributed engines like Trino or Dremio without any data migration.

Master the Agentic Lakehouse

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon