DuckDB is an open-source, in-process analytical database. Unlike traditional databases that run as a separate server process requiring a client-server connection, DuckDB is embedded directly within your application process, similar to SQLite but built specifically for the high-throughput columnar workloads of analytics rather than transactional storage. In 2026, DuckDB has become a first-class citizen of the Apache Iceberg ecosystem.
DuckDB and Apache Iceberg
DuckDB's Iceberg extension (loaded via INSTALL iceberg; LOAD iceberg;) has matured significantly through 2025 and 2026. Key capabilities include:
- Full Read/Write DML: DuckDB now supports INSERT, UPDATE, and DELETE operations on Iceberg tables, allowing it to serve as a lightweight Iceberg write engine for applications that don't need the overhead of a full Spark cluster.
- REST Catalog Integration: DuckDB connects directly to Iceberg REST Catalogs (like Apache Polaris or Amazon S3 Tables) via the
ATTACHcommand, providing a zero-infrastructure, serverless analytics experience. - Browser Support via WASM: The DuckDB-Wasm build added Iceberg support in late 2025/early 2026, enabling end-to-end Iceberg table exploration directly in a web browser without any server or local installation.
The Featherweight Lakehouse
The 2026 "featherweight lakehouse" pattern uses DuckDB as the primary query engine for small-to-medium analytical workloads (typically up to 1TB of data on a single machine). By combining DuckDB with an open Iceberg catalog on Amazon S3, small teams can achieve full ACID transactions, time travel, and schema evolution without the platform overhead of traditional managed warehouses. For workloads that outgrow single-machine processing, the same Iceberg tables are immediately accessible to larger, distributed engines like Trino or Dremio without any data migration.

