Dagster is a modern data orchestration platform built around the concept of software-defined assets (SDAs), a significant conceptual shift from traditional task-based orchestrators like Airflow. Where Airflow models pipelines as a series of tasks to execute, Dagster models them as a set of data assets to produce. Each dbt model, each Iceberg table, and each ML feature is an explicit asset with declared dependencies, owners, freshness requirements, and quality checks.

Software-Defined Assets and Iceberg

In Dagster, an Iceberg table is an asset. A dbt model that reads from that table and produces a Gold-layer table is a downstream asset. Dagster automatically infers the asset graph, showing data engineers a visual lineage of all data assets and their dependencies. When an upstream Iceberg table is refreshed by a new CDC snapshot, Dagster can automatically schedule downstream dbt models to run, propagating freshness through the pipeline graph. This asset-centric model aligns naturally with the Iceberg lakehouse's table-first organization.

Master the Agentic Lakehouse

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon