The Modern Data Stack (MDS) describes the category of cloud-native, composable data tools that emerged from roughly 2016 onward to replace the previous generation of monolithic, on-premise data warehouses and proprietary ETL platforms. The MDS philosophy favors specialized, best-of-breed SaaS tools connected by standard APIs over vertically integrated platforms that lock every capability into a single vendor.

The Classic MDS Components

The canonical Modern Data Stack of 2020-2022 typically included:

The Lakehouse Evolution

The Modern Data Stack is evolving. The proprietary cloud warehouse at the center of the classic MDS is increasingly being replaced by the Open Lakehouse (Iceberg tables in object storage, queried through Dremio), which addresses the MDS's two primary weaknesses: storage cost and ML/AI workload support. In the evolved MDS, dbt writes transformation models to Iceberg tables rather than to Snowflake schemas, and the same Iceberg data feeds both BI dashboards and ML training pipelines without ETL duplication.

Where AI Enters

The third generation of the Modern Data Stack incorporates AI at multiple layers. AI-powered ingestion tools use LLMs to map source schemas to destination schemas automatically, reducing configuration time for new connectors. AI transformation assistants suggest dbt model code based on natural language descriptions of the desired output. AI agents query the final gold-tier Iceberg tables through the execution engine to answer business questions in real time. This AI-integrated MDS, built on an open lakehouse foundation, is the architecture that the Agentic Lakehouse represents.

Master the Agentic Lakehouse

Start building today with free trials and authoritative resources.

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon
Apache Iceberg and Agentic AI

Apache Iceberg and Agentic AI

Buy on Amazon
Lakehouse Built for Everyone

Lakehouse Built for Everyone

Buy on Amazon