Modern data privacy regulations impose specific technical obligations on organizations storing and processing personal data. GDPR (EU General Data Protection Regulation, effective 2018) and CCPA (California Consumer Privacy Act, effective 2020) are the most impactful regulations for US and European data lakehouse operators. Both require the ability to delete individual's data, provide data access reports, honor data portability requests, and maintain complete audit trails.

How Iceberg Addresses Compliance

Apache Iceberg provides several capabilities that directly address regulatory requirements. The GDPR right to erasure is fulfilled using Iceberg's row-level delete feature: a targeted DELETE FROM table WHERE user_id = 'subject_id' executes as an ACID transaction, and subsequent compaction physically removes the deleted rows from Parquet files. Time travel supports data access requests: querying a historical snapshot shows what data existed at any past point. Iceberg V3's row lineage (the _row_id system column) makes identifying all rows belonging to a specific data subject more precise. Fine-grained access controls (column masking, row filters) in Dremio and Apache Polaris ensure that personal data is accessible only to authorized personnel, satisfying the data minimization principle.

Master the Agentic Lakehouse

Architecting an Apache Iceberg Lakehouse

Architecting an Apache Iceberg Lakehouse

Buy on Manning
The AI Lakehouse

The AI Lakehouse

Buy on Amazon