Column-Level Security (CLS) is the governance control that restricts which users or roles can see specific columns within a table. Rather than creating separate, sanitized copies of tables for different user groups (an expensive and operationally complex approach), CLS enforces column visibility at query time: the query engine checks the requesting user's permissions against the requested columns and either returns the column data, masks it, or excludes the column from the result entirely.
Why Column-Level Security Matters
Real-world enterprise tables often contain sensitive data alongside non-sensitive operational data in the same row. A customer orders table might include: order ID (non-sensitive), product SKU (non-sensitive), quantity (non-sensitive), customer SSN (highly sensitive PII), salary bracket (sensitive), and credit card token (sensitive). Most analysts need access to the first three columns for sales analysis but should never see the latter three. Without CLS, the only option would be creating separate, de-identified views or table copies for different user groups, multiplying storage costs and maintenance burden.
CLS Implementation in Lakehouse Platforms
Column-level security in lakehouse environments is enforced at the query layer:
- Dremio: Implements column masking policies and grants that hide or mask specific columns from specific roles. A masking policy on the SSN column might return '***-**-****' for all non-HR users while returning the real value for authorized HR analysts, all transparently within the query execution.
- Apache Polaris: Column-level privileges can be granted or revoked per catalog namespace, allowing fine-grained control over which engines and users can access specific columns in shared Iceberg catalogs.
- Databricks Unity Catalog: Supports column masks and row filters as SQL expressions evaluated at query time for Iceberg and Delta tables.

