Role-Based Access Control (RBAC) is an access control model where permissions are assigned to roles, and roles are granted to users or groups, rather than assigning permissions directly to individual users. RBAC is the foundation of data governance in most enterprise lakehouse deployments and is central to the security model of Apache Polaris, Databricks Unity Catalog, and Dremio.
How RBAC Works in the Lakehouse
In a lakehouse RBAC system:
- Permissions are the specific capabilities: SELECT on a table, CREATE TABLE in a namespace, DROP TABLE on a specific dataset.
- Roles are named groupings of permissions: "analyst_us" might have SELECT on all US sales tables, while "data_engineer" has full privileges on staging tables.
- Users or Groups are granted roles. A new analyst joining the US team is granted the "analyst_us" role and immediately has all appropriate permissions with a single administrative action.
RBAC in Apache Polaris and Unity Catalog
Apache Polaris implements a hierarchical RBAC model specifically designed for multi-engine Iceberg environments. Principal roles are granted to users and service accounts. Catalog roles control access to namespaces and tables within a catalog. Because Polaris enforces RBAC at the catalog layer, any engine connecting to Polaris (Spark, Trino, Dremio, DuckDB) automatically inherits the same access controls without requiring per-engine permission configuration.
RBAC for AI Agents
As AI agents increasingly access lakehouse data, RBAC becomes critical for agent governance. An AI agent analyzing customer behavior should be granted a role with SELECT-only access to anonymized behavioral tables, not access to the underlying PII. Treating AI agents as first-class principals in the RBAC system, with carefully scoped roles, ensures that agentic workloads operate within the same governance boundaries as human users.

