Governance Policy as Code applies the Infrastructure as Code philosophy to data governance: instead of managing access control, masking rules, data quality expectations, and compliance policies through manual GUI clicks in a governance platform, these policies are defined as code (YAML, HCL, Python, SQL), stored in version control (Git), reviewed via pull requests, and deployed automatically through CI/CD pipelines. This transforms governance from an ad-hoc administrative activity into a rigorous, auditable engineering discipline.
Why Policy as Code Matters
- Auditability: Every policy change is a Git commit with an author, timestamp, and justification. Compliance auditors can see exactly who changed which policy, when, and why—directly from the version history.
- Repeatability: Policies applied via code are consistent across environments. The same access control definitions applied to production are replicated exactly to staging.
- Drift Detection: CI pipelines can detect when the actual state of governance policies drifts from the desired state defined in code, automatically raising alerts or applying corrections.
- Peer Review: A new masking policy or access grant requires a pull request, enabling data governance teams to review and approve changes before they take effect.
Implementation Patterns
Apache Polaris implements governance through REST API calls that can be wrapped in Terraform providers or custom CLI tools, enabling catalog permissions to be managed as code. Great Expectations and Soda Core store data quality expectations as YAML files committed to Git. Dremio's role and permission management can be exported and managed via the REST API, enabling IaC workflows. Open Policy Agent (OPA) provides a general-purpose policy language (Rego) for expressing data access policies that can be evaluated programmatically by catalog and query engine integrations.

