Azure Databricks — Databricks Administration using Terraform Integration
Problem Statement
Sometimes managing Databricks resources can be cumbersome/complex, especially in Development/UAT environment where lot of frequent updates/changes are necessary. Automating this process with Infrastructure as Code (IaC) tool like Terraform and with Github pipelines helps you to maintain consistency and Security.
Here I am going to talk about the approach to automate different Databricks Tasks using Github Actions.
Approach
In general logic I am going to create GitHub Actions workflow and one of the important thing to avoid using DB PAT token usage. Rather I want to use ServicePrincipal with certificate/Secret usage.
Pre-requisites
Some of the pre-requisites which you need to do before you go further with your Github Actions approach.
- Make sure you add your ClientID, Client SPN ID, Tenant_ID, Databricks_host, Client secret in your Github Action secrets.
- Next make sure to set right scope.
- Make sure you define your TF Storage account and it should have right TF backend storage key set.
- Valid ADB Instance URL.
Workflow
Lets start with the actual Github Actions workflow, In my workflow I am defining Workflow call along with Environment variables which are going to be…