Azure LHO Deployment Readiness Checklist
Table of contents:
- 1 1. Decide which Databricks workspaces should be monitored by LHO
- 2 2. Determine which region the LHO infrastructure will be deployed into.
- 3 3. Understand the Azure resource requirements and ensure you use a subscription and region that allows the Azure resources listed below
- 4 4. Resource Provider Registrations
- 5 5. Understand the Required Permissions for Installation & Configuration
- 6 6. Identify all custom enforced policies in your Azure Subscription.
- 7 7. Resource Group Location
- 8 8. Decide on a DNS name
To prepare for an installation of Lakehouse Optimizer the following items will be things your team needs to be aware of or make decisions about prior to the deployment working session:
1. Decide which Databricks workspaces should be monitored by LHO
2. Determine which region the LHO infrastructure will be deployed into.
Ideally the same region as your Databricks workspaces to avoid cross-region data transfer fees.
Ensure that the region you choose has at least eight available Total Regional vCPUs
3. Understand the Azure resource requirements and ensure you use a subscription and region that allows the Azure resources listed below
These will be necessary for installation but will not be created until the working session for your deployment:
4. Resource Provider Registrations
The following resource providers must be registered in the subscription into which the LHO will be deployed:
Microsoft.Network
Microsoft.Storage
Microsoft.KeyVault
Microsoft.Resources
Microsoft.Sql
Microsoft.Compute
5. Understand the Required Permissions for Installation & Configuration
The Deployer Azure User that will run the LHO installation script must have the following rights already granted in order for the installation process to complete successfully:
Resource Group Owner
the user must have the ability to create a resource group or be assigned as owner of the resource group in which LHO resources will be installed.
Cloud Application Administrator
the user must have ability to create an App Registration in Azure
Application Developer
the user must be assigned Application Developer role in order to be able to create LHO application’s service principal.
UserAccessAdministrator Role
The signed in user will grant the application the necessary permissions to load consumption data on a schedule and analyze telemetry data during https://blueprinttechnologies.atlassian.net/wiki/spaces/BLMPD/pages/3663888385 .
The signed in user must have at least the UserAccessAdministrator role in the subscription.
Databricks Metastore Admin
The user configuring the LHO the first time will need to be a Metastore Admin inside of the Databricks Unity Catalog.
We recommend creating a group and assign it as the Metastore Admin, add admins as members to this group.
Databricks CREATE_VOLUME for main catalog
The user configuring the LHO the first time will need to have the CREATE_VOLUME permission on the main catalog.
The last two permissions are needed so that the Lakehouse Optimizer init script get’s uploaded to the Databricks Unity Catalog and the Catalog configured to use it.
6. Identify all custom enforced policies in your Azure Subscription.
If your subscription has any custom resource policies (e.g: Purge Protection for the Key Vaults) make sure to inform the Blueprint team beforehand.
7. Resource Group Location
Resource Group in the same region as most if not all your Databricks workspaces.
8. Decide on a DNS name
As part of full deployment script (powershell), SSL certificates are automatically created (LetsEncrypt/Certbot) and an Azure DNS record is created for accessing LHO (it looks like this: https://<dns-prefix>.<region>.cloudapp.azure.com) with the dns-prefix being set as a parameter when deploying
If the desired DNS name is not a part of the Azure tenant LHO is deployed into, have someone create a DNS record in your DNS provider of choice