Azure Lakehouse Optimizer (LHO) Deployment Readiness Checklist

To prepare for an installation of Lakehouse Optimizer the following items will be things your team needs to be aware of or make decisions about prior to the deployment working session:

1. Decide which Databricks workspaces should be monitored by LHO

2. Determine which region the LHO infrastructure will be deployed into. Ideally the same region as your Databricks workspaces to avoid cross-region data transfer fees.

Ensure that the region you choose has at least eight available Total Regional vCPUs

3. Understand the Azure resource requirements and ensure you use a subscription and region that allows the Azure resources listed below

These will be necessary for installation but will not be created until the working session for your deployment:

Azure Resource Requirements

4. Resource Provider Registrations

The following resource providers must be registered in the subscription into which the LHO will be deployed:

https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/resource-providers-and-types#register-resource-provider-1

  • Microsoft.Network

  • Microsoft.Storage

  • Microsoft.KeyVault

  • Microsoft.Resources

  • Microsoft.Sql

  • Microsoft.Compute

5. Understand the Required Permissions for Installation & Configuration

The Azure account used to run the LHO installation script must have the following rights already granted in order for the installation process to complete successfully.

  • Resource Group Owner - the user must have the ability to create a resource group or be assigned as owner of the resource group in which LHO resources will be installed

  • Cloud Application Administrator - must have ability to create an App Reigstration in Azure.

  • Application Developer - the user must be assigned Application Developer role in order to be able to create LHO application’s service principal 

  • User Access Administrator Role - The signed in user will grant the application the necessary permissions to load consumption data on a schedule and analyze telemetry data. The signed in user must have at least the UserAccessAdministrator role in the subscription.

  • Databricks Metastore Admin - The user configuring the Optimizer the first time will need to be a Metastore Admin inside of the Databricks Unity Catalog. We recommend creating a group and assign it as the Metastore Admin, add admins as members to this group. This is needed so that the Lakehouse Optimizer init script get’s uploaded to the Databricks Unity Catalog and the Catalog configured to use it.

6. Identify all custom enforced policies in your Azure Subscription.

  • If your subscription has any custom resource policies (e.g: Purge Protection for the Key Vaults) make sure to inform the Blueprint team beforehand.

7. Resource Group Creation

  • Create Resource Group in the region as co-located to your Databricks workspaces.

8. Decide on a DNS name

  • As part of deployment, certificates are automatically created and an Azure DNS record is created for accessing LHO (it looks like this: https://<dns-prefix>.<region>.cloudapp.azure.com) with the dns-prefix being set as a parameter when deploying

  • If the desired DNS name is not a part of the Azure account LHO is deployed into, have someone who can create a DNS in the provider of choice