LHO Application Resource & Permissions Requirements (Azure)
This page provides an overview of the Azure resources required to deploy Lakehouse Optimizer.
If you use the PowerShell script provided in this guide https://blueprinttechnologies.atlassian.net/wiki/spaces/BLMPD/pages/2515566593, the required resources and permissions will be created automatically.
Page Contents:
Resource Group in your Azure tenant
One Resource Group in your Azure tenant (in the same region as your Azure Databricks resources) containing:
Azure Ubuntu Linux VM
OS: Ubuntu Linux 24.04
Recommended Type B8ms or similar with minimum 8 cores, 32G RAM, 50GB of disk space
Docker Engine installed (version 23.0 or later)
Docker Compose installed (version v2.40.0 or later)
Following ports accessible:
80- Internet access (used by LetsEncrypt servers to issue/renew SSL certificate)443- IP restricted HTTPS access (firewall) for locations LHO will be used from22- IP restricted SSH access (firewall) for locations LHO will be managed from
If any Databricks Workspaces have Secure Cluster Connectivity enabled, a private endpoint of type
databricks_ui_apiis required between the LHO VNET and each Secure Azure Databricks WorkspaceAn extra 50GB datadisk attached to LUN 0 (Linux VM disk Volume)
Outbound Internet access: required for system package updates and Blueprint Azure Container Registry access for retrieving LHO docker images (ACR login server:
blueprint.azurecr.io)DNS Name configured on the VMs public IP
SSL Keystore with the SSL Certificate and Private Key, if the LetsEncrypt Certificate is not an option. See https://blueprinttechnologies.atlassian.net/wiki/x/CYDXrg for details.
Azure SQL Server
Recommended Type: S3 w. 400 DTU or higher (Serverless min 2 cores up to 8 as alternative)
SQL login (User & Pass)
Empty Azure/MS SQL Database
Collation on LHO DB:
SQL_Latin1_General_CP1_CI_ASSQL Login user granted the following permissions:
db_ddladmindb_datareaderdb_datawriter
Azure Key Vault
Recommended Type: Standard
Recommended permission model: Vault Access Policy
List,GetandSetpermissions granted for Secrets to the VM’s System-Assigned Managed Identityif RBAC is required instead of Vault Access Policy grant the
Key Vault Secrets Officerrole to the VM’s System-Assigned Managed Identity
the following secrets created
msft-provider-auth-secret- the Microsoft EntraID App Registration client secret (for Azure Active Directory Single Sign On into the application as well as the LHO Application and Agent to access the Azure Storage account storing telemetry data)storage-account-key- (only needed if accessing storage account through the Service Principal is not an option) The storage account access key which will be used by the LHO Service Principal and LHO Agent running in each Databricks workspace cluster to access (read/write) the Azure Storage account.mssql-password- the SQL Login passwordapplication-encryption-secret- the secret used to encrypt data inside the LHO database (any random string)
Azure Storage Account
Storage account key accessenabled (if access key is used instead of Service Principal)Hierarchical namespaceenabledService Principal granted the following roles at the storage account level :
Storage Table Data ContributorStorage Queue Data Contributor
Network access required from each Databricks workspace (VNet of the managed resource group) into the LHO storage account since the LHO agent running in the data plane of each Dbx workspace will write telemetry data to azure tables and send message to the azure queues
Microsoft EntraID App Registration/Service Principal
Permissions granted as per https://blueprinttechnologies.atlassian.net/wiki/x/TAArnw
Permission to list Databricks workspaces granted as per https://blueprinttechnologies.atlassian.net/wiki/spaces/BLMPD/pages/3663888385/Initial+Setup+and+Configuration+Azure#Assign-workspace-read-permissions-via-Azure-AD-custom-role
Role based authorization in the LHO app can be configured through Azure Custom App Roles created as per https://blueprinttechnologies.atlassian.net/wiki/spaces/BLMPD/pages/2670493810
Billing Readerrole granted on every Azure Subscription (or scoped to every Resource Group, data plane and control plane/managed resource group) that contains Databricks workspaces. We recommend granting the role to the entire subscription so that any new workspace created will be captured by LHO during consumption data loading. LHO will filter our non-Databricks data.
Additional information can be found in https://blueprinttechnologies.atlassian.net/wiki/spaces/BLMPD/pages/35744645131 Web Platform configuration with:
redirect URI:
https://<VM-DNS>/login/oauth2/code/azureID Tokenenabled
Granted
Adminrole in all Databricks workspacesSystem table permissions:
USE_CATALOG,USE_SCHEMAandSELECTgranted on the following tables in thesystemcatalog:billing.usagequery.historycompute.warehouse_eventscompute.warehouses
Enable All Workspacesoption set on themaincatalogConfiguration of support for Unity Catalog can be performed by a Metastore Admin user from the LHO Web Interface, provisioning area.
If this is not an option, or the metastore is not configured properly, follow this document for full manual configuration: https://blueprinttechnologies.atlassian.net/wiki/x/LoDNo
Referenced Pages: