Provisioning with Unity Catalog Enabled

Provisioning with Unity Catalog Enabled

Introduction

Compute with Shared Access mode are only available when Unity Catalog is enabled for the workspace. Global Init Scripts are not executed for shared access mode clusters, which means the LHO can only enable monitoring for them by means of Cluster-Scoped Init Scripts stored in Unity Catalog volumes and Spark configurations and tags at the cluster level.

Moreover, loading init scripts from UC volumes is only supported in Databricks Runtimes 13.3 and more recent.

cluster-init-script - is a LHO script that copies the LHO Telemetry Agent jar from DBFS to the storage area accessible by clusters, which can be either in /Volumes or in /Workspace (workspace files) depending on Unity Catalog setup.

Automatically enabling Unity Catalog

A metastore admin has the option to automatically enable LHO with Unity Catalog by following these steps:

  1. In LHO, unde Settings → Provisioning and permissions click on upload unde Volumes Init Script

image-20250626-110731.png
  1. Once this step completes succesfully the init script should be uploaded and you can check it in your Catalog under /Volumes/main/default/bp-lakehouse-monitor/script/init_script.sh

Using different catalogs

Should you not want to use the main catalog for this, before enabling you need to update an environment variable for LHO and restart.

  1. SSH into the LHO vm and locate the .env file

  2. When editing this file either locate or add the WORKSPACES_OVERRIDES environment variable. This variable tells LHO what path to use to upload the init script. It supports per workspace settings as you can see in this example:

WORKSPACES_OVERRIDES={workspaceId1}:/Volumes/path/to/init-script.sh,{workspaceId2}:/Volumes/path/to/init-script.sh
  1. After saving the file make sure to restart the Lakehouse Optimizer container for the new settings to take effect

docker restart bplm
  1. Once the app restarts, the metastore admin can upload the init script as outlined in the section above by clicking on Upload in the Settings → Provisioning and permissions section.

Manually enabling Unity Catalog

Should enabling and upload of the init script from the LHO app not be acceptable, the admin needs to upload the cluster-init-script in /Volumes by following the following steps.

 

Step 1) Use the following link to download the LHO cluster-init-script

  • https://bplmdemoappstg.blob.core.windows.net/libraries/LHO_VERSION/init_script.sh

    • e.g. https://bplmdemoappstg.blob.core.windows.net/libraries/2.27.1/init_script.sh

 

Step 2) Prepare storage path in Catalog Explorer

storage path
  • open Catalog Explorer

  • select schema: main/default. If using the main catalog is not an option you can choose/create a different one but be sure to grant proper access to it.

  • create volume: bp-lakehouse-monitor

The catalog, schema and name of the volume are configurable and these are default values.

Users that own shared clusters should be granted access to the configured catalog, schema and volume configured for the init script to be loaded from.

Workspaces to be monitored should also be configured to access the configured catalog.

 

Step 3) Upload script to

/Volumes/main/default/bp-lakehouse-monitor/script/

Note that if you’re not using the main catalog the path will be specific to your catalog.

The complete path should look like:

/Volumes/main/default/bp-lakehouse-monitor/script/init_script.sh

 

Grant cluster owner Users Read Access to Volume

  • Click on the bp-lakehouse-monitor volume in ‘Catalog Explorer’

  • Select Permissions

  • Click on Grant

  • Select READ VOLUME and add ‘Account Users’ principal (a more narrow group of users that own shared clusters can be configured instead. The same group should also have access to the catalog and schema where the volume is created).

  • Click on Grant

 

The storage location backing the configured catalog and schema should be accessible from every workspace to be monitored.

 

Allow init script execution

Step 1) In Databricks

  • open Catalog

  • open Metastore details

  • open Allowed JARs/Init Scripts

  • add a new entry for

/Volumes/main/default/bp-lakehouse-monitor/script/init_script.sh

Unity Catalog Disabled

This configuration step is done automatically by LHO when LHO Telemetry Agent is updated.

LHO copies the cluster-init-script into the storage area accessible by the cluster, i.e. the Databricks workspace files

/Workspace/bp-lakehouse-monitor/script/init_script.sh

 

Note: permissions to this file must be granted manually to All Users in the workspace!

 

 


Related articles