Enable External Databricks Analyzer Job

In order to accelerate telemetry analysis and processing, LHO can be configured to use an external Databricks Job instead of using In-App processing.

 

Databricks Analyzer Job (DAJ) Tasks

Databricks Analyzer Job is used for the following tasks

  1. Processing and analyzing telemetry data

  2. Processing cluster-events (including tags and estimated costs)

  3. Persisting data in the LHO database

 

Step 1 Download Files

  1. Download the JAR file for bplm-analyser
    from blobstorage location for LHO libraries

  2. Download from LHO libraries from analysis-module the scala files which contain the notebook files that will be used to configure the Databricks Jobs

Step 2 Create a Databricks job for each DAJ task

  1. each job has a task

  2. attach analyzer JAR file

  3. set DBX Task parameters

image-20240124-163108.png
image-20240124-162940.png

Required: Job cluster must use Java 11

 

Recommended cluster specs: - for analyses: depending on load (4/8 moderate nodes) - for cluster events: SINGLENODE cluster of moderate instance type - for persisting (storage): SINGLENODE cluster of moderate instance type

 

Step 3 Link DAJ Persistence Execution

 

(1) Configure Analyzer Processor Job

copy ID of persisting job
got to Analyzer Processor job
set as Parameter the storageJobId

 

(2) ConfigureAnalyzer Cluster Events Job

copy ID of persisting job
got to Analyzer Cluster Events job
set as Parameter the storageJobId

Step 4 Configure LHO Environment Variables

LHO environment variables

DAJ_WORKSPACE_HOST=<databricks_daj_workspace_host> DAJ_JOB_ID=<daj-analyzer-processor-job-id>,<daj-analyzer-cluster-events-job-id>

e.g.

METRIC_PROCESSOR_ENABLED=false DAJ_WORKSPACE_HOST=dbc-f4095c7c-e857.cloud.databricks.com DAJ_JOB_ID=967951742258704,931886104401413 DAJ_SCHEDULE_CRON_EXPRESSION="0 30 */3 * * ?"

 

Step 5 Validate that DAJ is enabled

Navigate to the analysis management dashboard in LHO to validate that DAJ is enabled.