Enable External Databricks Analyzer Job
In order to accelerate telemetry analysis and processing, LHO can be configured to use an external Databricks Job instead of using In-App processing.
Databricks Analyzer Job (DAJ) Tasks
Databricks Analyzer Job is used for the following tasks
Processing and analyzing telemetry data
Processing cluster-events (including tags and estimated costs)
Persisting data in the LHO database
Step 1 Download Files
Download the JAR file for bplm-analyser
from blobstorage location for LHO librariesDownload from LHO libraries from
analysis-module
the scala files which contain the notebook files that will be used to configure the Databricks Jobs
Step 2 Create a Databricks job for each DAJ task
each job has a task
attach analyzer JAR file
set DBX Task parameters
Required: Job cluster must use Java 11
Recommended cluster specs:
- for analyses: depending on load (4/8 moderate nodes)
- for cluster events: SINGLENODE cluster of moderate instance type
- for persisting (storage): SINGLENODE cluster of moderate instance type
Step 3 Link DAJ Persistence Execution
(1) Configure Analyzer Processor Job
copy ID of persisting job
got to Analyzer Processor job
set as Parameter the storageJobId
(2) ConfigureAnalyzer Cluster Events Job
copy ID of persisting job
got to Analyzer Cluster Events job
set as Parameter the storageJobId
Step 4 Configure LHO Environment Variables
LHO environment variables
DAJ_WORKSPACE_HOST=<databricks_daj_workspace_host>
DAJ_JOB_ID=<daj-analyzer-processor-job-id>,<daj-analyzer-cluster-events-job-id>
e.g.
METRIC_PROCESSOR_ENABLED=false
DAJ_WORKSPACE_HOST=dbc-f4095c7c-e857.cloud.databricks.com
DAJ_JOB_ID=967951742258704,931886104401413
DAJ_SCHEDULE_CRON_EXPRESSION="0 30 */3 * * ?"
Step 5 Validate that DAJ is enabled
Navigate to the analysis management dashboard in LHO to validate that DAJ is enabled.