/
Lakehouse Optimizer Incidents and Notifications Configuration

Lakehouse Optimizer Incidents and Notifications Configuration

The Lakehouse Optimizer empowers users to monitor and improve their Lakehouse infrastructure by configuring incidents of interest that identify inefficiencies in cost, performance, and operational metrics. Incidents are displayed in the Incidents section of the app, providing actionable insights to optimize resource utilization and expenditures.

image-20250116-211206.png

 

Incident Configuration Steps

  1. Under the Settings menu, select Settings / Incident Policies

    Incident Policies.png
    1. Incidents can be defined within Subscriptions, Workspaces, Workflows, All Purpose Compute, Delta Live Tables, SQL Warehouses, Pools, and Job Compute areas.

    2. Each incident created has its own incident policy

  2. Select the area of interest to create an incident under (ex. Workflow).



  3. Select category - Cost Control or Performance

     

  4. Select sub-category from dropdown menu (ex. Over Provisioning)



  5. Select sub-option within Sub-Category (ex. Cluster CPU Over Provisioning)

 

  1. Select +ADD button under Incident Rules section

 

  1. Specify threshold for the incident rule (ex. Cluster CPU under 60% for any run) and save changes.

    a. multiple rules can be set for each incident policy (ex. if a user wants an incident created each time Cluster CPU for any run is under 60%, 65%, or 70%)

b. When an incident rule has been met, an Incident Ticket is created automatically (and corresponding incident will appear in the Incidents view.

  1. Every incident rule for a given Incident Policy automatically has email notifications turned on. However, by default no email group is tied to the incident rule, thus email notifications will not be sent until an email group(s) is selected for a given incident rule.

  2. To assign an email group(s) to an incident rule, select the Email Group dropdown and select email group(s) (ex. Leadership Team, lho-support)


a. You will now see Leadership Team and lho-support displayed for Email Groups. Now save your changes to the incident rule.

  1. To configure additional incident policies, you can navigate to Incident Policies under Settings or select the dropdown at the top of the screen and choose a different area of interest.

     

The following incidents are configurable in LHO:

Entity

Incident

Entity

Incident

Subscriptions

Monthly Cost above Threshold

Workspaces

Monthly Cost above Threshold

Workflows

Monthly Cost above Threshold

Workflows

  • Over-Provisioning

    • Cluster CPU

    • Driver Memory

    • Driver CPU

    • Driver Memory

Workflows

  • Under-Provisioning

    • Cluster CPU

    • Driver Memory

    • Driver CPU

    • Driver Memory

Workflows

  • Imbalanced-Provisioning

    • CPU

    • Memory

Workflows

Bad Skew

Workflows

Disk Spillage

Workflows

Run Failure

Workflows

Job with All Purpose Clusters

Delta Live Tables

Over Provisioning

  • Cluster CPU

  • Cluster Memory

Delta Live Tables

Under Provisioning

  • Cluster CPU

  • Cluster Memory

Delta Live Tables

Update Failure

Delta Live Tables

Monthly Cost above Threshold

All Purpose Clusters

Monthly Cost above Threshold

All Purpose Clusters

Auto Shutdown Timeout

  • Shutdown Timeout above Threshold

  • Shutdown Timeout Missing

All Purpose Clusters

Total Idle Time above Threshold

All Purpose Clusters

  • Over-Provisioning

    • Cluster CPU

    • Driver Memory

    • Driver CPU

    • Driver Memory

All Purpose Clusters

  • Under-Provisioning

    • Cluster CPU

    • Driver Memory

    • Driver CPU

    • Driver Memory

Pools

Auto Shutdown Timeout

  • Shutdown Timeout above Threshold

 

Email Group Configuration Steps

  1. To create a new email group, select Settings / Email Notifications

  2. Select +Add Group for Custom Groups

  3. Specify group name (ex. Blueprint Test) and Save

  4. Select +Add Email for the email group

  5. Specify an email account and Add

    1. Repeat process to include additional email accounts for a group

  6. See steps #8 and #9 under Incident Configuration Steps above on how to assign an email group to an incident policy.

 

 

 

 

Related articles