2.18.2 Release Notes

 

✨ Release Highlights

  • support exposing costs for Delta Live Table Pipelines that run on Serverless

  • view configuration timeline on All Purpose Cluster in a selected time interval

  • configuration timeline for DLT pipelines

 

📂 Workloads // Workspaces

  • 10793 - add URL param for selection on the Workspaces page

image-20240904-105556.png
support sharing Workspaces URL to a selected workspace

 

🖥️ Workloads // All Purpose Clusters

  • 10718 - Add support for APC configuration timeline

    • view configuration changes made on an All Purpose Cluster in a selected time interval

image-20240904-105742.png
configuration timeline per cluster definition

 

🖥️ Workloads // Cluster Instances

  • 10746 - view cluster instances per Workflow Single Run

    • added a new view per Job Run to quickly view the cluster instances used by that particular job run

  • 10767 - Always display estimated costs for APCs on cluster instances page

 

💠 Workloads // Serverless

  • 10738 - Serverless DLT costs

    • support exposing costs for Delta Live Table Pipelines that run on Serverless

 

🖥️ Workloads // Delta Live Tables

  • 10707 - DLT Cluster changes timeline

 

✅ Incidents & Recommendations

  • 10753 - Expose Incidents&Recommendations for APC instances

  • 10636 - Add recommendation for allowed conflicting incident types
    Some conflicting incidents should never occur (e.g. DRIVER_CPU_OVER_PROVISIONING vs DRIVER_CPU_UNDER_PROVISIONING), because the same metric from the underlying analysis is used to evaluate if these incidents should occur, and it can only be either too high or too low, not both. A ConflictTicketsRecommendationException is thrown in this case.

However, some conflicts (e.g. CLUSTER_CPU_UNDER_PROVISIONING vs CLUSTER_CPU_OVER_PROVISIONING) should be allowed, as this happens due to the different approaches that we currently use to compute the cluster over vs underprovisioning incidents. For overprovisioning we use the median across the entire cluster. For underprovisioning we use the maximum median across the executors. Thus we overall median can remain low (leading to overprovisioning), while the median on a single executor that is constantly under stress can remain high (leading to underprovisioning).

Therefore, instead of throwing an exception for all conflicting recommendations, we'll:
provide the usual scale up/down in/out recommendations for each incident, and add a new recommendation stating "Incidents of both over and under provisioning types were found in this workload. This is usually an opportunity for re-orchestration. Analyze the Cluster instance's Autoscaling Timeline and Activity Histograms for the supporting data evidence.

 

🔔 Incidents Notifications

  • 10772 - set default time to 60 min instead of 20 min for cluster instance total idle time policy

  • 10771 - set default value for APC autoshutdown policy to 30 min instead of 15 min

 

Performance Optimizations

  • 10770 - Improve query performance by evaluating job under-provisioning incidents in app

 

📊 Analysis Optimizations

  • 10776 - Improve idleness detection for APC with cluster names that match job cluster patterns

 

🪩 UX Optimizations

  • 10583 - migrate navigation panel to new design with new icons set

  • 10906 - add "Job Run ID" label when viewing one single Workflow Run

  • 10905 - add "job name" label for workflows runs view

  • 10907 - remove BACK button from Workflows page

 

🐞 Fixed Bugs

  • 10749 - Wrong workspace used when accessing a shared link from the app

  • 8425 - duration not reported properly in consumption / run history