2.18.2 Release Notes

✨ Release Highlights

support exposing costs for Delta Live Table Pipelines that run on Serverless
view configuration timeline on All Purpose Cluster in a selected time interval
configuration timeline for DLT pipelines

📂 Workloads // Workspaces

10793 - add URL param for selection on the Workspaces page

support sharing Workspaces URL to a selected workspace

🖥️ Workloads // All Purpose Clusters

10718 - Add support for APC configuration timeline
- view configuration changes made on an All Purpose Cluster in a selected time interval

configuration timeline per cluster definition

🖥️ Workloads // Cluster Instances

10746 - view cluster instances per Workflow Single Run
- added a new view per Job Run to quickly view the cluster instances used by that particular job run

10767 - Always display estimated costs for APCs on cluster instances page

💠 Workloads // Serverless

10738 - Serverless DLT costs
- support exposing costs for Delta Live Table Pipelines that run on Serverless

🖥️ Workloads // Delta Live Tables

10707 - DLT Cluster changes timeline

configuration timeline for DLT pipelines

✅ Incidents & Recommendations

10753 - Expose Incidents&Recommendations for APC instances
10636 - Add recommendation for allowed conflicting incident types
Some conflicting incidents should never occur (e.g. DRIVER_CPU_OVER_PROVISIONING vs DRIVER_CPU_UNDER_PROVISIONING), because the same metric from the underlying analysis is used to evaluate if these incidents should occur, and it can only be either too high or too low, not both. A ConflictTicketsRecommendationException is thrown in this case.

However, some conflicts (e.g. CLUSTER_CPU_UNDER_PROVISIONING vs CLUSTER_CPU_OVER_PROVISIONING) should be allowed, as this happens due to the different approaches that we currently use to compute the cluster over vs underprovisioning incidents. For overprovisioning we use the median across the entire cluster. For underprovisioning we use the maximum median across the executors. Thus we overall median can remain low (leading to overprovisioning), while the median on a single executor that is constantly under stress can remain high (leading to underprovisioning).

Therefore, instead of throwing an exception for all conflicting recommendations, we'll:
provide the usual scale up/down in/out recommendations for each incident, and add a new recommendation stating "Incidents of both over and under provisioning types were found in this workload. This is usually an opportunity for re-orchestration. Analyze the Cluster instance's Autoscaling Timeline and Activity Histograms for the supporting data evidence.

🔔 Incidents Notifications

10772 - set default time to 60 min instead of 20 min for cluster instance total idle time policy
10771 - set default value for APC autoshutdown policy to 30 min instead of 15 min

Performance Optimizations

10770 - Improve query performance by evaluating job under-provisioning incidents in app

📊 Analysis Optimizations

10776 - Improve idleness detection for APC with cluster names that match job cluster patterns

🪩 UX Optimizations

10583 - migrate navigation panel to new design with new icons set
10906 - add "Job Run ID" label when viewing one single Workflow Run
10905 - add "job name" label for workflows runs view
10907 - remove BACK button from Workflows page

🐞 Fixed Bugs

10749 - Wrong workspace used when accessing a shared link from the app
8425 - duration not reported properly in consumption / run history