2.18.2 Release Notes
✨ Release Highlights
support exposing costs for Delta Live Table Pipelines that run on Serverless
view configuration timeline on All Purpose Cluster in a selected time interval
configuration timeline for DLT pipelines
📂 Workloads // Workspaces
10793 - add URL param for selection on the Workspaces page
🖥️ Workloads // All Purpose Clusters
10718 - Add support for APC configuration timeline
view configuration changes made on an All Purpose Cluster in a selected time interval
🖥️ Workloads // Cluster Instances
10746 - view cluster instances per Workflow Single Run
added a new view per Job Run to quickly view the cluster instances used by that particular job run
10767 - Always display estimated costs for APCs on cluster instances page
💠 Workloads // Serverless
10738 - Serverless DLT costs
support exposing costs for Delta Live Table Pipelines that run on Serverless
🖥️ Workloads // Delta Live Tables
10707 - DLT Cluster changes timeline
✅ Incidents & Recommendations
10753 - Expose Incidents&Recommendations for APC instances
10636 - Add recommendation for allowed conflicting incident types
Some conflicting incidents should never occur (e.g. DRIVER_CPU_OVER_PROVISIONING vs DRIVER_CPU_UNDER_PROVISIONING), because the same metric from the underlying analysis is used to evaluate if these incidents should occur, and it can only be either too high or too low, not both. AConflictTicketsRecommendationException
is thrown in this case.
However, some conflicts (e.g. CLUSTER_CPU_UNDER_PROVISIONING vs CLUSTER_CPU_OVER_PROVISIONING) should be allowed, as this happens due to the different approaches that we currently use to compute the cluster over vs underprovisioning incidents. For overprovisioning we use the median across the entire cluster. For underprovisioning we use the maximum median across the executors. Thus we overall median can remain low (leading to overprovisioning), while the median on a single executor that is constantly under stress can remain high (leading to underprovisioning).
Therefore, instead of throwing an exception for all conflicting recommendations, we'll:
provide the usual scale up/down in/out recommendations for each incident, and add a new recommendation stating "Incidents of both over and under provisioning types were found in this workload. This is usually an opportunity for re-orchestration. Analyze the Cluster instance's Autoscaling Timeline and Activity Histograms for the supporting data evidence.
🔔 Incidents Notifications
10772 - set default time to 60 min instead of 20 min for cluster instance total idle time policy
10771 - set default value for APC autoshutdown policy to 30 min instead of 15 min
Performance Optimizations
10770 - Improve query performance by evaluating job under-provisioning incidents in app
📊 Analysis Optimizations
10776 - Improve idleness detection for APC with cluster names that match job cluster patterns
🪩 UX Optimizations
10583 - migrate navigation panel to new design with new icons set
10906 - add "Job Run ID" label when viewing one single Workflow Run
10905 - add "job name" label for workflows runs view
10907 - remove BACK button from Workflows page
🐞 Fixed Bugs
10749 - Wrong workspace used when accessing a shared link from the app
8425 - duration not reported properly in consumption / run history