FAQs - Frequently Asked Questions
Does the Lakehouse Optimizer run in a Blueprint Environment?
No, the Lakehouse Optimizer will be deployed and provisioned on a Virtual Machine within your organization’s environment
Does the Lakehouse Optimizer send any data, including metadata, off of our on-premise or cloud environments?
Lakehouse Optimizer is deployed within your environment — either on a VM in your Azure or AWS subscription. All telemetry and metadata collected from Databricks workspaces are stored locally in your infrastructure. The only thing LHO sends is the cost/data usage to Blueprint for billing purposes.
Does your software require a database for metadata storage? Is there a repository? How much storage is recommended for ongoing processing?
Yes, LHO requires a SQL database for storing metadata and telemetry data. Azure: Requires an Azure SQL or MS SQL Server (recommended type: S3 with 400 DTUs or higher; serverless options with 2–8 cores are also supported). In terms of storage: A 50GB disk volume is recommended for the VM. An additional 50GB data disk is suggested for telemetry and analysis data but we will know more about what is needed once we deploy.
How much compute power is needed for the analysis? Do you need dedicate CPU / Ram?
Yes, LHO requires dedicated compute resources. Minimum VM specs: 8 cores, 28–32GB RAM, 50GB+ disk space
How often are these resources employed? In other words, how often is a collection of metadata done, how often does the analysis run, etc.
LHO allows custom scheduling for both metadata collection and analysis:
Telemetry Detection: Can be configured to run at regular intervals (e.g., hourly for long-running workflows).
Consumption Data Loading: Can be run on-demand or scheduled (e.g., daily at midnight). Admins can set batch sizes and intervals for loading historical or incremental data.
Analysis Jobs: Triggered automatically after detection steps or based on workflow completion. Partial analysis can occur hourly for ongoing jobs.
Read more: How often does LHO update data