AWS Deployment Diagram

Below is the deployment diagram for the Lakehouse Optimizer security flow when using AWS.

 

AWS deployment diagram

 

AWS components of an LHO deployment

  1. AWS components created at deployment time by the deployment script:

    1. AWS EC2 VM - hosting the LHO application, a docker-compose/docker image/container

    2. AWS RDS for SQL Server - storing the LHO analyzed data, cost, recommendations etc

    3. AWS DynamoDB - storing telemetry data from each monitored Databricks workspace

    4. AWS SQS - DBX event notifications used for real-time analysis of Dbx workloads

    5. AWS VPC - network configuration for the LHO EC2 VM and the RDS for SQL Server

    6. AWS Secrets Manager - LHO stores all sensitive keys and secrets (SQL Server password, IAM user access key/secret key, Dbx Service Principal Oauth Secret, encryption private keys, license token etc)

    7. IAM Role - instance profile that allows the EC2 VM access to its services (Secrets Manager, DynamoDB, SQS, AWS Cost Explorer, Dbx Billable Usage Logs)

    8. AWS S3 - deployment Terraform backups

  2. Optional components controlled by a script parameter (providedDnsUrl)

    1. Elastic IP - used for accessing the VM from the Internet

    2. Route53 registered domain using the above Elastic IP

  3. AWS components created by the customer:

    1. IAM User Access Key/Secret Key with permissions and trust policies for cross-AWS access

    2. AWS Cost Explorer tag activation

    3. Databricks Service Principal with OAuth secret

    4. App Registration for Single Sign-On: Okta, Microsoft Entra ID (Azure AD), Google Cloud Console

    5. Route53 hosted zone (not required if providedDnsUrl is set)