Deployment and Quick Setup Guide: AWS
Guide for the process of installing LHO on AWS via a deployment script that creates all the required AWS resources.
Quick Setup guide to enable cost and telemetry monitoring on a Databricks Account and a Databricks Workspace
- 1 Installation Prerequisites
- 2 I. Installation Guide
- 3 II. First-Time Login Guide
- 4 III. Configure Databricks Workspace
- 5 IV. Load Consumption Data
- 6 V. Explore Cost and Telemetry Insights
- 7 VI. Automatically grant access consent for all Active Directory Users (optional)
- 8 VII. Assign User Roles in Lakehouse Optimizer (optional)
- 9 Related
Installation Prerequisites
Helpful Competencies During Deployment
To aid in following the below steps and understanding the installation procedure, knowledge about AWS services such as EC2, VPC, RDS, Secrets Manager, and IAM Roles\Policies. Additionally, Linux administration, bash scripting, and optionally Azure Entra ID app registrations is suggested but not strictly required.
AWS Account Prerequisites
The AWS account (the “deployer” AWS IAM user) used to run the LHO installation script must have the following rights already granted in order for the installation process to complete successfully. Below is the AWS IAM policy in JSON format.
We strongly recommend not using the root user for this deployment and either create a separate user with the below permissions or assign the below permissions via a role to an existing user.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"iam:CreateServiceLinkedRole",
"iam:DeleteServiceLinkedRole",
"iam:CreateInstanceProfile",
"iam:AddRoleToInstanceProfile",
"iam:DeleteInstanceProfile",
"iam:GetUser",
"iam:AttachRolePolicy",
"iam:GetInstanceProfile",
"iam:PassRole",
"iam:CreatePolicy",
"iam:ListEntitiesForPolicy",
"iam:AttachUserPolicy",
"iam:CreatePolicyVersion",
"iam:ListAttachedUserPolicies",
"iam:ListPolicies",
"iam:DetachUserPolicy",
"iam:ListUsers",
"iam:ListGroups",
"iam:CreateRole",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"iam:RemoveRoleFromInstanceProfile",
"iam:DeleteRole",
"iam:DeletePolicy",
"iam:ListPolicyVersions",
"iam:ListInstanceProfilesForRole",
"iam:ListRolePolicies",
"iam:ListAttachedRolePolicies",
"iam:GetRole",
"iam:ListRoles",
"iam:DetachRolePolicy",
"organizations:DescribeOrganization",
"account:ListRegions"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:*",
"dynamodb:*",
"route53:*",
"rds:*",
"s3:*",
"cloudshell:*",
"resource-groups:*",
"secretsmanager:TagResource",
"secretsmanager:CreateSecret",
"secretsmanager:DescribeSecret",
"secretsmanager:GetResourcePolicy",
"secretsmanager:GetSecretValue",
"kms:CreateKey",
"kms:DescribeKey",
"kms:GetKeyPolicy",
"kms:ScheduleKeyDeletion",
"kms:GetKeyRotationStatus",
"secretsmanager:DeleteSecret",
"secretsmanager:PutSecretValue",
"kms:ListResourceTags"
],
"Resource": "*"
}
]
}
Databricks metastore admin permission
Besides the AWS permissions listed above, the deploying user needs to be a Metastore Admin of the Databricks Unity Catalog. We recommend creating a group which shall be configured as Metastore Admin and having the admins added to this group.
Supported AWS Regions
The Lakehouse Optimizer supports any region where all the required AWS services can be deployed. You can check service availability by region by this AWS page to list AWS services available by region.
Databricks Service Principal
Ensure the Databricks service principal was created with the prerequisites linked below
AWS Resource Requirements | Databricks Service Principal
LHO Agent Access
Make a decision on how the LHO agent will be granted authorization in your Databricks environment. We suggest creating an IAM user and configuring a role trust, as outlined in the documentation linked below.
AWS Resource Requirements | AWS Permission Policies
Single Sign-On Configuration
Lakehouse Optimizer supports a handful of different identity providers. Please follow the links to the configuration page for your SSO provider of choice. If you use Databricks accounts for authentication, you can skip this section.
Microsoft Entra\Azure Active Directory
I. Installation Guide
Step 1. Log in to AWS and open AWS CloudShell
Step 2. Prepare Lakehouse Optimizer deployment archive.
wget https://bplmdemoappstg.blob.core.windows.net/deployment/vm-aws/lho_aws_r9.zip
unzip lho_aws_r9.zip -d lho
cd lho
chmod +x deploy-lm.sh
sed -i.bak 's/\r$//' *.sh
Step 3. Build the argument list and run deployment script.
There are multiple options for LHO for AWS deployment, please understand the below options and create an argument list that best configures LHO for your environment’s needs.
The base arguments for all deployments are below:
bash ./deploy-lm.sh --email_certbot "{user email}" \
--aws_region {aws region} \
--databricks_account_id {databricks account ID}\
--databricks_principal_guid {Databricks service principal client id} \
--dns_record_name "{DNS record name}" \
--name_prefix "{NAME}" \
--acr_username "{Container_resistry_username}"
email_certbot
- An admin email notified by Let’s Encrypt regarding certificate expiry. As a backup to the automatically renewing certs LHO configures.
aws_region
- The AWS region you wish to deploy LHO into. Should be the same or as close to your databricks workspaces as possible.
databricks_account_id
- Your Databricks account id. Manage your Databricks account
databricks_principal_guid
- The client id of the databricks service principal created as part of the prerequisites. AWS Resource Requirements | Databricks Service Principal
dns_record_name
- Friendly, descriptive DNS name for the application. eg: lho-app-dev.yourdomain.com
. An 'A' record is created in an AWS Hosted Zone in the account this deployment is running in. Ensure that the hosted zone is in the AWS account.
name_prefix
- AWS Resources name prefix. Note that this will be used to name the S3 bucket. The bucket name must be globally unique (across the entire AWS space) so we recommend using specific names instead of generic ones. E.g: lho-<your company name here>
instead of lho
acr_username
- Container registry username to authenticate and pull down the LHO app container. Contact Blueprint support if you do not have this information.
Single Sign-On Options
Azure Entra Id ( Active Directory )
Two additional arguments must be passed in, ‘tenant_id
' and 'service_principal
'. You will also be prompted for the client secret during script execution
service_principal
- Your Azure AD app registration client ( App ) ID. Steps to create this above under the section 'Azure Active Directory Single Sign-On Prerequisites”.
tenant_id
- Your Azure AD tenant id. Get subscription and tenant IDs in the Azure portal - Azure portal
Okta
Okta SSO requires that the Okta base URL and client ID be added as arguments. You will also be prompted for the client secret during script execution
Google SSO requires the Google app client ID passed in as an argument. You will also be prompted for the client secret during script execution
LHO for smaller environments
If you are deploying a smaller footprint version of LHO, you can leverage the switch --docker_hosted_sql
. This will deploy SQL Server container hosted on the same host VM as the LHO app container. This switch can be used in conjunction with any combination of the Azure AD enabling arguments as well as the below arguments for configuring the LHO agent.
Options for LHO Agent authorization
Documentation for determining what authorization solutions works best, see: AWS Resource Requirements | Lakehouse Optimizer Agent Permissions
Configuring option 1, leveraging an IAM user for LHO agent access
After creating an IAM user and assigning the LHO agent policy (outlined in documentation link above):
--iam_agent_username
- Username of AWS IAM user created and assigned the LHO Agent policy to access DynamoDB and SQS.
Configuring option 2, using role trusts
The below code snippet creates a role trust file that is a newline separated list of all the instance profile role ARNs or AWS accounts that are in use in your Databricks environment. This file location is passed in as an argument to the deployment script.
Copy the snippet to a text editor. Change the below ARN values defined in the role_array variable declaration. After updating, run in AWS Cloudshell to create the role trust file.
Your argument list:
--role_trusts_file
The location of the created role trust list file. If following the above steps to create the file, the location would be ~/lho/agent_trusted_roles.txt
Argument recap
There are multiple different argument options available to configure the LHO application to suit your needs. Beyond the required base arguments, you can mix and match the additional options discussed above. For example, you can run an Azure AD SSO version of LHO that enables both IAM user and role trust authentication solutions. For other options, run ./deploy-lm.sh --help to see all available options
Step 4. Script input fields:
Result of process is:
Step 5. Installation complete
Once the installation is complete, you will see the following output.
The URL to login to LHO will be printed in the Cloudshell output.
Please copy the App URL
that you will use to login to LHM.
if you got link with http protocol - change it to https
e.g.:
Step 6. SSH login (optional)
Once the script is done you also have the ssh key to access the VM in your ~/.ssh
folder in the AWS cloudshell session.
ssh -i ~/.ssh/<BLPLM-APP-KEY> centos@<BPLM-APP-VM>
You can use this to login to the host VM and gather logs from the container via docker logs
docker logs -n 500 -f
II. First-Time Login Guide
Step 1. Login to LHO App
with the login URL provided when the installation was complete.
Step 2. Grant permissions
If it’s the first time you are logging in with your user to LHO, you will be asked for permissions by LHO’s App Service. Click Accept.
Approval Required Troubleshooting
Depending on how your Azure subscription is configured by the IT department, you might also come across the following “Approval required” screen. The approval must be granted by the IT department. It is not something related to configurations done by LHO installation process.
Please follow the following guide to avoid this dialog:
Active Directory Enable Access for All Users at Tenant Level
If you still have issues sign-in in and thee see the screens below, please follow the next steps:
Step 3. Add app role
Open created Azure App → App roles → Create app role with value bplm-admin
Step 4. Assign user and group
Open created Azure App overview page → Managed application in local directory → Assign users and groups → Add user/group → add a new role bplm-admin → Select your user and save
Step 5. Add the redirect uri to the registered app
Open registered app Overview page on Azure Portal
Open Redirect URIs page
Click on Add platform
Choose Web and insert new redirect URI
Step 6. Configure License
Once logged in, you will be redirected to the License panel.
Need to add new Assignment for bplm-admin
Copy the License Token and contact Blueprint and provide the token in order to receive a trial or permanent license for your deployment.
Once you receive the license, add the License Key and Public Key in this panel.
Once this is done, LHM is ready to start monitoring your assets.
III. Configure Databricks Workspace
The following actions are required in order to enable Lakehouse Monitor to gather cost and telemetry data:
Go to → Settings → Provisioning & Permissions
Generate your Databricks token
Add token
This setup is the quickest option to get your Databricks monitored. There are also other configuration options for LHM, for example to enable monitoring on assets one-by-on. For more configuration options please contact Blueprint or follow the more advanced topics in the documentation material.
IV. Load Consumption Data
Step 1. Navigate to the Consumption Data panel.
This page is available only to the role of Billing Admin
.
Step 2. Load Now consumption data
LHO supports loading consumption (cost/usage detail) data from both your Databricks account for your DBU charges as well as AWS Cost Explorer for your AWS cloud chages, either on demand or on a schedule basis.
At this step, for this tutorial purpose, select Run Now and load data for the past 30 days or 2 months at most. Depending on your Azure Subscription size this process might be long, therefore we recommend to load for a smaller date interval, the purpose being to see cost and telemetry data in LHO as soon as possible.
Loading consumption data for large subscriptions for the past 12 months, can take up to 12 hours or even more.
Step 3. Scheduled load consumption data
Most likely, Databricks resources are used on a daily basis in your infrastructure. Therefore we recommend you to create a scheduled daily consumption data load in order for LHO to report updated costs on a daily basis.
Recommended schedule configuration:
load data: incrementally
frequency: daily
You can configure multiple schedules based on your particular needs.
V. Explore Cost and Telemetry Insights
Once all previous steps are completed, your LHO instance is ready to monitor your cloud infrastructure.
VI. Automatically grant access consent for all Active Directory Users (optional)
The following guide will help you configure the login process such that users with a valid AD user using single-sign-on will login automatically, without having to click on “grant permissions” dialogs or contact IT for further approvals.
Active Directory Enable Access for All Users at Tenant Level
VII. Assign User Roles in Lakehouse Optimizer (optional)
If Azure Active Directory is used for authentication, then each user can also be assigned to different roles supported by Lakehouse Optimizer.
The following article provides further configuration details:
What roles are there in the LHO app?
How can I assign LHO roles to users?