Deployment and Quick Setup Guide: AWS

 

Guide for the process of installing LHO on AWS via a deployment script that creates all the required AWS resources.

Quick Setup guide to enable cost and telemetry monitoring on a Databricks Account and a Databricks Workspace

 


Installation Prerequisites

Helpful Competencies During Deployment

To aid in following the below steps and understanding the installation procedure, knowledge about AWS services such as EC2, VPC, RDS, Secrets Manager, and IAM Roles\Policies. Additionally, Linux administration, bash scripting, and optionally Azure Entra ID app registrations is suggested but not strictly required.

AWS Account Prerequisites

The AWS account (the “deployer” AWS IAM user) used to run the LHO installation script must have the following rights already granted in order for the installation process to complete successfully. Below is the AWS IAM policy in JSON format.

We strongly recommend not using the root user for this deployment and either create a separate user with the below permissions or assign the below permissions via a role to an existing user.

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "iam:CreateServiceLinkedRole", "iam:DeleteServiceLinkedRole", "iam:CreateInstanceProfile", "iam:AddRoleToInstanceProfile", "iam:DeleteInstanceProfile", "iam:GetUser", "iam:AttachRolePolicy", "iam:GetInstanceProfile", "iam:PassRole", "iam:CreatePolicy", "iam:ListEntitiesForPolicy", "iam:AttachUserPolicy", "iam:CreatePolicyVersion", "iam:ListAttachedUserPolicies", "iam:ListPolicies", "iam:DetachUserPolicy", "iam:ListUsers", "iam:ListGroups", "iam:CreateRole", "iam:GetPolicy", "iam:GetPolicyVersion", "iam:RemoveRoleFromInstanceProfile", "iam:DeleteRole", "iam:DeletePolicy", "iam:ListPolicyVersions", "iam:ListInstanceProfilesForRole", "iam:ListRolePolicies", "iam:ListAttachedRolePolicies", "iam:GetRole", "iam:ListRoles", "iam:DetachRolePolicy", "organizations:DescribeOrganization", "account:ListRegions" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "ec2:*", "dynamodb:*", "route53:*", "rds:*", "s3:*", "cloudshell:*", "resource-groups:*", "secretsmanager:TagResource", "secretsmanager:CreateSecret", "secretsmanager:DescribeSecret", "secretsmanager:GetResourcePolicy", "secretsmanager:GetSecretValue", "kms:CreateKey", "kms:DescribeKey", "kms:GetKeyPolicy", "kms:ScheduleKeyDeletion", "kms:GetKeyRotationStatus", "secretsmanager:DeleteSecret", "secretsmanager:PutSecretValue", "kms:ListResourceTags" ], "Resource": "*" } ] }

Databricks metastore admin permission

Besides the AWS permissions listed above, the deploying user needs to be a Metastore Admin of the Databricks Unity Catalog. We recommend creating a group which shall be configured as Metastore Admin and having the admins added to this group.

Supported AWS Regions

The Lakehouse Optimizer supports any region where all the required AWS services can be deployed. You can check service availability by region by this AWS page to list AWS services available by region.


Databricks Service Principal

Ensure the Databricks service principal was created with the prerequisites linked below

https://blueprinttechnologies.atlassian.net/wiki/spaces/BLMPD/pages/2577662006/AWS+Resource+Requirements#Databricks-Service-Principal


LHO Agent Access

Make a decision on how the LHO agent will be granted authorization in your Databricks environment. We suggest creating an IAM user and configuring a role trust, as outlined in the documentation linked below.

https://blueprinttechnologies.atlassian.net/wiki/spaces/BLMPD/pages/2577662006/AWS+Resource+Requirements#AWS-Permission-Policies


Azure Active Directory Single Sign-On Prerequisites

LHO currently support Databricks Accounts and Azure Active Directory as Identity Providers for signing into the application. A signed in user will have to import a Databricks API Token for each monitored Databricks workspace and all Databricks APIs made by the LHO on behalf of the user will use the API Token.

If your choice is Azure AD SSO for identity management create an app registration in the target Azure tenant before completing the installation guide

https://learn.microsoft.com/en-us/azure/active-directory/develop/quickstart-register-app

Make note of the azure tenant id, Application ( client ) id.

Create an application client secret, saving the generated value for later use in the deployment process. Make note of the expiration date of the secret. Once deployed into AWS Secrets manager, this value will be stored in 'msft-provider-auth-secret'. Rotate that secret value by generating a new client secret as necessary.

Click “Add a platform” under Authentication to add a Web platform authentication configuration

Enable the authorization endpoint to issue ‘ID tokens’ under ‘Authentication’

If you’ve decided on a DNS name for the app, you can also at this time update the Web Redirect URI to include

https://{dns record name}/login/oauth2/code/azure

You can also update this value after deployment. It must be configured correctly to enable Azure AD SSO.

 


I. Installation Guide

 

Step 1. Log in to AWS and open AWS CloudShell

Step 2. Prepare Lakehouse Optimizer deployment archive.

wget https://bplmdemoappstg.blob.core.windows.net/deployment/vm-aws/lho-centos.zip unzip lho-centos.zip -d lho cd lho chmod +x deploy-lm.sh sed -i.bak 's/\r$//' *.sh

 

Step 3. Build the argument list and run deployment script.

There are multiple options for LHO for AWS deployment, please understand the below options and create an argument list that best configures LHO for your environment’s needs.

 

The base arguments for all deployments are below:

email_certbot - An admin email notified by Let’s Encrypt regarding certificate expiry. As a backup to the automatically renewing certs LHO configures.

aws_region - The AWS region you wish to deploy LHO into. Should be the same or as close to your databricks workspaces as possible.

databricks_account_id - Your Databricks account id.

databricks_principal_guid - The client id of the databricks service principal created as part of the prerequisites.

dns_record_name - Friendly, descriptive DNS name for the application. eg: lho-app-dev.yourdomain.com. An 'A' record is created in an AWS Hosted Zone in the account this deployment is running in. Ensure that the hosted zone is in the AWS account.

name_prefix - AWS Resources name prefix. Note that this will be used to name the S3 bucket. The bucket name must be globally unique (across the entire AWS space) so we recommend using specific names instead of generic ones. E.g: lho-<your company name here> instead of lho

acr_username - Container registry username to authenticate and pull down the LHO app container. Contact Blueprint support if you do not have this information.

Azure Entra Id ( Active Directory ) Single-Sign On Enabled Options

If you are setting up Azure AD SSO as well, two additional arguments must be passed in, ‘tenant_id' and 'service_principal'.

service_principal - Your Azure AD app registration client ( App ) ID. Steps to create this above under the section 'Azure Active Directory Single Sign-On Prerequisites”.

tenant_id - Your Azure AD tenant id.

LHO for smaller environments

If you are deploying a smaller footprint version of LHO, you can leverage the switch --docker_hosted_sql. This will deploy SQL Server container hosted on the same host VM as the LHO app container. This switch can be used in conjunction with any combination of the Azure AD enabling arguments as well as the below arguments for configuring the LHO agent.

Options for LHO Agent authorization

Documentation for determining what authorization solutions works best, see:

Configuring option 1, leveraging an IAM user for LHO agent access

After creating an IAM user and assigning the LHO agent policy (outlined in documentation link above):

--iam_agent_username - Username of AWS IAM user created and assigned the LHO Agent policy to access DynamoDB and SQS.

Configuring option 2, using role trusts

First run the below code snippet to create a role trust file that is a newline separated list of all of the instance profile role ARNs that are in use in your Databricks environments. You may also simply add the root ARN to have the LHO agent role trust all roles in the given account.

Change the below ARN values defined in the role_array variable declaration to suit your needs

Your argument list:

--role_trusts_file The location of the created role trust list file. If following the above steps to create the file, the location would be ~/lho/agent_trusted_roles.txt

Argument recap

There are multiple different argument options available to configure the LHO application to suit your needs. Beyond the required base arguments, you can mix and match the additional options discussed above. For example, you can run an Azure AD SSO version of LHO that enables both IAM user and role trust authentication solutions.

 

Step 4. Script input fields:

 Result of process is:

 

Step 5. Installation complete

Once the installation is complete, you will see the following output.

The URL to login to LHO will be printed in the Cloudshell output.

Please copy the App URL that you will use to login to LHM.

if you got link with http protocol - change it to https

e.g.:

 

 

Step 6. SSH login (optional)

Once the script is done you also have the ssh key to access the VM in your ~/.ssh folder in the AWS cloudshell session.

ssh -i ~/.ssh/<BLPLM-APP-KEY> centos@<BPLM-APP-VM>

 

You can use this to login to the host VM and gather logs from the container via docker logs

docker logs -n 500 -f

 


II. First-Time Login Guide

 

Step 1. Login to LHO App

with the login URL provided when the installation was complete.

 

Step 2. Grant permissions

If it’s the first time you are logging in with your user to LHO, you will be asked for permissions by LHO’s App Service. Click Accept.

 

Approval Required Troubleshooting

Depending on how your Azure subscription is configured by the IT department, you might also come across the following “Approval required” screen. The approval must be granted by the IT department. It is not something related to configurations done by LHO installation process.

Please follow the following guide to avoid this dialog:

 

If you still have issues sign-in in and thee see the screens below, please follow the next steps:

 

Step 3. Add app role

Open created Azure App → App roles → Create app role with value bplm-admin

 

Step 4. Assign user and group

Open created Azure App overview page → Managed application in local directory → Assign users and groups → Add user/group → add a new role bplm-admin → Select your user and save

 

Step 5. Add the redirect uri to the registered app

Open registered app Overview page on Azure Portal

Open Redirect URIs page

Click on Add platform

Choose Web and insert new redirect URI


Step 6. Configure License

Once logged in, you will be redirected to the License panel.
Need to add new Assignment for bplm-admin

 

Copy the License Token and contact Blueprint and provide the token in order to receive a trial or permanent license for your deployment.

Once you receive the license, add the License Key and Public Key in this panel.

Once this is done, LHM is ready to start monitoring your assets.

 


III. Configure Databricks Workspace

The following actions are required in order to enable Lakehouse Monitor to gather cost and telemetry data:

  1. Go to → Settings → Provisioning & Permissions

  1. Generate your Databricks token

  2. Add token

 

This setup is the quickest option to get your Databricks monitored. There are also other configuration options for LHM, for example to enable monitoring on assets one-by-on. For more configuration options please contact Blueprint or follow the more advanced topics in the documentation material.

 

 


IV. Load Consumption Data

Step 1. Navigate to the Consumption Data panel.

This page is available only to the role of Billing Admin.

 

Step 2. Load Now consumption data

LHO supports loading consumption (cost/usage detail) data from both your Databricks account for your DBU charges as well as AWS Cost Explorer for your AWS cloud chages, either on demand or on a schedule basis.

At this step, for this tutorial purpose, select Run Now and load data for the past 30 days or 2 months at most. Depending on your Azure Subscription size this process might be long, therefore we recommend to load for a smaller date interval, the purpose being to see cost and telemetry data in LHO as soon as possible.

Loading consumption data for large subscriptions for the past 12 months, can take up to 12 hours or even more.

 

Step 3. Scheduled load consumption data

Most likely, Databricks resources are used on a daily basis in your infrastructure. Therefore we recommend you to create a scheduled daily consumption data load in order for LHO to report updated costs on a daily basis.

Recommended schedule configuration:

  • load data: incrementally

  • frequency: daily

You can configure multiple schedules based on your particular needs.

 


V. Explore Cost and Telemetry Insights

Once all previous steps are completed, your LHO instance is ready to monitor your cloud infrastructure.

 

 


VI. Automatically grant access consent for all Active Directory Users (optional)

The following guide will help you configure the login process such that users with a valid AD user using single-sign-on will login automatically, without having to click on “grant permissions” dialogs or contact IT for further approvals.


VII. Assign User Roles in Lakehouse Optimizer (optional)

If Azure Active Directory is used for authentication, then each user can also be assigned to different roles supported by Lakehouse Optimizer.

The following article provides further configuration details:

What roles are there in the LHO app?

How can I assign LHO roles to users?

 

 


Related