Monitoring Lakehouse Optimizer (LHO) using AWS CloudWatch

Monitoring Lakehouse Optimizer (LHO) using AWS CloudWatch

Introduction

In this guide, we will create a Python-based Canary in AWS CloudWatch Synthetics. This canary is a script that runs continuously and uses the LHO API to monitor your application's availability and performance by checking that the application can list jobs/clusters/pipelines for the workspaces available to the USER used to run the script.

Step 1: Create a Python virtualenv

python -m venv myenv

Step 2: Install the Requests package

cd myenv source bin/activate pip install requests

Step 3: Create the script folder structure needed for the canary

deactivate # deactivate the virtualenv cd .. # exit the virtualenv folder mkdir -p canary/python cd canary/python

Step 4: Create the canary script

  • create a file called script.py and open it for editing

  • paste the contents of the following code block, and modify the values of _url, username and password

import requests def basic_custom_script(): # Set variables messages = "" errors = "" _url = 'YOUR LHM URL' username = 'LHM DATABRICKS USERNAME' password = 'LHM DATABRICKS PASSWORD' # Make request response = requests.post( f'{_url}/api/1.0/public/login', json={'username': username, 'password': password} ) # Get cookie cookies = response.cookies jsessionid = cookies.get('JSESSIONID') cookies = {'JSESSIONID': jsessionid} subscriptions = requests.get(f'%s/api/1.0/subscriptions' % _url, cookies=cookies) if subscriptions.status_code != 200: errors += '\tSubscriptions request failure\r\n' if len(subscriptions.json()) == 0: errors += '\tNo subscriptions found\r\n' for sub in subscriptions.json(): messages += "Found subscription: " + sub['displayName'] + "\r\n" workspaces_map = {} for sub in subscriptions.json(): workspaces = requests.get(f'%s/api/1.0/subscriptions/%s/workspaces' % (_url, sub['subscriptionId']), cookies=cookies) if (workspaces.status_code != 200): errors += 'Subscription ' + sub + ' - Workspace request failure ' + workspaces.text + '\r\n'; continue if (len(workspaces.json()) == 0): errors += 'Subscription ' + sub + ' - No workspaces in subscription' + '\r\n'; continue workspaces_map[sub['displayName']] = [] for space in workspaces.json(): workspaces_map[sub['displayName']].append(space) messages += f"Subscription {sub['displayName']} - found workspace: " + space['displayName'] + '\r\n' for subscription in workspaces_map.keys(): workspaces = workspaces_map[subscription] for workspace in workspaces: clusters = requests.get(f'%s/api/1.0/clusters?&subscriptionId=%s&workspaceId=%s' % (_url, subscription, workspace['workspaceId']), cookies=cookies) if (clusters.status_code != 200): errors += f"Subscription {subscription} - workspace {workspace['displayName']} - " + 'Cluster request failure ' + clusters.text + '\r\n' else: if (len(clusters.json()) == 0): errors += f"Subpscription {subscription} - workspace {workspace['displayName']} - " + 'No clusters in workspace' + '\r\n' for cluster in clusters.json(): messages += f"Subpscription {subscription} - workspace {workspace['displayName']} - found cluster: " + cluster['clusterName'] + '\r\n' jobs = requests.get(f'%s/api/1.0/jobs?&subscriptionId=%s&workspaceId=%s' % (_url, subscription, workspace['workspaceId']), cookies=cookies) if (jobs.status_code != 200): errors += f"Subpscription {subscription} - workspace {workspace['displayName']} - " + 'Job request failure ' + jobs.text + '\r\n' else: if (len(jobs.json()) == 0): errors += f"Subpscription {subscription} - workspace {workspace['displayName']} - No jobs in workspace" + '\r\n' for job in jobs.json(): messages += f"Subpscription {subscription} - workspace {workspace['displayName']} - found job: " + job['jobName'] + '\r\n' pipelines = requests.get(f'%s/api/1.0/pipelines?&subscriptionId=%s&workspaceId=%s' % (_url, subscription, workspace['workspaceId']), cookies=cookies) if (pipelines.status_code != 200): errors += f"Subpscription {subscription} - workspace {workspace['displayName']} - " + 'Pipeline request failure ' + pipelines.text + '\r\n' else: if (len(pipelines.json()['pipelines']) == 0): errors += f"Subpscription {subscription} - workspace {workspace['displayName']} - No pipelines in subscription" + '\r\n' for pipeline in pipelines.json()['pipelines']: messages += f"Subpscription {subscription} - workspace {workspace['displayName']} - found pipeline: " + pipeline['name'] + '\r\n' if (len(errors) != 0): raise Exception(errors) return messages def handler(event, context): return basic_custom_script()

Step 5: Copy the contents of the 'site-packages' folder to the Python folder

cd ../.. # exit back to the folder which creates both the myenv venv and the canary folder cp -rfv myenv/lib/python3.11/site-packages/* canary/python/

Step 6: Archive the Python folder into a ZIP and upload it into an S3 bucket

cd canary zip -r canary.zip python
  • Create a new S3 bucket if you don't have one already, and upload the canary.zip file to it.

Step 7: Create a new canary in CloudWatch Synthetics

  • In the AWS Management Console, navigate to CloudWatch Synthetics, and select "Canaries" from the left-hand menu.

  • Click the "Create canary" button to start the creation process.

Step 8: Configure the basics of your canary

  • Select Import from S3

  • Specify a name for your canary

  • Select the latest Python runtime from the dropdown menu e.g: sys-python-selenium-1.3

  • Provide the s3 path to the archive uploaded in step 6

  • Specify the entry point of the canary as script.handler, where script is the name of the script file that contains the canary script code and handler is the method inside the script.py file that executes the code

Step 9: Schedule the canary

  • Set the schedule for the canary to run continuously every 5/10/15 minutes.

Step 10: Select an IAM role

  • Choose an existing IAM role that has the necessary permissions or create a new one.

Step 11: Create the canary

  • Click the "Create canary" button to create the canary.

Step 12: Create alerts for the canary

  • Once the canary is created, you can create alerts for it using the "Failed" metric of the canary.

  • In the AWS Management Console, navigate to CloudWatch Alarms, and click the "Create alarm" button.

  • Select the "Failed" metric for your canary from the "Canary" namespace.

  • Set the "Statistic" to "Sample Count", the "Period" to "5 minutes", and the "Threshold" to "Static greater than 0".

  • Choose the action you want to take when the alarm is triggered, such as sending an email or triggering an AWS Lambda function.

  • Click the "Create alarm" button to create the alert.

And that's it! You've successfully created a Python-based Canary in AWS CloudWatch Synthetics and set up alerts to notify you when it fails. You can use this canary to monitor the availability and performance of your application and make sure it's always up and running.