Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 22 Next »

Terms:

  • LHM Application Role - a IAM role that is assigned to the EC2 Instance (VM) where the Lakehouse Monitor is deployed, the role allows sts:AssumeRole permission for cross account access or just regular permission policies for resource access.

  • LHM Agent Role - a IAM role that will be assumed by the Databricks Workspace Instance Profile Roles enabled for the Databricks workloads monitored by LHM.

  • LHM Application host AWS Account - AWS account where BPLM app (VM) is deployed and where DynamoDB and SQS artifacts are also created.

  • Databricks Workspace AWS account - AWS accounts hosting Databricks workspaces

Databricks consumption data: Log Delivery for Billable usages in S3

The S3 bucket storing the billable usage (Databricks consumption data) requires an S3 bucket policy that will specify the scope of access for the LHM Application:

  1. Full AWS organization

  2. Full AWS Account where LHM App is hosted

  3. Exactly the IAM Role of the LHM Application in the AWS Account hosting it

    Depending on client security configuration on S3 bucket, two options are available for cross-account access.

    a) Bucket policy and KMS key policy: applicable when custom KMS keys are used. The custom key and the bucket must belong to the same AWS region.

    Bucket policy:

  • # Full AWS organization
    # Bucket policy
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "S3ReadObject",
                "Effect": "Allow",
                "Principal": "*",
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::<bucket>/<path_prefix>/*",
                "Condition": {
                    "StringEquals": {
                        "aws:PrincipalOrgID": "<org_id>"
                    }
                }
            },
            {
                "Sid": "S3ListBucket",
                "Effect": "Allow",
                "Principal": "*",
                "Action": "s3:ListBucket",
                "Resource": "arn:aws:s3:::<bucket>",
                "Condition": {
                    "StringEquals": {
                        "aws:PrincipalOrgID": "<org_id>"
                    },
                    "StringLike": {
                        "s3:prefix": "<path_prefix>/*"
                    }
                }
            }
        ]
    }

# Full AWS Account where LHM App is hosted
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "S3ReadObject",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<LHM_App_Host_AWS_Account_Id>:root"            
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<bucket>/<path_prefix>/*"
        },
        {
            "Sid": "S3ListBucket",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<LHM_App_Host_AWS_Account_Id>:root"            
            },
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::<bucket>",
            "Condition": {
                "StringLike": {
                    "s3:prefix": "<path_prefix>/*"
                }
            }
        }
    ]
}

# Exactly the IAM Role of the LHM Application in the AWS Account hosting it
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "S3ReadObject",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<LHM_App_Host_AWS_Account_Id>:role/<LHM_App_IAM_Role>"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<bucket>/<path_prefix>/*"
        },
        {
            "Sid": "S3ListBucket",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<LHM_App_Host_AWS_Account_Id>:role/<LHM_App_IAM_Role>"            
            },
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::<bucket>",
            "Condition": {
                "StringLike": {
                    "s3:prefix": "<path_prefix>/*"
                }
            }
        }
    ]
}

KMS key policy:

# Exactly the IAM Role of the LHM Application in the AWS Account hosting it
{
    "Version": "2012-10-17",
    "Id": "key-consolepolicy-3",
    "Statement": [
        {
           ... the default statement for local trusting ...
        },
        {
            "Sid": "Allow use of the key to LHM App IAM Role ",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<LHM_App_Host_AWS_Account_Id>:role/<LHM_App_IAM_Role>"            
            },
            "Action": "kms:Decrypt",
            "Resource": "*"
        }
    ]
}

LHM Application IAM Role permission policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<bucket>/<path_prefix>/*"
        },
        {
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::<bucket>",
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "<path_prefix>/*"
                    ]
                }
            }
        },
        {
            "Sid": "DecryptKMSbucket",
            "Action": [
                "kms:Decrypt"
            ],
            "Effect": "Allow",
            "Resource": "<ARN_OF_CUSTOM_KMS_KEY_IN_SAME_REGION_AS_BUCKET>"
        }
    ]
}

Configuring Lakehouse monitor to read from s3:

CONSUMPTION_BILLABLE_USAGE_PATH=s3a://<bucket>/<path_prefix>/billable-usage/csv
STORAGE_AWS_S3_REGION=<bucket_region>

b) Using Assume Role for S3: for AWS managed KMS keys

S3 bucket and KMS permission role on AWS account where S3 bucket belongs

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "S3ReadObject",
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<bucket>/<path_prefix>/*"
        },
        {
            "Sid": "S3ListBucket",
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::<bucket>",
            "Condition": {
                "StringLike": {
                    "s3:prefix": "<path_prefix>/*"
                }
            }
        },
        {
            "Sid": "DecryptKMSbucket",
            "Action": [
                "kms:Decrypt"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:kms:<bucket_region>:<s3_aws_account_id>:key/*"
        }
    ]
}

Trusting policy for the S3 role (only trusting a remote role version, for account-id or PrincipalOrgId, see the examples above):

# Exactly the IAM Role of the LHM Application in the AWS Account hosting it
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<LHM_App_Host_AWS_Account_Id>:role/<LHM_App_IAM_Role>"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
 

LHM Application IAM Role permission policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::<s3_aws_account_id>:role/<s3_role_name>"
        }
    ]
}

Configuring Lakehouse monitor to read from s3:

CONSUMPTION_BILLABLE_USAGE_PATH=s3a://<bucket>/<path_prefix>/billable-usage/csv
STORAGE_AWS_S3_REGION=<bucket_region>
CROSS_ACCOUNT_ASSUME_IAM_ROLE_S3_DBX_BILLING_APP=arn:aws:iam::<s3_aws_account_id>:role/<s3_role_name>

DynamoDB and SQS:

Both the LHM Application and the LHM Agent running in the Databricks workspaces require access to DynamoDB tables and SQS queue that are created in the same AWS account as the LHM application, we will call this the “LHM_App_AWS_Account_Id” in the permission policies below:

# LHM Agent IAM Role in the application-host AWS account
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "WriteToDynamoDbAndSqs",
            "Effect": "Allow",
            "Action": [
                "dynamodb:BatchWriteItem",
                "dynamodb:PutItem",
                "dynamodb:UpdateItem",
                "sqs:SendMessage"
            ],
            "Resource": [
                "arn:aws:sqs:<optional_region_or_*>:<LHM_App_AWS_Account_Id>:bplm*",
                "arn:aws:dynamodb:<optional_region_or_*>:<LHM_App_AWS_Account_Id>:table/bplm*"
            ]
        }
    ]
}

# LHM Application (VM) IAM Role in the application-host AWS account:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DynamoAndSQS",
            "Effect": "Allow",
            "Action": [
                "sqs:DeleteMessage",
                "dynamodb:CreateTable",
                "sqs:GetQueueUrl",
                "dynamodb:UpdateTimeToLive",
                "dynamodb:DescribeTable",
                "sqs:ReceiveMessage",
                "dynamodb:Scan",
                "dynamodb:Query",
                "sqs:CreateQueue"
            ],
            "Resource": [
                "arn:aws:sqs:<optional_source_region_or_*>:<LHM_App_AWS_Account_Id>:bplm*",
                "arn:aws:dynamodb:<optional_source_region_or_*>:<LHM_App_AWS_Account_Id>:table/bplm*"
            ]
        }
    ]
}

Trust policy for the LHM Agent IAM Role in the application-host AWS account:

DynamoDb tables and SQS queue are accessed by LHM agents running on multiple AWS accounts hosting Dbx workspaces:

# LHM Agent IAM Role fully trusts a list of AWS accounts
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                  "arn:aws:iam::<Dbx_Workspace_AWS_Account1_id>:root",
                  "arn:aws:iam::<Dbx_Workspace_AWS_Account2_id>:root",
                  ...
                ]          
            },
            "Action": "sts:AssumeRole",
            "Condition": {}
        }
    ]
}

or

# LHM Agent IAM Role trusts a Databricks workspace Instance Profile IAM Role 
# in a particular AWS account

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<Dbx_Workspace_AWS_Account>:role/<Dbx_Wksp_Instance_Profile_IAM_Role>"
            },
            "Action": "sts:AssumeRole",
        }
    ]
}

Permission policy for the Databricks Workspace Instance Profile Roles that will assume the LHM_Agent_IAM_Role:

(see more info here: instance profile)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::<LHM_App_Host_AWS_Account_ID_or_*>:role/<LHM_Agent_IAM_Role>"
        }
    ]
}

And in the Lakehouse Monitor .env config file provide the AWS account ID and IAM Role name for the LHM agent that will be shipped to each monitored Databricks workspace:

Note that the AWS account ID of the LHM Agent, DynamoDB, SQS and LHM Application (VM) has the same value of LHM_App_Host_AWS_Account_ID

CROSS_ACCOUNT_ASSUME_IAM_ROLE_AGENT=arn:aws:iam::<LHM_App_Host_AWS_Account_ID>:role/<LHM_Agent_IAM_Role>
STORAGE_AWS_REGION=<dynamodb_and_sqs_region>

AWS CostExplorer

LHM Application IAM Role will assume a IAM Role in the Databricks Workspace AWS Account with a permission policy to access Cost Explorer data in that AWS Account Id:

IAM_Role_Cost_Explorer permission policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowGetCostAndUsages",
            "Effect": "Allow",
            "Action": "ce:GetCostAndUsage",
            "Resource": "*"
        }
    ]
}

Trust policy for the IAM_Role_Cost_Explorer that allows the LHM Application IAM Role in the app-host AWS Account to assume the cost explorer role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<LHM_App_Host_AWS_Account_ID>:role/<LHM_App_IAM_Role>"
            },
            "Action": "sts:AssumeRole",
        }
    ]
}

LHM application IAM Role permission policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AssumeCostExplorerRole",
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::<Databricks_Wksp_AWS_Account_Id>:role/<IAM_Role_Cost_Explorer>"
        }
    ]
}

And in the Lakehouse Monitor .env config file provide the source role name for cost explorer

CROSS_ACCOUNT_ASSUME_IAM_ROLE_COST_EXPLORER_APP=arn:aws:iam::<Databricks_Wksp_AWS_Account_Id>:role/<IAM_Role_Cost_Explorer>

  • No labels

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.