Accessing S3 data securely
This article demonstrates how to leverage role-based access to authenticate with Amazon Simple Storage Service (S3) and access your data securely from ClickHouse Cloud.
Introduction
Before diving into the setup for secure S3 access, it is important to understand how this works. Below is an overview of how ClickHouse services can access private S3 buckets by assuming a role within your AWS account.
This approach allows you to manage all access to S3 buckets in a single place (the IAM policy of the assumed role) without having to go through all individual bucket policies to add or remove access.
Setup
Obtaining the ClickHouse service IAM role ARN
1 - Log in to your ClickHouse cloud account.
2 - Select the ClickHouse service you want to connect from.
3 - Select the Settings tab.
4 - Scroll down to the Network security information section at the bottom of the page.
5 - Copy the Service role ID (IAM) value for the service, as shown below.
Setting up IAM assume role
Option 1: Deploying with CloudFormation stack
1 - Log in to your AWS Account in the web browser with an IAM user that has enough permissions to create & manage IAM roles.
2 - Visit this url to populate the CloudFormation stack.
3 - Enter the IAM Role for the ClickHouse service you noted in the previous step.
4 - Configure the CloudFormation stack. Below is additional information about these parameters.
| Parameter | Default Value | Description |
|---|---|---|
| RoleName | ClickHouseAccess-001 | The name of the new role that ClickHouse Cloud will use to access your S3 bucket. |
| Role Session Name | * | Role Session Name can be used as a shared secret to further protect your bucket. |
| ClickHouse Instance Roles | Comma-separated list of ClickHouse service IAM roles that can use this secure S3 integration. | |
| Bucket Access | Read | Sets the level of access for the provided buckets. |
| Bucket Names | Comma-separated list of bucket names that this role will have access to. Note: use the bucket name, not the full bucket ARN. |
5 - Select the I acknowledge that AWS CloudFormation might create IAM resources with custom names checkbox.
6 - Click the Create stack button in the bottom right.
7 - Make sure the CloudFormation stack completes with no error.
8 - Select the Outputs of the CloudFormation stack.
9 - Copy the RoleArn value for this integration. This is needed to configure access to your S3 bucket in the next step.
Option 2: Manually create IAM role
1 - Login to your AWS Account in the web browser with an IAM user that has permission to create & manage IAM roles.
2 - Browse to IAM Service Console.
3 - Create a new IAM role with the following trust and IAM policies, replacing {ClickHouse_IAM_ARN} with the IAM Role arn belonging to your ClickHouse instance and {BUCKET_NAME} with the name of the bucket.
Trust policy
IAM policy
4 - Copy the new IAM Role Arn after creation. This needed to configure access to your S3 bucket in the next step.
Access your S3 bucket with the ClickHouseAccess role
ClickHouse Cloud has a new feature that allows you to specify extra_credentials as part of the S3 table function. Below is an example of how to run a query using the newly created role copied from above.
Below is an example query that uses the role_session_name as a shared secret to query data from a bucket. If the role_session_name is not correct, this operation will fail.
To reduce data transfer costs, it's recommended that your S3 bucket is in the same region as your ClickHouse Cloud service. For more information, refer to S3 pricing.
Advanced action control
For stricter access control, it's possible to restrict the bucket policy to only accept requests that originate from ClickHouse Cloud's VPC endpoints using the aws:SourceVpce condition. To obtain the VPC endpoints for your ClickHouse Cloud region, open a terminal and run:
Then, add a deny rule to the IAM policy with the returned endpoints:
For more details on accessing endpoints for ClickHouse Cloud services, see Cloud IP Addresses.