Skip to content

A simple bearer token authenticated dropbox that drops its payloads into an S3 bucket, designed to run in AWS Lambda via a Function URL

License

Notifications You must be signed in to change notification settings

uktrade/s3-dropbox

Repository files navigation

s3-dropbox Test suite Code coverage

A bearer token authenticated dropbox that drops its payloads into an S3 bucket, designed to run as an AWS Lambda function via a Lambda Function URL.

This is a simple single-function dropbox. Payloads are expected to be small: a limit of 10 KB is enforced.

Tip

Looking for the FastAPI-based s3-dropbox? We moved away from that to AWS Lambda. You can access the most recent FastAPI-based code in the git history.


Contents


Authentication

There are 2 layers of authentication:

  • Clients that send data must pass an Authorization HTTP header in the format Bearer <token>, where <token> is the client token generated by the create_token.py script. The corresponding server token, which is a hashed form of the client token, should be stored in the AUTH_TOKEN environment variable.

  • The CDN in front of the Lambda Function URL must add a HTTP header in the format Bearer <token>, where <token> is the client token generated by the create_token.py script. The corresponding server token, which is a hashed form of the client token, should be stored as the CDN_TOKEN environment variable. The name of the HTTP header should be stored in the CDN_TOKEN_HTTP_HEADER_NAME. A typical name would be x-cdn-authorization.

    This is in place so a web application firewall (WAF) can be attatched to the CDN with additional protections. For example a web access control list (web ACL) that limits connections from certain IP addresses.

    The CDN_TOKEN should be generated by a separate run of the create_token.py script to the run that generates AUTH_TOKEN.

Deployment

Deployment is fairly manual (AKA ClickOps):

  1. Create an S3 bucket for the data to be dropped in.

  2. Create a Python Lambda function and copy and paste the code from main.py into the AWS Console, ensuring to "deploy" the code. The only Python dependency is boto3, which is already automatically available in AWS Lambda.

  3. Enable Function URL for the Lambda Function, disabling authentication.

  4. Create a CloudFront distribution with origin of the Lambda Function URL, and a behaviour pointing to this origin at a specific path. A typical path would be "/v1/drop".

    The default behaviour, i.e. for other paths, can be set to a "blackhole" origin that points to the non-existant domain "blackhole.invalid" for example.

  5. Create a WAF attached to the CloudFront distribution with appropriate protections. For example, one that allows only specific IP addresses through.

  6. Set environment variables on the Lambda function as described below.

  7. (Safely) pass the client token to the client that will be connecting.

  8. Configure the origin of the CloudFront distribution to set its token as the CDN_TOKEN_HTTP_HEADER_NAME HTTP header.

  9. Set permissions on the execution role of the Lambda function as described below.

Required environment variables

Configuration is via environment variables, and 4 are required to be explicitly set for s3-dropbox to function.

  • BUCKET

    The S3 bucket name to upload files to

  • AUTH_TOKEN

    The server side token created from create_token.py, corresponding to the plain text bearer token given to the client.

  • CDN_TOKEN

    The server side token created from create_token.py, corresponding to the plain text token set in the CDN in the HTTP header with name CDN_TOKEN_HTTP_HEADER_NAME.

    The CDN_TOKEN should be generated by a separate run of the create_token.py script to the run that generates AUTH_TOKEN.

  • CDN_TOKEN_HTTP_HEADER_NAME

    The name of the HTTP header that contains the client token created by create_token.py. This is typically added by the CDN running in front of the Lambda Function URL, for example CloudFront.

Optional environment variables

These environment variables can be used to configure s3-dropbox, but typically do not have to be explicitly set.

  • S3_ENDPOINT_URL

    The endpoint of S3 or the S3-compatible service. If this is unset, the AWS S3 endpoints are used, and so typically they should only need tobe set during testing outside of AWS Lambda.

  • AWS_* environment variables.

    These are technically required because boto3, the Python AWS SDK, uses them, they are automatically populated by the AWS Lambda environment, and so they would typically would only need to be explicitly set during testing outside of AWS Lambda.

Permissions

The only permission that the Lambda IAM execution role needs (in addition to any usual logging permissions) is s3:PutObject on the bucket specified in the BUCKET environment variable.

Running type checking and tests

Python requirements must be installed and a local S3-like service started:

pip install -r requirements-dev.txt
./start-services

Then to run type checking:

mypy .

Then to run the tests:

pytest

Or to run the tests with more verbose output:

pytest -s

Running locally outside of tests

At the time of writing, it hasn't been necessary to run the function locally outside of tests.

About

A simple bearer token authenticated dropbox that drops its payloads into an S3 bucket, designed to run in AWS Lambda via a Function URL

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published