Skip to content

Lambda for restructuring objects created in an S3 bucket

License

Notifications You must be signed in to change notification settings

buoyant-data/lambda-s3-restructure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

S3 Restructure Lambda

This Lambda function responds to S3 bucket notifications and will copy the objects to a new directory structure based on the rules defined in environment variables.

Building

Building and testing the Lambda can be done with cargo: cargo test build.

In order to deploy this in AWS Lambda, it must first be built with the cargo lambda command line tool, e.g.:

cargo lambda build --release --output-format zip

This will produce the file: target/lambda/lambda-delta-optimize/bootstrap.zip

Infrastructure

The deployment.tf file contains the necessary Terraform to provision the function, a DynamoDB table for locking, and IAM permissions. This Terraform does not provision an S3 bucket to optimize.

After configuring the necessary authentication for Terraform, the following steps can be used to provision:

cargo lambda build --release --output-format zip
terraform init
terraform plan
terraform apply
ℹ️

Terraform configures the Lambda to run with the smallest amount of memory allowed. For a sizable table, this may not be sufficient for larger tables.

Environment variables

The following environment variables must be set for the function to run properly

Name Value Notes

INPUT_PATTERN

required but empty by default

A prefix mapping that is compatible with the routefinder syntax, e.g. some/prefix/with/:name/:filename which makes the name and filename parameters available in the output path.

OUTPUT_TEMPLATE

required byt empty by default

A liquid template which produces a compatible path for outputting the file into the S3 bucket.

EXCLUDE_REGEX

optional regular expression for keys to exclude from consideration

⚠️

The parameters must all be present and match in the input and output patterns. If there are variable segments of the INPUT_PATTERN which must be ignored, use the special :ignore parameter name to ignore that segment but still allow it to match the output pattern

Handlebars built-ins

There are a few special patterns which are available for the OUTPUT_TEMPLATE to make it easy to provide more natural dynamic outputs.

Name Value Notes

year

Current year (UTC)

month

Current month (UTC)

day

Current day (UTC)

ds

Date stamp (UTC)

Formatted with %Y-%m-%d e.g. 2023-08-21

region

AWS Region defined by runtime

The value of AWS_REGION or unknown if that variable is not available

Licensing

This repository is intentionally licensed under the AGPL 3.0. If your organization is interested in re-licensing this function for re-use, contact me via email for commercial licensing terms: rtyler@buoyantdata.com