Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime for AWS Lambda #1127

Merged
merged 34 commits into from
Feb 1, 2024
Merged

Add runtime for AWS Lambda #1127

merged 34 commits into from
Feb 1, 2024

Conversation

jleedev
Copy link
Collaborator

@jleedev jleedev commented Jan 11, 2024

This adds the lambda-web crate to adapt the actix App to speak to Lambda by way of the lambda_runtime crate.

Briefly: AWS Lambda has native support for scripting languages to execute a function directly; compiled languages must embed a runtime to fetch incoming events from Lambda and post the responses. This detects the environment variables to start up in Lambda mode instead of the normal HTTP server, and is added as an optional feature.

Lambda has five (!) distinct ways of routing HTTP requests to a function; this supports some of them. (Specifically, the most obvious way to do this is with a Function URL, which is newest and simplest, and perhaps with CloudFront, which speaks to the Function URL and not Lambda directly.)

The error handling could probably be refined, I was just trying to get this to compile.

(Supported: API Gateway HTTP API with payload format version 2.0; API Gateway REST API; Lambda function URLs / Not supported: API Gateway HTTP API with payload format version 1.0; Application Load Balancer)

Necessary for #1102 to be able to run the released packages directly, and only having to configure the appropriate environment.

TODOs

  • Update Book Documentation
  • Update README
  • Add CI test(s)

Copy link
Member

@nyurik nyurik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! My biggest concern is how can we test this, especially as part of the CI?

martin/Cargo.toml Show resolved Hide resolved

match () {
#[cfg(feature = "lambda")]
() if is_running_on_lambda() => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a simple block like this? Why do you need a match statement?

#[cfg(feature = "lambda")]
if is_running_on_lambda() {
    ....
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. we need to document that a new env var variable has magical properties per https://github.com/hanabu/lambda-web/blob/ddceea3b8ffcfae465fc8d0059aaa52a8ff94f8e/src/lib.rs#L34

if AWS_LAMBDA_RUNTIME_API is defined, runs under AWS lambda runtime

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was frustrated that an if/else wouldn't work with the conditional compilation, and this is a way to express that only one of the code paths will run. But an if statement and a return would probably be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may refactor it later, no worries. I think we should focus on the main issue - Testing, and ideally - testing in CI. We need a clear way to ensure that after some person adds some random code, we don't have to do too much work to make sure it continues to work in all expected scenarios.

@jleedev
Copy link
Collaborator Author

jleedev commented Jan 11, 2024

looks great! My biggest concern is how can we test this, especially as part of the CI?

Looks like AWS SAM (Serverless Application Model) is the harness to use: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-using-automated-tests.html

Hah, this was surprisingly painless:

~/sam-app$ cat template.yaml
AWSTemplateFormatVersion: 2010-09-09
Transform: 'AWS::Serverless-2016-10-31'
Resources:
  MartinFunction:
    Type: 'AWS::Serverless::Function'
    Properties:
      PackageType: Image
      Architectures:
        - arm64
      ImageUri: martin
      ImageConfig:
        Command:
          - 'https://jleedev-planet.s3.us-east-2.amazonaws.com/demotiles.pmtiles'
      FunctionUrlConfig:
        AuthType: NONE
~/sam-app$ sam local generate-event apigateway http-api-proxy | jq '.rawPath="/catalog"|.requestContext.http.method="GET"' >event.json 
~/sam-app$ sam local invoke --event event.json 
[…]
{"body": "eyJ0aWxlcyI6eyJkZW1vdGlsZXMiOnsiY29udGVudF90eXBlIjoiYXBwbGljYXRpb24veC1wcm90b2J1ZiIsImNvbnRlbnRfZW5jb2RpbmciOiJnemlwIiwibmFtZSI6Im1hcGxpYnJlIiwiZGVzY3JpcHRpb24iOiIiLCJhdHRyaWJ1dGlvbiI6IlJlbmRlcmVkIHdpdGggPGEgaHJlZj1cImh0dHBzOi8vd3d3Lm1hcHRpbGVyLmNvbS9kZXNrdG9wL1wiPk1hcFRpbGVyIERlc2t0b3A8L2E+In19LCJzcHJpdGVzIjp7fSwiZm9udHMiOnt9fQ==", "cookies": [], "headers": {"content-type": "application/json", "vary": "Origin, Access-Control-Request-Method, Access-Control-Request-Headers"}, "isBase64Encoded": true, "statusCode": 200}
~/sam-app$ sam local start-lambda
~/sam-app$ AWS_ACCESS_KEY_ID=x AWS_SECRET_ACCESS_KEY=x AWS_REGION=us-east-1 aws --endpoint http://localhost:3001 lambda invoke --function-name MartinFunction --payload file://event.json --cli-binary-format raw-in-base64-out /dev/stdout 
{"body": "eyJ0aWxlcyI6eyJkZW1vdGlsZXMiOnsiY29udGVudF90eXBlIjoiYXBwbGljYXRpb24veC1wcm90b2J1ZiIsImNvbnRlbnRfZW5jb2RpbmciOiJnemlwIiwibmFtZSI6Im1hcGxpYnJlIiwiZGVzY3JpcHRpb24iOiIiLCJhdHRyaWJ1dGlvbiI6IlJlbmRlcmVkIHdpdGggPGEgaHJlZj1cImh0dHBzOi8vd3d3Lm1hcHRpbGVyLmNvbS9kZXNrdG9wL1wiPk1hcFRpbGVyIERlc2t0b3A8L2E+In19LCJzcHJpdGVzIjp7fSwiZm9udHMiOnt9fQ==", "cookies": [], "headers": {"content-type": "application/json", "vary": "Origin, Access-Control-Request-Method, Access-Control-Request-Headers"}, "isBase64Encoded": true, "statusCode": 200}{
    "StatusCode": 200
}

Copy link
Member

@nyurik nyurik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much simpler, thanks! I added a few more comments, but this is looking better and better, almost ready to go. I will add a few TODOs to the top comment.

@@ -9,7 +9,7 @@ use martin::{read_config, Config, MartinResult};

const VERSION: &str = env!("CARGO_PKG_VERSION");

async fn start(args: Args) -> MartinResult<Server> {
async fn start(args: Args) -> MartinResult<impl Future<Output = MartinResult<()>>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks ... scary. Could you add a comment explaining why you have future of a future here, or if we can somehow simplify this? Basically here we have a desugared Future<Result<Future<Result>>>>, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, the actix Server type was just being used as a future, and desugars to this. I note that the main function is already awkward with the .await.unwrap().await.unwrap().

The only reason to do anything with the server besides awaiting it immediately would be to pause or stop it, and also to print out the listening address as we do here.

Defining a type alias for what the "Server" is would reduce the clutter in the other file, but this function can really be changed to return MartinResult<()>, since this is essentially the main.

pub fn new_server(
config: SrvConfig,
state: ServerState,
) -> MartinResult<(LocalBoxFuture<'static, MartinResult<()>>, String)> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing as above - any way to simplify the result of this, or at least lots of comments?

@nyurik
Copy link
Member

nyurik commented Jan 15, 2024

@jleedev hi, let me know if you need any help with this PR, i really do want to get this merged in, but no pressure! :)

@jleedev
Copy link
Collaborator Author

jleedev commented Jan 17, 2024

Current status:

Figured out how to build and test a Lambda function on GitHub actions. The sam local invoke command is straightforward but starts and stops the server each time, which would be suboptimal for running a full test suite. The sam local start-lambda --warm-containers EAGER is better, but requires actually using the AWS client to send the Invoke RPC. Rigging that up with bash and the aws cli is not extremely terrible.

Also requires having a (about ten line) yaml SAM template to actually run it, which is sort of useless boilerplate; the point of this change is so that the binary will "just work" when tossed into Lambda, and perhaps a sample template that someone can launch, which isn't really the same as a template whose only purpose is to run the integration tests.

At any rate, I propose adding a CI test that invokes the lambda once, to verify that it works in a minimal sense, and a followup can integrate the test suite fully. Translating the normal HTTP calls to Lambda invocations, etc.

I'll address the type comments above. If you could make the documentation changes that would be good I think.

@nyurik
Copy link
Member

nyurik commented Jan 17, 2024

Thanks, sounds good - a simple Lambda smoke test would be a great start that we can improve upon later. Adding a doc page is easy - I created a simple placeholder https://github.com/jleedev/martin/pull/1 - but the actual "how to" is a mystery to me - i have never used lambdas.

Verifies that the (amd64) container image can start up under AWS Lambda
and respond to an HTTP event with a 200. Does not supply any
configuration or test any handlers.
@jleedev
Copy link
Collaborator Author

jleedev commented Jan 17, 2024

Well, that worked in that SAM actually started martin with no arguments and thus failed during the Init phase. And then neglected to fail the CI.

@nyurik
Copy link
Member

nyurik commented Jan 17, 2024

I'll have to think of which CI step to move this to. Most likely it will be part of the linux one, maybe even a separate test that simply downloads the linux binary from the temp storage, and runs it without postgres database - e.g. it will simply use a small test pmtiles. Ideally, it would be great if we can simulate an S3 bucket that the lambda accesses - in that case we can place a sample pmtiles into it, and the lambda runner will run martin which will have just one CLI parameter - the URL of the s3 file.

@nyurik
Copy link
Member

nyurik commented Jan 18, 2024

@jleedev for CI, add something like this into ci.yml around line 291:

  test-aws-lambda:
    name: Test AWS Lambda
    runs-on: ubuntu-latest
    needs: [ build ]
    steps:
      - name: Checkout sources
        uses: actions/checkout@v4
      - name: Download build artifact build-x86_64-unknown-linux-gnu
        uses: actions/download-artifact@v3
        with:
          name: build-x86_64-unknown-linux-gnu
          path: target/
      - run: chmod +x target/martin
      - run: test/test-aws-lambda.sh
        env:
          MARTIN_BIN=target/martin

@nyurik
Copy link
Member

nyurik commented Jan 18, 2024

P.S. Actually an even better path would be to write a simple script similar to tests/test.sh, e.g. tests/test-aws-lambda.sh -- which would have some of the bits and pieces of the test.sh script:

#!/usr/bin/env bash
set -euo pipefail

MARTIN_PORT="${MARTIN_PORT:-3111}"
MARTIN_URL="http://localhost:${MARTIN_PORT}"
MARTIN_ARGS="${MARTIN_ARGS:---listen-addresses localhost:${MARTIN_PORT}}"  # adjust this to use lambda?
MARTIN_BIN="${MARTIN_BIN:-target/debug/martin} ${MARTIN_ARGS}"

function wait_for {
  # copy this function from test.sh
}

function kill_process {
  # copy this function from test.sh
}

$MARTIN_BIN "${ARG[@]}"  &
MARTIN_PROC_ID=`jobs -p | tail -n 1`
trap "echo 'Stopping Martin server $MARTIN_PROC_ID...'; kill -9 $MARTIN_PROC_ID 2> /dev/null || true; echo 'Stopped Martin server $MARTIN_PROC_ID';" EXIT HUP INT TERM
wait_for $MARTIN_PROC_ID Martin "$MARTIN_URL/health"

# do whatever tests needed

kill_process $MARTIN_PROC_ID Martin

You may also add the logging parts to keep it even more consistent

@nyurik
Copy link
Member

nyurik commented Jan 31, 2024

@jleedev please run just fmt-md to fix some markdown issues. Thx!

On the hopes that al2023 won't complain about:

    martin: /lib64/libm.so.6: version `GLIBC_2.35' not found (required by martin)
@jleedev
Copy link
Collaborator Author

jleedev commented Jan 31, 2024

Regarding this libc thing: I used the docker images that the workflow produced in my fork, and it ran successfully on AWS Lambda. The docker image uses musl.

The release tarball uses gnu. I just tried on CloudShell, and it fails in the same way as on Lambda:

$ curl -OL https://github.com/maplibre/martin/releases/download/v0.13.0/martin-x86_64-unknown-linux-gnu.tar.gz
$ tar -xzvf martin-x86_64-unknown-linux-gnu.tar.gz     
$ ./martin 
./martin: /lib64/libm.so.6: version `GLIBC_2.35' not found (required by ./martin)
$ cat /etc/os-release 
NAME="Amazon Linux"
VERSION="2023"
ID="amzn"
ID_LIKE="fedora"
VERSION_ID="2023"
PLATFORM_ID="platform:al2023"
PRETTY_NAME="Amazon Linux 2023"

So the Lambda documentation (and CI step) should specify to pull from the container image, not the release tarballs.

@nyurik
Copy link
Member

nyurik commented Jan 31, 2024

So the Lambda documentation (and CI step) should specify to pull from the container image, not the release tarballs.

We have both -gnu and -musl in the releases. Do they both fail? BTW, feel free to ping me on slack to discuss

@jleedev
Copy link
Collaborator Author

jleedev commented Jan 31, 2024

Aha, I was confusing the "Artifacts" section of the run with the Release assets. So the CI must extract the cross-build zip artifact to get at the musl build, and the documentation can refer to the musl release (it already does).

Copy link
Member

@nyurik nyurik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome!!! i think its almost done. I added a few nits that should be very easy to fix, and just to clarify for myself if they are needed and how they are being used. Thanks again!

.github/files/lambda-function/Makefile Show resolved Hide resolved
.github/files/lambda-layer/bootstrap Show resolved Hide resolved
tar -C src/bin/ -xzf martin-aarch64-unknown-linux-musl.tar.gz martin
```

Every zip-based Lambda function runs a file called `bootstrap`. `vim src/bootstrap`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better use heredoc for this - easier, and does not require familiarity with vim. Might also use #!/usr/bin/env sh, but not certain which shebang is better.

cat <<EOF > src/bootstrap
#!/bin/sh
set -eu
exec martin -c ${_HANDLER}.yaml
EOF

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like AL2023's /bin/sh is "GNU bash, version 5.2.15(1)-release (x86_64-amazon-linux-gnu)", and they also have a merged / and /usr. So it should be fine.

I'm amused to learn that #!/bin/sh is actually not a POSIX compliant thing to do, as the path to sh cannot be assumed.

Anyway, /bin/sh causes Bash to start up in its posix mode, and refers to Dash on Debian and Ubuntu systems, but this small shell script shouldn't encounter any differences :)

exec martin -c ${_HANDLER}.yaml
```

Write the SAM template. `vim template.yaml`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use heredoc as above


2. Click Deploy, wait for the success banner, and visit your function URL.

### Extra things
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"things" here looks weird - either Extras or TODO ?

Suggested change
### Extra things
### TODO


Run `sam deploy --guided`.

1. Stack Name: [martin-layer]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken link?


Add your layer:

1. Click “add a layer” (green banner at the top, or the very bottom).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in cases like these, you may want to add screenshot images, but we can do this in a separate PR.

### Extra things

- Lambda has a default timeout of 3 seconds, and 128 MB of memory, maybe this is suboptimal.
- Martin will cache tiles in memory, but this might be useless for an ephemeral server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is useless - if martin binary will handle more than one request, it will be better to cache at least some things - for example, when using pmtiles, cache the directory structure.

.unwrap_or_else(|e| on_error(e))
.await
.unwrap_or_else(|e| on_error(e));
start(Args::parse()).await.unwrap_or_else(|e| on_error(e));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better to inline the on_error here. Also, I noticed the same issue in martin-cp if you want to inline it there at the same time.

Suggested change
start(Args::parse()).await.unwrap_or_else(|e| on_error(e));
start(Args::parse()).await.unwrap_or_else(|e| {
if log_enabled!(log::Level::Error) {
error!("{e}");
} else {
eprintln!("{e}");
}
std::process::exit(1);
});

tests/test-aws-lambda.sh Show resolved Hide resolved
@jleedev
Copy link
Collaborator Author

jleedev commented Feb 1, 2024

Made all the changes.

@nyurik
Copy link
Member

nyurik commented Feb 1, 2024

@jleedev there are a few minor comments in the diff view

jleedev and others added 4 commits February 1, 2024 11:45
Co-authored-by: Yuri Astrakhan <yuriastrakhan@gmail.com>
Co-authored-by: Yuri Astrakhan <yuriastrakhan@gmail.com>
@jleedev
Copy link
Collaborator Author

jleedev commented Feb 1, 2024

Think I got them all now!

Copy link
Member

@nyurik nyurik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work, thank you!!!

We may want to create follow up issues to remind later of any further tasks, but this is totally up to you :)

tests/test-aws-lambda.sh Show resolved Hide resolved
@nyurik nyurik enabled auto-merge (squash) February 1, 2024 18:32
@nyurik nyurik merged commit 9f9f181 into maplibre:main Feb 1, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants