From 70f0517cec5c9830f07a7dff9f2f7b8565a793e3 Mon Sep 17 00:00:00 2001 From: Lana Brindley Date: Fri, 10 Sep 2021 01:18:59 +1000 Subject: [PATCH] AWS Lambda tutorial review (#353) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Edit index * Edit create data API * Small changes to index * Edit 3rd party ingest * Small edits * Update timescaledb/tutorials/aws-lambda/3rd-party-api-ingest.md Co-authored-by: Attila Tóth Co-authored-by: Attila Tóth --- .../aws-lambda/3rd-party-api-ingest.md | 272 +++++---- .../tutorials/aws-lambda/create-data-api.md | 523 +++++++++--------- timescaledb/tutorials/aws-lambda/index.md | 39 +- .../tutorials/page-index/page-index.js | 9 +- 4 files changed, 429 insertions(+), 414 deletions(-) diff --git a/timescaledb/tutorials/aws-lambda/3rd-party-api-ingest.md b/timescaledb/tutorials/aws-lambda/3rd-party-api-ingest.md index a9480674f67e..6ba61cce9c25 100644 --- a/timescaledb/tutorials/aws-lambda/3rd-party-api-ingest.md +++ b/timescaledb/tutorials/aws-lambda/3rd-party-api-ingest.md @@ -1,30 +1,27 @@ -# Pull data from third-party API and ingest into TimescaleDB (Docker) -In this section, you build a data pipeline which pulls data from a third-party finance API and loads it into TimescaleDB. - -**Required libraries:** - -* pandas -* requests -* psycopg2 -* pgcopy - -This tutorial requires multiple libraries. This can make your deployment package size -larger than the 250MB limit of Lambda. With a Docker container, your package -size can be up to 10GB which gives you much more flexibility regarding what -libraries and dependencies you can use. For more about AWS Lambda container support, see the -[AWS documentation](https://docs.aws.amazon.com/lambda/latest/dg/images-create.html). - -To complete this tutorial, you need to complete these procedures: -1. Create ETL function -1. Add requirements.txt -1. Create the Dockerfile -1. Upload the image to ECR -1. Create Lambda function from the container - -## Create ETL function -This is an example function that pulls data from a finance API called Alpha Vantage, and -inserts the data into TimescaleDB. The connection is made using the values from -environment variables: +# Pull and ingest data from a third-party API +This tutorial builds a data pipeline that pulls data from a third-party finance +API and loads it into TimescaleDB. + +This tutorial requires multiple libraries. This can make your deployment package +size larger than the 250 MB limit of Lambda. You can use a Docker +container to extend the package size up to 10 GB, giving you much more +flexibility in libraries and dependencies. For more about AWS Lambda container +support, see the [AWS documentation](https://docs.aws.amazon.com/lambda/latest/dg/images-create.html). + +The libraries used in this tutorial: +* [`pandas`][pandas] +* [`requests`][requests] +* [`psycopg2`][psycopg2] +* [`pgcopy`][pgcopy] + +## Create an ETL function +Extract, transform, and load (ETL) functions are used to pull data from one +database and ingest the data into another. In this tutorial, the ETL function +pulls data from a finance API called Alpha Vantage, and inserts the data into +TimescaleDB. The connection is made using the values from environment variables. + +This is the ETL function used in this tutorial: + ```python # function.py: import csv @@ -32,14 +29,14 @@ import pandas as pd import psycopg2 from pgcopy import CopyManager import os - + config = {'DB_USER': os.environ['DB_USER'], 'DB_PASS': os.environ['DB_PASS'], 'DB_HOST': os.environ['DB_HOST'], 'DB_PORT': os.environ['DB_PORT'], 'DB_NAME': os.environ['DB_NAME'], 'APIKEY': os.environ['APIKEY']} - + conn = psycopg2.connect(database=config['DB_NAME'], host=config['DB_HOST'], user=config['DB_USER'], @@ -47,24 +44,24 @@ conn = psycopg2.connect(database=config['DB_NAME'], port=config['DB_PORT']) columns = ('time', 'price_open', 'price_close', 'price_low', 'price_high', 'trading_volume', 'symbol') - + def get_symbols(): """Read symbols from a csv file. - + Returns: [list of strings]: symbols """ with open('symbols.csv') as f: reader = csv.reader(f) return [row[0] for row in reader] - + def fetch_stock_data(symbol, month): """Fetches historical intraday data for one ticker symbol (1-min interval) - + Args: symbol (string): ticker symbol month (int): month value as an integer 1-24 (for example month=4 will fetch data from the last 4 months) - + Returns: list of tuples: intraday (candlestick) stock data """ @@ -76,7 +73,7 @@ def fetch_stock_data(symbol, month): .format(symbol=symbol, slice=slice, interval=interval,apikey=apikey) df = pd.read_csv(CSV_URL) df['symbol'] = symbol - + df['time'] = pd.to_datetime(df['time'], format='%Y-%m-%d %H:%M:%S') df = df.rename(columns={'time': 'time', 'open': 'price_open', @@ -86,7 +83,7 @@ def fetch_stock_data(symbol, month): 'volume': 'trading_volume'} ) return [row for row in df.itertuples(index=False, name=None)] - + def handler(event, context): symbols = get_symbols() for symbol in symbols: @@ -97,15 +94,12 @@ def handler(event, context): mgr = CopyManager(conn, 'stocks_intraday', columns) mgr.copy(stock_data) conn.commit() - ``` -## Add requirements.txt -Add a text file to your project called `requirements.txt` that includes all the libraries that -you need installed. For example, if you need pandas, requests, psycopg2 and pgcopy -your `requirements.txt` looks like this: +## Add a requirements file +When you have created the ETL function, you need to include the libraries you want to install. You can do this by creating a text file in your project called `requirements.txt` that lists the libraries. This is the `requirements.txt` file used in this tutorial: -``` +```txt pandas requests psycopg2-binary @@ -113,60 +107,63 @@ pgcopy ``` -We use `psycopg2-binary` instead of `psycopg2` in the `requirements.txt` file. The binary version -of the library contains all its dependencies, which means that you don’t need to install them separately. +This example uses `psycopg2-binary` instead of `psycopg2` in the +`requirements.txt` file. The binary version of the library contains all its +dependencies, so that you don’t need to install them separately. ## Create the Dockerfile - -Dockerfile: -```dockerfile -# Use a AWS Lambda base image -FROM public.ecr.aws/lambda/python:3.8 - -# Copy all project files to the root folder -COPY function.py . -COPY requirements.txt . - -# Install libraries -RUN pip install -r requirements.txt - -CMD ["function.handler"] -``` +When you have the requirements set up, you can create the Dockerfile for the project. + +### Procedure: Creating the Dockerfile +1. Use an AWS Lambda base image: + ```docker + FROM public.ecr.aws/lambda/python:3.8 + ``` +1. Copy all project files to the root directory: + ```docker + COPY function.py . + COPY requirements.txt . + ``` +1. Install the libraries using the requirements file: + ```docker + RUN pip install -r requirements.txt + CMD ["function.handler"] + ``` ## Upload the image to ECR -To connect the container image to a Lambda function, you need to start by uploading -it to the AWS Elastic Container Registry (ECR). - -Login to Docker CLI: -```bash -aws ecr get-login-password --region us-east-1 \ -| docker login --username AWS \ ---password-stdin .dkr.ecr.us-east-1.amazonaws.com -``` - -Build image: -```bash -docker build -t lambda-image . -``` - -Create repository in ECR: -```bash -aws ecr create-repository --repository-name lambda-image -``` - -Tag your image to match the repository name, and deploy the image to Amazon ECR -using the `docker push` command: -```bash -docker tag lambda-image:latest .dkr.ecr.us-east-1.amazonaws.com/lambda-image:latest -docker push .dkr.ecr.us-east-1.amazonaws.com/lambda-image:latest - -``` - -## Create Lambda function from the container -You can use the same Lambda `create-function` command as you used earlier, but this -time you need to define the `--package-type` parameter as `image`, and add the ECR -Image URI using the `--code` flag: +To connect the container image to a Lambda function, you need to upload it to +the AWS Elastic Container Registry (ECR). + +### Procedure: Uploading the image to ECR +1. Log in to the Docker command line interface: + ```bash + aws ecr get-login-password --region us-east-1 \ + | docker login --username AWS \ + --password-stdin .dkr.ecr.us-east-1.amazonaws.com + ``` +1. Build the image: + ```bash + docker build -t lambda-image . + ``` +1. Create a repository in ECR. In this example, the repository is + called `lambda-image`: + ```bash + aws ecr create-repository --repository-name lambda-image + ``` +1. Tag your image using the same name as the repository: + ```bash + docker tag lambda-image:latest .dkr.ecr.us-east-1.amazonaws.com/lambda-image:latest + ``` +1. Deploy the image to Amazon ECR with Docker: + ```bash + docker push .dkr.ecr.us-east-1.amazonaws.com/lambda-image:latest + ``` + +## Create a Lambda function from the container +To create a Lambda function from your container, you can use the Lambda +`create-function` command. You need to define the `--package-type` parameter as +`image`, and add the ECR Image URI using the `--code` flag: ```bash aws lambda create-function --region us-east-1 \ @@ -174,53 +171,54 @@ aws lambda create-function --region us-east-1 \ --code ImageUri= --role arn:aws:iam::818196790983:role/Lambda ``` -## Schedule your Lambda function -If you want to run your Lambda function periodically, you can set up an EventBridge trigger. -Create a new rule with a cron-like expression. For example, if you want to run the function everyday at 9am, -you can use this expression: `cron(0 9 * * ? *)`. - -```bash -aws events put-rule --name schedule-lambda --schedule-expression 'cron(0 9 * * ? *)' -``` - -If you encounter the `Parameter ScheduleExpression is not valid` error message, have a look at the [cron expression examples in the EventBridge docs](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-create-rule-schedule.html#eb-cron-expressions). - -Grant the necessary permissions for the Lambda function: -```bash -aws lambda add-permission --function-name \ ---statement-id my-scheduled-event --action 'lambda:InvokeFunction' \ ---principal events.amazonaws.com -``` - -To add the function to the EventBridge rule, create a `targets.json` file containing a memorable, unique string, -and the ARN of the Lambda Function: -```json -[ - { - "Id": "docker_lambda_trigger", - "Arn": "" - } -] -``` - -When you have finished, use the `events put-target` command to add the target (the lambda function to be invoked) to the rule. -```bash -aws events put-targets --rule schedule-lambda --targets file://targets.json -``` - -To check if the rule is connected to the Lambda function, in the AWS console, navigate to Amazon EventBridge > Events > Rules, and -select your rule. The Lambda function’s name is listed under `Target(s)`. - -![aws eventbridge lambda](https://assets.timescale.com/docs/images/tutorials/aws-lambda-tutorial/targets.png) +## Schedule the Lambda function +If you want to run your Lambda function according to a schedule, you can set up +an EventBridge trigger. This creates a rule using a [`cron` expression][cron-examples]. + + +### Procedure: Scheduling the Lambda function : +1. Create the schedule. In this example, the function runs every day at 9am: + ```bash + aws events put-rule --name schedule-lambda --schedule-expression 'cron(0 9 * * ? *)' + ``` +1. Grant the necessary permissions for the Lambda function: + ```bash + aws lambda add-permission --function-name \ + --statement-id my-scheduled-event --action 'lambda:InvokeFunction' \ + --principal events.amazonaws.com + ``` +1. Add the function to the EventBridge rule, by creating a `targets.json` file + containing a memorable, unique string, and the ARN of the Lambda Function: + ```json + [ + { + "Id": "docker_lambda_trigger", + "Arn": "" + } + ] + ``` +1. Add the Lambda function, referred to in this command as the `target`, to + the rule: + ```bash + aws events put-targets --rule schedule-lambda --targets file://targets.json + ``` + + +If you get an error saying `Parameter ScheduleExpression is not valid`, you +might have made a mistake in the cron expression. Check the +[cron expression examples][cron-examples] +documentation. + +You can check if the rule is connected correctly to the Lambda function in the +AWS console. Navigate to Amazon EventBridge → Events → Rules, and click the rule +you created. The Lambda function's name is listed under `Target(s)`: -## Conclusion -AWS Lambda is a popular tool of choice for running your data pipelines. It’s scalable, serverless and works like magic. I hope this tutorial was useful to get you started using TimescaleDB with AWS Lambda. +Lamdba function target in AWS Console -## Resources -* [AWS CLI Version 2 References](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/index.html) -* [Creating Lambda container images](https://docs.aws.amazon.com/lambda/latest/dg/images-create.html) -* [Getting started with AWS Lambda](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html) -* [Analyze historical intraday stock data](/tutorials/analyze-intraday-stocks) -* [Analyze cryptocurrency market data](/tutorials/analyze-cryptocurrency-data) +[pandas]: https://pandas.pydata.org/ +[requests]: https://docs.python-requests.org/en/master/ +[psycopg2]: https://github.com/jkehler/awslambda-psycopg2 +[pgcopy]: https://github.com/G-Node/pgcopy +[cron-examples]: https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-create-rule-schedule.html#eb-cron-expressions diff --git a/timescaledb/tutorials/aws-lambda/create-data-api.md b/timescaledb/tutorials/aws-lambda/create-data-api.md index 2026a87d896f..53a6e65c50d4 100644 --- a/timescaledb/tutorials/aws-lambda/create-data-api.md +++ b/timescaledb/tutorials/aws-lambda/create-data-api.md @@ -1,229 +1,235 @@ # Create a data API for TimescaleDB -This tutorial covers how to create an API to fetch data from your TimescaleDB instance. We are using API Gateway to -trigger a Lambda function which fetches the requested data from TimescaleDB and returns it in JSON format. +This tutorial covers creating an API to fetch data from your TimescaleDB +instance. It uses an API gateway to trigger a Lambda function, that then fetches +the requested data from TimescaleDB and returns it in JSON format. ## Connect to TimescaleDB from Lambda -To connect to the TimescaleDB instance, you need to use a database connector library. For this tutorial we -chose [`psycopg2`](https://www.psycopg.org/docs/). +To connect to the TimescaleDB instance, you need to use a database connector +library. This tutorial uses [`psycopg2`][psycopg2]. + +The `psycopg2` database connector is not part of the standard Python library, +and is not included in AWS Lambda, so you need to manually include the library +in your deployment package to make it available to use. This tutorial uses +[Lambda Layers][lambda-layers] to include `psycopg2`. A Lambda Layer is an +archive containing additional code, such as libraries or dependencies. Layers +help you use external libraries in your function code that are not be available +otherwise. + +Additionally, `psycopg2` needs to be built and compiled with statically linked +libraries, something that you can't do directly in a Lambda function or layer. A +workaround to this issue is to download the +[compiled version of the library][lambda-psycopg2] and use that as a Lambda Layer. + +### Procedure: Adding the psycopg2 library as a Lambda layer +1. Download and unzip the compiled `psycopg2` library: + ```bash + wget https://github.com/jkehler/awslambda-psycopg2/archive/refs/heads/master.zip + unzip master.zip + ``` +1. In the directory you downloaded the library to, copy the `psycopg2` files + into a new directory called `python`. Make sure you copy the directory that + matches your Python version: + ```bash + cd awslambda-psycopg2-master/ + mkdir python + cp -r psycopg2-3.8/ python/ + ``` +1. Zip the `python` directory and upload the zipped file as a Lambda layer: + ```bash + zip -r psycopg2_layer.zip python/ + aws lambda publish-layer-version --layer-name psycopg2 \ + --description "psycopg2 for Python3.8" --zip-file fileb://psycopg2_layer.zip \ + --compatible-runtimes python3.8 + ``` +1. At the AWS Lambda console, check to see if your `psycopg2` has been uploaded + as a Lambda layer: + ```bash + ![aws layers](https://assets.timescale.com/docs/images/tutorials/aws-lambda-tutorial/layers.png) + ``` -Because `psycopg2` is not part of the standard Python library, and it’s not included in AWS Lambda either, you need to -manually include this library in your deployment package so that it is available to use. We are going to use [Lambda Layers](https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html) for this purpose. - - -**What’s a Lambda Layer?** A Lambda Layer is an archive containing additional code, such as libraries or dependencies. -Layers help you use external libraries in your function code that would not be available otherwise. - - -One issue is that `psycopg2` needs to be built and compiled with statically linked libraries, something that you can't -do directly in a Lambda function or Layer. A workaround to this issue is to download the [compiled version of the library](https://github.com/jkehler/awslambda-psycopg2) and use that as a Lambda Layer. - -### Procedure: Adding psycopg2 library as a Lambda layer - -Download and unzip the compiled `psycopg2` library. -```bash -wget https://github.com/jkehler/awslambda-psycopg2/archive/refs/heads/master.zip -unzip master.zip -``` - -Cd into the folder and copy the psycopg2 files into a new directory called python. -Note: copy the folder which fits your Python version -```bash -cd awslambda-psycopg2-master/ -Mkdir python -cp -r psycopg2-3.8/ python/ -``` - -Zip the python folder and upload the zipped file as a lambda layer using the `lambda publish-layer-version` command. -```bash -zip -r psycopg2_layer.zip python/ -aws lambda publish-layer-version --layer-name psycopg2 \ ---description "psycopg2 for Python3.8" --zip-file fileb://psycopg2_layer.zip \ ---compatible-runtimes python3.8 -``` - -Check the AWS Lambda console to see if your `psycopg2` has been uploaded as a Lambda Layer. - -![aws layers](https://assets.timescale.com/docs/images/tutorials/aws-lambda-tutorial/layers.png) +## Create a function to fetch and return data from the database +When the layer is available to your Lambda function, you can create an API to +return data from the database. This section shows you how to create the Python +function that returns data from the database and uploads it to AWS Lambda. + +### Procedure: Creating a function to fetch and return data from the database +1. Create a new directory called `timescaledb_api`, to store the function + code, and change into the new directory: + ```bash + mkdir timescaledb_api + cd timescaledb_api + ``` +1. In the new directory, create a new function called `function.py`, with this + content: + ```python + import json + import psycopg2 + import psycopg2.extras + import os + + def lambda_handler(event, context): + + db_name = os.environ['DB_NAME'] + db_user = os.environ['DB_USER'] + db_host = os.environ['DB_HOST'] + db_port = os.environ['DB_PORT'] + db_pass = os.environ['DB_PASS'] + + conn = psycopg2.connect(user=db_user, database=db_name, host=db_host, + password=db_pass, port=db_port) + sql = "SELECT * FROM stocks_intraday" -Now that the layer is available to your Lambda function, we can create our first API to return data from the database. + cursor = conn.cursor(cursor_factory=psycopg2.extras.DictCursor) -## Create a function to fetch and return data from the database -In this step, you create the Python function that returns data from the database and upload it to AWS Lambda. + cursor.execute(sql) -Create a new project folder called `timescaledb_api` where you will put the function code. -```bash -mkdir timescaledb_api -cd timescaledb_api -``` + result = cursor.fetchall() + list_of_dicts = [] + for row in result: + list_of_dicts.append(dict(row)) -Create *function.py* with this content: -```python -import json -import psycopg2 -import psycopg2.extras -import os - -def lambda_handler(event, context): - - db_name = os.environ['DB_NAME'] - db_user = os.environ['DB_USER'] - db_host = os.environ['DB_HOST'] - db_port = os.environ['DB_PORT'] - db_pass = os.environ['DB_PASS'] - - conn = psycopg2.connect(user=db_user, database=db_name, host=db_host, - password=db_pass, port=db_port) - - sql = "SELECT * FROM stocks_intraday" - - cursor = conn.cursor(cursor_factory=psycopg2.extras.DictCursor) - - cursor.execute(sql) - - result = cursor.fetchall() - list_of_dicts = [] - for row in result: - list_of_dicts.append(dict(row)) - - return { + return { 'statusCode': 200, 'body': json.dumps(list_of_dicts, default=str), 'headers': { "Content-Type": "application/json" - } - } -``` + } + } + ``` ## Upload the function in AWS Lambda -ZIP the Python file and upload it to Lambda using the *create-function* AWS command. - -```bash -zip function.zip function.py -aws lambda create-function --function-name simple_api_function \ ---runtime python3.8 --handler function.lambda_handler \ ---role arn:aws:iam::818196790983:role/Lambda --zip-file fileb://function.zip -``` +When you have created the function, you can zip the Python file and upload it to +Lambda using the `create-function` AWS command. + +## Procedure: Uploading the function to AWS Lanbda +1. At the command prompt, zip the function directory: + ```bash + zip function.zip function.py + ``` +1. Upload the function: + ```bash + aws lambda create-function --function-name simple_api_function \ + --runtime python3.8 --handler function.lambda_handler \ + --role arn:aws:iam::818196790983:role/Lambda --zip-file fileb://function.zip + ``` +1. You can check that the function has been uploaded correctly by using this + command in the AWS console: + ```bash + ![aws lambda uploaded](https://assets.timescale.com/docs/images/tutorials/aws-lambda-tutorial/lambda_function.png) + ``` +1. If you make changes to your function code, you need to zip the file again and use the +`update-function-code` command to upload the changes: + ```bash + zip function.zip function.py + aws lambda update-function-code --function-name simple_api_function --zip-file fileb://function.zip + ``` -If you go to the AWS console you should see the uploaded Lambda function. -![aws lambda uploaded](https://assets.timescale.com/docs/images/tutorials/aws-lambda-tutorial/lambda_function.png) - - -Whenever you want to apply changes to your function code, you can just zip the file again and use the -*update-function-code* command. - -```bash -zip function.zip function.py -aws lambda update-function-code --function-name simple_api_function --zip-file fileb://function.zip -``` ## Add database configuration to AWS Lambda -Before we can test that the function works, we need to provide database connection information. You may have noticed in -the Python code above, that we specified retrieving values from environment variables, something you need to specify -within the Lambda environment - -**Define environment variables** - -To upload your connection details, you can use the *update-function-configuration* command with the --environment -parameter. This command needs a JSON file as an input that contains the variables required for the script. - -Example json file, `env.json`: - -```json -{ - "Variables": {"DB_NAME": "db", +Before you can use the functions, you need to ensure it can connect to the database. In +the Python code above, you specified retrieving values from environment variables, and you also need to specify these within the Lambda environment. + +### Procedure: Adding database configuration to AWS Lambda with environment variables +1. Create a JSON file that contains the variables required for the function: + ```json + { + "Variables": {"DB_NAME": "db", "DB_USER": "user", "DB_HOST": "host", "DB_PORT": "5432", "DB_PASS": "pass"} -} -``` - -Update the configuration using this JSON file: -```bash -aws lambda update-function-configuration \ ---function-name simple_api_function --environment file://env.json -``` - -When uploaded to AWS Lambda, you can reach these variables using *os.environ* in the function: -```python -import os -config = {'DB_USER': os.environ['DB_USER'], + } + ``` +1. Upload your connection details. In this example, the JSON file that contains + the variables is saved at `file://env.json`: + ``` bash + aws lambda update-function-configuration \ + --function-name simple_api_function --environment file://env.json + ``` +1. When the configuration is uploaded to AWS Lambda, you can reach the + variables using the `os.environ` parameter in the function: + ```python + import os + config = {'DB_USER': os.environ['DB_USER'], 'DB_PASS': os.environ['DB_PASS'], 'DB_HOST': os.environ['DB_HOST'], 'DB_PORT': os.environ['DB_PORT'], 'DB_NAME': os.environ['DB_NAME']} -``` - -Now your function code is uploaded along with the database connection details. Let's see if it retrieves the data we expect! -Run the *lambda invoke* command with *--function-name* parameter and a name for the output file. - -```bash -aws lambda invoke --function-name simple_api_function output.json -``` - -Lambda function output: -```json -{ - "statusCode": 200, - "body": "[ - { - \"bucket_day\": \"2021-02-01 00:00:00\", - \"symbol\": \"AAPL\", - \"avg_price\": 135.32576933380264, - \"max_price\": 137.956910987, - \"min_price\": 131.131547781 - }, - { - \"bucket_day\": \"2021-01-18 00:00:00\", - \"symbol\": \"AAPL\", - \"avg_price\": 136.7006897398394, - \"max_price\": 144.628477898, - \"min_price\": 126.675666886 - }, - { - \"bucket_day\": \"2021-05-24 00:00:00\", - \"symbol\": \"AAPL\", - \"avg_price\": 125.4228325920157, - \"max_price\": 128.32, - \"min_price\": 123.21 - }, - ... - ]", - "headers": { - "Content-Type": "application/json" - } -} -``` - -## Create a new API Gateway -Now that the Lambda function works, let’s create the API Gateway. -In AWS terms, you are setting up a [custom Lambda integration](https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-custom-integrations.html). - -Create the API using the `apigateway create-rest-api` command: -```bash -aws apigateway create-rest-api --name 'TestApiTimescale' --region us-east-1 -{ - "id": "4v5u26yw85", - "name": "TestApiTimescale2", - "createdDate": "2021-08-23T13:21:13+02:00", - "apiKeySource": "HEADER", - "endpointConfiguration": { - "types": [ - "EDGE" - ] - }, - "disableExecuteApiEndpoint": false -} -``` -One important field to note in the response is the “id”. You will need to reference this id to make changes to the API Gateway. - -You also need to get the id of the root (/) resource to add a new GET endpoint. -Call the `get-resources` command to get the root resource id. - -```bash -aws apigateway get-resources --rest-api-id --region us-east-1 -{ - "items": [ + ``` + +## Test the database connection +When your function code is uploaded along with the database connection details, you can check to see if it retrieves the data you expect it to. + +### Procedure: Testing the database connection +1. Invoke the function. Make sure you include the name of the function, and + provide a name for an output file. In this example, the output file is + called `output.json`: + ```bash + aws lambda invoke --function-name simple_api_function output.json + ``` +1. If your function is working correctly, your output file looks like this: + ```json + { + "statusCode": 200, + "body": "[ + { + \"bucket_day\": \"2021-02-01 00:00:00\", + \"symbol\": \"AAPL\", + \"avg_price\": 135.32576933380264, + \"max_price\": 137.956910987, + \"min_price\": 131.131547781 + }, + { + \"bucket_day\": \"2021-01-18 00:00:00\", + \"symbol\": \"AAPL\", + \"avg_price\": 136.7006897398394, + \"max_price\": 144.628477898, + \"min_price\": 126.675666886 + }, + { + \"bucket_day\": \"2021-05-24 00:00:00\", + \"symbol\": \"AAPL\", + \"avg_price\": 125.4228325920157, + \"max_price\": 128.32, + \"min_price\": 123.21 + }, + ... + ]", + "headers": { + "Content-Type": "application/json" + } + } + ``` + +## Create a new API gateway +Now that you have confirmed that the Lambda function works, you can create the +API gateway. In AWS terms, you are setting up a +[custom Lambda integration][custom-lambda-integration]. + +### Procedure: Creating a new API gateway +1. Create the API. In this example, the new API is called `TestApiTimescale`. + Take note of the `id` field in the response, you need to use this to make + changes later on: + ```bash + aws apigateway create-rest-api --name 'TestApiTimescale' --region us-east-1 + { + "id": "4v5u26yw85", + "name": "TestApiTimescale2", + "createdDate": "2021-08-23T13:21:13+02:00", + "apiKeySource": "HEADER", + "endpointConfiguration": { + "types": [ + "EDGE" + ] + }, + "disableExecuteApiEndpoint": false + } + ``` +1. Retrieve the `id` of the root resource, to add a new GET endpoint: + ```bash + aws apigateway get-resources --rest-api-id --region us-east-1 + { + "items": [ { "id": "hs26aaaw56", "path": "/" @@ -237,64 +243,58 @@ aws apigateway get-resources --rest-api-id --region us-east-1 "GET": {} } } - ] -} - -``` - -Create a new resource with the desired name (in this example ticker). - -```bash -aws apigateway create-resource --rest-api-id \ ---region us-east-1 --parent-id --path-part ticker -{ - "id": "r9cakv", - "parentId": "hs26aaaw56", - "pathPart": "ticker", - "path": "/ticker" -} -``` - -Create a GET request for the root resource. -```bash -aws apigateway put-method --rest-api-id \ ---region us-east-1 --resource-id \ ---http-method GET --authorization-type "NONE" \ ---request-parameters method.request.querystring.symbol=false -``` - -Set up a *200 OK* response to the method request of GET /ticker?symbol={symbol}. -```bash -aws apigateway put-method-response --region us-east-1 \ ---rest-api-id --resource-id r9cakv \ ---http-method GET --status-code 200 -``` - -Connect the API Gateway to the Lambda function. -```bash -aws apigateway put-integration --region us-east-1 \ ---rest-api-id --resource-id \ ---http-method GET --type AWS --integration-http-method POST \ ---uri arn:aws:lambda:us-east-1:818196790983:function:simple_timescale/invocations \ ---request-templates file://path/to/integration-request-template.json -``` + ] + } + ``` +1. Create a new resource. In this example, the new resource is called `ticker`: + ```bash + aws apigateway create-resource --rest-api-id \ + --region us-east-1 --parent-id --path-part ticker + { + "id": "r9cakv", + "parentId": "hs26aaaw56", + "pathPart": "ticker", + "path": "/ticker" + } + ``` +1. Create a GET request for the root resource: + ```bash + aws apigateway put-method --rest-api-id \ + --region us-east-1 --resource-id \ + --http-method GET --authorization-type "NONE" \ + --request-parameters method.request.querystring.symbol=false + ``` +1. Set up a `200 OK` response to the method request + of `GET /ticker?symbol={symbol}`: + ```bash + aws apigateway put-method-response --region us-east-1 \ + --rest-api-id --resource-id r9cakv \ + --http-method GET --status-code 200 + ``` +1. Connect the API Gateway to the Lambda function: + ```bash + aws apigateway put-integration --region us-east-1 \ + --rest-api-id --resource-id \ + --http-method GET --type AWS --integration-http-method POST \ + --uri arn:aws:lambda:us-east-1:818196790983:function:simple_timescale/invocations \ + --request-templates file://path/to/integration-request-template.json + ``` +1. Pass the Lambda function output to the client as a `200 OK` response: + ```bash + aws apigateway put-integration-response --region us-east-1 \ + --rest-api-id / --resource-id \ + --http-method GET --status-code 200 --selection-pattern "" + ``` +1. Deploy the API: + ```bash + aws apigateway create-deployment --rest-api-id --stage-name test + ``` -Pass the Lambda function output to the client as *200 OK* response. -```bash -aws apigateway put-integration-response --region us-east-1 \ ---rest-api-id / --resource-id \ ---http-method GET --status-code 200 --selection-pattern "" -``` - -Finally, deploy. -```bash -aws apigateway create-deployment --rest-api-id --stage-name test -``` ## Test the API - -Let’s make a GET request to the API endpoint using `curl`. +You can test the API is working correctly by making a GET request to the +endpoint with `curl`: ```bash -curl 'https://hlsu4rwrkl.execute-api.us-east-1.amazonaws.com/test/ticker?symbol=MSFT’ +curl 'https://hlsu4rwrkl.execute-api.us-east-1.amazonaws.com/test/ticker?symbol=MSFT' [ { "time": "2021-07-12 20:00:00", @@ -306,8 +306,13 @@ curl 'https://hlsu4rwrkl.execute-api.us-east-1.amazonaws.com/test/ticker?symbol= "symbol": "MSFT" } ] - ``` -If you did everything correctly, you should see the output of the Lambda function, which in this example, is the -latest stock price of MSFT (Microsoft) in JSON format. \ No newline at end of file +If everything is working properly, you see the output of the Lambda function. In +this example, it's the latest stock price of MSFT (Microsoft) in JSON format. + + +[psycopg2]: https://www.psycopg.org/docs/ +[lambda-layers]: https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html +[lambda-psycopg2]: https://github.com/jkehler/awslambda-psycopg2 +[custom-lambda-integration]: https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-custom-integrations.html diff --git a/timescaledb/tutorials/aws-lambda/index.md b/timescaledb/tutorials/aws-lambda/index.md index bf077b0f4cf3..80f896c15b27 100644 --- a/timescaledb/tutorials/aws-lambda/index.md +++ b/timescaledb/tutorials/aws-lambda/index.md @@ -1,27 +1,38 @@ -# TimescaleDB with AWS Lambda +# TimescaleDB with AWS Lambda +This section contains tutorials for working with AWS Lambda and TimescaleDB. -In this section, you will find tutorials for working with AWS Lambda and TimescaleDB. - -* [Create a data API for TimescaleDB (AWS Lambda + API Gateway)](/tutorials/aws-lambda/create-data-api) -* [Pull data from 3rd party API and ingest into TimescaleDB (AWS Lambda + Docker)](/tutorials/aws-lambda/3rd-party-api-ingest) - -In some cases you may need to use Lambda to run a Docker image when your dependencies grow beyond the supported size, -something we cover in the second tutorial. +* [Create a data API for TimescaleDB](/tutorials/aws-lambda/create-data-api) + using AWS Lambda and API Gateway. +* [Pull data from 3rd party API and ingest into TimescaleDB](/tutorials/aws-lambda/3rd-party-api-ingest) + using AWS Lambda and Docker. This is great if you have a lot of dependencies. ## Prerequisites -To complete this tutorial, you will need an AWS account and AWS CLI. +To complete this tutorial, you need an AWS account, and be able to access the +AWS command-line interface (CLI). + +To check if you have AWS CLI installed, use this command at the command prompt. +If it is installed, the command shows the version number, like this: -To check if you have AWS CLI installed, run the `aws --version` command: ```bash aws --version aws-cli/2.2.18 Python/3.8.8 Linux/5.10.0-1044-oem exe/x86_64.ubuntu.20 prompt/off ``` -If you do not have it installed, please follow the [instructions provided by AWS here.](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) +If you do not have it installed, follow [these instructions](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) from AWS. -It’s also required that you complete the [Analyze intraday stock data tutorial](https://docs.timescale.com/timescaledb/latest/tutorials/analyze-intraday-stocks/) first because it sets up the needed tables and data that is used in this tutorial. + +Make sure you have completed the [Analyze intraday stock data tutorial](https://docs.timescale.com/timescaledb/latest/tutorials/analyze-intraday-stocks/). This tutorial needs the tables and data that you set up in that tutorial. + ## Programming language +The code examples in this tutorial use Python, but you can use any language +[supported by AWS Lambda](https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html). + +## Resources +For more information about the topics in this tutorial, check out these resources: -The code examples throughout the tutorial use Python as the programming language but you can use any other language -you prefer as long as it’s [supported by AWS Lambda](https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html). +* [AWS CLI Version 2 References](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/index.html) +* [Creating Lambda container images](https://docs.aws.amazon.com/lambda/latest/dg/images-create.html) +* [Getting started with AWS Lambda](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html) +* [Analyze historical intraday stock data](/tutorials/analyze-intraday-stocks) +* [Analyze cryptocurrency market data](/tutorials/analyze-cryptocurrency-data) diff --git a/timescaledb/tutorials/page-index/page-index.js b/timescaledb/tutorials/page-index/page-index.js index 869acd636dbf..d20f85578825 100644 --- a/timescaledb/tutorials/page-index/page-index.js +++ b/timescaledb/tutorials/page-index/page-index.js @@ -113,7 +113,7 @@ module.exports = [ { title: 'Grafana', href: 'grafana', - exceprt: 'Getting Started with Grafana and TimescaleDB', + excerpt: 'Getting Started with Grafana and TimescaleDB', children: [ { href: 'installation', @@ -162,7 +162,8 @@ module.exports = [ href: 'simulate-iot-sensor-data', excerpt: 'Simulate IoT Sensor Data with TimescaleDB', }, - { title: 'TimescaleDB with AWS Lambda', + { + title: 'TimescaleDB with AWS Lambda', href: 'aws-lambda', excerpt: 'Tutorial for using TimescaleDB with AWS Lambda', children: [ @@ -172,9 +173,9 @@ module.exports = [ excerpt: 'Create a data API for TimescaleDB with AWS Lambda and API Gateway', }, { - title: 'Pull data from 3rd party API and ingest', + title: 'Pull and ingest data from a third party API', href: '3rd-party-api-ingest', - excerpt: 'Pull and ingest 3rd party data into TimescaleDB with AWS Lambda', + excerpt: 'Pull and ingest data from a third party into TimescaleDB with AWS Lambda', } ], },