Skip to content

Table API workflow

Jerry Lee edited this page Jun 17, 2021 · 1 revision

Table API

To have a fully functioning pipeline, there are a few things you need to do to prepare the data.

  1. Onboard source into table
  2. Create queries to pull the data from this source
  3. Write an API to use these queries to pull data
  4. Clean up the data you added

Onboarding a source into the table

  1. Insert a row into the RDS Source table with the source metadata
  2. Populate the RDS Event table with dummy data about the source
  3. Make sure you put the queries you used in the queries/ folder! The file name should be in the format <#>_ (ex: 04_onboard_login_traffic)

Create queries to pull data

Based on the dummy data inserted into the Event table, formulate an SQL query to pull the data. The data should be aggregated. For example, for login traffic, you may want to have something like this:

Date # people
6/14/2021 200

To get this, you need something like the query

SELECT date, count(distinct userID) FROM Event JOIN Source on Event.SourceId = Source.id WHERE date = '2021-06-14';

Test your query on MySQL Workbench and make sure it outputs the correct result

Write an API to use the pulled data

You will need to make your API endpoint in ExpressJS. If you are modifying the table, stub the returned result. Otherwise, query your dummy data from the table using the query you created from the previous step. Parse the data and return a JSON

Clean up the data you added

NOTE: For the purpose of this internship, you do not need to clean up the data you added. I, Jerry will be wiping the Event table when the pipeline is ready to go into production. Remove all dummy data you added into the RDS Event table. Make sure you keep the RDS Source table though!