-
Notifications
You must be signed in to change notification settings - Fork 0
Table API workflow
To have a fully functioning pipeline, there are a few things you need to do to prepare the data.
- Onboard source into table
- Create queries to pull the data from this source
- Write an API to use these queries to pull data
- Clean up the data you added
- Insert a row into the RDS Source table with the source metadata
- Populate the RDS Event table with dummy data about the source
- Make sure you put the queries you used in the
queries/
folder! The file name should be in the format <#>_ (ex: 04_onboard_login_traffic)
Based on the dummy data inserted into the Event table, formulate an SQL query to pull the data. The data should be aggregated. For example, for login traffic, you may want to have something like this:
Date | # people |
---|---|
6/14/2021 | 200 |
To get this, you need something like the query
SELECT date, count(distinct userID) FROM Event JOIN Source on Event.SourceId = Source.id WHERE date = '2021-06-14';
Test your query on MySQL Workbench and make sure it outputs the correct result
You will need to make your API endpoint in ExpressJS. If you are modifying the table, stub the returned result. Otherwise, query your dummy data from the table using the query you created from the previous step. Parse the data and return a JSON
NOTE: For the purpose of this internship, you do not need to clean up the data you added. I, Jerry will be wiping the Event table when the pipeline is ready to go into production. Remove all dummy data you added into the RDS Event table. Make sure you keep the RDS Source table though!