This Mini-Challenge was completed as part of the "Data Wrangling" course.
We were tasked with building a data pipeline which creates two clean data frames with the following daily data:
- per country (Global)
- new cases
- total cases
- new deaths
- total deaths
- per canton from start of 2nd wave (Switzerland)
- new cases since 1st of june
- total cases since 1st of june
- new deaths since 1st of june
- total deaths since 1st of june
We were only allowed to use the raw data provided by the following to institutions on GitHub:
- John Hopkins University (Global) - https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports
- Canton of Zurich (Switzerland) - https://github.com/openZH/covid_19/blob/master/COVID19_Fallzahlen_CH_total_v2.csv
You can find the current revision of the data pipline here.