COVID-19 County Mask Mandate Analysis

See Election2_dataset.ipynb, final.csv & covidRegressionandRF.ipynb

Team: Red Zone

Members: Jimmy Greer, Ben Altshuler, TeKisha Sampson & Jason Goddard

Comm Protocol:

Trello for PM, Slack, Group-Text, email
Use one of the above as heads-up on working status, new branches, major commits, new files, pull requests, and merge conflicts

Objective: Did counties with mask mandates see fewer COVID-19 cases than those without? Can we find other more relevant features that suggest a relation between factors and death rate? The analysis will be based on the cases as a percentage of the population.

Intro Deck: Red Zone's COVID-19 Mask Mandate Intro

Datasets

CSV	Keys	Summary
POPULATION_TEST.csv	FIPS	population in 2020 by county in the United States
us-counties.csv	FIPS	the cumulative daily 2020 cases and deaths by county taken on the last day of the year
county_mask_mandate_data.csv	county_fips state_fips	mask mandates in counties in the United States defined by whether a mask mandate was implemented
elec_results_2020.csv	state_fips	designation of red/blue states by 2020 presidential election

Requirements

Pandas, Matplotlib, Sklearn, PostgreSQL 13.x & Numpy

EDA

Most source data in csv format. Pandas reads in from the sources as separate dataframes before being cleaned and merged. The merge of county population takes place in SQL on the Postgres instance initiated by the user. Finally, a complete csv is passed to the ML segment.

County Mask Mandate
  -dropping multiple columns
  -county_start_date to 1 or 0 in new column
  -add column for duration
US Counties
-dropping multiple columns
-groupby counties to get sum of cases and deaths
Population Test
-concatenate state and county codes into fips
Election Results
-ETL strip
-merge by category into County Mask Mandate data on state_fips

SQL

Merging on fips keys to bring in population in order to fairly measure features against the percentages of cases in counties.

Note: while much of the merging was done in Python, the below shows a simple ERD of the mapping that we could work from throughout.

Machine Learning

See covidRegressionandRF.ipynb
Using a classification model, Logistic Regression, we'd like to see if we can predict the likelihood of infection in a county with a mask mandate. We'd like to pinpoint correlation by adding population size & 2020 presidential election results as features. Because of the manageable size of the data, we believe that we can employ Logistic Regression from the start.

Logistic Regression Classification Reports:

As you can see, with high f1-scores across the bins and a accuracy rate of 92.9%, this would be a good predictive model given that we have the mask mandate, Blue/Red State status and the population of a given county.

Additionally, we ran a Random Forest model. With an f1-score of .60, we would lean towards the Logistic Regression above.

Additonal Analysis

To narrow down the causation, we took a simple approach with linear association. The results closely matched what we perceived in the data visualization and gave insight into the driving feature in the dataset (mask mandate).

Dashboard

2020 COVID-19 Analysis. Cases, Deaths & Mandates via Tableau

Consider:
-Filtering Mask Mandates map down to 0-5% as well as 10-29%
-Highlighting Case % and Death % maps by mask mandate

These visualizations tell a story that aligns with the data analysis.

Exported CSV

final.csv
-via Election2_dataset.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Database		Database
EDA		EDA
ML		ML
Resources		Resources
POPULATION_TEST.csv		POPULATION_TEST.csv
README.md		README.md
co-est2020.csv		co-est2020.csv
full_merged_df.csv		full_merged_df.csv
pre_binned_final.csv		pre_binned_final.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COVID-19 County Mask Mandate Analysis

Team: Red Zone

Datasets

Requirements

EDA

SQL

Machine Learning

Additonal Analysis

Dashboard

Exported CSV

About

Releases

Packages

Contributors 4

Languages

Jimmygjr10/Covid19_Mask_Mandate

Folders and files

Latest commit

History

Repository files navigation

COVID-19 County Mask Mandate Analysis

Team: Red Zone

Datasets

Requirements

EDA

SQL

Machine Learning

Additonal Analysis

Dashboard

Exported CSV

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages