Links to mini projects on data science topics like data wrangling, EDA, ML and data storytelling (Ongoing)
- Data munging on the World Bank projects (working with json) https://github.com/chocolocked/DataWrangling_JsonMiniProject
- User Engagements study using SQL https://modeanalytics.com/chocolocked/reports/04cfdba20eac
- EDA examining normal range of human body temperature https://github.com/chocolocked/EDA_HumanBodyTemp
- EDA examining the impact of race in resume call-back rate https://github.com/chocolocked/EDA_RacialDiscri
- EDA examining hospital readmission data and making recommendations for reduction https://github.com/chocolocked/EDA_HospitalReadm-
- Linear Regression on Boston Housing Dataset https://github.com/chocolocked/LinearReg_Boston
- Logistic Regression on Weight vs. Height with Hyperparameter Tuning https://github.com/chocolocked/ML_LogisticReg
- Naive Bayes on predicting Rotten Tomatoes fresh movie reviews with GridSearch on hyperparamter https://github.com/chocolocked/ML_NaiveBayes
- Wine Customer Segmentation using Clustering Methods: Kmeans, Silhouette Method; and Visualizations with PCA https://github.com/chocolocked/ML_Clustering
- RDD operations, Text analysis on Julius Caesar, Linear Regression & Random Forest on Boston Housing Dataset with PySpark Machine Learning libraryhttps://github.com/chocolocked/ML_PySpark
- User adoption analysis, feature engineering, and predictive modeling https://github.com/chocolocked/DataChallenge_relax
- 3 Part ultimate data challenge at https://github.com/chocolocked/DataChallenge_ultimate
-
- Part 1. Time Series Analysis on user engagement trend;
-
- Part 2. Experimental design and hypothesis testing;
-
- Part 3. EDA, feature engineering, pipeline model buidling and recommendations with LIME(Local Interpretable Model-agnostic Explanations)on improving rider retention