Skip to content

This Repository Contains different Machine Learning Projects on various dataset. From Exploratory Data Analysis - Visualization to Prediction and Classification..

Notifications You must be signed in to change notification settings


Repository files navigation


This Repository Contains different Machine Learning Projects on various dataset. From Exploratory Data Analysis - Visualization to Prediction and Classification.

Breast Cancer Predication

  • Dataset used : Click here to download
  • Here I have uploaded 2 versions of this project. Difference between these two versions is Feature Selection based on correlation.
  • Both version contains in-depth insight into Dataset - Exploratory Data Analysis, Visualization and Data Preparation.
  • Also tried various methods for data preparation like handle outliers and data imbalance.
  • Version 1 : This Notebook contains simple method for feature selection based on correlation with target attribute.
  • Version 2 : This Notebook contains logistic regression method for feature selection based on column's accuracy.
  • Able to got ~100% accuracy.

Red Wine Quality Prediction

  • Dataset used : Click here to download
  • Notebook contains in-depth insight into Dataset - Exploratory Data Analysis, Visualization and Data Preparation.
  • This Notebook contains simple method for feature selection based on correlation with target attribute.
  • Tried different algorithms for classification and got ~98% accuracy.

Loan Answer Prediction

  • Dataset used :
  • Notebook contains - Data Preprocessing by Scaling, Transforming into One-hot vectors, Data Preparation for model bulding and Model evaluation.
  • This Notebook contains end-to-end implementation of the projects and used different Classification algorithms.

Customer Segmentation - K-means

  • Dataset used : Click here to download
  • This Notebook contains in-depth explaination of K-means clustering algorithm with it's working visualization on randomly generated dataset.
  • Also, Used K-means to segment customers to 3 different Clusters.

Patient Drugs Prediction

  • Dataset used : Click here to download
  • Notebook contains end-to-end implementation of Decision Tree Classifier with printing tree also.
  • Tree : Click here to view
  • This Notebook contains data preprocessing, model building and model evaluation.
  • Got ~98%+ accuracy.

Nutrition Facts for McDonald's Menu Analysis