Overwiev of Projects

Hi, I am Tugra. I am a engineer who is curious about Machine learning and artificial intelligence and stepped into this field with a master program in Istanbul 😊. My goal is to learn something new every day and make more projects 💪 so here is the small portfolio for you. You can check description of whole my projects that i pushed.

Project 1: Bengaluru House Project : Predict House Prices

In this project, I actually try a low-code python module which name is LazyPredict to see how it works. LazyPredict is allow us to see which model can fit better, with a few line codes and without any parameter tuning. Thus you get some insight which model or models can fit your data before using these models with hyperparameter tuning. Here is the link of the LazyPredict Documentation

Project steps:

Get the result of regression algorithms in LazyPredict, here is the top 3 models fitted the data:
1. MLPregressor in Lazypredict = Adj. R2: 0.87 RMSE: 23.38
2. BayesianRidge in Lazypredict = Adj. R2: 0.84 RMSE: 26.65
3. Linear Regression in Lazypredict = Adj. R2: 0.84 RMSE: 26.67
Then I used MLP(Multi-Layer Perceptron) algorithms with Sklearn module to train data.
GridSearchCV was used for hyperparameter tuning.
Finally, I got the result of the test set. RMSE : 22.45 R2 of Tuned Model: 0.902

Project 2: (End-to-End) Celebrity Face Recognition : Image Classification with SVM

In this machine learning project, I classify celebrity personalities. I restrict classification to only 5 people. This project includes from data collection (Image Scrapping) to Deployment on AWS. Random Forest, Logistic Regression and Support Vector Machines algorithm were used for this study, and GridSearchCV method was used for model selection with tuning parameters.

Choosen People:

Cristiano Ronaldo
Cheki Chen
Brad Pitt
Johnny Depp
Lionel Messi

Here is the folder structure:

UI: This contains ui website code
server: Python flask server
model: Contains python notebook for model building
google_image_scrapping: Code to scrap google for images
datasets: Dataset used for our model training which includes celebrity images

Technologies used in this project:

Python ↙️
Numpy and OpenCV for data cleaning ↙️
Matplotlib & Seaborn for data visualization ↙️
Sklearn for model building ↙️
Jupyter notebook, visual studio code as IDE ↙️
Python flask for http server ↙️
HTML/CSS/Javascript for UI ↙️

A Screenshot after model deployment

Project 3: Data Analysis Project On TABLEAU : Sales Insight

The case study is based on a computer hardware business which is facing challenges in dynamically changing market. Sales director decides to invest in data analysis project and he would like to build Tableau dashboard that can give him real time sales insights.

Simply,insights could be:

Revenue breakdown cities
Revenue breakdown by years and months
Top 5 customers by revenue and sales quantity
Top 5 products by revenue number etc.

Project Steps:

import the "db_dumb.sql" file to MySQL DB and getting some knowledge about data.
Plug MySQL with Tableau.
In Tableau, do data cleaning, ETL(Extract, transform ,load), currency normalization and handling invalid values etc.
And.. Make Dashboard !!

Note: "db_dumb.sql" used for Revenue Analysis - "db_dumb_version_2.sql" file used for Profit Analysis

Revenue Dashboard

Profit Dashboard

Project 4: Breast Cancer : ML Classification Project

Predicting if the cancer diagnosis is benign or malignant based on several observations/features. Support Vector Classification algorithm used for this study with GridSearchCV method to reach best model parameter.

30 features are used, examples:

radius (mean of distances from center to points on the perimeter)
texture (standard deviation of gray-scale values)
perimeter
area
smoothness (local variation in radius lengths)
compactness (perimeter^2 / area - 1.0)
concavity (severity of concave portions of the contour)
concave points (number of concave portions of the contour)
symmetry
fractal dimension ("coastline approximation" - 1) Datasets are linearly separable using all 30 input features

Number of Instances: 569

Class Distribution: 212 Malignant, 357 Benign

Target class:

Malignant
Benign

Dataset: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)

Project 5: Fashion MNIST Dataset : Image Classification with CNN

Basic Deep Learning project to predict class of attirbutes with Convolution Neural Network using Fashion mnist dataset.

Dataset : https://www.kaggle.com/zalando-research/fashionmnist

Project 6: (End-to-End) Predicting of Zomato Restaurant Ratings : Regression Task / Flask Deployment

The main agenda of this project is:

Perform extensive Exploratory Data Analysis(EDA) on the Zomato Dataset
Build an appropriate Machine Learning Model that will help various Zomato Restaurants to predict their respective Ratings based on certain features
DEPLOY the Machine learning model via Flask that can be used to make live predictions of restaurants ratings

Project 7: Flight Price Prediction with XGBoost: Regression Task

It is a Regression task to predict flight price with given inputs.
Performed EDA and some feature engineering technics such as one-hot encoding.
Sklearn and Pycaret modules were used in modelling part.
Applied LightGBM - CatBoos and XGBoost algorithms with GridSearchCV method.
Source : Kaggle

Project 8: Credit Card Fraud Detection : Deep Learning Task

Dataset is from kaggle
Applied some visualizations to perform EDA
Handling with imbalanced data to using Downsampling and SMOTE
Modelling with Desicin trees(Sklearn) and Convolutional Neural Network(Keras)

Project 9: Stock Market Analysis with Using K-means : Unsupervised Task

Fetching 27 companies data from Yahoo Finance with using pandas-datareader module
Used pipeline to applying normalize , PCA and K-means
There will be further studies on Stock Prices such as more effective visualizations and applying Deep Learning methods like LSTM

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
Project1-Bengaluru House Project		Project1-Bengaluru House Project
Project10 - Term-Deposit-Marketing		Project10 - Term-Deposit-Marketing
Project11 - PredictingElectricityConsumption		Project11 - PredictingElectricityConsumption
Project12 - Gdz Elektrik Datathon 2023		Project12 - Gdz Elektrik Datathon 2023
Project2-Celebrity Face Recognition		Project2-Celebrity Face Recognition
Project3-TABLEAU Data Analysis Project		Project3-TABLEAU Data Analysis Project
Project4-Breast Cancer		Project4-Breast Cancer
Project5-Fashion Mnist		Project5-Fashion Mnist
Project6- Zomato Restaurant Ratings		Project6- Zomato Restaurant Ratings
Project7- Flight Price Prediction		Project7- Flight Price Prediction
Project8 -Credit Card Fruad Detection		Project8 -Credit Card Fruad Detection
Project9- Stock Market Clustering		Project9- Stock Market Clustering
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overwiev of Projects

Project 1: Bengaluru House Project : Predict House Prices

Project 2: (End-to-End) Celebrity Face Recognition : Image Classification with SVM

Project 3: Data Analysis Project On TABLEAU : Sales Insight

Project 4: Breast Cancer : ML Classification Project

Project 5: Fashion MNIST Dataset : Image Classification with CNN

Project 6: (End-to-End) Predicting of Zomato Restaurant Ratings : Regression Task / Flask Deployment

Project 7: Flight Price Prediction with XGBoost: Regression Task

Project 8: Credit Card Fraud Detection : Deep Learning Task

Project 9: Stock Market Analysis with Using K-means : Unsupervised Task

About

Releases

Packages

Languages

tugra-alp/Data-Science-Projects

Folders and files

Latest commit

History

Repository files navigation

Overwiev of Projects

About

Resources

Stars

Watchers

Forks

Languages