Skip to content

zaaachos/Overfit-level-I

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Docker Pulls

Welcome to Overfit_Level_I! This repository showcases my work for the Kaggle competition S04E02, focusing on Multi-Class Prediction of Obesity Risk. The goal of this competition is to use various factors to predict obesity risk in individuals, which is related to cardiovascular disease.

My top performing model ranked in the top 30% out of 2,840 competitors. Applied an extensive Exploratory Data Analysis, along with Feature Engineering, scikit-learn, SHAP, CatBoost, XGBoost, and LightGBM, and different Ensemble strategies. You can read my analysis and developemnt in my submission_notebook

ML App

One of the highlights of this project is the deployment of a Dockerized Flask application, which you can interact with, through HuggingFace Spaces. Check it here OverfitLevelI

Environment Setup

To get started with this project, follow these steps to set up your environment:

  • Clone the Repository:
git clone https://github.com/zaaachos/Overfit-Level-I.git
  • Install Dependencies:

It is highly recommended, to use conda as your virtual enviroment:

conda create -n obesityEnv python=3.9
conda activate obesityEnv

Navigate to the webapp project directory and install the necessary dependencies by running:

pip install -r requirements.txt
  • Run the Application Locally:

Once dependencies are installed, you can run the Flask application locally by executing:

python app.py

This will start the Flask server, and you can access the application at http://localhost:5000/obesityRiskForm in your web browser.

Example video

ExampleGif

Next steps

  • Codify the work that is provided in my notebook
  • Make a web app using Flask and Postman
  • Dockerize and deploy it to docker Hub
  • Deploy to a cloud service (HuggingFace Spaces)
  • Deploy to another cloud service with GitHub Actions
  • Write a notebook readme that summarises all implemented approaches along with graphs.