Skip to content

diegodebrito/data-science-toolbox

Repository files navigation

Data Science Toolbox

This folder contains procedures, classes and functions that I use frequently on my Data Science projects. The notebooks contain examples of applications. This is a permanent work in progress by definition.

Contents

  • Reference Guide and Notes: notes on useful strategies and techniques for Data Science. The techniques described in this file are demonstrated on the individual notebooks. The purpose of this file is to have a quick reference guide for the contents of the other files.
  • Quick Baseline Analysis: quick data cleaning and local validation with Random Forest model for first initial iteration.
  • Exploratory Data Analysis: mostly templates for different techniques/plots used on EDA.
  • Model Selection and Hyperparameter Tuning: model selection techniques and hyperparameter tuning strategis for Gradient Boosting Models (XGBoost, LightGBM).
  • Building Pipelines: custom classes based on BaseEstimator (Scikit-Learn) and some pipelines templates for Machine Learning.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published