Skip to content

A comprehensive machine learning pipeline for churn prediction in telecom customers using CatBoost, featuring data preprocessing, exploratory data analysis, feature engineering, and model evaluation.

License

Notifications You must be signed in to change notification settings

yyigitturan/Telco-Customer-Churn-Feature-Engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Telecom Customer Churn Prediction using CatBoost

Overview

This project presents a complete machine learning pipeline for predicting customer churn in the telecom industry. It demonstrates data preprocessing, exploratory data analysis (EDA), feature engineering, and model evaluation, utilizing the CatBoost classifier for effective prediction.

Dataset

The dataset used in this project is the "Telco Customer Churn" dataset, which includes various customer attributes, such as demographics, account information, and services subscribed.

Key Features

  • Data Preprocessing: Handling missing values, encoding categorical variables, and standardizing features.
  • Exploratory Data Analysis (EDA): Visual and statistical analysis to understand the data distribution and relationships between variables.
  • Feature Engineering: Creation of new features to improve model performance and insights into customer behavior.
  • Model Building: Utilizing CatBoost, a gradient boosting algorithm, for building a predictive model.
  • Model Evaluation: Assessing model performance with metrics like accuracy, precision, recall, F1 score, and AUC.

Requirements

The project requires Python and the following Python libraries:

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • CatBoost
  • scikit-learn

Usage

Instructions on how to set up the environment, run the analysis, and interpret results. Include steps for installing required libraries, executing the script, and any additional setup needed.

Installation

pip install pandas numpy matplotlib seaborn catboost scikit-learn

About

A comprehensive machine learning pipeline for churn prediction in telecom customers using CatBoost, featuring data preprocessing, exploratory data analysis, feature engineering, and model evaluation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages