Customer Churn Prediction

Churn prediction, or the task of identifying customers who are likely to discontinue use of a service, is an important and lucrative concern of any industry.

Description

This project is tasked to predict the churn score for a website based on features such as:

User demographic information
Browsing behavior
Historical purchase data among other information

This project aims to identify customers who are likely to leave so that we can retain them with certain incentives.

DataSet:

Dataset has been taken from a Hackathon, and raw dataset can be downloaded from here. Link
Cleaned and processed version of the data can be accessed from here. Link
Classes [Customer will EXIT(1) or NOT(0)] are properly balanced with 5:4 ratio

Notebook:

Notebook contains the EDA, data processing, and model building ideas.

Notebook	Colab	Kaggle
Customer Churn Modeling
Exploratory data analysis

Models

The final model used is an ensemble of different classifiers such as:
- KNN
- Random Forest
- AdaBoost
- Xgboost

Project Pipeline

Techstack

Python version : 3.7
Packages: pandas, numpy, sklearn, xgboost, fastapi, seaborn
Cloud: heroku

Usage [running this locally]:

conda create -n envname python=3.7
activate envname
git clone https://github.com/d0r1h/Churn-Analysis.git
cd Churn-Analysis
pip install -r requirements.txt
python app.py

To download dataset and preprocess automatically run following script

!pip install datasets
!python src/preprocess.py

Results

Even though Xgboost is giving good Test Accuracy of ~ 93% but we need to focus on the customers who are leaving i.e. class 1, so that we can retain them with some discount offer on membership.
Ensemble methods (stack classifier) is having 94% of recall for predicting the customers who are likely to leave, higher than Xgboost.
Following is confusion matrix of final classifier (stack ensemble) and xgboost classifier.

Score table for different classifier

Inference Demo:

Application is deployed on heroku and can be accessed on https://churn01.herokuapp.com/ and sample data for the test app is here

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
DataSet		DataSet
Examples		Examples
src		src
static		static
templates		templates
.gitignore		.gitignore
Procfile		Procfile
Project_Pipeline.png		Project_Pipeline.png
README.md		README.md
app.py		app.py
churn_model.pkl		churn_model.pkl
requirements.txt		requirements.txt
wsgi.py		wsgi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Churn Prediction

Description

DataSet:

Notebook:

Models

Project Pipeline

Techstack

Usage [running this locally]:

Results

Inference Demo:

About

Languages

d0r1h/Churn-Analysis

Folders and files

Latest commit

History

Repository files navigation

Customer Churn Prediction

Description

DataSet:

Notebook:

Models

Project Pipeline

Techstack

Usage [running this locally]:

Results

Inference Demo:

About

Topics

Resources

Stars

Watchers

Forks

Languages