The primary objective of this project is to develop a model that can effectively predict the likelihood of a customer defaulting on insurance premium payments.
- 🤓 Description
- 💻 Dataset Overview
- 📊 Exploratory Data Analysis
- 🛠️ Feature Engineering
- 🏗️ Model Building
- ✨ Recommendations
- 📗 Notebooks
- 📧 Contact Information
As the premium paid by customers is the major revenue source for insurance companies, defaulting on these payments results in significant revenue losses. Hence, insurance companies would like to know upfront which types of customers would default on premium payments.
The objectives of this project are to:
- Build a model that can predict the likelihood of a customer defaulting on premium payments.
- Identify the factors that drive higher default rates.
- Propose a strategy for reducing default rates by using the model and other insights from the analysis.
The dataset source file can found through the following link:
The dataset contains 17 variables. The data dictionary below explains each variable:
Data Dictionary
- id: Unique customer ID
- perc_premium_paid_by_cash_credit: What % of the premium was paid by cash payments?
- age_in_days: age of the customer in days
- Income: Income of the customer
- Marital Status: Married/Unmarried, Married (1), unmarried (0)
- Veh_owned: Number of vehicles owned (1-3)
- Count_3-6_months_late: Number of times premium was paid 3-6 months late
- Count_6-12_months_late: Number of times premium was paid 6-12 months late
- Count_more_than_12_months_late: Number of times premium was paid more than 12 months late
- Risk_score: Risk score of customer (similar to credit score)
- No_of_dep: Number of dependents in the family of the customer (1-4)
- Accommodation: Owned (1), Rented (0)
- no_of_premiums_paid: Number of premiums paid till date
- sourcing_channel: Channel through which customer was sourced
- residence_area_type: Residence type of the customer
- premium : Total premium amount paid till now
- default: (Y variable) - 0 indicates that customer has defaulted the premium and 1 indicates that customer has not defaulted the premium
There was a significant amount of data pre-processing required prior data visualization. These steps can be seen in the following section.
The Univariate and Bivariate analysis can be seen here.
The step by step data cleaning and wrangling can be observed in this section
The data model preparation and linear regression steps can be seen here.
Based on the analysis the following recommendations can be made to further ensure than insurance companies maintain receiving their insurance premium payments:
- Improve the accessibility to non-cash payment services. This can be achieved by placing special emphasis on payment options in all marketing campaigns.
- Include additional no-claim discounts and services to customers who pay their premiums via non-cash methods.
- Notify policy holders via phone and mail with policy expiration dates and renewal from four months prior and continuously send electronic reminders every month until due date. On date of policy expiration, insurance agents should contact policy holders to further remind them.
- The Insurance companies should amalgamate insurance packages by policy holder to reduce the number of policies to each individual client. Discounts should also be given to clients who hold many policies, to reduce the hesitance of clients from completing their premium payments.
- Since older customers tend to default more than younger ones, if applicable, then their registered next of kin should be contacted and marketed about the policies available.
- Since the highest income earners churn the fastest, then insurance companies should contact those clients on a regular basis possibly offering additional trial services to influence them to remain loyal.
The Notebook for the "Data Exploration" can be accessed below:
The Notebook for the "Exploratory Data Analysis" can be accessed below:
The Notebook for the "Feature Engineering" can be accessed below:
The Notebook for the "Model Building" can be accessed below:
- Email: sean_dhanasar@msn.com
- LinkedIn: Sean Dhanasar