Skip to content

nalkpas/MSE246-2018-Project

Repository files navigation

MSE246-2018-Project

This project implements and compares a number of different models for predicting loan default and loss at default. We use data from the U.S. SBA 504 loan program, consisting of 150,000 loans issued between 1990 and 2014. We augment our data with several macroeconomic factors, including the Consumer Price Index and yearly S&P 500 returns.

Data Processing

The data_processed_final folder contains our final processed data, created with data_processed_hujia.ipynb. data_exploration.ipynb contains code for preliminary analysis and generating exploratory graphs.

Logistic Model

The logistic model.ipynb notebook in logistic_model folder contains code for tuning and analyzing our logistic model. logistic_roc.csv is the validation ROC curve.

Neural Network

The neural_network folder contains our attempts at implementing a binary classification neural network. NNprocessing.py contains neural network-specific preprocessing. static_net.py and dynamic_net.py are first attempts, exploring PyTorch's support for dynamic computational graphs. default_net.py contains our final implementation, which uses batch normalization, dropout, and Adam gradient descent. nn_eval.py analyzes our model parameters and tests its validation performance. Unfortunately, were were unable to implement a fully functioning neural network.

Hazard Model

The hazard model is in the hazard_lifelines_michelle.ipynb notebook in the data_processed_final folder.

Loss Model

The loss model is in the loss folder, in loss_model_michelle.ipynb. The 1_and_5_year_loss_michelle.ipynb notebook contains the tranche loss simulation code. Generated graphs in said notebook were also screenshotted and placed in the graphs folder.

About

SBA loan risk analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages