Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 944 Bytes

File metadata and controls

18 lines (11 loc) · 944 Bytes

logistic-regression-with-Donorchoose-dataset

In this project I have applied logistic regression algorithm on DonorChoose dataset to predict whether a given project will be approved funding or not.

I have created 4 dataset. Each dataset contains text features encoded with different encoding techniques.
Set1 | Text features encoded with simple Bag of words vectorizer.
Set2 | Text Features encoded with TFIDF vectorizer.
Set3 | Text features encoded with Avarage Word2Vec vectorizer.
Set4 | Text features encoded with TFIDF Word2Vec vectorizer.

Then LR is applied on all 4 datasets.

Conclusion after applying LR to all datasets

LR is able to predict fund approval for project with 0.72 AUC score with Set 2. With 1/3 point is miss classified as False positive and 1/2 points are miss classified as False Negative.