Skip to content

aisheh90/My-Work-on-Titanic-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

My-Work-on-Titanic-Dataset in Python

This is an assignment for the Machine Learning course (CS763). The required is to apply different machine learning techniques on Titanic dataset [1] using Scikit-learn library. The two Jupyter Notebooks have full details about each step in the work.

Data processing steps are the same in the two notebooks with referring to [2-3]. In the first Notebook 'train_test.ipynb', I used the train data and test data as is from Kaggle website. While for the second Notebook 'train_valid_test.ipynb', I split the train data into train and validation with 20%.

  • Results of 'train_test.ipynb'(Kaggle Test Results):

    • DT ... 0.74162
    • Bagging DT .... 0.75598
    • AdaBoost ... 0.75598
    • Voting 5 models ... 0.76555
    • Voting all models ... 0.79425

    So, voting from all models had the best score.

  • Results of 'train_valid_test.ipynb'(Kaggle Test Results):

    • DT2 ... 0.74162
    • Random Forest .... 0.79904
    • Naive Bayes ... 0.76555
    • KNN ... 0.76555
    • Voting all .... 0.77990
    • AdaBoost .. 0.73684

    So, Random Forest had the best score anmong the two notebooks.

References:

About

Machine Learning on Titanic Dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published