Skip to content

Machine Learning Competition in Python for la Universidad Carlos III de Madrid. Model Algorithm Selection and Score Accuracy optimization.

Notifications You must be signed in to change notification settings

dizquedavid/ML_UC3M_Kaggle_Copetition

Repository files navigation

ML_UC3M_Kaggle_Copetition

Repository for Universidad Carlos III de Madrid Machine Learning Kaggle Competition

David Campos Brandao:

Brief:

Here I compare different algorithms' acurracies when predicting over an obscure data provided by our professor. I created a training pipeline and a scoring pipeline, alongside a hyperparamter tunning pipeline that allows for flexibility and scalability in small Data Science projects. I also comment a lot about my thought processes and justify certain steps and improvements needed for the future.

This code has some data exploration, feature reduction, model comparisons and visualiations. Bare minimuns when doing a DS project of this type. I might be updating the notebook if anything comes to mind.

Information about the Project (Context & Credit)

This is a notebook detailing the code used for a Kaggle Competition I participated in 2019. The competition was created by our Machine Learning professor: Dr. Pablo Martinez Olmos (a professor for the Masters in Big Data Analytics at Universidad Carllos III in Madrid). We worked in pairs to present a submission to the Kaggle Competition.

In the original submission I ranked above 80% in the 70 competitor pool. I am revisiting the code and updating it for my github repository. I am also making changes as I see fit, to improve the code with things I have learned since then. Yolanda Ibañez Perez was my original partner in this competition, credit goes to her for the work presented here.

Repository Structure

  • kaggle_competition_dcamposb_notebook.ipynb: The Jupyter Notebook with All the Code
  • Data: Where the all the Working Data is
    • Training and Validation Data Sets
    • Unlabeled Data Set for final Prediction
    • 2019 Data Result for Competition Submission
    • 2020 Data Result for Competition Submission
  • optimal_feature_lists_for_targets:
    • Files Holding Special Information about feature reduction process
  • notebook_mardkdown:Where the notebook markdown is
    • If the Notebook doesn't open up in the github console, open the markdown.
    • Make sure to download the whole file, with the images too

If interested in reading the code:

If you want to checkout or download the notebook, simply download the file kaggle_competition_dcamposb_notebook.ipynb.

If you want to check the final markdown just go into the folder: notebook_mardkdown, and check out the knited jupyter notebook.

About

Machine Learning Competition in Python for la Universidad Carlos III de Madrid. Model Algorithm Selection and Score Accuracy optimization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published