Skip to content
/ MLOS Public

Machine Learning For Official Statistics and SDGs

Notifications You must be signed in to change notification settings

hiver-py/MLOS

Repository files navigation

Machine Learning For Official Statistics and SDGs

This repository contains the practicals from different modules, as well as interactive tools built in Shiny that can help with understanding the intuition of the methods. These were used during a 6 week long machine learning course for official statistics and SDGs hosted by UN-SIAP. The code is created by Christophe Bontemps and Patrick Jonsson with help and inspiration from Pascal Lavergne.

Module 0: You've seen this before

  • Linear and non-linear regression
  • Supervised vs unsupervised learning
  • k-Nearest Neighbors
  • Statistical Learning vs Machine Learning
  • Cross validation

Including an interactive Shiny application that visualizes KNN-regression.

Module 2: Classification

  • How classification works
  • Supervised vs unsupervised classification
  • Examples of classifiers
  • Measures of fit
  • Logit as a classifier
  • How to choose the "best" model

Including an interactive Shiny application visualizing a fitted logistic regression curve, the decision boundary, and accuracy measures.

Module 3: Regression-based methods

  • Linear Regression and all his friends
  • Selection of regressors
  • Penalization Methods
  • How to choose the best model?

Module 4: Tree-Based Methods

  • Decision Trees: Construction and visualization
  • Selecting hyperparameters for trees
  • From trees to forest
  • Bagging & Feature sampling
  • Random forest

Including an interactive Shiny application that visualizes how the complexity parameter affects the complexity of a decision tree.

Module 5: Advanced Methods

  • Support Vector Machines
  • K-means clustering

About

Machine Learning For Official Statistics and SDGs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages