SC4/SM8 Advanced Topics in Statistical Machine Learning

Times: Fridays 1000-1100, 1400-1500, Hilary Term 2020.
Venue: Large Lecture Theatre LG.01, Department of Statistics, University of Oxford
Lecturer: Yee Whye Teh
Website: https://github.com/ywteh/advml2020

Announcements:

5/3: Razvan Pascanu (DeepMind) will be giving a guest lecture on optimisation and data-efficiency in deep learning.
5/3: Problem sheet 3 available now, and due 5/3 noon.
28/2: Lecture times moved to 2-4pm (back to back).
27/2: Arthur Gretton (Gatsby Unit, UCL) will be giving a guest lecture on GANs and MMD.
20/2: Problem sheet 2 available now, and due 20/2 noon.
13/2: Nicolas Heess and Leonard Hasenclever (DeepMind) will be giving guest lecture 1-3pm.
7/2: Direct link to lecture recordings.
7/2: Slides for weeks 1-3 and problem sheet 1 answers uploaded.
Part C/OMMS: due Thursday noon week 3. MSc: class Friday 3pm week 3.

Links:

Course Information

Aims and objectives:

Machine learning is widely used across the sciences, engineering and society, to construct methods for identifying interesting patterns and predicting accurately from large datasets. This course introduces several widely used machine learning frameworks and describes their underpinning statistical principles and properties. The course studies both unsupervised and supervised learning and several advanced and state-of-the-art topics are covered in detail. The course will also cover computational considerations of machine learning algorithms and how they can scale to large datasets.

Prerequisites:

A8 Probability and A9 Statistics.
Some material from this year's syllabus of SB2.2 Statistical Machine Learning, PCA and the basics of clustering, will be used (which is mainly taught in the first three lectures of SB2.2, also in HT2019), but SB2.2 is not a prerequisite and background notes will be provided.

Synopsis:

Review of unsupervised and supervised learning.
Duality in convex optimization and support vector machines.
Kernel methods and reproducing kernel Hilbert spaces. Representer theorem. Representation of probabilities in RKHS.
Kernel PCA.
Deep learning. Representation learning. Neural networks and computation graphs. Automatic differentiation. Stochastic gradient descent.
Probabilistic machine learning: latent variable models, EM algorithm, mixtures, mixtures of experts, probabilistic PCA.
Variational inference, deep generative models, variational auto-encoders.
Bayesian learning: Laplace Approximation. Variational Bayes, Latent Dirichlet Allocation.
Collaborative filtering models, probabilistic matrix factorization.
Gaussian processes for regression and classification. Bayesian optimization.

Textbooks and Background Reading

Bishop, Pattern Recognition and Machine Learning, Springer.
Murphy, Machine Learning: A Probabilistic Perspective, MIT Press.
Hastie, Tibshirani and Friedman, The Elements of Statistical Learning, Springer. ebook
Shalev-Shwartz and Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press.
Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press. website

Background Review Aids:

Matrix and Gaussian identities - short useful reference for machine learning.
Linear Algebra Review and Reference- useful selection for machine learning.
Video reviews on Linear Algebra by Zico Kolter
Video reviews on Multivariate Calculus and SVD by Aaditya Ramdas
The Matrix Cookbook - extensive reference.

Software

R

Installation
[Introduction](a href="http://cran.r-project.org/doc/manuals/R-intro.pdf)

Python

Jupyter notebooks

website

Knowledge of Python is not required for this course, but some descriptive examples in lectures may be done in Python. Students interested in further Python training are referred to the free University IT online courses.

Special Guest Lectures:

There will be a series of special guest lectures on (even more) advanced topics in machine learning. These will be 1.5-2 hours in length, with first half being more pedagogical introduction to an area and second half a research seminar. These are not examinable.

Some Thursdays in LG.01
Feb 13 1300-1500 week 4: Nicolas Heess and Leonard Hasenclever (DeepMind) on reinforcement learning and control
Feb 27 1330-1500 week 6: Arthur Gretton (Gatsby Unit, UCL)
Mar 5 week 7: Razvan Pascanu (DeepMind) on looking at data efficiency from the learning algorithm perspective
Mar 6 Friday 330-430pm week 8: Max Welling (Amsterdam, Qualcomm) (departmental distinguished seminar)
Mar 12 1300-1500 week 8: Silvia Chiappa (DeepMind)

Course Materials

The course materials will appear here before the course starts. They consist of notes, slides, and Jupyter notebooks. Notes may not be exhaustive and should be used in conjunction with the slides. All materials may be updated during the course and are thus best read on screen. Please email me any typos or corrections.

Notes:

Notes up to kernels. Updated 23/1/2020.

Slides:

slides01: Admin, PCA, K-means, empirical risk minimisation
slides02: Convex duality, SVMs, kernels

Problem Sheets:

sheet1 due Thursday noon week 3

Other Resources:

Information:

Information for Part C and OMMS Students

weblearn intercollegiate class signup
miverva intercollegiate class list
Problem sheets are due Thursday noon weeks 3,5,7,TT1
Classes are on weeks 4,6,8,TT1
Class tutors and TAs:

Information for MSc Statistical Science Students

Classes are Fridays 1500-1600 weeks 3,5,8,TT1 in LG.01

Information for DPhil and CDT students

Assessment will be via reproducibility challenge projects.
Aim is to reproduce recent ML conference papers.
Papers assigned in weeks 7,8.
4 page reports, open sourced software repositories, and 20 min presentation due in early TT.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
problemsheets		problemsheets
slides		slides
LICENSE		LICENSE
README.md		README.md
notes.pdf		notes.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SC4/SM8 Advanced Topics in Statistical Machine Learning

Course Information

Aims and objectives:

Prerequisites:

Synopsis:

Textbooks and Background Reading

Recommended textbooks:

Background Review Aids:

Software

R

Python

Jupyter notebooks

Special Guest Lectures:

Course Materials

Notes:

Slides:

Problem Sheets:

Other Resources:

Information:

Information for Part C and OMMS Students

Information for MSc Statistical Science Students

Information for DPhil and CDT students

About

Releases

Packages

License

mhstrongman/advml2020

Folders and files

Latest commit

History

Repository files navigation

SC4/SM8 Advanced Topics in Statistical Machine Learning

Course Information

Aims and objectives:

Prerequisites:

Synopsis:

Textbooks and Background Reading

Recommended textbooks:

Background Review Aids:

Software

R

Python

Jupyter notebooks

Special Guest Lectures:

Course Materials

Notes:

Slides:

Problem Sheets:

Other Resources:

Information:

Information for Part C and OMMS Students

Information for MSc Statistical Science Students

Information for DPhil and CDT students

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages