This repo records my answers to all questions from the excercises of CS229 (Autumn 2017). http://cs229.stanford.edu/syllabus.html
I tried to record all details in Jupyter notebooks. If you see mistakes, please let me know.
As for reinforcement learning, I've also implemented value iteration, policy iteration, SARSA, and Q-learning before in javascript for the gridworld at https://github.com/zyxue/rljs with a web demo at https://rljs.herokuapp.com/.
You might also be interested in an earlier version of cs229, https://see.stanford.edu/Course/CS229.
This project is considered complete.