Skip to content

A summary of important concepts and algorithms in RL

License

Notifications You must be signed in to change notification settings

alxthm/rl-cheatsheet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning Cheat Sheet

Some important concepts and algorithms in RL, all summarized in one place. PDF file is also available here.

Contents

  1. Bandits: settings, exploration-exploitation, UCB, Thompson Sampling
  2. RL Framework: Markov Decision Process, Markov Property, Bellman Equations
  3. Dynamic Programming: Policy Evaluation, Policy Iteration, Value Iteration
  4. Value-Based
    1. Tabular environments: Tabular Q-learning, SARSA, TD-learning, eligibility traces
    2. Approximate Q-learning: DQN, prioritized experience replay, Double DQN, Rainbow, DRQN
  5. Policy Gradients
    1. On-Policy: REINFORCE, Actor-Critic (with compatible functions, GAE), A2C/A3C, TRPO, PPO
    2. Off-Policy: Policy gradient theorem, ACER, importance sampling
    3. Continuous Action Spaces: DDPG, Q-Prop

References

Contributing

Contributions are welcome ! If you find any typo or error, feel free to raise an issue.

If you would like to contribute to the code and make changes directly (e.g. adding algorithms, adding a new section, etc), you should start by cloning the repository.

git clone https://github.com/alxthm/rl-cheatsheet.git

Work locally

Since all the sources and figures are included in the repo, you can make modifications and build the document locally. For this, you should have a full TeX distribution (if not, you can install it here), and you can then edit the LateX files with any IDE (e.g. Visual Studio Code).

Work on Overleaf

If you'd rather avoid installing LateX, you can also use Overleaf. For this, you need to compress the rl-cheatsheet folder and upload it to Overleaf (New Project -> Upload Project).