PPO

Overview

This is the code for this video on Youtube by Siraj Raval. Download OpenAI's Gym and Tensorflow as dependencies.

PPO implementation for OpenAI gym environment based on Unity ML Agents: https://github.com/Unity-Technologies/ml-agents

Notable changes include:

Ability to continuously display progress with non-stochastic policy during training
Works with OpenAI environments
Option to record episodes
State normalization for given number of frames
Frame skip
Faster reward discounting etc.

Credits

Credits for this code go to embersarc. I've merely created a wrapper to get people started.