Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



30 Commits

Repository files navigation


Project, created to run reinforcement learning experiments in gym environments. The idea is to create universal framework, that could be used to run the games as well as comfortably test and compare different RL algorithms. Currently works with Atari envs from gym and uses raw pixels input to predict discrete actions as an output.



Download and install Anaconda. Run:

conda create -n ml-games python=3.6 anaconda 
conda activate ml-games 
pip install -U numpy
pip install tensorflow
pip install gym
pip install --no-index -f atari_py


Download and install Anaconda. Run:

sudo apt install -y python3-dev zlib1g-dev libjpeg-dev cmake swig python-pyglet python3-opengl libboost-all-dev libsdl2-dev libosmesa6-dev patchelf ffmpeg xvfb
conda create -n ml-games python=3.6 anaconda 
source activate ml-games 
pip install -U numpy
pip install tensorflow
pip install 'gym[atari]'


Clone the project and activate conda env:

git clone
conda activate ml-games 
cd ml-games

Run selected game, model and number of games. Examples:

python -game Breakout-v0 -model RandomModel
python -game Pong-v0 -model PolicyGradientsModel -n_games 20000

Game statistics and recordings will be saved in images/ and videos/.


PolicyGradientsModel - Random moves vs. model after 10k games:

Repo overview and contribution (Gamer) - runs the games, tracks statistics and gets action from model. At the selected frequency plots game statistics to images/ and saves game game play recordings to videos/ .

models/ (RandomModel) - baseline model, that always makes random moves. Created to test if everything works and contains only mandatory methods:

  • predict_action(observation) - get the current game state from Gamer and return predicted action.
  • get_step_results(observation, reward, done, info) - get the results after the action step was done.

models/ (PolicyGradientsModel) - RL model, that predicts best action from existing observation and learns from his experience. Based on simple neural network and Policy Gradients approach. Contains same methods as RandomModel and many more. Extracts command line arguments, loads initializes objects and runs the projects according to user preferences.

New models could be created and added to /models. They should contain same methods as RandomModel and in order to run they should be added to