Skip to content

Simple implementation and comparison of three reinforcement learning models.

Notifications You must be signed in to change notification settings

Phrungck/reinforcement-learning-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning Algorithms (2021 coding-style)

Implemented Q-Learning, SARSA, and Cross Entropy Method using numpy and torch and compared their performance on frozenlake-deterministic, frozenlake-stochastic, and cliffwalking.

Dependencies

  • OpenAI gym
  • matplotlib
  • numpy
  • collections
  • torch
  • itertools
  • plotting

Deterministic Frozenlake Results

alt text

Stochastic Frozenlake Results

alt text

Cliffwalking Results

alt text

Changing Parameters

alt text

All results showed that SARSA and Q-Learning bested Cross-entropy method for the CliffWalking environment. Changes in the hyperparameters showed significant changes. Notably, by increasing the alpha parameter Q-Learning and SARSA exceeded results of the baseline.

Increase in alpha while reducing Gamma resulted to almost similar values for all variants of Q-Learning and SARSA. However, Cross-entropy became more erratic in the process.

Releases

No releases published

Packages

No packages published