Skip to content

danqiye1/bandit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning

This is an implementation of code for a reinforcement learning course.

Multi-armed Bandits

This repository implements a set of algorithms to solve the multi-armed bandit problem:

  1. Epsilon Greedy (epsilon_greedy.py)
  2. Optimistic Initial Value (optimistic_initial_value.py)
  3. Upper Confidence Bound (ucb.py)
  4. Thompson Sampling (thompson.py)

Furthermore, we implemented 2 sample bandit interfaces as examples of how the algorithms (agent) can interact with bandits (environment).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages