smdp

implementation of semi-Markov Decision Process with options(primitive actions and temporally extended actions towards a designated landmark) with a restricted initiation set

repository contents:

valueIteration file = generic solver
setUp = primitive and landmark option setup, option space setup basics in setUp = used for landmarkOptionSetUp.py
executive = transition, reward, and option space functions used in option value iteration
tests = test files for option set up, value iteration, and value iteration functions (transition, reward, optionSpace)
valueIteration = generic value iteration code used throughout the Gao Lab
main = file to generate improved policies (termination improvement for policy generated in valueLearning)
visualization = drawHeatMap.py code and screenshot of the resulting heat map used to model figure 2 in "Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning"

numbers = landmark number relating to the option
black arrow = primitive option
blue arrow = primitive option within the policy of the landmark option

*note: initial setUp and value iteration run on semi-MDP options are run using the generic value iteration solver

Name		Name	Last commit message	Last commit date
Latest commit History 505 Commits
exec		exec
src		src
tests		tests
visualization		visualization
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

smdp

About

Releases

Packages

Languages

adelchan07/smdp

Folders and files

Latest commit

History

Repository files navigation

smdp

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages