DRL Dodgeball

DRL (Deep Reinforcement Learning) has led to the creation of agents that exhibit fascinatingly complex and intelligent behavior, especially in the area of computer games like Dota 2 and StarCraft II. An exciting area of research for is to develop digital agents that eventually get deployed into physical robots, a task which OpenAI’s Dactyl project demonstrates requires high-fidelity training environments. In this project, we designed a physically realistic simulation with high-dimensional sensory data sources, and trained an agent using a refined deep Q-network in it.

Files

train.py

Main file used for training, currently training agents with refined DQN
Command: "py train.py"

train_DQN.py

Training agents under the Unity environment with refined DQN
Command: "py train_DQN.py"

train_DoubleDQN.py

Training agents under the Unity environment with Double DQN
Command: "py train_DoubleDQN.py"

train_vanilla.py

Training agents under the Unity environment with vanilla Q Learning
Command : "py train_vanilla.py"

utils.py

Modularized utility function which contains abstract functions necessary for training our agent

experience_replay.py

Container for the storage and use of memories/experiences according to our prioritized replay buffer.

networkparams.pt

Configuration file for defining network parameters and topology

networks.py

Pytorch network definition following the parameters from networkparams.pt

PolicyGradients.ipynb

Attempt at using Policy Gradients as an alternative learning mechanism to train intelligent agents on our environment

QLearning.ipynb

Generalized Q Learning to solve MDP problems

QLearningAgentAndy.py

Implementation of generic Q learning on basic example environments from Unity

\runs

TensorBoard based results of training runs

\environments

Built Unity environments

Future Work

Accomodate continuous action spaces
Acommodate multi-dimensional action spaces
Accomodate visual observations (first-person and birds-eye)
Experiment with more robust neural network architectures
Experiment with different optimizers

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.vscode		.vscode
__pycache__		__pycache__
assets		assets
environments		environments
models		models
papers		papers
results/experiment-01		results/experiment-01
runs		runs
videos		videos
.gitignore		.gitignore
PolicyGradients.ipynb		PolicyGradients.ipynb
QLearning.ipynb		QLearning.ipynb
QLearningAgentAndy.py		QLearningAgentAndy.py
README.md		README.md
Refined DQN.pages		Refined DQN.pages
experience_replay.py		experience_replay.py
networkparams.pt		networkparams.pt
networks.py		networks.py
qlearning_agent.py		qlearning_agent.py
qnet_parameters.pt		qnet_parameters.pt
related-work.md		related-work.md
requirements.txt		requirements.txt
train.py		train.py
train_DQN.py		train_DQN.py
train_DoubleDQN.py		train_DoubleDQN.py
train_vanilla.py		train_vanilla.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DRL Dodgeball

Files

Future Work

About

Releases

Packages

Contributors 2

Languages

kevtan/drl-dodgeball

Folders and files

Latest commit

History

Repository files navigation

DRL Dodgeball

Files

Future Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages