A deep reinforcement learning agent that catches yellow bananas and avoids blue ones in a large square world simulated with Unity ML-Agents.
This project is my solution to the navigation project of udacity's Deep Reinforcement Learning Nanodegree.
The agend is a monkey moving in a 2d arena where on the floor are blue and yellow bananas, each time the agent hits a banana it is rewarded as follows:
- for the yellow bananas it receives a reward of +1
- for the blue bananas it receives a reward of -1
State space has 37 continuous dimensions including:
- the agent's velocity
- ray based perception of objects around agent's forward direction
Action space has 1 discrete dimension, possible dimensions are:
0
- move forward1
- move backward2
- turn left3
- turn right
The task is episodic.
The task is considered solved if the agent can achieve a score of +13 over 100 consecutive episodes.
- Download the pre-compiled unity environment Linux: click here Mac OSX: click here Windows (32-bit): click here Windows (64-bit): click here
- Decompress the archive at your preferred location (e.g. in this repository working copy)
- Open the getting-started.ipynb. This notebook installs the dependencies and explores the environment concluding with a demonstration of an agent which chooses actions randomly.
- Follow the instructions indicated in the getting-started.ipynb notebook. You will need to specify the path to the environment executable that you downloaded at the beginning.
The code is organized as follows (in hierarchical order from abstract to detailed):
- Report.ipynb: notebook illustrating the result of this project.
- deep_monkey.py: includes high level functions used in the notebook, these are used for training, plotting results and saving model checkpoints.
- agent.py: this file include the classes that model the agent and its dependencies. Implements a high level Agent which generalizes each variant of the original DQN algorithm.
- model.py: this file include the neural network class, implemented using pytorch, which is used by the agent for approximating the q function.
A working python 3 environment is required. You can easily setup one installing [anaconda] (https://www.anaconda.com/download/)
If you are using anaconda is suggested to create a new environment as follows:
conda create --name deepmonkey python=3.6
activate the environment
source activate deepmonkey
start the jupyter server
python jupyter-notebook --no-browser --ip 127.0.0.1 --port 8888 --port-retries=0