Reproducing results from the Hindsight Experience Replay Paper in PyTorch
- implemented the bit flip environment
- Task: given a starting string of
n
bits and a target string of same length, flip the bits in the start string till the target string is achieved. The number of flips allowed is equal to the number of bits in the string -n
- Task: given a starting string of
- implemented a deep q-network with one hidden layer of 256 nodes
- implemented hindsight-experience-replay with goal selection rule of
s_T
, i.e. the new goal is the last state achieved in the sequence of flips (in the file dqn-her.ipynb) - implemented a baseline DQN network without hindsight-experience-replay (in the file dqn.ipynb)
- Compared success rate across number of episodes for bit length
n=6,7,8
for both DQN and DQN+HER (could not do higher bit length as no GPU)
- Higher bit length using Google Colab notebook