This is the official implementation of the codes that produced the results in the 2021 IEEE TNNLS paper titled "Hierarchical Reinforcement Learning with Universal Policies for Multi-Step Robotic Manipulation". Link to video demo. Feel free to play with the codes and raise issues.
@ARTICLE{9366328,
author={X. {Yang} and Z. {Ji} and J. {Wu} and Y. -K. {Lai} and C. {Wei} and G. {Liu} and R. {Setchi}},
journal={IEEE Transactions on Neural Networks and Learning Systems},
title={Hierarchical Reinforcement Learning With Universal Policies for Multistep Robotic Manipulation},
year={2021},
volume={},
number={},
pages={1-15},
doi={10.1109/TNNLS.2021.3059912}}
- Ubuntu 16.04/18.04 were tested
- Higher version Ubuntu systems should work as well.
- The project was developed and tested on Linux, not sure how it works on Windows.
- Python 3
- Make sure you
pip install gym==0.21.0
. The multigoal gym codes have been removed from later releases. - Mujoco and mujoco-py (150, 200, & 210) should be all working properly
- Pytorch
- Others
python -m pip install -r requirements.txt
- Clone the repository to wherever you like.
- On a terminal:
export PYTHONPATH=$PYTHONPATH:$PATH_OF_THE_PROJECT_ROOT
. Replace$PATH_OF_THE_PROJECT_ROOT
with something like/home/someone/UOF-paper-code
. - (Optional) Activate your conda environment if desired.
- From the project root:
- Evaluate the pre-trained UOF agent
python run_uof.py --task-id 0
- Evaluate the pre-trained HAC agent
python run_hac.py --task-id 0
- Train your own UOF agent
python run_uof.py --task-id 0 --train
- Train your own HAC agent
python run_hac.py --task-id 0 --train
- Evaluate the pre-trained UOF agent
- If you know what you are doing, just modify the arguments in the script as you like.
- More algorithm-related parameters can be found in the config files.
This table gives the correspondence between the pre-trained policies provided in this repo and the performance given in the paper figures. The given UOF policies were trained with AAES and 0.75 demonstration proportion.
Task id | Paper result (red curves) |
0 | Fig. 4a, 5a, 9a, 9c, 10a |
1 | Fig. 5b, 6, 7, 8, 9b, 9d, 10b |
2 | Fig. 10c |
3 | Fig. 10d |
4 | Section VII-G |
5 | Section VII-G |
6 | Section VII-G |
7 | Section VII-G |
run_uof.py
For run_hac.py
, ignore the --multi-inter
and --no-aaes
arguments.