GitHub - IanYangChina/UOF-paper-code: Official implementation for the UOF paper (algorithm & environment)

Official implementation for the UOF paper (algorithm & environment)

This is the official implementation of the codes that produced the results in the 2021 IEEE TNNLS paper titled "Hierarchical Reinforcement Learning with Universal Policies for Multi-Step Robotic Manipulation". Link to video demo. Feel free to play with the codes and raise issues.

If you use our codes, please consider cite our paper as follow:

@ARTICLE{9366328,
  author={X. {Yang} and Z. {Ji} and J. {Wu} and Y. -K. {Lai} and C. {Wei} and G. {Liu} and R. {Setchi}},
  journal={IEEE Transactions on Neural Networks and Learning Systems}, 
  title={Hierarchical Reinforcement Learning With Universal Policies for Multistep Robotic Manipulation}, 
  year={2021},
  volume={},
  number={},
  pages={1-15},
  doi={10.1109/TNNLS.2021.3059912}}

Main Dependencies:

Ubuntu 16.04/18.04 were tested
- Higher version Ubuntu systems should work as well.
- The project was developed and tested on Linux, not sure how it works on Windows.
Python 3
Make sure you pip install gym==0.21.0. The multigoal gym codes have been removed from later releases.
Mujoco and mujoco-py (150, 200, & 210) should be all working properly
Pytorch
Others python -m pip install -r requirements.txt

Get started:

Clone the repository to wherever you like.
On a terminal: export PYTHONPATH=$PYTHONPATH:$PATH_OF_THE_PROJECT_ROOT. Replace $PATH_OF_THE_PROJECT_ROOT with something like /home/someone/UOF-paper-code.
(Optional) Activate your conda environment if desired.
From the project root:
- Evaluate the pre-trained UOF agent python run_uof.py --task-id 0
- Evaluate the pre-trained HAC agent python run_hac.py --task-id 0
- Train your own UOF agent python run_uof.py --task-id 0 --train
- Train your own HAC agent python run_hac.py --task-id 0 --train
If you know what you are doing, just modify the arguments in the script as you like.
More algorithm-related parameters can be found in the config files.

Task id (pre-trained policy) - paper result relation

This table gives the correspondence between the pre-trained policies provided in this repo and the performance given in the paper figures. The given UOF policies were trained with AAES and 0.75 demonstration proportion.


Task id	Paper result (red curves)
0	Fig. 4a, 5a, 9a, 9c, 10a
1	Fig. 5b, 6, 7, 8, 9b, 9d, 10b
2	Fig. 10c
3	Fig. 10d
4	Section VII-G
5	Section VII-G
6	Section VII-G
7	Section VII-G

Full argument list:

run_uof.py


Arguments	Description
`--task-id i`	Task id, where,
`--render`	Use this flag if you want to render the task
`--train`	Use this flag for training an agent from scratch
`--multi-inter`	Use this flag to train separate high-level policies for each goal
`--no-aaes`	Use this flag to turn off the AAES exploration strategy
`--no-demo`	Use this flag to turn off the Abstract Demonstrations
`--demo-proportion j`	Use this flag to set the proportion of episodes that use demonstrations, where,

For run_hac.py, ignore the --multi-inter and --no-aaes arguments.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
agent		agent
config		config
multigoal_env		multigoal_env
pretrained_policy		pretrained_policy
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
run_hac.py		run_hac.py
run_uof.py		run_uof.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official implementation for the UOF paper (algorithm & environment)

If you use our codes, please consider cite our paper as follow:

Main Dependencies:

Get started:

Task id (pre-trained policy) - paper result relation

Full argument list:

About

Releases

Packages

Languages

License

IanYangChina/UOF-paper-code

Folders and files

Latest commit

History

Repository files navigation

Official implementation for the UOF paper (algorithm & environment)

If you use our codes, please consider cite our paper as follow:

Main Dependencies:

Get started:

Task id (pre-trained policy) - paper result relation

Full argument list:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages