Skip to content
/ ACT Public

Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)

Notifications You must be signed in to change notification settings

LAMDA-RL/ACT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Advantage-Conditioned Transformer

This is the official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning, which is accepted by AAAI 2024.

Dependencies

offlinerllib==0.1.4
UtilsRL==0.6.5
gym==0.23.1
mujoco-py==2.1.2.14
torch==2.0.1
D4RL==1.1

How to Reproduce the results

Below are the commands for reproducing the results. Feel free to contact me if anything goes wrong in your local dev environment.

  • For D4RL tasks
    python3 reproduce/sequence_rvs/run_sequence_rvs_onestep.py \
        --config reproduce/sequence_rvs/config/onestep/mujoco/${env_name}-v2.py \
        --iql_tau ${iql_tau}
    Here the value for iql tau can be found in the Appendix. In the actual benchmarking we incorporated model selection for the critics. If you want to do that, you can use reproduce/sequence_rvs/run_iql_pretrain.py t first pre-train the critics, select the best fitted one, and add --load_path to load the selected critics.
  • For the 2048 game
    python3 reproduce/sequence_rvs/run_sequence_rvs_stoc.py \
        --config reproduce/sequence_rvs/config/onestep/stoc_toy/2048-v0.py 
  • For the delayed rewards tasks
    python3 reproduce/sequence_rvs/run_sequence_rvs_onestep.py \
        --config reproduce/sequence_rvs/config/onestep/delayed/base.py \
        --task walker2d-medium-expert-v2
  • For the stochastic mujoco tasks
    python3 reproduce/sequence_rvs/run_sequence_rvs_onestep.py \
        --config reproduce/sequence_rvs/config/onestep/stochastic_mujoco/

Citation

@inproceedings{act,
  author = {Chen-Xiao Gao, Chenyang Wu, Mingjun Cao, Rui Kong, Zongzhang Zhang, Yang Yu}, 
  title = {ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning},
  booktitle = {Proceedings of the Thirty-Eighth {AAAI} Conference on Artificial Intelligence},
  year = {2024},
}

About

Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages