Skip to content

andreazanette/OfflineArcher

Repository files navigation

OfflineArcher

Research Code for the Offline Experiments of "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"

Yifei Zhou, Andrea Zanette, Jiayi Pan, Aviral Kumar, Sergey Levine

archer_diagram 001

This repo supports the following methods:

And the following environments

Quick Start

1. Install Dependencies

conda create -n archer python==3.10
conda activate archer

git clone https://github.com/andreazanette/OfflineArcher
cd OfflineArcher
python -m pip install -e .

2. Download Datasets and Oracles

Offline datasets and Oracles checkpoints used in the paper can be found here. You will need to create an "oracles" and "datasets" folder and put the oracle and dataset in such folders. The oracle for Twenty Questions should be named 20q_t5_oracle.pt and the dataset should be called "twenty_questions.json".

3. Run Experiments

You can directly run experiments by runnig the launch scripts. For example, in order to lauch Offline Archer on Twenty Question simply run

. submit_OfflineArcher_TwentyQuestions.sh

The code uses the torch lightning framework. Please refer to the documentation of torch lightning (https://lightning.ai/docs/pytorch/stable/) for additional information, such as using different flags when launching the code. For example, in order to run on GPU 0 please add --trainer.devices=[0] to the launch script.

4. Citing Archer

@misc{zhou2024archer,
      title={ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL}, 
      author={Yifei Zhou and Andrea Zanette and Jiayi Pan and Sergey Levine and Aviral Kumar},
      year={2024},
      eprint={2402.19446},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published