Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Training Unstable(webshop,gpt2) #14

Open
RobertXWL opened this issue Sep 11, 2024 · 1 comment
Open

Model Training Unstable(webshop,gpt2) #14

RobertXWL opened this issue Sep 11, 2024 · 1 comment

Comments

@RobertXWL
Copy link

I encountered an issue while trying to reproduce the results by loading the gpt2_bc_webshop_history.pt model and running the run.py script. The training was initiated with the following parameters:

8*GPUS
"""
epochs=50
actor_epochs=3
batch_size=8
grad_accum_steps=4
capacity=10000
critic_lr=6e-5
lm_lr=3e-5
rollout_size=512
gamma=0.9
tau=0.1
agent_type="archer"
webshop_lower: 2000
webshop_upper: 2100
"""

However, I noticed that during training, the eval_rollout.mean value barely increases, and in many cases, the training either crashes (with rewards becoming zero) or the performance deteriorates. To mitigate the issue, I tried lowering the learning rate and reducing the number of actor updates, which seemed to prevent the model from crashing.

I would like to understand the potential reason for this behavior and whether my parameter settings are appropriate. Could you help clarify if I am missing something or suggest adjustments to make the training more stable?

@YifeiZhou02
Copy link
Owner

Hi, thanks for your interest in our work. Have you tried using the provided hyperparameters (https://github.com/YifeiZhou02/ArCHer/blob/master/scripts/config/archer_webshop.yaml)? In general a smaller learning rate, larger gradient accumulation, and rollout size will make the training more stable. Please allow one or two days of running to be able to see improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants