You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered an issue while trying to reproduce the results by loading the gpt2_bc_webshop_history.pt model and running the run.py script. The training was initiated with the following parameters:
However, I noticed that during training, the eval_rollout.mean value barely increases, and in many cases, the training either crashes (with rewards becoming zero) or the performance deteriorates. To mitigate the issue, I tried lowering the learning rate and reducing the number of actor updates, which seemed to prevent the model from crashing.
I would like to understand the potential reason for this behavior and whether my parameter settings are appropriate. Could you help clarify if I am missing something or suggest adjustments to make the training more stable?
The text was updated successfully, but these errors were encountered:
Hi, thanks for your interest in our work. Have you tried using the provided hyperparameters (https://github.com/YifeiZhou02/ArCHer/blob/master/scripts/config/archer_webshop.yaml)? In general a smaller learning rate, larger gradient accumulation, and rollout size will make the training more stable. Please allow one or two days of running to be able to see improvements.
I encountered an issue while trying to reproduce the results by loading the gpt2_bc_webshop_history.pt model and running the run.py script. The training was initiated with the following parameters:
8*GPUS
"""
epochs=50
actor_epochs=3
batch_size=8
grad_accum_steps=4
capacity=10000
critic_lr=6e-5
lm_lr=3e-5
rollout_size=512
gamma=0.9
tau=0.1
agent_type="archer"
webshop_lower: 2000
webshop_upper: 2100
"""
However, I noticed that during training, the eval_rollout.mean value barely increases, and in many cases, the training either crashes (with rewards becoming zero) or the performance deteriorates. To mitigate the issue, I tried lowering the learning rate and reducing the number of actor updates, which seemed to prevent the model from crashing.
I would like to understand the potential reason for this behavior and whether my parameter settings are appropriate. Could you help clarify if I am missing something or suggest adjustments to make the training more stable?
The text was updated successfully, but these errors were encountered: