Replies: 1 comment
-
The optimizer step and gradient zero-ing is handled internally within the deepspeed engine I think (not 100% sure), so it is okay to skip. I believe even if executed, these will be no-ops. I think to reduce confusion, we could do these but I don't have the bandwidth to test on a complete run. Would you be able to verify? cc @yiyixuxu |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In this 1448 line of this script, when deepspeed is enabled, optimizer.step() and zero_grad() will no longer be executed. I have checked the related documentation of accelerate and found no similar introduction, is this correct?
diffusers/examples/cogvideo/train_cogvideox_lora.py
Lines 1448 to 1450 in d9029f2
Beta Was this translation helpful? Give feedback.
All reactions