You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When training with low hardware resources (GPU memory), may be useful to accumulate gradients, since the batch size cannot be increased, due to the low memory available. It doesnt help the model to be faster, but since we have convergence concerns with very small batch size, this can be very useful.
Describe the solution you'd like
It has to accumulate the grads for some batchs before to perform the wieghts update (opt.step ) . More references can be found here:
It is implementation is pretty straightfoward. Just, perform opt.step not after each step, but after each N steps to simulate a batch size N times larger.
Adapt the code for pytorch lightning can be a fruitful direction for so many engineering reasons, but it would be harder and it up to the core CoquiTTS developers to decide, but IMHO it would be great. Additional context
The text was updated successfully, but these errors were encountered:
joseluismoreira
changed the title
[Feature request]
Accumulate grads ( Larger batch, even for low gpu memory)
Jun 9, 2021
joseluismoreira
changed the title
Accumulate grads ( Larger batch, even for low gpu memory)
Accumulate grads ( Larger batch size for low gpu memory)
Jun 9, 2021
Is your feature request related to a problem? Please describe.
When training with low hardware resources (GPU memory), may be useful to accumulate gradients, since the batch size cannot be increased, due to the low memory available. It doesnt help the model to be faster, but since we have convergence concerns with very small batch size, this can be very useful.
Describe the solution you'd like
It has to accumulate the grads for some batchs before to perform the wieghts update (opt.step ) . More references can be found here:
https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#accumulate-grad-batches
https://kozodoi.me/python/deep%20learning/pytorch/tutorial/2021/02/19/gradient-accumulation.html
Describe alternatives you've considered
It is implementation is pretty straightfoward. Just, perform opt.step not after each step, but after each N steps to simulate a batch size N times larger.
Adapt the code for pytorch lightning can be a fruitful direction for so many engineering reasons, but it would be harder and it up to the core CoquiTTS developers to decide, but IMHO it would be great.
Additional context
The text was updated successfully, but these errors were encountered: