You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently mini-batch size N is subject to the memory limit. For example, for training a large model, I cannot use large mini-batch size, otherwise my GPU cannot N training sample at once.
Is it possible that Caffe can support mini-batch size that can be a multiple of input data batch size? My understanding is that it just needs to accumulate the gradients over several batches of input data before doing a model update step. Right?
I wonder if Caffe will support this functionality, or it already does that (I am new to Caffe so I may have missed something)? Or is there any difficulty I overlooked in implementing this functionality?
The text was updated successfully, but these errors were encountered:
@shelhamer Thanks for your information. This is great!
Is there a way now to control the data batch size and the mini-batch size, based on the new gradient accumulation implementation? I think this needs an extra parameter in the proto files, and it also requires some changes in the solver, right? Are these also be done, or will be done soon?
Currently mini-batch size N is subject to the memory limit. For example, for training a large model, I cannot use large mini-batch size, otherwise my GPU cannot N training sample at once.
Is it possible that Caffe can support mini-batch size that can be a multiple of input data batch size? My understanding is that it just needs to accumulate the gradients over several batches of input data before doing a model update step. Right?
I wonder if Caffe will support this functionality, or it already does that (I am new to Caffe so I may have missed something)? Or is there any difficulty I overlooked in implementing this functionality?
The text was updated successfully, but these errors were encountered: