Theoretical slowdown using batch accumulation #1101

mrgloom · 2016-09-21T09:22:02Z

As I understand batch accumulation is just a alias for iter_size parameter in Caffe solver.

DIGITS/digits/model/tasks/caffe_train.py

Line 741 in 6b8995d

solver.iter_size = self.batch_accumulation

As I can see here result of for axample using batch size of 16 and batch size 4 and iter_size 4 should be numerically equivalent? and other settings of solver as learning rate and etc. should not affect result?

What is theoretical slowdown when we use batch accumulation?

How it's work internally? Does it store each batch result of forward pass in GPU memory (iter_size times in total), then average\merge batches to create one batch of batch_size size and then do backward pass and update using one batch of batch_size size ?

The text was updated successfully, but these errors were encountered:

lukeyeager · 2016-09-21T16:26:57Z

I'll refer you to the pull request that added the feature: BVLC/caffe#1977

mrgloom closed this as completed Sep 21, 2016

mrgloom mentioned this issue Nov 23, 2017

Not able to reproduce results after following Segnet tutorial alexgkendall/SegNet-Tutorial#104

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Theoretical slowdown using batch accumulation #1101

Theoretical slowdown using batch accumulation #1101

mrgloom commented Sep 21, 2016

lukeyeager commented Sep 21, 2016

Theoretical slowdown using batch accumulation #1101

Theoretical slowdown using batch accumulation #1101

Comments

mrgloom commented Sep 21, 2016

lukeyeager commented Sep 21, 2016