[feature request] manual mini-batching and batch dimension scaling #1437

mbabadi · 2018-10-08T18:32:06Z

In models with mixed levels of nesting (e.g. global_plate > local_plate_1 > local_plate_2 > ...), mibi-batching across different batch dimensions requires introducing proper scale factors for each batch dimension. Pyro handles these scale factors automatically if mini-batching is achieved via pyro.iarange(..., size=..., subsample_size=...) or pyro.iarange(..., size=..., subsample=...). The latter construct is flexible and allows arbitrary mibi-batching schemes, including big data situations where the full data tensor can not be loaded all at once.

Mini-batching, however, is often done manually and externally and not via pyro.iarange. In such cases, the appropriate scale factors must also be applied manually via poutine.scale. We are being consistent here: manual mini-batching? then manual scaling. However, most of the examples (DMM, VAE, ...) have little to no emphasis on this issue and neglect scaling altogether. While convergence is not a big deal while working with adaptive optimizers, neglecting the scale factors leads to wrong ELBO estimates.

Adding a word of caution to the examples about scale factors and/or throwing in poutine.scale when mini-batching manually to set a good precedent for the new users?

The text was updated successfully, but these errors were encountered:

eb8680 · 2018-10-12T08:07:04Z

Good points. Most of our examples don't do nested subsampling, but maybe we could add a manually-batched version of our LDA example or something similar? If you have another example where this is relevant, we'd definitely welcome a PR.

fritzo · 2018-10-16T15:02:12Z

@mbabadi agreed we could improve docs about subsampling. I'm inclined to recommend users use pyro.iarange(..., subsample=...) when any minibatching is done, as that clarifies the intention of the code. Do you know of any cases where minibatching cannot be done through pyro.iarange(..., subsample=...)?

neerajprad · 2018-10-16T16:11:19Z

Do you know of any cases where minibatching cannot be done through pyro.iarange(..., subsample=...)?

I think one use case is when we are running inference on the GPU using a large dataset (i.e. calling data.cuda() at once will take a lot of GPU memory) for which the torch data loaders work great since they will spin up a thread of workers that will keep pulling off batches of data and transferring it to the GPU incrementally. We are using data loaders in our examples, but many of our datasets are probably small enough that they can be directly transferred in one shot.

fritzo · 2018-10-16T16:13:18Z

@neerajprad The use case you suggest cannot be accomplished via iarange(..., subsample_size=...), but it can be accomplished via iarange(..., subsample=...) (that is the motivating use case behind the subsample kwarg).

neerajprad · 2018-10-16T16:18:56Z

The use case you suggest cannot be accomplished via iarange(..., subsample_size=...), but it can be accomplished via iarange(..., subsample=...) (that is the motivating use case behind the subsample kwarg).

Ahh, my bad. In that case, we should probably just change our examples to use subsample=, which will do the correct scaling.

mbabadi · 2018-10-17T18:09:05Z

@fritzo @neerajprad I also can not imagine what can not be accomplished by iarange(..., subsample=...)! A callable subsampler can take care of both incremental data loading and optionally sending the minibatch to CUDA. That would be great if you could simply encourage the usage of this motif in the examples.

eb8680 added documentation good first issue labels Oct 12, 2018

fritzo mentioned this issue Oct 16, 2018

Misc cleanup for 0.3 release #1406

Closed

8 tasks

eb8680 added help wanted Issues suitable for, and inviting external contributions and removed help wanted Issues suitable for, and inviting external contributions good first issue labels Oct 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request] manual mini-batching and batch dimension scaling #1437

[feature request] manual mini-batching and batch dimension scaling #1437

mbabadi commented Oct 8, 2018

eb8680 commented Oct 12, 2018 •

edited

Loading

fritzo commented Oct 16, 2018

neerajprad commented Oct 16, 2018 •

edited

Loading

fritzo commented Oct 16, 2018 •

edited

Loading

neerajprad commented Oct 16, 2018

mbabadi commented Oct 17, 2018

[feature request] manual mini-batching and batch dimension scaling #1437

[feature request] manual mini-batching and batch dimension scaling #1437

Comments

mbabadi commented Oct 8, 2018

eb8680 commented Oct 12, 2018 • edited Loading

fritzo commented Oct 16, 2018

neerajprad commented Oct 16, 2018 • edited Loading

fritzo commented Oct 16, 2018 • edited Loading

neerajprad commented Oct 16, 2018

mbabadi commented Oct 17, 2018

eb8680 commented Oct 12, 2018 •

edited

Loading

neerajprad commented Oct 16, 2018 •

edited

Loading

fritzo commented Oct 16, 2018 •

edited

Loading