Params inside iarange #238

ngoodman · 2017-10-11T16:26:57Z

the current behavior of params created inside an iarange (aka vectorized map_data) is a bit nonsensical: a vector of params of length batch_size will be created and used on every iteration. this means that a param gets reused for whatever data point happens to be selected at that index in a batch. this comes up eg when there is a mean-field model for a local variable.

(there is a reasonable work-around, by creating the vector of needed params globally, and only selecting from it in the iarange.)

a better behavior would be to automatically associate params to their index into the data tensor, pulling out the appropriate params for a given batch (and creating them as needed). it might be tricky to get this working, and especially to do it efficiently.

i'm flagging this as a low priority bug for now. we can think through a solution after launch.

The text was updated successfully, but these errors were encountered:

martinjankowiak · 2017-10-11T16:34:15Z

the other attitude to take is #alwaysamortize ; )

fritzo · 2017-11-08T00:01:55Z

It seems pretty simple to write this as:

with pyro.iarange("data", size, subsample_size) as ind:
    ps = pyro.param("ps", Variable(torch.ones(size), requires_grad=True)).index_select(ind)
    # ...do stuff with ps...

fritzo · 2018-04-13T23:47:40Z

I believe the only way to get this to work automatically is to add a notion of .event_dim to parameters. However the batch_dim/event_dim distinction has proven to be confusing in the context of distributions. It seems like it will be easier for us to provide documentation for this subtle use of param inside iarange, rather than implementing automatic machinery that is itself confusing and needing documentation.

I'll remove this from the 0.2 milestone (if you don't mind). This will give us a chance to see how users adapt to other automatic dimension handling in e.g. enumeration. Given what we learn after 0.2 release, there may be a clear path to resolving this issue in Pyro 0.3.

ngoodman · 2018-04-14T13:19:51Z

We can wait until after 0.2, but I do think this is a significant issue, both because it makes irange semantically diverge from iarange and because it messes up mean field guides.

fritzo · 2018-10-23T01:58:49Z

Here is a proposal for refactoring towards a more plate-friendly pyro.param statement.

I think we could modify param behavior by inspecting the cond_indep_stack. For each vectorized frame we could .expand(frame.size) at the appropriate dim. Determining the appropriate dim is tricky because param currently does not have a notion of event_dim, and we would need to add something akin to .independent() or an event_dim kwarg to param statements. For each non-vectorized frame we could do some simple autoname stuff like appending '_{}'.format(frame.counter). For consistency, we should probably also apply this autonaming to sample statements and avoid most of our current '_{}'.format(i) grossness.

To correctly handle subsampling, we would need to give pyro.param access to the subsampling tensors. This information can't live in the cond_indep_stack, but it could be looked-up in a trace via trace.nodes['_subsample_{}'.format(frame.name)]. This is a little gross, but param could simply find that trace in the _PYRO_STACK. Once this information is available, vectorized param would slice its tensors by the subsample index, and sequential param would index into subsample_index[frame.counter] to create the correct param name.

To account for global params, we could use various autonaming hacks. E.g. we could avoid expanding in vectorize_particles if the cond_indep_stack's frame.name.startswith('_pyro_'). Or we could add a kwarg pyro.param(..., global=True) to avoid the proposed .expand() behavior. @eb8680 could probably find cleaner patterns based on pyro.contrib.autoname.

@eb8680 @neerajprad @ngoodman WDYT?

ngoodman · 2018-10-23T17:05:09Z

what if we add an abstract index argument to param (and sample)? ie something along the lines of pyro.param("foo", index=[I1 I2], ...) where Ij can be either literals or abstract indices introduced by plates (assuming we go with the proposal to have plate bind a name for the index it introduces).

that is, within an iteration the indices are just strings/numbers that get appended to the name. within a plate block the index is an object that tells us which dimension to expand along. if a param is created without and index it is treated as global.

this by itself doesn't tell us what to do with subsampling, though as @eb8680 pointed out we should probably rethink that anyhow.... maybe @fritzo's suggestions is an answer?

neerajprad · 2018-10-23T17:51:35Z

I like the idea of a more plate friendly param!

This information can't live in the cond_indep_stack, but it could be looked-up in a trace via trace.nodes['subsample{}'.format(frame.name)]. This is a little gross, but param could simply find that trace in the _PYRO_STACK.

It should mostly be fine, but I hope that the cost doesn't add up as we do this every time the model is run. In any case, I think we can explore other ideas of exposing this information to param.

E.g. we could avoid expanding in vectorize_particles if the cond_indep_stack's frame.name.startswith('pyro'). Or we could add a kwarg pyro.param(..., global=True) to avoid the proposed

Given our existing use cases, it might be easier to add a kwarg to plate, say global_params=Trueto denote that any params inside it do not need to be expanded. Then it will be easy for us to wrap a user model inside a plate without making any changes to the model code.

I have not thought through this, but I think this could also be extended to support iparam (e.g. by wrapping inside a plate with global_params=False) #1213, #1330.

fritzo · 2018-10-23T18:43:36Z

Ok, I've illustrated possible syntax changes in #1491 (based on #1486 ). The intent is to:

illustrate current syntax in Add test demonstrating current plate syntax #1486;
work towards consensus on proposed syntax changes in Add plate magic for autoname and subsampling #1491;
decide which changes we actually want to implement.

eb8680 · 2018-10-23T18:54:35Z

In addition to a batch_shape/event_shape distinction, it seems to me that there are at least two types of broadcasting operations we'd need to distinguish between - one that simply broadcasts and one that actually replicates the parameter tensor along a dimension to allow different values at each site. This distinction corresponds to the distinction between torch.Tensor.expand() and torch.Tensor.repeat().

For example, consider the case of SVI with vectorized num_particles and local parameters created by a param statement within a model plate. Presumably we'd want to repeat along the model plate, but only expand along the num_particles plate. It's easy enough to see that here, but what about more complicated settings, especially when a model may be used under multiple different broadcasting interpretations?

There's fundamentally ambiguity in the identity of each broadcasting dimension, and moreover expand/repeat decisions made in one setting will affect all future decisions that could be made about that parameter. Is there a design decision we could make that would remove that ambiguity? We can't simply create all parameters globally, since the existence, shape, or initial values of local parameters may depend on computation within a model. Should we only allow repeat when a parameter is first created, and assume all broadcasts thereafter are expands? That's a possibility, but it introduces a dependence on the context in which the parameter was created, which sounds like a recipe for very tricky bugs.

ngoodman added bug low priority labels Oct 11, 2017

fritzo mentioned this issue Dec 19, 2017

Mixture models converging to a single mode #635

Closed

ngoodman removed the low priority label Dec 20, 2017

ngoodman added this to the 0.2 release milestone Dec 20, 2017

fritzo added the low priority label Apr 13, 2018

fritzo removed this from the 0.2.0 release milestone Apr 14, 2018

fritzo mentioned this issue Apr 14, 2018

Support subsampling in pyro.contrib.autoguide's ADVI #1026

Closed

fritzo added this to the 0.3.0 release milestone Apr 22, 2018

fritzo mentioned this issue May 3, 2018

Implement TorchDistribution.expand() #1119

Closed

fritzo mentioned this issue Jul 3, 2018

Support independence assumptions about params #1213

Closed

fritzo mentioned this issue Jul 16, 2018

Cleaning up SVI-related internals #1234

Open

11 tasks

fritzo mentioned this issue Oct 23, 2018

Add test demonstrating current plate syntax #1486

Closed

fritzo mentioned this issue Oct 23, 2018

Add plate magic for autoname and subsampling #1491

Closed

7 tasks

martinjankowiak removed this from the 0.3.0 release milestone Nov 26, 2018

fritzo mentioned this issue Mar 30, 2019

Subsample pyro.param inside plate if event_dim is given #1796

Merged

3 tasks

fritzo self-assigned this Mar 30, 2019

eb8680 closed this as completed in #1796 Mar 30, 2019

fritzo mentioned this issue Nov 9, 2019

Implement PyroModule with poutine aware __getattr__ #2134

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Params inside iarange #238

Params inside iarange #238

ngoodman commented Oct 11, 2017

martinjankowiak commented Oct 11, 2017

fritzo commented Nov 8, 2017

fritzo commented Apr 13, 2018 •

edited

Loading

ngoodman commented Apr 14, 2018

fritzo commented Oct 23, 2018

ngoodman commented Oct 23, 2018

neerajprad commented Oct 23, 2018

fritzo commented Oct 23, 2018 •

edited

Loading

eb8680 commented Oct 23, 2018 •

edited

Loading

Params inside iarange #238

Params inside iarange #238

Comments

ngoodman commented Oct 11, 2017

martinjankowiak commented Oct 11, 2017

fritzo commented Nov 8, 2017

fritzo commented Apr 13, 2018 • edited Loading

ngoodman commented Apr 14, 2018

fritzo commented Oct 23, 2018

ngoodman commented Oct 23, 2018

neerajprad commented Oct 23, 2018

fritzo commented Oct 23, 2018 • edited Loading

eb8680 commented Oct 23, 2018 • edited Loading

fritzo commented Apr 13, 2018 •

edited

Loading

fritzo commented Oct 23, 2018 •

edited

Loading

eb8680 commented Oct 23, 2018 •

edited

Loading