Vectorize ELBO computation over num particles #1176

neerajprad · 2018-06-05T20:29:06Z

Resolves #791.

My initial plan was to write a separate SVI class for this; but it turned out that modifying the base ELBO class was the simplest way to percolate this change to all the other loss classes.

I have modified only a few tests that were hand-parallelizing over num particles, but we can do a broader sweep later after this PR is merged. It is not strictly necessary though since those tests are fast anyways. Also modified the conjugate_gradient test that was not parallelized.

fritzo

Looks good so far!

fritzo · 2018-06-05T20:47:24Z

pyro/infer/elbo.py

@@ -32,7 +45,63 @@ class ELBO(object):
    def __init__(self,
                 num_particles=1,
                 max_iarange_nesting=float('inf'),
+                 auto_vectorize=False,


nit: maybe call this vectorize_particles in case we want to vectorize some other aspect?

fritzo · 2018-06-05T21:01:17Z

pyro/infer/elbo.py

+            with pyro.iarange("num_particles_vectorized", self.num_particles, dim=-self.max_iarange_nesting):
+                return fn(*args, **kwargs)
+
+        return poutine.broadcast(wrapped_fn)


Is it worth checking whether fn is already wrapped in poutine.broadcast and avoid wrapping a second time? Alternatively, this avoidance check could be done inside poutine.broadcast.

Good idea. I'll check it in poutine.broadcast itself.

I had to add fn.__broadcasted__ as an attribute to mark a function that has already been broadcasted.

Shouldn't broadcast be idempotent?

Yes broadcast should be idempotent, I'm asking only whether it is worth avoiding double wrapping for efficiency purposes. If it costs more than 5 lines of code, then it's probably not worth avoiding the double wrap.

Agreed, I'll remove this.

fritzo · 2018-06-05T21:05:02Z

pyro/infer/elbo.py

                 strict_enumeration_warning=True):
        self.num_particles = num_particles
        self.max_iarange_nesting = max_iarange_nesting
+        self.auto_vectorize = auto_vectorize
+        if self.auto_vectorize:


nit: if self.vectorize_particles and self.num_particles > 1: ...

fritzo · 2018-06-05T21:07:29Z

pyro/infer/elbo.py

+        Runs the guide and runs the model against the guide with
+        the result packaged as a trace generator.
+        """
+        if self.auto_vectorize:


This may error if self.num_particles == 1 since max_iarange_nesting will not have been incremented. Instead gate on if self.vectorize_particles and self.num_particles > 1, as used elsewhere, or simplify the gate above.

The _vectorized_num_particles method does a special check for num_particles=1, so it should work fine. Note that vectorize_particles also calls poutine.broadcast (the two effect are hard to decouple), so even when the user has num_particles=1 with parallelize_particles=True, we would like to wrap the model and guide inside a poutine.broadcast so that the behavior is consistent.

jpchen

looks good! do you have any basic speed comparisons, at least on the test examples?

jpchen · 2018-06-05T21:05:14Z

pyro/infer/elbo.py

@@ -15,6 +23,11 @@ class ELBO(object):
        :func:`pyro.iarange` contexts. This is only required to enumerate over
        sample sites in parallel, e.g. if a site sets
        ``infer={"enumerate": "parallel"}``.
+    :param bool auto_vectorize: Vectorize ELBO computation over `num_particles`.


can you add what it defaults to

jpchen · 2018-06-05T21:09:21Z

pyro/infer/trace_elbo.py

-                        check_site_shape(site, self.max_iarange_nesting)
-
-            yield model_trace, guide_trace
+        guide_trace = poutine.trace(guide).get_trace(*args, **kwargs)


is this method identical sans the for loop over num_particles? the diff shows the whole method has changed but from a cursory glance it looks the same

they are identical. have you tried the old "cross your eyes at the side-by-side diff" trick?

i do the "cmd-f a randomly sampled substring and see if it appears on both sides". 90% of the time, it works every time.

Yup, that's right! Mostly Trace_ELBO, TraceGraph_ELBO and TraceEnum_ELBO will only differ in how they generate a single trace. This is reused when we want to generate traces vectorized over num_particles.

neerajprad · 2018-06-05T22:37:13Z

looks good! do you have any basic speed comparisons, at least on the test examples?

@jpchen - This will be many orders of magnitude faster than sequentially sampling traces num particles times.

e.g. for test_conjugate_gradients.py 103 m.s. for 10k particles when they are parallelized vs. more than 2 mins otherwise.

I intend to do some comparisons w.r.t. hand parallelization which is what we should be comparing against. I did not observe any significant difference for most models, but there are cases when we end up adding up to 20% more time as compared to hand parallelization. It is still very fast though (all in milliseconds vs seconds or mins).

fritzo · 2018-06-05T23:19:06Z

pyro/poutine/handlers.py

@@ -248,7 +248,14 @@ def broadcast(fn=None):
    ...     return sample
    """
    msngr = BroadcastMessenger()
-    return msngr(fn) if fn is not None else msngr
+    if fn is not None:


I guess we can omit this logic until we see a real performance cost of double wrapping.

martinjankowiak

this is super awesome!

lgtm

question: i guess as the code stands we expect the user to know when auto-vectorization will break (e.g. in cases with stochastic control flow etc.)?

eb8680 · 2018-06-05T23:22:48Z

question: i guess as the code stands we expect the user to know when auto-vectorization will break (e.g. in cases with stochastic control flow etc.)?

Yeah, we'd have to do a bit of static analysis for automatic validation

neerajprad · 2018-06-06T16:17:51Z

question: i guess as the code stands we expect the user to know when auto-vectorization will break (e.g. in cases with stochastic control flow etc.)?

That's true. I think if an iarange does not specify a size arg or if the size arg is not consistent with the sample site's observed batch shape, we should throw an error in broadcast poutine itself (by having some kind of a strict mode under which this operates). That will ensure that we are are not doing something odd or silently failing in those situations.

fritzo

Thanks for simplifying the enumeration tests!

fritzo · 2018-06-06T17:02:46Z

tests/infer/test_enum.py

-            pyro.sample("x3", dist.Bernoulli(q).expand_by([num_particles]), infer={"enumerate": enumerate3})
+        pyro.sample("x1", dist.Bernoulli(q), infer={"enumerate": enumerate1})
+        pyro.sample("x2", dist.Bernoulli(q), infer={"enumerate": enumerate2})
+        pyro.sample("x3", dist.Bernoulli(q), infer={"enumerate": enumerate3})

    kl = sum(kl_divergence(dist.Bernoulli(q), dist.Bernoulli(p)) for p in [0.1, 0.2, 0.3])
    expected_loss = kl.item()
    expected_grad = grad(kl, [q])[0]

    elbo = TraceEnum_ELBO(max_iarange_nesting=1,


You can now set max_iarange_nesting=0 since it is incremented internally.

fritzo · 2018-06-06T17:02:51Z

tests/infer/test_enum.py

-            pyro.sample("x1", dist.Bernoulli(0.1).expand_by([num_particles]))
-            pyro.sample("x2", dist.Bernoulli(0.2).expand_by([num_particles]))
-            pyro.sample("x3", dist.Bernoulli(0.3).expand_by([num_particles]))
+            pyro.sample("x1", dist.Bernoulli(0.1))


nit: outdent

My bad, fixed.

fritzo · 2018-06-06T17:03:55Z

tests/infer/test_enum.py

-                pyro.sample("y", d.expand_by([inner_dim, 1, num_particles]))
-            with context1, context2:
-                pyro.sample("z", d.expand_by([inner_dim, outer_dim, num_particles]))
+        context1 = pyro.iarange("outer", outer_dim, dim=-2)


Can we change these dims to -1,-2 now?

fritzo · 2018-06-06T17:06:11Z

tests/infer/test_gradient.py

@@ -26,35 +26,35 @@ def test_subsample_gradient(Elbo, reparameterized, subsample):
    pyro.clear_param_store()
    data = torch.tensor([-0.5, 2.0])
    subsample_size = 1 if subsample else len(data)
-    num_particles = 50000
    precision = 0.06
    Normal = dist.Normal if reparameterized else fakes.NonreparameterizedNormal

    def model(subsample):


I think you should wrap these model and guide with @poutine.broadcast and avoid relying on the implicit @poutine.broadcast implied by num_particles>1 and vectorize_particles=True. While it is correct without the explicit decorator, it is better style to be explicit, and users may look to our tests for idiomatic usage.

vectorize_particles with num_particles=1 would still use @poutine.broadcast, so as not to have some special surprising behavior for num_particles=1. That was part of the reason why I thought of calling it auto_vectorize to emphasize that its a bit more general than adding an outer iarange, and could be used as a replacement for @poutine.broadcast with or without num_particles. That said, vectorize_particles is a clearer arg name, but it makes the broadcasting side-effect seem unexpected. Will change to using @poutine.broadcast, unless there is a better name that we can use for this arg.

fritzo · 2018-06-06T17:18:53Z

tests/infer/test_valid_models.py

+
+@pytest.mark.parametrize('enumerate_', [None, "sequential", "parallel"])
+def test_enum_discrete_vectorized_num_particles(enumerate_):
+    num_particles = 50


Can you @pytest.mark.parametrize('num_particles', [1, 10]) to test the single-particle edge case?

Good idea, will do.

fritzo · 2018-06-07T00:30:07Z

tests/infer/test_conjugate_gradients.py

        elbo.loss_and_grads(self.model, self.guide, reparameterized=reparameterized)

        for i in range(1, N + 1):
            for param_prefix in ["loc_q_%d", "log_sig_q_%d", "kappa_q_%d"]:
                if i == N and param_prefix == 'kappa_q_%d':
                    continue
                actual_grad = pyro.param(param_prefix % i).grad
-                assert_equal(actual_grad, 0.0 * actual_grad, prec=0.20, msg="".join([
+                assert_equal(actual_grad, 0.0 * actual_grad, prec=0.10, msg="".join([


fritzo · 2018-06-07T00:40:24Z

pyro/infer/elbo.py

@@ -15,6 +23,11 @@ class ELBO(object):
        :func:`pyro.iarange` contexts. This is only required to enumerate over
        sample sites in parallel, e.g. if a site sets
        ``infer={"enumerate": "parallel"}``.
+    :param bool vectorize_particles: Vectorize ELBO computation over `num_particles`.


consider adding more info:
"Whether to vectorize the ELBO computation over num_particles. Defaults to False. This requires static structure in model and guide."

Thanks, will update.

neerajprad · 2018-06-08T18:19:35Z

The failure is due to the observations library unable to download MNIST, and is unrelated. I am fine with merging this PR as is, once tests pass. Will address the issue of slowness in .expand, and adding more warnings separately.

jpchen

lgtm

neerajprad added WIP discussion labels Jun 5, 2018

neerajprad force-pushed the parallel-numparticles branch from 5c3686d to cf65af6 Compare June 5, 2018 20:36

fritzo reviewed Jun 5, 2018

View reviewed changes

jpchen reviewed Jun 5, 2018

View reviewed changes

fritzo reviewed Jun 5, 2018

View reviewed changes

martinjankowiak reviewed Jun 5, 2018

View reviewed changes

eb8680 added the Blocked label Jun 5, 2018

neerajprad force-pushed the parallel-numparticles branch from b9c04e6 to 6e1ac43 Compare June 5, 2018 23:21

neerajprad mentioned this pull request Jun 5, 2018

Allocate rightmost available dim in iaranges by default #1171

Merged

fritzo reviewed Jun 6, 2018

View reviewed changes

neerajprad added 5 commits June 6, 2018 15:23

Vectorize ELBO computation over num particles

28150ff

address comments

b6820a0

address comments

779e260

address comment

4995fcf

add missing annotation

76fa97c

neerajprad force-pushed the parallel-numparticles branch from 31371c5 to 76fa97c Compare June 6, 2018 22:24

neerajprad removed the Blocked label Jun 6, 2018

fritzo previously approved these changes Jun 7, 2018

View reviewed changes

modify docstring

ec0a818

neerajprad dismissed fritzo’s stale review via ec0a818 June 7, 2018 16:21

add backticks

3ccd5ce

neerajprad added awaiting review and removed WIP labels Jun 8, 2018

jpchen approved these changes Jun 8, 2018

View reviewed changes

fritzo approved these changes Jun 8, 2018

View reviewed changes

fritzo merged commit 865bff6 into pyro-ppl:dev Jun 8, 2018

neerajprad mentioned this pull request Aug 10, 2018

Future areas of work / improvement for HMC and NUTS #1093

Closed

17 tasks

breakdawn pushed a commit to breakdawn/pyro that referenced this pull request Aug 15, 2018

Vectorize ELBO computation over num particles (pyro-ppl#1176)

28db4af

Vectorize ELBO computation over num particles #1176

Vectorize ELBO computation over num particles #1176

Conversation

neerajprad commented Jun 5, 2018 • edited Loading

fritzo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fritzo Jun 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpchen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neerajprad commented Jun 5, 2018

Choose a reason for hiding this comment

martinjankowiak left a comment

Choose a reason for hiding this comment

eb8680 commented Jun 5, 2018

neerajprad commented Jun 6, 2018

fritzo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fritzo Jun 6, 2018 • edited Loading

Choose a reason for hiding this comment

neerajprad Jun 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neerajprad commented Jun 8, 2018

jpchen left a comment

Choose a reason for hiding this comment

neerajprad commented Jun 5, 2018 •

edited

Loading

fritzo Jun 5, 2018 •

edited

Loading

fritzo Jun 6, 2018 •

edited

Loading

neerajprad Jun 6, 2018 •

edited

Loading