Fixes for PyTorch 1.7 release #2683

fritzo · 2020-10-28T13:54:42Z

Tasks

replace .expand(...) -> .expand(...).clone() if the result must support .__setitem__()
update to use torch.fft; see Deprecate old fft functions pytorch/pytorch#44876 (comment)

fix examples/sparse_regression.py

To reproduce, run pdb -cc examples/sparse_regression.py --num-steps=2 --num-data=50 --num-dimensions 20

Traceback (most recent call last):
File "/Users/fobermey/opt/miniconda3/envs/pyro/lib/python3.7/site-packages/numpy/lib/shape_base.py", line 867, in split
  len(indices_or_sections)
TypeError: object of type 'int' has no len()

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/fobermey/opt/miniconda3/envs/pyro/lib/python3.7/pdb.py", line 1697, in main
  pdb._runscript(mainpyfile)
File "/Users/fobermey/opt/miniconda3/envs/pyro/lib/python3.7/pdb.py", line 1566, in _runscript
  self.run(statement)
File "/Users/fobermey/opt/miniconda3/envs/pyro/lib/python3.7/bdb.py", line 585, in run
  exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "/Users/fobermey/github/pyro-ppl/pyro/examples/sparse_regression.py", line 4, in <module>
  import argparse
File "/Users/fobermey/github/pyro-ppl/pyro/examples/sparse_regression.py", line 290, in main
  median['sigma'].double())
File "/Users/fobermey/opt/miniconda3/envs/pyro/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
  return func(*args, **kwargs)
File "/Users/fobermey/github/pyro-ppl/pyro/examples/sparse_regression.py", line 173, in compute_posterior_stats
  active_quadratic_dims = np.split(active_quadratic_dims, active_quadratic_dims.shape[0])
File "<__array_function__ internals>", line 6, in split
File "/Users/fobermey/opt/miniconda3/envs/pyro/lib/python3.7/site-packages/numpy/lib/shape_base.py", line 871, in split
  if N % sections:
ZeroDivisionError: integer division or modulo by zero

fix multi-chain mcmc, failing in the baseball example

The baseball example is failing. to reproduce:

python examples/baseball.py --num-samples=200 --warmup-steps=100 --jit

raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries.  If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).

Tested

Ran tests locally against torch==1.7. (Note CI still uses PyTorch 1.6 since that is the oldest supported)

pytest -vx --stage unit
pytest -vx --stage integration
make test-tutorials
pytest -vx --stage test_examples (currently failing)

fritzo · 2020-11-16T15:11:10Z

Thanks for helping, @fehiepsi 🎉

fritzo · 2020-11-16T16:29:43Z

@martinjankowiak can you please take a look at your failing sparse_regression.py example?

@neerajprad @fehiepsi any idea how to fix multi-chain mcmc in the baseball example?

tests/test_examples.py

…-1.7-fixes

fritzo · 2020-11-16T21:25:28Z

@neerajprad I'm seeing weird unexpected .requires_grad in the baseball example. I think PyTorch 1.7 might be overly propagating .requires_grad during jitting, even to .shape (which under the jit is a tensor). 😕

neerajprad · 2020-11-16T22:57:32Z

@neerajprad I'm seeing weird unexpected .requires_grad in the baseball example. I think PyTorch 1.7 might be overly propagating .requires_grad during jitting, even to .shape (which under the jit is a tensor). 😕

I'll take a look at this, @fritzo.

…to torch-1.7-fixes

fehiepsi · 2020-11-16T23:20:08Z

pyro/infer/mcmc/api.py

+                # at https://github.com/pytorch/pytorch/issues/10375
+                # This also resolves "RuntimeError: Cowardly refusing to serialize non-leaf tensor which
+                # requires_grad", which happens with `jit_compile` under PyTorch 1.7
+                args = [arg.clone().detach() if torch.is_tensor(arg) else arg for arg in args]


@neerajprad Could you double-check if this is a good solution? This resolves the issues for:

cuda + jit/nojit

cpu + jit

There is no issue with cpu + nojit so should we filter it out at if self.num_chains > 1 above?

How about we remove the num_chains restriction and simply detach (instead of cloning)? That should be very cheap and we can do that as a sanity measure anyways. I think @fritzo has correctly identified that the jit is incorrectly propagating up requires_grad, so this seems like a bigger problem.

Thanks! It seems that .detach() solves the issue. Removing num_chains helps for single-chain too (without detach, baseball failed with num_chains=1???). I think the problem comes from this line of Binomial.log_prob. If I change k * self.logits to (k + 0) * self.logits, the problem goes away without having to detach... That observation agrees with: the jit is incorrectly propagating up requires_grad.

Thanks, @fehiepsi. I wonder if the problem is due to the cached logits. I have seen this previously for transforms where the cached value has its requires_grad set during backward, but in that case we get an error saying that JIT cannot insert a constant with a requires_grad attribute set. That doesn't explain why adding 0 helps take care of this.

I'm not sure. I remember that the issue still happens when I changed self.logits to self.probs.

pyro/infer/mcmc/api.py

fritzo · 2020-11-17T03:03:40Z

Thanks again for fixing the baseball example @fehiepsi! I think this is ready to merge.

Fixes for PyTorch 1.7 release

84a84f0

fritzo added the WIP label Oct 28, 2020

fritzo added this to the 1.5.1 milestone Oct 28, 2020

fritzo added 2 commits November 5, 2020 17:40

Update spanning_tree.cpp

b4b86a4

Merge branch 'dev' into torch-1.7-fixes

4cb57fe

fehiepsi self-assigned this Nov 14, 2020

make fft compatible with pytorch 1.6

5a8fd94

fehiepsi removed their assignment Nov 15, 2020

fehiepsi added awaiting review and removed WIP labels Nov 15, 2020

Merge branch 'dev' into torch-1.7-fixes

b07ad5f

fritzo added WIP and removed awaiting review labels Nov 16, 2020

fritzo mentioned this pull request Nov 16, 2020

Add a OneTwoMatching distribution #2697

Merged

12 tasks

change sparse reg test args

30083cb

martinjankowiak reviewed Nov 16, 2020

View reviewed changes

tests/test_examples.py Show resolved Hide resolved

fritzo added 2 commits November 16, 2020 13:47

Merge branch 'torch-1.7-fixes' of github.com:pyro-ppl/pyro into torch…

128b062

…-1.7-fixes

Work around PyTorch 1.7 bugs in mcmc

a922631

fehiepsi added 2 commits November 16, 2020 17:16

fix mcmc multi-chain bugs

b4b1edf

Merge branch 'torch-1.7-fixes' of https://github.com/pyro-ppl/pyro in…

02de667

…to torch-1.7-fixes

fehiepsi reviewed Nov 16, 2020

View reviewed changes

neerajprad reviewed Nov 17, 2020

View reviewed changes

pyro/infer/mcmc/api.py Outdated Show resolved Hide resolved

fritzo added awaiting review and removed WIP labels Nov 17, 2020

only detach

cf1d49f

neerajprad approved these changes Nov 17, 2020

View reviewed changes

neerajprad merged commit ae55140 into dev Nov 17, 2020

fritzo deleted the torch-1.7-fixes branch September 27, 2021 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for PyTorch 1.7 release #2683

Fixes for PyTorch 1.7 release #2683

fritzo commented Oct 28, 2020 •

edited

Loading

fritzo commented Nov 16, 2020

fritzo commented Nov 16, 2020 •

edited

Loading

fritzo commented Nov 16, 2020

neerajprad commented Nov 16, 2020

fehiepsi Nov 16, 2020

neerajprad Nov 17, 2020

fehiepsi Nov 17, 2020

neerajprad Nov 17, 2020

fehiepsi Nov 17, 2020

fritzo commented Nov 17, 2020

Fixes for PyTorch 1.7 release #2683

Fixes for PyTorch 1.7 release #2683

Conversation

fritzo commented Oct 28, 2020 • edited Loading

Tasks

Tested

fritzo commented Nov 16, 2020

fritzo commented Nov 16, 2020 • edited Loading

fritzo commented Nov 16, 2020

neerajprad commented Nov 16, 2020

fehiepsi Nov 16, 2020

Choose a reason for hiding this comment

neerajprad Nov 17, 2020

Choose a reason for hiding this comment

fehiepsi Nov 17, 2020

Choose a reason for hiding this comment

neerajprad Nov 17, 2020

Choose a reason for hiding this comment

fehiepsi Nov 17, 2020

Choose a reason for hiding this comment

fritzo commented Nov 17, 2020

fritzo commented Oct 28, 2020 •

edited

Loading

fritzo commented Nov 16, 2020 •

edited

Loading