Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
# This is a combination of 197 commits.
# This is the 1st commit message: Add Gaussian negative log likelihood loss # This is the commit message #2: flake8 compliance of test file # This is the commit message #3: flake8 compliance loss math description # This is the commit message #4: flake8 compliance loss docstring # This is the commit message #5: Fix tests and docs # This is the commit message #6: Add loss to init script # This is the commit message #7: Change eps # This is the commit message #8: Fix test and docs # This is the commit message #9: Cleaner docs and fix tests # This is the commit message #10: Update docs for var clamping change # This is the commit message #11: Fix overridetests # This is the commit message #12: Fix reduction mode bug and var view bug # This is the commit message #13: Update class init to have kwargs # This is the commit message #14: Add note and reference to docs # This is the commit message #15: Fix typos # This is the commit message #16: Preserve memory format in qconv op (#49533) Summary: * qconv used to return NHWC no matter the input format * this change returns NCHW format if the input was NCHW Pull Request resolved: https://github.com/pytorch/pytorch/pull/49533 Test Plan: pytest test/quantization/test_quantized_op.py::\ TestQuantizedConv::test_qconv2d_preserve_mem_format Fixes https://github.com/pytorch/pytorch/issues/47295 Reviewed By: kimishpatel Differential Revision: D25609205 Pulled By: axitkhurana fbshipit-source-id: 83f8ca4a1496a8a4612fc3da082d727ead257ce7 # This is the commit message #17: Added linalg.inv (#48261) Summary: This PR adds `torch.linalg.inv` for NumPy compatibility. `linalg_inv_out` uses in-place operations on provided `result` tensor. I modified `apply_inverse` to accept tensor of Int instead of std::vector, that way we can write a function similar to `linalg_inv_out` but removing the error checks and device memory synchronization. I fixed `lda` (leading dimension parameter which is max(1, n)) in many places to handle 0x0 matrices correctly. Zero batch dimensions are also working and tested. Ref https://github.com/pytorch/pytorch/issues/42666 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48261 Reviewed By: ngimel Differential Revision: D25690129 Pulled By: mruberry fbshipit-source-id: edb2d03721f22168c42ded8458513cb23dfdc712 # This is the commit message #18: Mod lists to neutral+descriptive terms in caffe2/docs (#49803) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49803 Per "https://fb.workplace.com/groups/e/permalink/3320810064641820/" we can no longer use the terms "whitelist" and "blacklist", and editing any file containing them results in a critical error signal. Let's embrace the change. This diff changes "blacklist" to "blocklist" in a number of non-interface contexts (interfaces would require more extensive testing and might interfere with reading stored data, so those are deferred until later). Test Plan: Sandcastle Reviewed By: vkuzo Differential Revision: D25686924 fbshipit-source-id: 117de2ca43a0ea21b6e465cf5082e605e42adbf6 # This is the commit message #19: Improve docs for scatter and gather functions (#49679) Summary: - Add warning about non-unique indices - And note that these functions don't broadcast - Add missing `torch.scatter` and `torch.scatter_add` doc entries - Fix parameter descriptions - Improve code examples to make indexing behaviour easier to understand Closes gh-48214 Closes gh-26191 Closes gh-37130 Closes gh-34062 xref gh-31776 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49679 Reviewed By: mruberry Differential Revision: D25693660 Pulled By: ngimel fbshipit-source-id: 4983e7b4efcbdf1ab9f04e58973b4f983e8e43a4 # This is the commit message #20: removes more unused THC functions (#49788) Summary: per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/49788 Reviewed By: mruberry Differential Revision: D25693328 Pulled By: ngimel fbshipit-source-id: 244a096214d110e4c1a94f2847ff8457f1afb0d1 # This is the commit message #21: [pt][quant] Make the CUDA fake quantize logic consistent with CPU fake quantize logic (#49808) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49808 In PyTorch, it uses `dst = std::nearbyint(src * inv_scale) + zero_point` instead of the LEGACY `dst = std::nearbyint(src * inv_scale + zero_point)`. However, the CUDA implementation doesn't match this. This Diff makes the CPU and CUDA implementation consistent. - FBGEMM code pointer: https://github.com/pytorch/FBGEMM/blob/master/include/fbgemm/QuantUtils.h#L76-L80 - PyTorch code pointer: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/quantized/affine_quantizer.cpp#L306 Test Plan: CI Reviewed By: dskhudia Differential Revision: D25694235 fbshipit-source-id: 0a615e559132aafe18543deac1ea5028dd840cb9 # This is the commit message #22: [numpy] `torch.erfinv`: promote integer inputs to float (#49155) Summary: Reference: https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49155 Reviewed By: ngimel Differential Revision: D25664234 Pulled By: mruberry fbshipit-source-id: 630fd1d334567d78c8130236a67dda0f5ec02560 # This is the commit message #23: [reland] Early terminate when CUDA assert were thrown (#49799) Summary: this is a reland of https://github.com/pytorch/pytorch/issues/49527. fixed slow test not running properly in py36 because capture_output is introduced in py37. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49799 Reviewed By: janeyx99 Differential Revision: D25692616 Pulled By: walterddr fbshipit-source-id: 9c5352220d632ec8d7464e5f162ffb468a0f30df # This is the commit message #24: Fix typo in complex autograd docs (#49755) Summary: Update complex autograd docs to fix a typo Pull Request resolved: https://github.com/pytorch/pytorch/pull/49755 Reviewed By: mruberry Differential Revision: D25692649 Pulled By: soulitzer fbshipit-source-id: 43c2113b4c8f2d1828880102189a5a9b887dc784 # This is the commit message #25: Revert D25690129: [pytorch][PR] Added linalg.inv Test Plan: revert-hammer Differential Revision: D25690129 (https://github.com/pytorch/pytorch/commit/8554b58fbdd865c760d92bfa50c1119cc8fc65e9) Original commit changeset: edb2d03721f2 fbshipit-source-id: 8679ea18e637423d35919544d2b047a62ac3abd8 # This is the commit message #26: Creation of test framework for Sparse Operators (#48488) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/48488 Reviewed By: ngimel Differential Revision: D25696487 Pulled By: mruberry fbshipit-source-id: dc4f57c6628f62b74dd321f3f6b0fff86f25b040 # This is the commit message #27: Revert D25692616: [pytorch][PR] [reland] Early terminate when CUDA assert were thrown Test Plan: revert-hammer Differential Revision: D25692616 (https://github.com/pytorch/pytorch/commit/e6a215592ea5b7f7f7e59e89116b507089bfb8d0) Original commit changeset: 9c5352220d63 fbshipit-source-id: dade8068cad265d15ee908d98abe0de5b81a195d # This is the commit message #28: [quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49754 This PR adds the support for {input/output}_quantized_idxs for standalone module. if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float input and produce float output, and will quantize the input and dequantize output internally if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d For more details, please see the test case Test Plan: python test/test_quantization.py TestQuantizeFx.test_standalone_module Imported from OSS Reviewed By: raghuramank100 Differential Revision: D25684692 fbshipit-source-id: 900360e01c0e35b26fe85f4a887dc1fd6f7bfb66 # This is the commit message #29: Clip small scales to fp16 min Summary: When the FC output min max range is very small, we want to enforce a cutoff on the scale parameter to better generalize for future values that could fall beyond the original range. Test Plan: More analysis about the output distributions can be found in N425166 An example workflow using fp16 min clipping is f240972205 Reviewed By: jspark1105 Differential Revision: D25681249 fbshipit-source-id: c4dfbd3ee823886afed06e6c2eccfc29d612f7e6 # This is the commit message #30: Revert D25684692: [quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs Test Plan: revert-hammer Differential Revision: D25684692 (https://github.com/pytorch/pytorch/commit/89b4899ea5363fd69872c0cabf0dedea2dc533c8) Original commit changeset: 900360e01c0e fbshipit-source-id: 8b65fa8fbc7b364fbddb5f23cc696cd9b7db98cd # This is the commit message #31: [numpy] `torch.digamma` : promote integer inputs to float (#48302) Summary: **BC-breaking Note:** This PR updates PyTorch's digamma function to be consistent with SciPy's special.digamma function. This changes the result of the digamma function on the nonpositive integers, where the gamma function is not defined. Since the gamma function is undefined at these points, the (typical) derivative of the logarithm of the gamma function is also undefined at these points, and for negative integers this PR updates digamma to return NaN. For zero, however, it returns -inf to be consistent with SciPy. Interestingly, SciPy made a similar change, which was noticed by at least one user: https://github.com/scipy/scipy/issues/9663#issue-396587679. SciPy's returning of negative infinity at zero is intentional: https://github.com/scipy/scipy/blob/59347ae8b86bcc92c339efe213128f64ab6df98c/scipy/special/cephes/psi.c#L163 This change is consistent with the C++ standard for the gamma function: https://en.cppreference.com/w/cpp/numeric/math/tgamma **PR Summary:** Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48302 Reviewed By: ngimel Differential Revision: D25664087 Pulled By: mruberry fbshipit-source-id: 1168e81e218bf9fe5b849db0e07e7b22e590cf73 # This is the commit message #32: early termination of CUDA tests (#49869) Summary: This is follow up on https://github.com/pytorch/pytorch/issues/49799. * uses `torch.cuda.synchronize()` to validate CUDA assert instead of inspecting error message. * remove non CUDA tests. hopefully can reproduce why slow_tests fails but not normal test. since the test still runs for >1min. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49869 Reviewed By: mruberry Differential Revision: D25714385 Pulled By: walterddr fbshipit-source-id: 04f8ccb50d8c9ee42826a216c49baf90285b247f # This is the commit message #33: [*.py] Rename "Arguments:" to "Args:" (#49736) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" | paste -s -d+ -- | bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619 # This is the commit message #34: Support the `in` operator with str (#47057) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47057 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D24863370 Pulled By: ansley fbshipit-source-id: 5d17165b06052f0a4676537c5f6757083185a591 # This is the commit message #35: [NNC] masked fill (#49627) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49627 There was a bug in the test that was hidden by the `If eager mode doesn't support a dtype/op/device combo` try / catch, so cuda wasn't being tested � The fix is just to rename `aten::masked_fill` to `aten_masked_fill`. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D25696409 Pulled By: eellison fbshipit-source-id: 83de1f5a194df54fe317b0035d4a6c1aed1d19a0 # This is the commit message #36: [JIT] Constant prop getattr (#49806) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49806 Fix for https://github.com/pytorch/pytorch/issues/47089 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D25696791 Pulled By: eellison fbshipit-source-id: 914c17b8effef7f4f341775ac2b8150ee4703efd # This is the commit message #37: fx quant: hook up ConvTranspose{n}d (#49717) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49717 Quantization of `ConvTranpose{n}d` is supported in Eager mode. This PR adds the support for FX graph mode. Note: this currenlty only works in `qnnpack` because per-channel weights are not supported by quantized conv transpose. In a future PR we should throw an error when someone tries to quantize a ConvTranspose model with per-channel weight observers until this is fixed. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_1d python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_2d ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25674636 fbshipit-source-id: b6948156123ed55db77e6337bea10db956215ae6 # This is the commit message #38: fx quant: split linear test cases (#49740) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49740 1. Separates the module and functional linear test cases. 2. Combines the test case which tests for linear bias observation into the main linear test case, as requested in https://github.com/pytorch/pytorch/pull/49628. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_linear_module python test/test_quantization.py TestQuantizeFxOps.test_linear_functional ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25681272 fbshipit-source-id: 0ed0ebd5afb8cdb938b530f7dbfbd79798eb9318 # This is the commit message #39: Implement torch.linalg.qr (#47764) Summary: I am opening this PR early to have a place to discuss design issues. The biggest difference between `torch.qr` and `numpy.linalg.qr` is that the former `torch.qr` takes a boolean parameter `some=True`, while the latter takes a string parameter `mode='reduced'` which can be one of the following: `reduced` this is completely equivalent to `some=True`, and both are the default. `complete` this is completely equivalent to `some=False`. `r` this returns only `r` instead of a tuple `(r, q)`. We have already decided that we don't want different return types depending on the parameters, so I propose to return `(r, empty_tensor)` instead. I **think** that in this mode it will be impossible to implement the backward pass, so we should raise an appropriate error in that case. `raw` in this mode, it returns `(h, tau)` instead of `(q, r)`. Internally, `h` and `tau` are obtained by calling lapack's `dgeqrf` and are later used to compute the actual values of `(q, r)`. The numpy docs suggest that these might be useful to call other lapack functions, but at the moment none of them is exposed by numpy and I don't know how often it is used in the real world. I suppose the implementing the backward pass need attention to: the most straightforward solution is to use `(h, tau)` to compute `(q, r)` and then use the normal logic for `qr_backward`, but there might be faster alternatives. `full`, `f` alias for `reduced`, deprecated since numpy 1.8.0 `economic`, `e` similar to `raw but it returns only `h` instead of `(h, tau). Deprecated since numpy 1.8.0 To summarize: * `reduce`, `complete` and `r` are straightforward to implement. * `raw` needs a bit of extra care, but I don't know how much high priority it is: since it is used rarely, we might want to not support it right now and maybe implement it in the future? * I think we should just leave `full` and `economic` out, and possibly add a note to the docs explaining what you need to use instead /cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/47764 Reviewed By: ngimel Differential Revision: D25708870 Pulled By: mruberry fbshipit-source-id: c25c70a23a02ec4322430d636542041e766ebe1b # This is the commit message #40: Fix errata (#49903) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49903 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25718411 Pulled By: ansley fbshipit-source-id: 0cc365c5a53077752dc1c5a5c4a65b873baa3604 # This is the commit message #41: Update gather documentation to allow index.shape[k] <= input.shape[k] rather than ==. (#41887) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41887 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D22680014 Pulled By: gchanan fbshipit-source-id: b162fccabc22a1403c0c43c1131f0fbf4689a79d # This is the commit message #42: Enable tests using named temp files on Windows (#49640) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49640 Reviewed By: ngimel Differential Revision: D25681548 Pulled By: malfet fbshipit-source-id: 0e2b25817c98d749920cb2b4079033a2ee8c1456 # This is the commit message #43: added fuse_op and list_construct - list_unpack pass Summary: Added fuse_op and list_construct and list_unpack pass Test Plan: jit_graph_opt_test.py jit_graph_optimizer_test.cc sparsenn_fused_operator_test.py Reviewed By: qizzzh Differential Revision: D25715079 fbshipit-source-id: fa976be53135a83f262b8f2e2eaedadd177f46c4 # This is the commit message #44: Clean up type annotations in caffe2/torch/nn/modules (#49938) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49938 Test Plan: Sandcastle tests Reviewed By: xush6528 Differential Revision: D25718705 fbshipit-source-id: 6a9e3e6d17aa458726cd32aa0a71a63c51b601d9 # This is the commit message #45: [Tensorexpr]Copying header files in tensorexpr dir (#49933) Summary: Previously header files from jit/tensorexpr were not copied, this PR should enable copying. This will allow other OSS projects like Glow to used TE. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49933 Reviewed By: Krovatkin, mruberry Differential Revision: D25725927 Pulled By: protonu fbshipit-source-id: 9d5a0586e9b73111230cacf044cd7e8f5c600ce9 # This is the commit message #46: Clean up some type annotations in caffe2/torch/quantization (#49942) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49942 Upgrades type annotations from Python2 to Python3 Test Plan: Sandcastle tests Reviewed By: vkuzo Differential Revision: D25717551 fbshipit-source-id: 1b63dc485ecf6641641b05f7ce095ae1d2d87346 # This is the commit message #47: Revert D25718705: Clean up type annotations in caffe2/torch/nn/modules Test Plan: revert-hammer Differential Revision: D25718705 (https://github.com/pytorch/pytorch/commit/891759f8609f300203d41cccc7337089b38858bd) Original commit changeset: 6a9e3e6d17aa fbshipit-source-id: 1a4ef0bfdec8eb8e7ce149bfbdb34a4ad8d964b6 # This is the commit message #48: added List as an option to the unflattened_size (#49838) Summary: Fixes https://github.com/pytorch/pytorch/issues/49743 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49838 Reviewed By: mruberry Differential Revision: D25727971 Pulled By: ngimel fbshipit-source-id: 60142dae84ef107f0083676a2a78ce6b0472b7e1 # This is the commit message #49: Fix auto exponent issue for torch.pow (#49809) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49809 Fixes https://github.com/pytorch/xla/issues/2688 #46936 Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D25724176 Pulled By: anjali411 fbshipit-source-id: 16287a1f481e9475679b99d6fb45de840da225be # This is the commit message #50: Adding JIT support for cuda streams and events (#48020) Summary: ======= This PR addresses the following: * Adds JIT support for CUDA Streams * Adds JIT support for CUDA Events * Adds JIT support for CUDA Stream context manager Testing: ====== python test/test_jit.py -v TestCUDA Pull Request resolved: https://github.com/pytorch/pytorch/pull/48020 Reviewed By: navahgar Differential Revision: D25725749 Pulled By: nikithamalgifb fbshipit-source-id: b0addeb49630f8f0c430ed7badeca43bb9d2535c # This is the commit message #51: Remove THPWrapper (#49871) Summary: Remove `THPWrapper` from PyTorch C code since it is not used anymore and because we have dropped Python 2 compatibility, its usage can be replaced by capsule objects (`PyCapsule_New`, `PyCapsule_CheckExact`, `PyCapsule_GetPointer` and `PyCapsule_GetDestructor`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49871 Reviewed By: mruberry Differential Revision: D25715038 Pulled By: albanD fbshipit-source-id: cc3b6f967bbe0dc42c692adf76dff4e4b667fdd5 # This is the commit message #52: Enable test_fusions TanhQuantize (#49970) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49970 enable test_fusions:test_tanhquantize Test Plan: https://internalfb.com/intern/testinfra/testrun/6755399469176694 Reviewed By: hyuen Differential Revision: D25732684 fbshipit-source-id: b8479e43b5248ba5510f0c78c993d534d3ffc2b0 # This is the commit message #53: [numpy] `torch.rsqrt` : promote integer inputs to float (#47909) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47909 Reviewed By: ngimel Differential Revision: D25730876 Pulled By: mruberry fbshipit-source-id: c87a8f686e1dd64e511640e0278021c4a584ccf2 # This is the commit message #54: Accept input tensor with 0-dim batch size for MultiLabelMarginLoss (#46975) Summary: Fix for one of the layers listed in https://github.com/pytorch/pytorch/issues/12013 or https://github.com/pytorch/pytorch/issues/38115 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46975 Reviewed By: mruberry Differential Revision: D25719980 Pulled By: ngimel fbshipit-source-id: 83414bad37c0b004bc7cced04df8b9c89bdba3e6 # This is the commit message #55: Fix a KaTeX crash and many docstring issues (#49684) Summary: The first commit fixes the `MultiheadAttention` docstrings, which are causing a cryptic KaTeX crash. The second commit fixes many documentation issues in `torch/_torch_docs.py`, and closes gh-43667 (missing "Keyword arguments" headers). It also fixes a weird duplicate docstring for `torch.argmin`; there's more of these, it looks like they were written based on whether the C++ implementation has an overload. That makes little sense to a Python user though, and the content is simply duplicate. The `Shape:` heading for https://pytorch.org/docs/master/generated/torch.nn.MultiheadAttention.html looked bad, here's what it looks like with this PR: <img width="475" alt="image" src="https://user-images.githubusercontent.com/98330/102797488-09a44e00-43b0-11eb-8788-acdf4e936f2f.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/49684 Reviewed By: ngimel Differential Revision: D25730909 Pulled By: mruberry fbshipit-source-id: d25bcf8caf928e7e8e918017d119de12e10a46e9 # This is the commit message #56: Remove incorrect usage of layout(std430) on uniform buffers, correctly now treated as error in the latest release of Vulkan SDK. (#49572) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49572 Differential Revision: D25729888 Test Plan: Imported from OSS Reviewed By: SS-JIA Pulled By: AshkanAliabadi fbshipit-source-id: 15dd4acef3dfae72f03e7e3085b1ff5936becf3d # This is the commit message #57: quant docs: add common errors section (#49902) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49902 Adds a common errors section, and details the two errors we see often on the discuss forums, with recommended solutions. Test Plan: build the docs on Mac OS, the new section renders correctly. Reviewed By: supriyar Differential Revision: D25718195 Pulled By: vkuzo fbshipit-source-id: c5ef2b24831d18d57bbafdb82d26d8fbf3a90781 # This is the commit message #58: [quant] Quantizable LSTM (#49671) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49671 - Introduces the `torch.nn.quantizable` namespace - Adds the `torch.nn.quantizable.LSTM` module The point of the `quantizable` namespace is to segregate the purely quantized modules with the modules that could be quantized through a normal quantization flow, but are not using the quantized kernels explicitly. That means the quantizable modules are functionally and numerically equivalent to the FP ones and can be used instead of the FP ones without any loss. The main difference between the `torch.nn.LSTM` and the `torch.nn.quantizable.LSTM` is that the former one does not support observation for the linear layers, because all the computation is internal to the `aten` namespace. The `torch.nn.quantizable.LSTM`, however, uses explicit linear layers that can be observed for further quantization. Test Plan: Imported from OSS Differential Revision: D25663870 Reviewed By: vkuzo Pulled By: z-a-f fbshipit-source-id: 70ff5463bd759b9a7922571a5712d3409dfdfa06 # This is the commit message #59: [PyTorch] Decouple version numbers from c10 and caffe2 targets (#49905) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49905 There's size regression in model delivery in D25682312. Only the model version numbers are used. However, the dependency of the entire c10 (128 KB) is pulled in. This diff is to decouple the version numbers to a separate header file, versions.h. Other targets referring to version numbers only can have deps of ```caffe2:version_headers```. ghstack-source-id: 119161467 Test Plan: CI Reviewed By: xcheng16, guangyfb Differential Revision: D25716601 fbshipit-source-id: 07634bcf46eacfefa4aa75f2e4c9b9ee30c6929d # This is the commit message #60: Revert D25719980: [pytorch][PR] Accept input tensor with 0-dim batch size for MultiLabelMarginLoss Test Plan: revert-hammer Differential Revision: D25719980 (https://github.com/pytorch/pytorch/commit/6b56b71e61e14bf4de5b371f0d8f2f2029065b31) Original commit changeset: 83414bad37c0 fbshipit-source-id: 27eddd711a2b9e0adbc08bfab12100562e63ac21 # This is the commit message #61: Improve `torch.flatten` docs and add tests to test_view_ops (#49501) Summary: Addresses https://github.com/pytorch/pytorch/issues/39474 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49501 Reviewed By: mruberry Differential Revision: D25734450 Pulled By: soulitzer fbshipit-source-id: 993667dd07acd81a4616465e0a3b94bde449193e # This is the commit message #62: Fix inf norm grad (reland) (#48611) Summary: Reland of https://github.com/pytorch/pytorch/issues/48122 Does this result in a regression? No significant regression observed. Timer script: ``` import torch from torch.utils.benchmark import Timer setup=""" a = torch.rand((2, 2), requires_grad=True) gradient = torch.ones(2) """ stmt=""" torch.autograd.grad(torch.norm(a, dim=(0,), keepdim=False), a, gradient) """ timer = Timer(stmt, setup) print(timer.timeit(10000)) print(timer.collect_callgrind(100)) ``` Note: small matrix, keepdim is False, and dims is non-empty Before change ``` Runtime 37.37 us 1 measurement, 10000 runs , 1 thread All Noisy symbols removed Instructions: 15279045 15141710 Baseline: 4257 3851 100 runs per measurement, 1 thread ``` After change ``` Runtime 36.08 us 1 measurement, 10000 runs , 1 thread All Noisy symbols removed Instructions: 15296974 15153534 Baseline: 4257 3851 100 runs per measurement, 1 thread ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/48611 Reviewed By: albanD, mruberry Differential Revision: D25309997 Pulled By: soulitzer fbshipit-source-id: 5fb950dc9259234342985c0e84ada25a7e3814d6 # This is the commit message #63: Revert D25734450: [pytorch][PR] Improve `torch.flatten` docs and add tests to test_view_ops Test Plan: revert-hammer Differential Revision: D25734450 (https://github.com/pytorch/pytorch/commit/730965c246192c94c804e5ac4a95f175dca2fb18) Original commit changeset: 993667dd07ac fbshipit-source-id: 603af25311fc8b29bb033167f3b2704da79c3147 # This is the commit message #64: Remove flops warnings from the default profiler use case (#49896) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49896 Add missing check for with_flops option set Test Plan: python test/test_profiler.py CI Reviewed By: xuzhao9, ngimel Differential Revision: D25716930 Pulled By: ilia-cher fbshipit-source-id: 0da0bbb6c1a52328f665237e503406f877b41449 # This is the commit message #65: [c10/**] Fix typos (#49815) Summary: All pretty minor. I avoided renaming `class DestructableMock` to `class DestructibleMock` and similar such symbol renames (in this PR). Pull Request resolved: https://github.com/pytorch/pytorch/pull/49815 Reviewed By: VitalyFedyunin Differential Revision: D25734507 Pulled By: mruberry fbshipit-source-id: bbe8874a99d047e9d9814bf92ea8c036a5c6a3fd # This is the commit message #66: Back out "[pytorch][PR] Preserve memory format in qconv op" (#49994) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49994 Revert preserving memory format in qconv op because it is negatively affecting performance, will revert revert after fixing all issues Test Plan: pytest fbcode/caffe2/test/quantization/test_quantized_op.py Reviewed By: kimishpatel Differential Revision: D25731279 fbshipit-source-id: 908dbb127210a93b27ada7ccdfa531177edf679a # This is the commit message #67: Making ops c10-full: list of optional tensors (#49138) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49138 See for details: https://fb.quip.com/QRtJAin66lPN We need to model optional types explicitly, mostly for schema inference. So we cannot pass a `Tensor?[]` as `ArrayRef<Tensor>`, instead we need to pass it as an optional type. This PR changes it to `torch::List<c10::optional<Tensor>>`. It also makes the ops c10-full that were blocked by this. ## Backwards Compatibility - This should not break the Python API because the representation in Python is the same and python_arg_parser just transforms the python list into a `List<optional<Tensor>>` instead of into a `List<Tensor>`. - This should not break serialized models because there's some logic that allows loading a serialized `List<Tensor>` as `List<optional<Tensor>>`, see https://github.com/pytorch/pytorch/pull/49138/files#diff-9315f5dd045f47114c677174dcaa2f982721233eee1aa19068a42ff3ef775315R57 - This will break backwards compatibility for the C++ API. There is no implicit conversion from `ArrayRef<Tensor>` (which was the old argument type) to `List<optional<Tensor>>`. One common call pattern is `tensor.index({indices_tensor})`, where indices_tensor is another `Tensor`, and that will continue working because the `{}` initializer_list constructor for `List<optional<Tensor>>` can take `Tensor` elements that are implicitly converted to `optional<Tensor>`, but another common call pattern was `tensor.index(indices_tensor)`, where previously, the `Tensor` got implicitly converted to an `ArrayRef<Tensor>`, and to implicitly convert `Tensor -> optional<Tensor> -> List<optional<Tensor>>` would be two implicit conversions. C++ doesn't allow chaining. two implicit conversions. So those call sites have to be rewritten to `tensor.index({indices_tensor})`. ghstack-source-id: 119269131 Test Plan: ## Benchmarks (C++ instruction counts): ### Forward #### Script ```py from torch.utils.benchmark import Timer counts = Timer( stmt=""" auto t = {{op call to measure}}; """, setup=""" using namespace torch::indexing; auto x = torch::ones({4, 4, 4}); """, language="cpp", ).collect_callgrind(number=1_000) print(counts) ``` #### Results | Op call |before |after |delta | | |------------------------------------------------------------------------|---------|--------|-------|------| |x[0] = 1 |11566015 |11566015|0 |0.00% | |x.index({0}) |6807019 |6801019 |-6000 |-0.09%| |x.index({0, 0}) |13529019 |13557019|28000 |0.21% | |x.index({0, 0, 0}) |10677004 |10692004|15000 |0.14% | |x.index({"..."}) |5512015 |5506015 |-6000 |-0.11%| |x.index({Slice(None, None, None)}) |6866016 |6936016 |70000 |1.02% | |x.index({None}) |8554015 |8548015 |-6000 |-0.07%| |x.index({false}) |22400000 |22744000|344000 |1.54% | |x.index({true}) |27624088 |27264393|-359695|-1.30%| |x.index({"...", 0, true, Slice(1, None, 2), torch::tensor({1, 2})})|123472000|123463306|-8694|-0.01%| ### Autograd #### Script ```py from torch.utils.benchmark import Timer counts = Timer( stmt=""" auto t = {{op call to measure}}; """, setup=""" using namespace torch::indexing; auto x = torch::ones({4, 4, 4}, torch::requires_grad()); """, language="cpp", ).collect_callgrind(number=1_000) print(counts) ``` Note: the script measures the **forward** path of an op call with autograd enabled (i.e. calls into VariableType). It does not measure the backward path. #### Results | Op call |before |after |delta | | |------------------------------------------------------------------------|---------|--------|-------|------| |x.index({0}) |14839019|14833019|-6000| 0.00% | |x.index({0, 0}) |28342019|28370019|28000| 0.00% | |x.index({0, 0, 0}) |24434004|24449004|15000| 0.00% | |x.index({"..."}) |12773015|12767015|-6000| 0.00% | |x.index({Slice(None, None, None)}) |14837016|14907016|70000| 0.47% | |x.index({None}) |15926015|15920015|-6000| 0.00% | |x.index({false}) |36958000|37477000|519000| 1.40% | |x.index({true}) |41971408|42426094|454686| 1.08% | |x.index({"...", 0, true, Slice(1, None, 2), torch::tensor({1, 2})}) |168184392|164545682|-3638710| -2.16% | Reviewed By: bhosmer Differential Revision: D25454632 fbshipit-source-id: 28ab0cffbbdbdff1c40b4130ca62ee72f981b76d # This is the commit message #68: Add type annotations to _tensorboard_vis.py and hipify_python.py (#49834) Summary: closes gh-49833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49834 Reviewed By: mruberry Differential Revision: D25725341 Pulled By: malfet fbshipit-source-id: 7454c7afe07a3ff829826afe02aba05b7f649d9b # This is the commit message #69: Run test_type_hints first (#49748) Summary: Since it sort of a liner check and fails frequently Pull Request resolved: https://github.com/pytorch/pytorch/pull/49748 Reviewed By: vkuzo Differential Revision: D25682980 Pulled By: malfet fbshipit-source-id: 7dba28242dced0277bad56dc887d3273c1e9e575 # This is the commit message #70: Update update_s3_htmls.yml (#49934) Summary: It is now running for forks, and generates a lot of failure message to owner of forks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49934 Reviewed By: mruberry Differential Revision: D25739552 Pulled By: seemethere fbshipit-source-id: 0f9cc430316c0a5e9972de3cdd06d225528c81c2 # This is the commit message #71: Improve `torch.flatten` docs and add tests to test_view_ops (#49501) Summary: Addresses https://github.com/pytorch/pytorch/issues/39474 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49501 Reviewed By: mrshenli Differential Revision: D25740586 Pulled By: soulitzer fbshipit-source-id: 3d7bdbab91eb208ac9e6832bb766d9d95a00c103 # This is the commit message #72: move to non-legacy magma v2 headers (#49978) Summary: We recently (https://github.com/pytorch/pytorch/issues/7582) dropped magma v1 support, but we were still including the legacy compatibility headers and using functions only provided by them. This changes the includes to the new magma_v2 header and fixes the triangular solve functions to use the v2-style magma_queue-using API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49978 Reviewed By: mrshenli Differential Revision: D25752499 Pulled By: ngimel fbshipit-source-id: 26d916bc5ce63978b341aefb072af228f140637d # This is the commit message #73: Enforce c10-fullness for all ops (#49619) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49619 This is a minimal-change PR that enforces that all operators are c10-full by making it the default. This does not clean up any code yet, that will happen in PRs stacked on top. But this PR already ensures that there are no non-c10-full ops left and there will be no non-c10-full ops introduced anymore. ghstack-source-id: 119269182 Test Plan: waitforsandcastle Reviewed By: bhosmer Differential Revision: D25650198 fbshipit-source-id: efc53e884cb53193bf58a4834bf148453e689ea1 # This is the commit message #74: .circleci: Ignore unbound variables for conda (#50053) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50053 For some reason conda likes to re-activate the conda environment when attempting this install which means that a deactivate is run and some variables might not exist when that happens, namely CONDA_MKL_INTERFACE_LAYER_BACKUP from libblas so let's just ignore unbound variables when it comes to the conda installation commands Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: samestep Differential Revision: D25760737 Pulled By: seemethere fbshipit-source-id: 9e7720eb8a4f8028dbaa7bcfc304e5c1ca73ad08 # This is the commit message #75: Construct CppSignatureGroup from NativeFunction (#49245) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49245 This will make it easier to implement the POC in https://github.com/peterbell10/pytorch/commit/d534f7d4c555a37fd178c143098b8537a5a05d61 see also https://github.com/pytorch/pytorch/pull/45666 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25594005 Pulled By: ezyang fbshipit-source-id: e458d3dc3a765ec77425761b9b17f23769cecf9e # This is the commit message #76: Tighten up error checking on manual_kernel_registration (#49341) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49341 I noticed that #49097 was using manual_kernel_registration incorrectly, so this diff tightens up the testing so that: 1. We don't generate useless wrapper functions when manual_kernel_registration is on (it's not going to be registered, so it does nothing). 2. manual_kernel_registration shouldn't affect generation of functions in Functions.h; if you need to stop bindings, use manual_cpp_binding 3. Structured and manual_kernel_registration are a hard error 4. We raise an error if you set dispatch and manual_kernel_registration at the same time. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25594003 Pulled By: ezyang fbshipit-source-id: 655b10e9befdfd8bc95f1631b2f48f995a31a59a # This is the commit message #77: codegen: Resolve overload ambiguities created by defaulted arguments (#49348) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49348 This is a redux of #45666 post refactor, based off of https://github.com/peterbell10/pytorch/commit/d534f7d4c555a37fd178c143098b8537a5a05d61 Credit goes to peterbell10 for the implementation. Fixes #43945. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25594004 Pulled By: ezyang fbshipit-source-id: c8eb876bb3348308d6dc8ba7bf091a2a3389450f # This is the commit message #78: Move default or no default logic into native.argument (#49489) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49489 Previously, it was done at a use site, but that meant other use sites don't get the right logic. Pushing it in makes sure everyone gets it. I also fixed one case of confusion where defn() was used to define a decl(). If you want to define a declaration with no defaults, say no_default().decl() which is more direct and will give us code reviewers a clue if you should have pushed this logic in. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25595407 Pulled By: ezyang fbshipit-source-id: 89c664f0ed4d95699794a0d3123d54d0f7e4cba4 # This is the commit message #79: Make use_c10_dispatcher: full mandatory for structured kernels (#49490) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49490 No reason to let people to do the legacy thing for the brand new kernel. This simplifies the codegen. I have to port the two structured kernels to this new format. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25595406 Pulled By: ezyang fbshipit-source-id: b5931873379afdd0f3b00a012e0066af05de0a69 # This is the commit message #80: Add trace batching forward/backward rule (#49979) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49979 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D25734379 Pulled By: ejguan fbshipit-source-id: 8f9346afaf324e7ab17bafd6ecc97eed8442fd38 # This is the commit message #81: [pytorch] add threshold_backward batching for vmap (#49881) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49881 title Test Plan: pytest test/test_vmap.py -v -k "BatchedGrad" Reviewed By: zou3519 Differential Revision: D25711289 fbshipit-source-id: f1856193249fda70da41e36e15bc26ea7966b510 # This is the commit message #82: torch.xlogy: Use wrapped_scalar_tensor / gpu_with_scalars to speed up GPU kernel. (#49926) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49926 While investigating https://github.com/pytorch/pytorch/issues/49758, I changed the xlogy kernel to use the recommended wrapped_scaler_tensor pattern instead of moving the scalar to the GPU as a tensor. While this doesn't avoid a synchronization (there is no synchronization in the move, as its done via fill), this does significantly speed up the GPU kernel (almost ~50%, benchmark in PR comments). From looking at the nvprof output, it looks like this code path avoids broadcasting. Aside: this seems unnecessary, as there is nothing special from the point-of-view of broadcasting whether the Tensor is ()-sized or marked as a wrapped_scalar. Still, this is a useful change to make as we avoid extra kernel launches and dispatches to create and fill the tensor. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25724215 Pulled By: gchanan fbshipit-source-id: 4adcd5d8b3297502672ffeafc77e8af80592f460 # This is the commit message #83: [BE] unified run_process_no_exception code (#49774) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49774 Reviewed By: janeyx99 Differential Revision: D25756811 Pulled By: walterddr fbshipit-source-id: 4d2b3bd772572764ff96e5aad70323b58393e332 # This is the commit message #84: prohibit assignment to a sparse tensor (#50040) Summary: Fixes https://github.com/pytorch/pytorch/issues/48225 by prohibiting assignment to a sparse Tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50040 Reviewed By: mrshenli Differential Revision: D25757125 Pulled By: zou3519 fbshipit-source-id: 3db6f48932eb10bf6ca5e97a6091afcabb60e478 # This is the commit message #85: Suppress "statement is unreachable" warning (#49495) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49495 Compiling PyTorch currently generates a large number of warnings like this: ``` caffe2/aten/src/ATen/core/builtin_function.h(105): warning: statement is unreachable ``` The offending code ``` std::string pretty_print_schema() const override { TORCH_INTERNAL_ASSERT(false); return ""; } ``` has an unreachable return which prevents a "no return" warning. We resolve the situation by using NVCC's pragma system to suppress this warning within this function. Test Plan: The warning appears when running: ``` buck build mode/dev-nosan //caffe2/torch/fb/sparsenn:test ``` As well as a number of other build commands. Reviewed By: ngimel Differential Revision: D25546542 fbshipit-source-id: 71cddd4fdb5fd16022a6d7b2daf0e6d55e6e90e2 # This is the commit message #86: [ONNX] Handle Sub-block index_put in _jit_pass_onnx_remove_inplace_ops_for_onnx (#48734) Summary: For the added UT and existing UTs, this code is independent and ready for review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48734 Reviewed By: izdeby Differential Revision: D25502677 Pulled By: bzinodev fbshipit-source-id: 788b4eaa5e5e8b5df1fb4956fbd25928127bb199 # This is the commit message #87: Dont inlinine intermediates on cpu (#49565) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49565 Test Plan: Imported from OSS Reviewed By: Krovatkin, ZolotukhinM Differential Revision: D25688271 Pulled By: eellison fbshipit-source-id: 9ea7858e2db4fb31292e04440fc72ee04623c688 # This is the commit message #88: Drop unused imports from scripts (#49956) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49956 From ``` ./python/libcst/libcst codemod remove_unused_imports.RemoveUnusedImportsWithGlean --no-format caffe2/ ``` Test Plan: Standard sandcastle tests Reviewed By: xush6528 Differential Revision: D25727347 fbshipit-source-id: 74d0a08aa0cfd0f492688a2b8278a0c65fd1deba # This is the commit message #89: Drop unused imports from leftovers (#49953) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49953 From ``` ./python/libcst/libcst codemod remove_unused_imports.RemoveUnusedImportsWithGlean --no-format caffe2/ ``` Test Plan: Standard sandcastle tests Reviewed By: xush6528 Differential Revision: D25727348 fbshipit-source-id: b3feef80b9b4b535f1bd4060dace5b1a50bd5e69 # This is the commit message #90: Clean up some type annotations in caffe2/contrib/aten/gen_op (#49945) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49945 Upgrades type annotations from Python2 to Python3 Test Plan: Sandcastle tests Reviewed By: xush6528 Differential Revision: D25717502 fbshipit-source-id: 718d93e8614e9d050f4da1c6bd4ac892bab98154 # This is the commit message #91: [ONNX] Modified var_mean symbolic to support more combinations of dims (#48949) Summary: Based on existing implementation of var_mean, values of dim have to be sequential and start with zero. The formats listed below are cause scenarios with incompatible dimension for the Sub node. -> dim[1, 2] -> dim[0, 2] -> dim[2, 0] The changes in this PR allow such formats to be supported in var_mean Pull Request resolved: https://github.com/pytorch/pytorch/pull/48949 Reviewed By: houseroad Differential Revision: D25540272 Pulled By: SplitInfinity fbshipit-source-id: 59813a77ff076d138655cc8c17953358f62cf137 # This is the commit message #92: introduce a flag to disable aten::cat in TE (#49579) Summary: introduce a flag to disable aten::cat in TE Pull Request resolved: https://github.com/pytorch/pytorch/pull/49579 Reviewed By: eellison Differential Revision: D25763758 Pulled By: Krovatkin fbshipit-source-id: c4f4a8220964813202369a3383057e77e7f10cb0 # This is the commit message #93: Complex backward for indexing, slicing, joining, and mutating ops (#49552) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49552 This PR: 1. Migrates independent autograd test for `hstack`, `dstack`, `vstack`, `movedim`, `moveaxis` from `test_autograd.py` to the new `OpInfo` based tests. 2. Migrates autograd test for `gather`, `index_select` from the method_tests to the new `OpInfo` based tests. 2. Enables complex backward for `stack, gather, index_select, index_add_` and adds tests for complex autograd for all the above mentioned ops. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25682511 Pulled By: anjali411 fbshipit-source-id: 5d8f89db4a9ec340ab99a6196987d44a23e2c6c6 # This is the commit message #94: [FX] fix Graph python_code return type annotation (#49931) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49931 This fixes #49932. The `maybe_return_annotation` was not being passed by reference, so it was never getting modified. Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D25725582 Pulled By: esqu1 fbshipit-source-id: 4136ff169a269d6b98f0b8e14d95d19e7c7cfa71 # This is the commit message #95: [TensorExpr] Fix LLVM 10 build after LLVM API changes Summary: Use `llvm::CodeGenFileType` for llvm-10+ Test Plan: local build Reviewed By: asuhan Differential Revision: D25694990 fbshipit-source-id: c35d973ef2669929715a94da5dd46e4a0457c4e8 # This is the commit message #96: unit test for fc parallelization aot (#50056) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50056 buck test //caffe2/caffe2/contrib/fakelowp/test:test_chunkingnnpi -- --fallback-classic Test Plan: https://our.intern.facebook.com/intern/testinfra/testrun/7036874446100155 Reviewed By: venkatacrc Differential Revision: D25731079 fbshipit-source-id: 4aa4ffc641659cd90bf4670d28cb43e43ae76dcd # This is the commit message #97: Fix return value of _vmap_internals._get_name (#49951) Summary: This appears to have been a copy-paste error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49951 Reviewed By: mrshenli Differential Revision: D25757099 Pulled By: zou3519 fbshipit-source-id: e47cc3b0694645bd0025326bfe45852ef0266adf # This is the commit message #98: Fix grammar typo in readme.md (#50000) Summary: missing ` Pull Request resolved: https://github.com/pytorch/pytorch/pull/50000 Reviewed By: ezyang Differential Revision: D25759608 Pulled By: mrshenli fbshipit-source-id: 4dbe06b8978ae5b2b9b66cde163dab4bd8ee2257 # This is the commit message #99: Fixing error in Readme.md. (#50033) Summary: Fix incorrect command in readme. Fix incorrect url in readme. Add url for dockerfile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50033 Reviewed By: ezyang Differential Revision: D25759567 Pulled By: mrshenli fbshipit-source-id: 2a3bc88c8717a3890090ddd0d6657f49d14ff05a # This is the commit message #100: Revert D25763758: [pytorch][PR] introduce a flag to disable aten::cat in TE Test Plan: revert-hammer Differential Revision: D25763758 (https://github.com/pytorch/pytorch/commit/9e0b4a96e48132190220820684033a77a92e8a33) Original commit changeset: c4f4a8220964 fbshipit-source-id: 98775ad9058b81541a010e646b0cf4864854be3e # This is the commit message #101: Patch death tests/fork use after D25292667 (part 3) Summary: (Note: this ignores all push blocking failures!) Test Plan: unit tests Differential Revision: D25775357 fbshipit-source-id: 0ae3c59181bc123d763ed9c0d05c536998ae5ca0 # This is the commit message #102: fixes indices computation for trilinear interpolate backwards (#50084) Summary: https://github.com/pytorch/pytorch/issues/48675 had some typos in indices computations so that results for trilinear interpolation where height is not equal to width were wrong. This PR fixes it. cc xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50084 Reviewed By: BIT-silence Differential Revision: D25777083 Pulled By: ngimel fbshipit-source-id: 71be545628735fe875b7ea30bf6a09df4f2fae5c # This is the commit message #103: Run mypy on more test files (#49658) Summary: Improves one annotation for `augment_model_with_bundled_inputs` Also add a comment to not work on caffe2 type annotations, that's not worth the effort - those ignores can stay as they are. xref gh-16574 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49658 Reviewed By: heitorschueroff Differential Revision: D25757721 Pulled By: ezyang fbshipit-source-id: 44c396d8da9ef3f41b97f9c46a528f0431c4b463 # This is the commit message #104: Run mypy over test/test_utils.py (#49654) Summary: This caught one incorrect annotation in `cpp_extension.load`. xref gh-16574. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49654 Reviewed By: heitorschueroff Differential Revision: D25757691 Pulled By: ezyang fbshipit-source-id: 145ce3ae532cc585d9ca3bbd5381401bad0072e2 # This is the commit message #105: quant: ensure observers do not crash for empty Tensors (#49800) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49800 Ensures that having a Tensor with 0 elements does not crash observers. Note: it's illegal to pass Tensors with 0 elements to reductions such as min and max, so we gate this out before the logic hits min/max. This should not be hit often in practice, but it's coming up during debugging of some RCNN models with test inputs. Test Plan: ``` python test/test_quantization.py TestObserver.test_zero_numel ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25693230 fbshipit-source-id: d737559697c98bd923356edacba895835060bb38 # This is the commit message #106: quant: nice error message on convtranspose with per-channel weight (#49899) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49899 Per channel weights observer in conv transpose is not supported yet. Adding an error message which fails instantly instead of making the user wait until after calibration/training finishes. Test Plan: ``` python test/test_quantization.py TestPostTrainingStatic.test_convtranspose_per_channel_fails_early python test/test_quantization.py TestQuantizeFx.test_convtranspose_per_channel_fails_early ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25717151 fbshipit-source-id: 093e5979030ec185e3e0d56c45d7ce7338bf94b6 # This is the commit message #107: quant: throw a nice error message for allclose with quantized inputs (#49802) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49802 Currently `torch.allclose` is not supported with quantized inputs. Throw a nice error message instead of a cryptic one. Test Plan: ``` torch.allclose(x_fp32, y_fp32) torch.allclose(x_int8, y_int8) ``` Imported from OSS Reviewed By: supriyar Differential Revision: D25693538 fbshipit-source-id: 8958628433adfca3ae6ce215f3e3ec3c5e29994c # This is the commit message #108: eager quant: fix error with removing forward hooks (#49813) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49813 https://github.com/pytorch/pytorch/issues/49739 reports a crash where removing forward hooks results in a ``` RuntimeError: OrderedDict mutated during iteration ``` Unfortunately I cannot repro this inside the PyTorch module, but the issue author has a good point and and we should not mutate the dict inside of the iteration. Test Plan: ``` // test plan from https://github.com/pytorch/pytorch/pull/46871 which // originally added this python test/test_quantization.py TestEagerModeQATOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25698725 fbshipit-source-id: 13069d0d5017a84038c8f7be439a3ed537938ac6 # This is the commit message #109: [JIT] Remove buffer metadata serialization forward-compat gate (#49990) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49990 **Summary** This commit removes the forward-compatibility gate for buffer metadata serialization. It was introduced to allow versions of fbcode binaries statically linked against older versions of PyTorch (without buffer metadata in JIT) to deserialize archives produced by new versions of PyTorch. Enough time has probably passed that these old binaries don't exist anymore, so it should be safe to remove the gate. **Test Plan** Internal tests. Test Plan: Imported from OSS Reviewed By: xw285cornell Differential Revision: D25743199 Pulled By: SplitInfinity fbshipit-source-id: 58d82ab4362270b309956826e36c8bf9d620f081 # This is the commit message #110: Add an option to disable aten::cat in TE (re-revert) (#50101) Summary: This reverts commit ace78ddb6a2bdbf03f08c69767eba57306dd69ed. Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/50101 Reviewed By: eellison Differential Revision: D25784785 Pulled By: Krovatkin fbshipit-source-id: cbb3d377e03303f6c8c71f4c59c6d90ab40d55f7 # This is the commit message #111: [distributed] Provide parameter to pass GPU ID in barrier function (#49069) Summary: For a multi GPU node, rank and corresponding GPU mapping can be different. Provide optional parameter to specify the GPU device number for the allreduce operation in barrier function. Add test cases to validate barrier device_ids. Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com> Fixes https://github.com/pytorch/pytorch/issues/48110 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49069 Reviewed By: mrshenli Differential Revision: D25658528 Pulled By: rohan-varma fbshipit-source-id: 418198b6224c8c1fd95993b80c072a8ff8f02eec # This is the commit message #112: [RPC] Relax some profiling tests (#49983) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49983 We have observed very rare flakiness in some profiling tests recently, i.e.: . However, we were not able to reproduce these even with thousands of runs on the CI machines where the failure was originally reported. As a result, relaxing these tests and re-enabling them to reduce failure rates. ghstack-source-id: 119352019 Test Plan: CI Reviewed By: mrshenli Differential Revision: D25739416 fbshipit-source-id: 4dbb6b30f20d3af94ba39f4a7ccf4fb055e440bc # This is the commit message #113: support building with conda installed libraries (#50080) Summary: This should fix a bunch of share library compilation error when installed in conda lib, lib64 folder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50080 Reviewed By: seemethere Differential Revision: D25781923 Pulled By: walterddr fbshipit-source-id: 78a74925981d65243b98bb99a65f1f2766e87a2f # This is the commit message #114: Fix store based barrier to only use 'add'. (#49930) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49930 Certain store implementations don't work well when we use get() and add() on the same key. To avoid this issue, we only use add() in the store based barrier. The buggy store implementations can't be properly fixed due to legacy reasons. Test Plan: 1) unit tests. 2) waitforbuildbot Reviewed By: osalpekar Differential Revision: D25725386 fbshipit-source-id: 1535e2629914de7f78847b730f8764f92cde67e7 # This is the commit message #115: [caffe2][a10] Move down pragma pop to properly suppress warning 4522 (#49233) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49233 As the comments on line 160, say we should suppress this overly aggressive warning with MSVC: ``` caffe2\tensorbody.h_ovrsource#header-mode-symlink-tree-only,headers\aten\core\tensorbody.h(1223): warning C4522: 'at::Tensor': multiple assignment operators specified ``` However, in order to remove the warning, the closing brace of the class must be between the`#pragma warning` push and its corresponding pop. Move the pop down to ensure that. Test Plan: Built locally using clang for Windows without buck cache, confirmed the warning resolved Reviewed By: bhosmer Differential Revision: D25422447 fbshipit-source-id: c1e1c66fb8513af5f9d4e3c1dc48d0070c4a1f84 # This is the commit message #116: Drop unused imports from caffe2/python (#49980) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49980 From ``` ./python/libcst/libcst codemod remove_unused_imports.RemoveUnusedImportsWithGlean --no-format caffe2/ ``` Test Plan: Standard sandcastle tests Reviewed By: xush6528 Differential Revision: D25727359 fbshipit-source-id: c4f60005b10546423dc093d31d46deb418352286 # This is the commit message #117: Update MultiHeadAttention docstring (#49950) Summary: Fixes MultiHeadAttention docstring. Currently, https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html#torch.nn.MultiheadAttention is <img width="648" alt="Screen Shot 2020-12-29 at 21 06 43" src="https://user-images.githubusercontent.com/2459423/103311124-cd10cc00-4a19-11eb-89c9-0ee261364963.png"> and with the fix will be <img width="648" alt="Screen Shot 2020-12-29 at 22 41 35" src="https://user-images.githubusercontent.com/2459423/103315838-0dc31200-4a27-11eb-82e2-ca8f13d713a1.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/49950 Reviewed By: mrshenli Differential Revision: D25732573 Pulled By: zhangguanheng66 fbshipit-source-id: b362f3f617ab26b0dd25c3a0a7d4117e522e620c # This is the commit message #118: Revert D25757691: [pytorch][PR] Run mypy over test/test_utils.py Test Plan: revert-hammer Differential Revision: D25757691 (https://github.com/pytorch/pytorch/commit/c86cfcd81da46b5e8226441edb58f0b11a97f215) Original commit changeset: 145ce3ae532c fbshipit-source-id: 3dfd68f0c42fc074cde15c6213a630b16e9d8879 # This is the commit message #119: Enable distribution validation if __debug__ (#48743) Summary: Fixes https://github.com/pytorch/pytorch/issues/47123 Follows https://github.com/pyro-ppl/pyro/pull/2701 This turns on `Distribution` validation by default. The motivation is to favor beginners by providing helpful error messages. Advanced users focused on speed can disable validation by calling ```py torch.distributions.Distribution.set_default_validate_args(False) ``` or by disabling individual distribution validation via `MyDistribution(..., validate_args=False)`. In practice I have found many beginners forget or do not know about validation. Therefore I have [enabled it by default](https://github.com/pyro-ppl/pyro/pull/2701) in Pyro. I believe PyTorch could also benefit from this change. Indeed validation caught a number of bugs in `.icdf()` methods, in tests, and in PPL benchmarks, all of which have been fixed in this PR. ## Release concerns - This may slightly slow down some models. Concerned users may disable validation. - This may cause new `ValueErrors` in models that rely on unsupported behavior, e.g. `Categorical.log_prob()` applied to continuous-valued tensors (only {0,1}-valued tenso…
- Loading branch information