Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix Torch tensor locality with autoray-registered
coerce
method (#5438
) ### Before submitting Please complete the following checklist when submitting a PR: - [x] All new features must include a unit test. If you've fixed a bug or added code that should be tested, add a test to the test directory! - [x] All new functions and code must be clearly commented and documented. If you do make documentation changes, make sure that the docs build and render correctly by running `make docs`. - [x] Ensure that the test suite passes, by running `make test`. - [x] Add a new entry to the `doc/releases/changelog-dev.md` file, summarizing the change, and including a link back to the PR. - [x] The PennyLane source code conforms to [PEP8 standards](https://www.python.org/dev/peps/pep-0008/). We check all of our code against [Pylint](https://www.pylint.org/). To lint modified files, simply `pip install pylint`, and then run `pylint pennylane/path/to/file.py`. When all the above are checked, delete everything above the dashed line and fill in the pull request template. ------------------------------------------------------------------------------------------------------------ **Context:** When Torch has a GPU backed data-buffer, failures can occur when attempting to make autoray-dispatched calls to Torch method with paired CPU data. In this case, for probabilities on the GPU, and eigenvalues on the host (read from the observables), failures appeared with `qml.dot`, and can be reproduced from: ```python import pennylane as qml import torch import numpy as np torch_device="cuda" dev = qml.device("default.qubit.torch", wires=2, torch_device=torch_device) ham = qml.Hamiltonian(torch.tensor([0.1, 0.2], requires_grad=True), [qml.PauliX(0), qml.PauliZ(1)]) @qml.qnode(dev, diff_method="backprop", interface="torch") def circuit(): qml.RX(np.zeros(5), 0) # Broadcast the state by applying a broadcasted identity return qml.expval(ham) res = circuit() assert qml.math.allclose(res, 0.2) ``` This pair modifies the registered `coerce` method for Torch to always automigrate mixed CPU-GPU data to always favour the associated GPU. In addition, this method now also catches multi-GPU data, where tensors do not reside on the same index, and will fail outright. As a longer term solution, moving the Torch GPU dispatch calls to earlier in the stack would be more sound, but this fixes the aforementioned issue, at the expense of always migrating from CPU to GPU. **Description of the Change:** As above. **Benefits:** Allows automatic data migration from host to device when using a GPU backed tensor. In addition, will catch multi-GPU tensor data when using Torch, and fail due to non-local representations. **Possible Drawbacks:** Auto migration may not always be wanted. The alternative solution is to always be explicit about locality, and move the eigenvalue data to exist on the device at a higher layer in the stack. **Related GitHub Issues:** #5269 introduced changes that resulted in GPU errors.
- Loading branch information