Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make numerical representations of operators differentiable #1749

Merged
merged 40 commits into from
Oct 18, 2021

Conversation

glassnotes
Copy link
Contributor

@glassnotes glassnotes commented Oct 13, 2021

Context: Operator overhaul

Description of the Change: Converts the numerical representation of parametrized operators to use qml.math throughout, and return matrix representations in the same interface in which input parameters are specified.

Benefits: Operator representations are fully differentiable.

Possible Drawbacks: TODO

Related GitHub Issues: N/A

@github-actions
Copy link
Contributor

Hello. You may have forgotten to update the changelog!
Please edit doc/releases/changelog-dev.md with:

  • A one-to-two sentence description of the change. You may include a small working example for new features.
  • A link back to this PR.
  • Your name (or GitHub username) in the contributors section.

@josh146
Copy link
Member

josh146 commented Oct 14, 2021

@glassnotes one good way to test these changes --- if you go into devices/default_qubit_<interface>.py, you'll see that there is a class attribute self.parametric_ops.

This instructs the backprop devices to ignore Operator.matrix for parametric ops, and instead they use an equivalent function inside the file devices/<interface>_ops.py.

If you clear the self.parametric_ops list, these devices will instead query Operator.matrix directly in backprop mode 🙂

@@ -17,7 +17,6 @@
import pennylane as qml
from pennylane.operation import DiagonalOperation
from pennylane.devices import DefaultQubit
from pennylane.devices import jax_ops
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mammoth task @glassnotes! 🦣

Comment on lines 197 to 199
if isinstance(unitary, DiagonalOperation):
return unitary.eigvals

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the rest of the method is identical to default.qubit, we can probably delete it, and just rely on inheritance!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I think this could be removed from a few of the devices.

@@ -134,6 +134,7 @@ def _take_autograd(tensor, indices, axis=None):
ar.autoray._SUBMODULE_ALIASES["tensorflow", "arctan2"] = "tensorflow.math"
ar.autoray._SUBMODULE_ALIASES["tensorflow", "diag"] = "tensorflow.linalg"
ar.autoray._SUBMODULE_ALIASES["tensorflow", "kron"] = "tensorflow.experimental.numpy"
ar.autoray._SUBMODULE_ALIASES["tensorflow", "moveaxis"] = "tensorflow.experimental.numpy"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only one that needed to be added? I'm somewhat surprised (in a good way), I thought more would be needed

interface = qml.math.get_interface(theta)

c = qml.math.cos(theta / 2)
s = qml.math.sin(theta / 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very very minor suggestion, but perhaps the 1j* and -theta were included here previously as a speedup?

Copy link
Contributor Author

@glassnotes glassnotes Oct 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried with torch, 7 microseconds for the 2x2 cases and 15 for the 4x4 cases. I'll take it! ⏩ 👍

Comment on lines 177 to 178
if len(qml.math.shape(theta)) > 0:
theta = theta[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, why does this line occur?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might actually be able to remove these now; I'll check. In the original versions using qml.math.array(stuff, like=interface), I needed to include this for cases like

torch.tensor([0.3], requires_grad=True)

otherwise there would be an extra dimension in the returned matrix. But with stack this may no longer be a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like they are not covered by the test suite 🤔

pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved
pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved
pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved
pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved
pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved

return expanded_tensor.reshape((2 ** M, 2 ** M))
return qml.math.reshape(expanded_tensor, (2 ** M, 2 ** M))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woah, nice catch updating this. Which functions were calling this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was in some of the functions in PauliRot and MultiRZ.

Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glassnotes I just realized, but changes to default.qubit.autograd are missing :)

Olivia Di Matteo and others added 2 commits October 14, 2021 12:47
@josh146
Copy link
Member

josh146 commented Oct 14, 2021

  File "/home/runner/work/pennylane/pennylane/pennylane/ops/qubit/qchem_ops.py", line 545, in _matrix
    e = cmath.exp(-1j * theta / 2)
TypeError: must be real number, not ArrayBox

looks like there are some additional QChem ops not converted yet? In particular, OrbitalRotation.

@glassnotes
Copy link
Contributor Author

  File "/home/runner/work/pennylane/pennylane/pennylane/ops/qubit/qchem_ops.py", line 545, in _matrix
    e = cmath.exp(-1j * theta / 2)
TypeError: must be real number, not ArrayBox

looks like there are some additional QChem ops not converted yet? In particular, OrbitalRotation.

Oops, I missed those because they weren't included in the parametric_ops dictionary! They must be pretty new as well :) Thanks for the heads up!

@josh146
Copy link
Member

josh146 commented Oct 14, 2021

No worries! Sorry, I got a bit too invested in the CI, so I've already pushed a fix 😬 Waiting to see if the CI passes now

@glassnotes
Copy link
Contributor Author

I've already pushed a fix

No complaints here 🎉 Thanks!

@josh146
Copy link
Member

josh146 commented Oct 14, 2021

All that remains are the Torch 'probabilities not summing to 1' issue 🤔

@@ -1321,8 +1321,8 @@ def circuit(a, b):
qml.CRX(b, wires=[0, 1])
return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1))

a = torch.tensor([-0.234], dtype=torch.float64, requires_grad=True)
b = torch.tensor([0.654], dtype=torch.float64, requires_grad=True)
a = torch.tensor(-0.234, dtype=torch.float64, requires_grad=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the checks on the size of theta led to errors in test_backprop_gradient; I've thus adjusted this test, but I'm a bit confused because test_prob_differentiability creates the tensors in the same way and runs without issues.

Comment on lines 1051 to 1054
# It might be that they are in different interfaces, e.g.,
# Rot(0.2, 0.3, tf.Variable(0.5), wires=0)
# So we need to make sure the matrix comes out having the right type
interfaces = [qml.math.get_interface(p) for p in [phi, theta, omega]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glassnotes you could replace this with

Suggested change
# It might be that they are in different interfaces, e.g.,
# Rot(0.2, 0.3, tf.Variable(0.5), wires=0)
# So we need to make sure the matrix comes out having the right type
interfaces = [qml.math.get_interface(p) for p in [phi, theta, omega]]
# It might be that they are in different interfaces, e.g.,
# Rot(0.2, 0.3, tf.Variable(0.5), wires=0)
# So we need to make sure the matrix comes out having the right type
interfaces = qml.math._multi_dispatch([phi, theta, omega])

_multi_dispatch was originally intended to be purely a private function, but since then I've realized it's kinda useful in general, so I might make it public.

What it does is that it iterates through a list of tensor-like objects, and decides the overall interface to choose. It also has some logic that 'prioritizes' certain interfaces over others.

E.g., if the list of tensors is [autograd, autograd, numpy, torch], it priorities torch over autograd.

Comment on lines 63 to 64
@classmethod
def _matrix(cls, *params):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@classmethod
def _matrix(cls, *params):
@classmethod
@functools.lru_cache()
def _matrix(cls, *params):

I think this should lead to speedups, since the call to generate the matrix will be cached for the same input object (e.g., if theta has the same value and same type!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! First timing run is the original, second is with the caching. Beautiful 🤩

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However this seems to cause trouble with the gradients in the Ising operators in the gradient tests for JAX/TF. Have you ever encountered this before?

    def __hash__(self):
      if ops.Tensor._USE_EQUALITY and ops.executing_eagerly_outside_functions():  # pylint: disable=protected-access
>       raise TypeError("Variable is unhashable. "
                        "Instead, use tensor.ref() as the key.")
E       TypeError: Variable is unhashable. Instead, use tensor.ref() as the key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's super weird 🤔 does qml.RX(tf.Variable(0.5), wires=0).matrix work okay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it does not :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I just ran the test suite with lru_cache() turned on for only RX, this is what happened:

image

Huge shame :(

I wonder if we can turn it on programmatically for just autograd and torch?

Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got the tests passing :) Interestingly, it looks like there are no tests for:

  • tf and PhaseShift
  • tf and ControlledPhaseShift
  • tf and PauliRot
  • tf and CRZ
  • tf and U1
  • tf and U2
  • tf and U3
  • tf and qml.utils.expand

which codecov is now picking up, since there are tf branches in those operations

@glassnotes
Copy link
Contributor Author

@josh146 CI is passing and all changes are in, this is ready for re-review. Codecov is still complaining about an overall decrease in coverage, but is not indicating any specific lines have been missed, as far as I can see.

@josh146
Copy link
Member

josh146 commented Oct 18, 2021

Will go for a review now 🙂

It looks like the reduction in coverage is coming from the devices, weird 🤔 Maybe codecov is just confused since the number of lines in the device class has dropped...
image

Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @glassnotes, been really looking forward to this change!

Comment on lines +369 to +374
* All qubit operations have been re-written to use the `qml.math` framework
for internal classical processing and the generation of their matrix representations.
As a result these representations are now fully differentiable, and the
framework-specific device classes no longer need to maintain framework-specific
versions of these matrices.
[(#1749)](https://github.com/PennyLaneAI/pennylane/pull/1749)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 for avoiding 'interface' hahaha

Comment on lines 511 to 516
- The updates made to the operator matrix representations in order to
enable full differentiability in all interfaces now require that single
parameters be passed as rank-0 tensors in order for the matrix dimensions
to be correct (i.e., use `qml.RX(tf.Variable(0.3), wires=0)` instead of
`qml.RX(tf.Variable([0.3]), wires=0)`.
[(#1749)](https://github.com/PennyLaneAI/pennylane/pull/1749)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost tempted not to include this here, since technically this isn't a breaking change (we never guaranteed this behaviour in earlier versions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wasn't sure about it; let's just remove it then if the behaviour was never guaranteed anyways.

pennylane/ops/qubit/matrix_ops.py Outdated Show resolved Hide resolved
pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved
pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved
pennylane/ops/qubit/parametric_ops.py Outdated Show resolved Hide resolved
Comment on lines +968 to +974

if qml.math.get_interface(theta) == "tensorflow":
theta = qml.math.cast_like(theta, 1j)

exp_part = qml.math.exp(-0.5j * theta)

return qml.math.diag([1, 1, exp_part, qml.math.conj(exp_part)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very minor, no action needed, but I just noticed that this could simply be written as

def _matrix(cls, *params):
    return qml.math.diag(cls._eigvals(*params))

since it is diagonal. Although, I don't know if there are any unintentional issues associated with this.

[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0],
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woah

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed here, but I wonder if this could be done via qml.math.scatter_element_add.

U = qml.math.eye(16, like=interface)
indices = [(3, 3, 12, 12), (3, 12, 3, 12)]
values = qml.math.stack([c - 1.0, -s, s, c - 1.0])
U = qml.math.scatter_element_add(U, indices, values)

obtained_mat = qml.utils.expand(tf_mat, wires, list(range(4)))

assert qml.math.get_interface(obtained_mat) == "tensorflow"
assert qml.math.allclose(qml.math.unwrap(obtained_mat), expected_mat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these @glassnotes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants