-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add finite difference function to compute the gradient and second-order derivatives of parametrized observables #1090
Conversation
… array with atomic positions
…mbols of the molecule and a numpy array with atomic coordinates
…/pennylane into modify-molecular-hamiltonian
Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: ixfoduap <40441298+ixfoduap@users.noreply.github.com>
Co-authored-by: ixfoduap <40441298+ixfoduap@users.noreply.github.com>
Hello. You may have forgotten to update the changelog!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great! Just left a couple of minor comments.
pennylane/_grad.py
Outdated
if (_np.array(_idx) > bounds).any(): | ||
raise ValueError( | ||
"Indices in 'idx' can not be greater than {}; got {}".format( | ||
tuple(bounds), _idx | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for letting this function raise this instead of letting numpy raise the error during calculation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'm in favour of removing this and just letting NumPy raise the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, you could probably remove the entire else:
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's funny that implementing a finite difference function has required so many iterations and discussions. I wonder if there is a lesson to be learnt for us?
Code looks good. I suggest clarifying the docstrings further, particularly by adding example usage to the user-facing finite_diff
. Some more testing of the various types of input functions and gradients would also be a good idea in my opinion
@@ -181,3 +182,206 @@ def _jacobian_function(*args, **kwargs): | |||
return _np.stack([_jacobian(func, arg)(*args, **kwargs) for arg in argnum]).T | |||
|
|||
return _jacobian_function | |||
|
|||
|
|||
def _fd_first_order_centered(f, argnum, delta, *args, idx=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agran2018 Please avoid using excessive underscores _
when choosing function names. Docstrings can help explain details of the function. Their names should be concise and intuitive.
In this case I suggest _first_order
and _second_order
, or possibly _fd_first_order
and _fd_second_order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, see this discussion in the ADR about indicating the finite-difference type in the function names. We can discuss it here and go with the one you like the most.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ixfoduap since this is a private function (and not user facing), it is generally okay to be a little bit more explicit/verbose.
But if it were a public, user-facing function, I would agree!
pennylane/_grad.py
Outdated
argnum (int): which argument to take a gradient with respect to | ||
delta (float): step size used to evaluate the finite difference | ||
idx (list[int]): if argument ``args[argnum]`` is an array, ``idx`` can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very confused with having both argnum
and idx
. Is the idea that we can have a function f(x,y,z)
and we may want to take a derivative with respect to y[3]
? Then we'd call _first_order(f, 1, idx=3)
?
If so, I think this needs to be better explained in the docstrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You got it correctly.
) | ||
|
||
if idx is None: | ||
idx = [(), ()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. What exactly happens when the user doesn't pass idx
? As far as I can tell, from line 288 when we make the assignment i, j = idx
, the rest of the code treats i,j
as integers, e.g., shift_i[i]
. So what does this function do when i,j = [(), ()]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this confused me as well. I think this might be sneakily using a feature of NumPy that array[tuple()]
simply returns the whole array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this seems to be the case.
I didn't know this @agran2018, you taught me something 😆
Definitely worth commenting this within the code though, so that it is clear what is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is some confusion here. This function gives the second derivative of the Hamiltonian. We don't need the full Hessian of the Hamiltonian as they enter individually in the calculation of the second derivative of the energy. Then,
- For multivariable functions
idx
is a list with the indicesi
andj
involved in the second derivative. - for one variable function
idx
defaults toNone
and by havingidx= [(), ()]
we can compute the second derivative of one variable functions using the code block implementing the diagonal case.
Indeed @josh146, I learned this from the first function you proposed :-).
derivative :math:`\frac{\partial^2 f(x)}{\partial x_i \partial x_j}`. | ||
delta (float): step size used to evaluate the finite differences | ||
|
||
Returns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could really use some example usage, especially given the subtleties in how to use argnum
and idx
, and the fact that idx
takes different forms depending on the order of the derivative. Would eb nice to have an example computing:
- Full gradient
- Single derivative
- Second derivative
for a function with several multi-dimensional arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, not finished yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, every user-facing function in PL should have an **Example**
section after the arguments!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of the full hessian might also be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examples added.
@josh146, if it is OK with you, I would rather have an example for the Hessian in the docstring of the second-order derivative of the energy since this is something users would like to do to compute normal modes.
+ f(*args[:argnum], x - shift_i - shift_j, *args[argnum + 1 :], **kwargs) | ||
) * delta ** -2 | ||
|
||
return deriv2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, for a function f(x, y)
we cannot take a derivative with respect to x[i]
and y[j]
, right? Only x[i]
and x[j]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be supporting x[i]
and y[j]
? Probably doable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not essential for the kind of properties we want to look at (nuclear gradients, force constants, vibrational frequencies).
|
||
|
||
@pytest.mark.parametrize( | ||
("x", "y", "argnum", "idx", "delta", "exp_grad"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to parametrize x
and y
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we can give them explicitly in the list but I think will deteriorate readability of the tests ?
Co-authored-by: ixfoduap <40441298+ixfoduap@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great @agran2018! Approved 💯 although conditional on:
- adding an example to the docstring
- considering whether df/dx[i]dy[j] makes sense
- exploring if we should remove
dtype="O"
.
@@ -14,6 +14,7 @@ | |||
""" | |||
This module contains the autograd wrappers :class:`grad` and :func:`jacobian` | |||
""" | |||
from functools import partial | |||
import numpy as _np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a throwback to previously when we imported both vanilla numpy and autograd numpy:
import numpy as _np
from pennylane import numpy as np
At some point, I imagine the second line got deleted.
However, this is going to be fixed in #1131 :)
@@ -181,3 +182,206 @@ def _jacobian_function(*args, **kwargs): | |||
return _np.stack([_jacobian(func, arg)(*args, **kwargs) for arg in argnum]).T | |||
|
|||
return _jacobian_function | |||
|
|||
|
|||
def _fd_first_order_centered(f, argnum, delta, *args, idx=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ixfoduap since this is a private function (and not user facing), it is generally okay to be a little bit more explicit/verbose.
But if it were a public, user-facing function, I would agree!
pennylane/_grad.py
Outdated
if (_np.array(_idx) > bounds).any(): | ||
raise ValueError( | ||
"Indices in 'idx' can not be greater than {}; got {}".format( | ||
tuple(bounds), _idx | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'm in favour of removing this and just letting NumPy raise the error
) | ||
|
||
x = _np.array(args[argnum]) | ||
gradient = _np.zeros_like(x, dtype="O") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this a list, gradient = []
?
Then, further down, we can do
gradient.append(
(f(*args[:argnum], x + shift, *args[argnum + 1 :], **kwargs)
- f(*args[:argnum], x - shift, *args[argnum + 1 :], **kwargs)) * delta ** -1
)
return _np.array(gradient).reshape(x.shape)
I think this might be beneficial, in that we get a float array or an object array when needed, rather than it always being an object array. @agran2018 maybe worth changing, and seeing if the tests still pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josh146, I tried this yesterday and it did not work for me:
247 # return gradient
--> 248 return _np.array(gradient).reshape(x.shape)
249
250
ValueError: cannot reshape array of size 4 into shape (2,)
Also, we need to keep track of the indices corresponding to the zero and non-zero components of the gradients if the list idx
is given. This is important for constrained geometry optimization which is very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, of course - the shape output should be the output shape of f
🤦
pennylane/_grad.py
Outdated
f (function): function with signature ``f(*args, **kwargs)`` | ||
argnum (int): which argument to take the derivative with respect to | ||
delta (float): step size used to evaluate the finite differences | ||
idx (list[int]): if argument ``args[argnum]`` is an array, `idx`` specifies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an example in the docstring will help? E.g., for function f(x, y, z)
, argnum=1
, idx=[3, 2]
will differentiate f
with respect to element (3, 2)
of argument y
.
derivative :math:`\frac{\partial^2 f(x)}{\partial x_i \partial x_j}`. | ||
delta (float): step size used to evaluate the finite differences | ||
|
||
Returns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, every user-facing function in PL should have an **Example**
section after the arguments!
derivative :math:`\frac{\partial^2 f(x)}{\partial x_i \partial x_j}`. | ||
delta (float): step size used to evaluate the finite differences | ||
|
||
Returns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of the full hessian might also be nice.
Co-authored-by: Josh Izaac <josh146@gmail.com>
Context:
Adds a finite difference function to evaluate the gradient and second-order derivative of callable functions. This is particular useful to evaluate the derivative of molecular Hamiltonians with respect to the nuclear coordinates in order to perform molecular geometry optimizations.
Description of the Change:
The functions
_fd_first_order_centered
,_fd_second_order_centered
andfinite_diff()
has been added to the_grad.py
module.Benefits:
Allows users to run VQAs to perform the joint optimization of the circuit and Hamiltonian parameters.
Possible Drawbacks:
The finite difference approximation is susceptible to numerical errors.
Related GitHub Issues:
None