Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add finite difference function to compute the gradient and second-order derivatives of parametrized observables #1090

Merged
merged 117 commits into from
Mar 17, 2021

Conversation

agran2018
Copy link
Contributor

@agran2018 agran2018 commented Feb 17, 2021

Context:
Adds a finite difference function to evaluate the gradient and second-order derivative of callable functions. This is particular useful to evaluate the derivative of molecular Hamiltonians with respect to the nuclear coordinates in order to perform molecular geometry optimizations.

Description of the Change:
The functions _fd_first_order_centered, _fd_second_order_centered and finite_diff() has been added to the _grad.py module.

Benefits:
Allows users to run VQAs to perform the joint optimization of the circuit and Hamiltonian parameters.

Possible Drawbacks:
The finite difference approximation is susceptible to numerical errors.

Related GitHub Issues:
None

agran2018 and others added 23 commits February 8, 2021 18:21
…mbols of the molecule and a numpy array with atomic coordinates
…/pennylane into modify-molecular-hamiltonian
Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: ixfoduap <40441298+ixfoduap@users.noreply.github.com>
Co-authored-by: ixfoduap <40441298+ixfoduap@users.noreply.github.com>
@agran2018 agran2018 added WIP 🚧 Work-in-progress qchem ⚛️ Related to the QChem package labels Feb 17, 2021
@agran2018 agran2018 self-assigned this Feb 17, 2021
@github-actions
Copy link
Contributor

Hello. You may have forgotten to update the changelog!
Please edit .github/CHANGELOG.md with:

  • A one-to-two sentence description of the change. You may include a small working example for new features.
  • A link back to this PR.
  • Your name (or GitHub username) in the contributors section.

@agran2018 agran2018 changed the title Add finite difference function to compute the first derivatives of parametrized observables Add finite difference function to compute the gradient and second-order derivatives of parametrized observables Mar 12, 2021
Copy link
Contributor

@zeyueN zeyueN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Just left a couple of minor comments.

Comment on lines 224 to 228
if (_np.array(_idx) > bounds).any():
raise ValueError(
"Indices in 'idx' can not be greater than {}; got {}".format(
tuple(bounds), _idx
)
Copy link
Contributor

@zeyueN zeyueN Mar 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for letting this function raise this instead of letting numpy raise the error during calculation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'm in favour of removing this and just letting NumPy raise the error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, you could probably remove the entire else: branch.

tests/test_finite_diff.py Show resolved Hide resolved
Copy link
Contributor

@ixfoduap ixfoduap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's funny that implementing a finite difference function has required so many iterations and discussions. I wonder if there is a lesson to be learnt for us?

Code looks good. I suggest clarifying the docstrings further, particularly by adding example usage to the user-facing finite_diff. Some more testing of the various types of input functions and gradients would also be a good idea in my opinion

.github/CHANGELOG.md Outdated Show resolved Hide resolved
.github/CHANGELOG.md Outdated Show resolved Hide resolved
@@ -181,3 +182,206 @@ def _jacobian_function(*args, **kwargs):
return _np.stack([_jacobian(func, arg)(*args, **kwargs) for arg in argnum]).T

return _jacobian_function


def _fd_first_order_centered(f, argnum, delta, *args, idx=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agran2018 Please avoid using excessive underscores _ when choosing function names. Docstrings can help explain details of the function. Their names should be concise and intuitive.

In this case I suggest _first_order and _second_order, or possibly _fd_first_order and _fd_second_order

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, see this discussion in the ADR about indicating the finite-difference type in the function names. We can discuss it here and go with the one you like the most.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixfoduap since this is a private function (and not user facing), it is generally okay to be a little bit more explicit/verbose.

But if it were a public, user-facing function, I would agree!

Comment on lines 194 to 196
argnum (int): which argument to take a gradient with respect to
delta (float): step size used to evaluate the finite difference
idx (list[int]): if argument ``args[argnum]`` is an array, ``idx`` can
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very confused with having both argnum and idx. Is the idea that we can have a function f(x,y,z) and we may want to take a derivative with respect to y[3]? Then we'd call _first_order(f, 1, idx=3)?

If so, I think this needs to be better explained in the docstrings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You got it correctly.

pennylane/_grad.py Outdated Show resolved Hide resolved
)

if idx is None:
idx = [(), ()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. What exactly happens when the user doesn't pass idx? As far as I can tell, from line 288 when we make the assignment i, j = idx, the rest of the code treats i,j as integers, e.g., shift_i[i]. So what does this function do when i,j = [(), ()]?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this confused me as well. I think this might be sneakily using a feature of NumPy that array[tuple()] simply returns the whole array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this seems to be the case.
image

I didn't know this @agran2018, you taught me something 😆

Definitely worth commenting this within the code though, so that it is clear what is happening.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some confusion here. This function gives the second derivative of the Hamiltonian. We don't need the full Hessian of the Hamiltonian as they enter individually in the calculation of the second derivative of the energy. Then,

  • For multivariable functions idx is a list with the indices i and j involved in the second derivative.
  • for one variable function idx defaults to None and by having idx= [(), ()] we can compute the second derivative of one variable functions using the code block implementing the diagonal case.

Indeed @josh146, I learned this from the first function you proposed :-).

derivative :math:`\frac{\partial^2 f(x)}{\partial x_i \partial x_j}`.
delta (float): step size used to evaluate the finite differences

Returns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could really use some example usage, especially given the subtleties in how to use argnum and idx, and the fact that idx takes different forms depending on the order of the derivative. Would eb nice to have an example computing:

  • Full gradient
  • Single derivative
  • Second derivative

for a function with several multi-dimensional arguments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not finished yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, every user-facing function in PL should have an **Example** section after the arguments!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example of the full hessian might also be nice.

Copy link
Contributor Author

@agran2018 agran2018 Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples added.

@josh146, if it is OK with you, I would rather have an example for the Hessian in the docstring of the second-order derivative of the energy since this is something users would like to do to compute normal modes.

+ f(*args[:argnum], x - shift_i - shift_j, *args[argnum + 1 :], **kwargs)
) * delta ** -2

return deriv2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, for a function f(x, y) we cannot take a derivative with respect to x[i] and y[j], right? Only x[i] and x[j]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be supporting x[i] and y[j]? Probably doable.

Copy link
Contributor Author

@agran2018 agran2018 Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not essential for the kind of properties we want to look at (nuclear gradients, force constants, vibrational frequencies).



@pytest.mark.parametrize(
("x", "y", "argnum", "idx", "delta", "exp_grad"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to parametrize x and y? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we can give them explicitly in the list but I think will deteriorate readability of the tests ?

tests/test_finite_diff.py Show resolved Hide resolved
agran2018 and others added 3 commits March 15, 2021 15:42
Co-authored-by: ixfoduap <40441298+ixfoduap@users.noreply.github.com>
Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great @agran2018! Approved 💯 although conditional on:

  • adding an example to the docstring
  • considering whether df/dx[i]dy[j] makes sense
  • exploring if we should remove dtype="O".

@@ -14,6 +14,7 @@
"""
This module contains the autograd wrappers :class:`grad` and :func:`jacobian`
"""
from functools import partial
import numpy as _np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a throwback to previously when we imported both vanilla numpy and autograd numpy:

import numpy as _np
from pennylane import numpy as np

At some point, I imagine the second line got deleted.

However, this is going to be fixed in #1131 :)

@@ -181,3 +182,206 @@ def _jacobian_function(*args, **kwargs):
return _np.stack([_jacobian(func, arg)(*args, **kwargs) for arg in argnum]).T

return _jacobian_function


def _fd_first_order_centered(f, argnum, delta, *args, idx=None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ixfoduap since this is a private function (and not user facing), it is generally okay to be a little bit more explicit/verbose.

But if it were a public, user-facing function, I would agree!

Comment on lines 224 to 228
if (_np.array(_idx) > bounds).any():
raise ValueError(
"Indices in 'idx' can not be greater than {}; got {}".format(
tuple(bounds), _idx
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I'm in favour of removing this and just letting NumPy raise the error

)

x = _np.array(args[argnum])
gradient = _np.zeros_like(x, dtype="O")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this a list, gradient = []?

Then, further down, we can do

        gradient.append(
            (f(*args[:argnum], x + shift, *args[argnum + 1 :], **kwargs)
            - f(*args[:argnum], x - shift, *args[argnum + 1 :], **kwargs)) * delta ** -1
        )

    return _np.array(gradient).reshape(x.shape)

I think this might be beneficial, in that we get a float array or an object array when needed, rather than it always being an object array. @agran2018 maybe worth changing, and seeing if the tests still pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josh146, I tried this yesterday and it did not work for me:

   247     # return gradient
--> 248     return _np.array(gradient).reshape(x.shape)
    249 
    250 

ValueError: cannot reshape array of size 4 into shape (2,)

Also, we need to keep track of the indices corresponding to the zero and non-zero components of the gradients if the list idx is given. This is important for constrained geometry optimization which is very useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, of course - the shape output should be the output shape of f 🤦

f (function): function with signature ``f(*args, **kwargs)``
argnum (int): which argument to take the derivative with respect to
delta (float): step size used to evaluate the finite differences
idx (list[int]): if argument ``args[argnum]`` is an array, `idx`` specifies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an example in the docstring will help? E.g., for function f(x, y, z), argnum=1, idx=[3, 2] will differentiate f with respect to element (3, 2) of argument y.

pennylane/_grad.py Outdated Show resolved Hide resolved
pennylane/_grad.py Outdated Show resolved Hide resolved
derivative :math:`\frac{\partial^2 f(x)}{\partial x_i \partial x_j}`.
delta (float): step size used to evaluate the finite differences

Returns:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, every user-facing function in PL should have an **Example** section after the arguments!

pennylane/_grad.py Outdated Show resolved Hide resolved
derivative :math:`\frac{\partial^2 f(x)}{\partial x_i \partial x_j}`.
delta (float): step size used to evaluate the finite differences

Returns:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example of the full hessian might also be nice.

@agran2018 agran2018 added review-ready 👌 PRs which are ready for review by someone from the core team. and removed WIP 🚧 Work-in-progress labels Mar 16, 2021
@agran2018 agran2018 merged commit a98f252 into master Mar 17, 2021
@agran2018 agran2018 deleted the gradient-hamiltonian branch March 17, 2021 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
qchem ⚛️ Related to the QChem package review-ready 👌 PRs which are ready for review by someone from the core team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants