Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF5-Property framework integration #461

Merged
merged 82 commits into from
Feb 10, 2022

Conversation

mrossinek
Copy link
Member

@mrossinek mrossinek commented Dec 15, 2021

Summary

Closes #302

Details and comments

This PR adds HDF5 saving/loading functionality to the qiskit_nature.property module.

Open tasks (no guarantee for completeness):

  • unittests
  • documentation
  • from_hdf5 and to_hdf5 methods in all other Property sub-classes
  • store Qiskit Nature version number provide some versioning mechanism
  • update hdf5 files in unittest suite I don't think we should do this just yet. What do you think?

@mrossinek mrossinek self-assigned this Dec 15, 2021
@coveralls
Copy link

coveralls commented Dec 20, 2021

Pull Request Test Coverage Report for Build 1822761483

  • 458 of 560 (81.79%) changed or added relevant lines in 31 files are covered.
  • 4 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.2%) to 82.491%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit_nature/properties/second_quantization/electronic/integrals/integral_property.py 32 33 96.97%
qiskit_nature/properties/second_quantization/electronic/magnetization.py 11 12 91.67%
qiskit_nature/properties/second_quantization/vibrational/types.py 3 4 75.0%
qiskit_nature/properties/second_quantization/vibrational/vibrational_energy.py 22 23 95.65%
qiskit_nature/properties/second_quantization/vibrational/integrals/vibrational_integrals.py 31 33 93.94%
qiskit_nature/properties/property.py 18 21 85.71%
qiskit_nature/properties/second_quantization/electronic/electronic_structure_driver_result.py 18 21 85.71%
qiskit_nature/properties/second_quantization/electronic/angular_momentum.py 25 29 86.21%
qiskit_nature/properties/second_quantization/electronic/dipole_moment.py 40 44 90.91%
qiskit_nature/properties/second_quantization/driver_metadata.py 10 15 66.67%
Files with Coverage Reduction New Missed Lines %
qiskit_nature/drivers/second_quantization/electronic_structure_driver.py 1 93.75%
qiskit_nature/drivers/second_quantization/vibrational_structure_driver.py 1 87.5%
qiskit_nature/properties/property.py 1 85.37%
qiskit_nature/properties/second_quantization/electronic/electronic_structure_driver_result.py 1 94.2%
Totals Coverage Status
Change from base Build 1822613020: -0.2%
Covered Lines: 8636
Relevant Lines: 10469

💛 - Coveralls

This needs to be removed in order to ensure that the stable tutorials
remain working.
The PseudoProperty class effectively removes everything which defines
the Property class (the interpret method). So instead of having such a
pseudo-class, all previous PseudoProperty subclass are now directly
Property subclasses and the `interpret()` method existence is handled
via the `Interpretable` Protocol.
@mrossinek
Copy link
Member Author

A single leading underscore is fine; the reason to add the trailing one too would be to indicate that it is a Protocol method.

I did not find any common naming conventions supporting this. For our case here I think it is also important that these methods are documented in the public API so we should leave the method names as they are now.

While the previous return types were not wrong, for documentation
purposes using these concrete implementations is a bit nicer.
@kevinsung
Copy link
Contributor

I did not find any common naming conventions supporting this.

As I mentioned above, Python itself uses hidden dunder names like __iter__ for its protocol methods. This is not merely a convention; hiding the protocol methods is a deliberate choice that ensures there is only one obvious way to execute the protocol. The protocol method and how to implement it should be sufficiently documented with the protocol itself.

Within the Qiskit ecosystem, Qiskit Experiments has already chosen to use dunder names for its JSON serialization protocol. However, dunder names are specifically reserved, so this was a poor choice (see qiskit-community/qiskit-experiments#630). Using a single underscore on both sides seems to be the next best thing, hence my suggestion.

@woodsp-ibm
Copy link
Member

Using a single underscore on both sides seems to be the next best thing, hence my suggestion.

I do not think its appropriate here as this appears to make the methods of the interface private - that's what a leading underscore signifies as I am sure you know. Here the interface (Protocol) is used across packages and as such represents a public interface for others to use/call. I can find no example where I see such naming as you have stated being proposed/recommended in general, though I can imagine if you had a Protocol that you wanted to keep private within a package that this might be applicable, its not the case here. While I might agree with your comment/example, I don't think this is an exactly apples to apples comparison. PEP544 example methods are regularly named - with exception only when its referring to how this works with the built-ins (dunder named). And of course as a user you never call those rather you use len(), hash(), +, * etc.

@kevinsung
Copy link
Contributor

Here the interface (Protocol) is used across packages and as such represents a public interface for others to use/call.

I'm only suggesting to hide the protocol methods (to_hdf5), not the public function used to execute the protocol (save_to_hdf5).

And of course as a user you never call those rather you use len(), hash(), +, * etc.

Yes, this is the exactly the behavior I am suggesting. Just like in Python itself, there would be only one way to execute the protocol, instead of two. The user would call the publicly exposed protocol function rather than the hidden methods.

@mrossinek
Copy link
Member Author

I do not see a reason why the Protocol methods should be hidden in our context.

@woodsp-ibm Could you please have a final pass over this PR? I would like to unblock the driver refactoring PRs which depend on this.

Instead of hard-coding the expected HDF5 file structure, we test that
`to_hdf5` and `from_hdf5` work consistently with each other.
The `test_to_hdf5` tests ensure that this method executes correctly
(i.e. without errors). The `from_hdf5` tests ensure that first writing
and subsequently reading a property from a file produces an identical
instance.

In the future, once version numbers of certain properties may increase,
we should store HDF5 files and compare those against expected instances.
The README test previously failed because the iteration over the
auxiliary operator observables in the ElectronicStructureResult is
currently unable to handle the lack of certain properties which have
always been evaluated for legacy reasons (AngularMomentum,
Magnetization).
Even if we were to default them to an empty list instead of None, while
the zip command would execute normally, no results would be printed
since zip stops after the shortest length.

That being said, fixing ElectronicStructureResult is not the solution
right now in any case, since a user would be unable to manually request
the computation of AngularMomentum and Magnetization before we resolve
the issue ý¿¿¼£�qiskit-community#312
Thus, this commit reverts the exclusion of these auxiliary operators in
the case of `settings.dict_aux_operators`.
Copy link
Member

@woodsp-ibm woodsp-ibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks for the hard work!

@mrossinek mrossinek merged commit 98de3b0 into qiskit-community:main Feb 10, 2022
@mrossinek mrossinek deleted the hdf5-property-saving branch February 10, 2022 16:13
@mrossinek mrossinek mentioned this pull request Feb 10, 2022
Anthony-Gandon pushed a commit to Anthony-Gandon/qiskit-nature that referenced this pull request May 25, 2023
* WIP: HDF5-integration for property framework

This is an initial proof-of-concept implementation.
This needs to be reviewed extensively.

* Clean up interface

* refactor: let ElectronicIntegrals derive from PseudoProperty

* refactor: let Molecule derive from PseudoProperty

* feat: implement DriverMetadata HDF5 methods

* feat: add toggle to include PseudoProperty objects in GroupedProperty iteration

* feat: implement ParticleNumber HDF5 methods

* feat: implement AngularMomentum and Magnetization HDF5 methods

* feat: implement ElectronicBasisTransform HDF5 methods

* fix: lint

* fix: fix unittests

* remove comment

* fix: spell

* feat: also store Qiskit Nature version

* handle potential error cases in Property.import_and_build_from_hdf5

* Fix copyright

* feat: introduce individual version numbers per Property class

* WIP: HDF5 integration into vibrational properties

* Fix copyright

* fix: property tutorial

* Update typehints

* Run black

* Remove @AbstractMethod from Property.from_hdf5

This needs to be removed in order to ensure that the stable tutorials
remain working.

* Add more missing typehints

* refactor: introduce HDF5Storable Protocol

* refactor: remove PseudoProperty in favor of Interpretable Protocol

The PseudoProperty class effectively removes everything which defines
the Property class (the interpret method). So instead of having such a
pseudo-class, all previous PseudoProperty subclass are now directly
Property subclasses and the `interpret()` method existence is handled
via the `Interpretable` Protocol.

* Fix linters

* Fix ASTransformer caught error type

* Fix copyright

* Fix imports

* More guards against Property type

* More import fixes

* fix: property tutorial

* fix: ElectronicStructureDriverResult.__str__

* refactor: remove Property base class where not needed

* test: basic hdf5 method unittests

* test: *StructureDriverResult from_hdf5 methods

* docs: HDF5 documentation

* refactor: formally deprecate PseudoProperty class

* docs: actual HDF5 save and load examples

* Fix spell

* fix: avoid name clash with multiple atoms of same kind

* fix: ElectronicEnergy.from_hdf5 group access

* Update unittest HDF5 resource

* fix: ParticleNumber.from_hdf5 occupation dataset access

* Fix qiskit-community#519

This is actually required for the matrices loaded during
ElectronicIntegrals.from_hdf5 to be in the correct order!

* Update qiskit_nature/properties/property.py

Co-authored-by: Steve Wood <40241007+woodsp-ibm@users.noreply.github.com>

* Update docs

* Rename save_to_hdf5(..., force -> replace)

* refactor: fix DriverMetadata HDF5 attribute names

* refactor: make from_hdf5 a staticmethod

* feat: store Molecule.units in HDF5

* fix: update expected HDF5 result

* docs: include backwards compatibility expectations

* feat: add skip_unreadable_data toggle to HDF5 loading methods

* Fix spell

Apparently Sphinx can now use `kwds` instead of `kwargs`

* docs: explicitly request error raising

* docs: use `:func:` instead of `:class:`

* docs: ensure *StructureDriverResults are documented

* refactor: enforce keyword arguments in hdf5 module

* Update driver return types

While the previous return types were not wrong, for documentation
purposes using these concrete implementations is a bit nicer.

* Run black

* Add reno

* feat: use Molecule.units during from_hdf5

* test: refactor _hdf5 method tests

Instead of hard-coding the expected HDF5 file structure, we test that
`to_hdf5` and `from_hdf5` work consistently with each other.
The `test_to_hdf5` tests ensure that this method executes correctly
(i.e. without errors). The `from_hdf5` tests ensure that first writing
and subsequently reading a property from a file produces an identical
instance.

In the future, once version numbers of certain properties may increase,
we should store HDF5 files and compare those against expected instances.

* fix: README test

The README test previously failed because the iteration over the
auxiliary operator observables in the ElectronicStructureResult is
currently unable to handle the lack of certain properties which have
always been evaluated for legacy reasons (AngularMomentum,
Magnetization).
Even if we were to default them to an empty list instead of None, while
the zip command would execute normally, no results would be printed
since zip stops after the shortest length.

That being said, fixing ElectronicStructureResult is not the solution
right now in any case, since a user would be unable to manually request
the computation of AngularMomentum and Magnetization before we resolve
the issue ý¿¿¼£�qiskit-community#312
Thus, this commit reverts the exclusion of these auxiliary operators in
the case of `settings.dict_aux_operators`.

* refactor: use only public API in PropertyTest

* refactor: update type hints

* fix: update TestVibrationalStructureDriverResult to G16 Rev.C01

Co-authored-by: Steve Wood <40241007+woodsp-ibm@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HDF5 integration for the Property framework
4 participants