Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: towards new h5netcdf/netcdf4 features #9509

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

kmuehlbauer
Copy link
Contributor

@kmuehlbauer kmuehlbauer commented Sep 17, 2024

.. ipython:: python
:okwarning:

# Writing complex valued data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should keep the example and highlight the new auto_complex=True of netcdf4-python?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be very useful to have an example.

Is it still true that a warning is raised? Is that warning xarray or h5netcdf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be no warning raised if capable versions are detected.

@@ -566,29 +566,12 @@ This is not CF-compliant but again facilitates roundtripping of xarray datasets.
Invalid netCDF files
~~~~~~~~~~~~~~~~~~~~

The library ``h5netcdf`` allows writing some dtypes (booleans, complex, ...) that aren't
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It boils down, that we are approaching feature equality between netcdf4-python and h5netcdf. It seems, that only reference objects can't be handled by netcdf-c.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo: check boolean enums

xarray/backends/h5netcdf_.py Outdated Show resolved Hide resolved
xarray/backends/netCDF4_.py Outdated Show resolved Hide resolved
@@ -537,6 +537,7 @@ def _choose_float_dtype(
if dtype.itemsize <= 2 and np.issubdtype(dtype, np.integer):
return np.float32
# For all other types and circumstances, we just use float64.
# Todo: with nc-complex from netcdf4-python >= 1.7.0 this is available
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: Add a test for this and fix it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe leave it to dedicated follow-up PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be an issue? nc-complex still stores the data as floats, just either in a compound type or with an extra dimension (netCDF-3).

Although, HDF5 1.15 will have native support for complex numbers, so that may be a future consideration (I suspect it will still be some time before netCDF itself gets native support)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe it will work out of the box.

xarray/tests/test_backends.py Outdated Show resolved Hide resolved
@kmuehlbauer kmuehlbauer marked this pull request as ready for review September 18, 2024 08:21
@kmuehlbauer kmuehlbauer reopened this Sep 18, 2024
@kmuehlbauer kmuehlbauer added the run-upstream Run upstream CI label Sep 18, 2024
@kmuehlbauer
Copy link
Contributor Author

@ZedThree @bzah it would be great if you could have a look here. This brings nc-complex (netcdf4/h5netcdf) and enum handling (h5netcdf) to the backends. This is not released yet in h5netcdf, but in latest main-branch (checked with upstream CI run). If you have anything to add, please let me know.

Copy link
Contributor

@ZedThree ZedThree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

.. ipython:: python
:okwarning:

# Writing complex valued data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be very useful to have an example.

Is it still true that a warning is raised? Is that warning xarray or h5netcdf?

@@ -1230,6 +1231,7 @@ def to_netcdf(
compute: bool = True,
multifile: Literal[False] = False,
invalid_netcdf: bool = False,
auto_complex: bool | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does auto_complex = None mean? Is it the same as auto_complex = False?
EDIT: Ah, to avoid passing the argument to h5netcdf? But invalid_netcdf is just a plain bool, so is that handled differently for netCDF4?

Should we also consider defaulting to True? I'm definitely interested in finding out if there are any cases where this results in unwanted behaviour

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not sure, if I handled that correctly. It's more or less a security measure against feeding the kwarg to versions which can't handle those yet. I think this was the simplest solution without adding a whole bunch of boilerplate code. But maybe there is an easier solution to that.

xarray/backends/netCDF4_.py Show resolved Hide resolved
xarray/backends/netCDF4_.py Outdated Show resolved Hide resolved
Comment on lines +444 to +445
if auto_complex is not None:
kwargs["auto_complex"] = auto_complex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to avoid passing auto_complex to the h5netcdf engine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To not feed it to old versions which can't digest it.

@@ -537,6 +537,7 @@ def _choose_float_dtype(
if dtype.itemsize <= 2 and np.issubdtype(dtype, np.integer):
return np.float32
# For all other types and circumstances, we just use float64.
# Todo: with nc-complex from netcdf4-python >= 1.7.0 this is available
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be an issue? nc-complex still stores the data as floats, just either in a compound type or with an extra dimension (netCDF-3).

Although, HDF5 1.15 will have native support for complex numbers, so that may be a future consideration (I suspect it will still be some time before netCDF itself gets native support)

Co-authored-by: Peter Hill <zed.three@gmail.com>
@kmuehlbauer
Copy link
Contributor Author

Thanks @ZedThree for taking the time to review. I've added couple of comments.

Comment on lines +1411 to +1413
if auto_complex is not None:
kwargs["auto_complex"] = auto_complex

Copy link
Contributor

@ZedThree ZedThree Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing the discussion above about making auto_complex: bool, then this could be:

Suggested change
if auto_complex is not None:
kwargs["auto_complex"] = auto_complex
if auto_complex:
if engine != "netcdf4":
raise ValueError(
f"unrecognised option 'auto_complex' for engine {engine}"
)
kwargs["auto_complex"] = auto_complex

This would mean we couldn't default to having this on though.

EDIT: Ah, but this also needs to handle earlier versions without the argument too. Hmmm, trickier!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-upstream Run upstream CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Writing complex numbers to netCDF
2 participants