Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: enable H5NetCDFStore to work with already open h5netcdf.File a… #3618

Merged
merged 7 commits into from
Jan 20, 2020

Conversation

kmuehlbauer
Copy link
Contributor

@kmuehlbauer kmuehlbauer commented Dec 13, 2019

enable H5NetCDFStore to work with open h5netcdf.File and h5netcdf.Group objects, add test

  • Tests added
  • Passes black . && mypy . && flake8
  • Fully documented, including whats-new.rst for all changes and api.rst for new API

@kmuehlbauer kmuehlbauer changed the title ENH: enable H5NetCDFStore to work with already open h5netcdf.File a… WIP: ENH: enable H5NetCDFStore to work with already open h5netcdf.File a… Dec 14, 2019
@kmuehlbauer
Copy link
Contributor Author

I'm unsure how this should be referenced in the docs/news because the netCDF4 counterpart doesn't have any mention about that feature too.

It seems that everything runs smoothly, the one error looks like it is unrelated. Please let me know, if the added test is enough and if you want me to do anything else. I'll leave the WIP mark in place so far.

Ping @shoyer, he might see if there are immediate problems with the code.

@kmuehlbauer
Copy link
Contributor Author

Happy New Year to everyone especially at pydata.

I'd be happy to actively follow any reviews to sort things out and to get this addition pulled in. Many thanks!

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good to me, with a minor suggestion.

It would be nice to mention this capability (along with the similar feature for netCDF4) in the IO/netCDF docs somewhere, but it doesn't have to happen in this PR.

xarray/backends/h5netcdf_.py Outdated Show resolved Hide resolved
self._group = group
self._mode = mode
self.format = None
self._filename = self.ds._root.filename
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, we don't want to rely on a private API here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately h5netcdf.Group doesn't seem to have a filename property like netCDF4.Dataset. So there is currently no direct means of extracting the filename from h5netcdf.Group using public API. Trying to to facilitate find_root_and_group now.

@kmuehlbauer
Copy link
Contributor Author

kmuehlbauer commented Jan 13, 2020

@TomNicholas After commit #1689db493f10262555196f658c52e370aacb4a33 (PR #3677) I get mypy TypCheck error:

xarray/core/dataset.py:3610: error: Item "Mapping[...]" of "Union[Dataset, Mapping[...]]" has no attribute "to_dataset"

Should I create an issue on that?

@kmuehlbauer
Copy link
Contributor Author

@shoyer I tried to address your review comments. I've added a temporary (backwards compatible) fix for #3680 to get along. Also using find_root_and_group to get hold of the filename is some kind of workaround until there is a better option.

I've added an item to whats-new.rst. For the API docs I did not find a fitting position because H5NetCDFStore and NetCDF4DataStore are nowhere mentioned there. So this might better be added in another PR by experienced xarray devs.

Unfortunately mypy checks are breaking due to some recent change (PR #3677).

@kmuehlbauer
Copy link
Contributor Author

@shoyer Please let me know if this needs rebasing after #3690 gets merged, or if I can anything do to finalize this.

@kmuehlbauer
Copy link
Contributor Author

Took the chance to rebase on latest master.

@kmuehlbauer kmuehlbauer changed the title WIP: ENH: enable H5NetCDFStore to work with already open h5netcdf.File a… ENH: enable H5NetCDFStore to work with already open h5netcdf.File a… Jan 14, 2020
@kmuehlbauer
Copy link
Contributor Author

@dcherian Following up on #3702 it would be really great if this enhancement could make it into 0.15.0 If I can be of any further assistance please ping me.

@dcherian dcherian mentioned this pull request Jan 17, 2020
11 tasks
@shoyer
Copy link
Member

shoyer commented Jan 20, 2020

This looks great, thanks for figuring this out!

@shoyer shoyer merged commit 5c97641 into pydata:master Jan 20, 2020
dcherian added a commit to dcherian/xarray that referenced this pull request Jan 20, 2020
…e-meta

* 'master' of github.com:pydata/xarray:
  Feature/align in dot (pydata#3699)
  ENH: enable `H5NetCDFStore` to work with already open h5netcdf.File a… (pydata#3618)
  One-off isort run (pydata#3705)
  hardcoded xarray.__all__ (pydata#3703)
  Bump mypy to v0.761 (pydata#3704)
  remove DataArray and Dataset constructor deprecations for 0.15  (pydata#3560)
  Tests for variables with units (pydata#3654)
dcherian added a commit to dcherian/xarray that referenced this pull request Jan 21, 2020
* upstream/master: (23 commits)
  Feature/align in dot (pydata#3699)
  ENH: enable `H5NetCDFStore` to work with already open h5netcdf.File a… (pydata#3618)
  One-off isort run (pydata#3705)
  hardcoded xarray.__all__ (pydata#3703)
  Bump mypy to v0.761 (pydata#3704)
  remove DataArray and Dataset constructor deprecations for 0.15  (pydata#3560)
  Tests for variables with units (pydata#3654)
  Add an example notebook using apply_ufunc to vectorize 1D functions (pydata#3629)
  Use encoding['dtype'] over data.dtype when possible within CFMaskCoder.encode (pydata#3652)
  allow passing any iterable to drop when dropping variables (pydata#3693)
  Typo on DataSet/DataArray.to_dict documentation (pydata#3692)
  Fix mypy type checking tests failure in ds.merge (pydata#3690)
  Explicitly convert result of pd.to_datetime to a timezone-naive type (pydata#3688)
  ds.merge(da) bugfix (pydata#3677)
  fix docstring for combine_first: returns a Dataset (pydata#3683)
  Add option to choose mfdataset attributes source. (pydata#3498)
  How do I add a new variable to dataset. (pydata#3679)
  Add map_blocks example to whats-new (pydata#3682)
  Make dask names change when chunking Variables by different amounts. (pydata#3584)
  raise an error when renaming dimensions to existing names (pydata#3645)
  ...
@kmuehlbauer kmuehlbauer deleted the enh_h5netcdfstore branch January 23, 2020 07:40
dcherian added a commit to fujiisoup/xarray that referenced this pull request Jan 25, 2020
* 'master' of github.com:pydata/xarray: (27 commits)
  bump min deps for 0.15 (pydata#3713)
  setuptools-scm and isort tweaks (pydata#3720)
  Allow binned coordinates on 1D plots y-axis. (pydata#3685)
  apply_ufunc: Add meta kwarg + bump dask to 2.2 (pydata#3660)
  setuptools-scm and one-liner setup.py (pydata#3714)
  Feature/align in dot (pydata#3699)
  ENH: enable `H5NetCDFStore` to work with already open h5netcdf.File a… (pydata#3618)
  One-off isort run (pydata#3705)
  hardcoded xarray.__all__ (pydata#3703)
  Bump mypy to v0.761 (pydata#3704)
  remove DataArray and Dataset constructor deprecations for 0.15  (pydata#3560)
  Tests for variables with units (pydata#3654)
  Add an example notebook using apply_ufunc to vectorize 1D functions (pydata#3629)
  Use encoding['dtype'] over data.dtype when possible within CFMaskCoder.encode (pydata#3652)
  allow passing any iterable to drop when dropping variables (pydata#3693)
  Typo on DataSet/DataArray.to_dict documentation (pydata#3692)
  Fix mypy type checking tests failure in ds.merge (pydata#3690)
  Explicitly convert result of pd.to_datetime to a timezone-naive type (pydata#3688)
  ds.merge(da) bugfix (pydata#3677)
  fix docstring for combine_first: returns a Dataset (pydata#3683)
  ...
dcherian added a commit to dcherian/xarray that referenced this pull request Jan 27, 2020
* master:
  Add support for CFTimeIndex in get_clean_interp_index (pydata#3631)
  sel with categorical index (pydata#3670)
  bump min deps for 0.15 (pydata#3713)
  setuptools-scm and isort tweaks (pydata#3720)
  Allow binned coordinates on 1D plots y-axis. (pydata#3685)
  apply_ufunc: Add meta kwarg + bump dask to 2.2 (pydata#3660)
  setuptools-scm and one-liner setup.py (pydata#3714)
  Feature/align in dot (pydata#3699)
  ENH: enable `H5NetCDFStore` to work with already open h5netcdf.File a… (pydata#3618)
  One-off isort run (pydata#3705)
  hardcoded xarray.__all__ (pydata#3703)
  Bump mypy to v0.761 (pydata#3704)
  remove DataArray and Dataset constructor deprecations for 0.15  (pydata#3560)
  Tests for variables with units (pydata#3654)
  Add an example notebook using apply_ufunc to vectorize 1D functions (pydata#3629)
  Use encoding['dtype'] over data.dtype when possible within CFMaskCoder.encode (pydata#3652)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants