-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add open_datatree to xarray #8697
Conversation
…-datatree-attempt2
Adds additional ignore to mypy Adds additional ignore to doctests Excludes xarray/datatree_ from all pre-commmit.ci
First stab. Will need to add/move tests.
I mistakenly thought we wanted to use the hidden version of datatree_ and we do not.
We do not want to expose open_datatree at top level until all of the code is migrated.
xarray/backends/api.py
Outdated
engine : str, optional | ||
Xarray backend engine to us. Valid options include `{"netcdf4", "h5netcdf", "zarr"}`. | ||
kwargs : | ||
Additional keyword arguments passed to :py:meth:`~xarray.open_dataset` for each group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't exist yet. at that location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you explain why? This should exist, if you use :py:func:
instead of :py:meth:
(you can use sphobjinv
to find the right role)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it doesn't exist at the top level because we wanted to migrate over the code before allowing a direct import.
from xarray import open_datatree
Until all of the code was merged, we said it would be imported from xarray.core.datatree. From #8572 First comment "EDIT: We decided it should return an xarray.DataTree object, or even xarray.core.datatree.DataTree object. So we can start by just copying the basic version in datatree/io.py right now which just calls open_dataset many times."
So I had thought xarray.core.datatree.DataTree made more sense for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the second part of your comment, makes me think I might not understand what you were asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what I'm asking is: open_dataset
definitely exists (and it should be possible to link to), so I suppose you meant to write xarray.open_datatree
?
Additional keyword arguments passed to :py:meth:`~xarray.open_dataset` for each group. | |
Additional keyword arguments passed to :py:func:`~xarray.open_datatree` for each group. |
Also, once that's fixed, does this cause the docs build to fail? If not I believe it would be fine to leave as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection, I think this should still be open_dataset
. That is what is in the original docs. And this is describing open_datatree
, where it's calling open_dataset
with these **kwargs
each time. So I think this commit should be reverted. As to :py:meth:
vs :py:func:
I don't know and will look at the difference unless you tell me before I figure it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure which tests to migrate/write with this change.
I think we can start by moving test_io.py
to a new folder xarray/tests/datatree/test_io.py
. This keeps a distinction between testing functionality we have collectively "okayed" and what we haven't got to yet. We can think about potentially integrating those tests into xarray/tests/test_backends.py
when we actually integrate the internals of open_datatree
with the backends internals.
I was able to open datatrees with each of the engines, netcdf4, h5netcdf and zarr.
Great! These all have tests right?
I haven't moved any documentation. I remember hearing that we could add it and mark it as experimental? Is that the correct way forward?
Yeah I'm not sure what to do for this. We could put it in the main docs but with a big experimental warning on it, but then we haven't got to the rest of the datatree functionality yet. I think I would favour moving the code first, then exposing the documentation after. Perhaps the datatree docs should be moved piecemeal to a dedicated section in the documentation (under "For developers/contributors" maybe), then moved into the main docs only once everything is done?
Also starts fixing simple mypy errors
I was thinking no, but when I went to write them. I think these are the tests I would write. They make me slightly uncomfortable only in that they use the |
Future us problem, I like that. But it also makes sense to me |
Add some typing for mygrated tests. Adds display_expand_groups to core options.
This is cargo-cult. I wonder if there's a different CI test that wanted these and since this is now excluded at the top level. I'm putting them back until migration into main codebase.
ec91d63
to
a4bad61
Compare
Marking as ready for review, but mostly to start comments coming. |
puts common parts in common.
Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
I didn't see that originally. Will they still cause problems when this is squashed merged? I assume you'd want me to get rid of the 5 commits before Jan 31st (when you merged #8688)? do you want to give me a git-wizard hint? |
The only way that I know of is interactive rebasing, but that would mean that you'd have to remove all merge commits and maybe even resolve merge conflicts again. So if you checked that github doesn't show any weird things in this PR it maybe doesn't matter. |
Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
This reverts commit aab1744.
I don't see anything weird, no conflicts or anything, just those old commits with different lineage. edit: I think it's fine if it's squash merged. If not, I did rebase this onto the current pydata/main and the resulting PR would be clean, but I don't actually want to push that on top of this. |
@flamingbear bear is there a comment thread for the mypy issue you raised earlier? I have no idea why mypy needs to be explicitly told about the types here, but mypy seems to pass CI on this branch... 🤷♂️ |
Also @flamingbear this now has two approvals - is it ready to merge from your end? |
I think that's because it's excluded in this PR. I've moved the that test module in the PR you linked to.
It is ready on my end. Merge away. 🙇 |
Let's start a habit of adding a |
Did some pattern matching because I'm not building the docs without error anymore locally. |
Head branch was pushed to by a user without write access
@TomNicholas I was hoping to kick off the CI again by fixing the rst formating since the macos 3.11 had timed out and was hanging. But I also turned off automerge. |
I'm kinda late with this, but one concern is that when merging the So my question is: should we remove that exclude and instead put the whole repo into build artifacts, or should we revert this and wait until we have done the series of PRs copying over the |
If we are not advertising any of this to be imported by users until the move is complete then does the distinction matter? |
if we're fine with it being in a broken state (because |
I personally think that's fine - we're likely going to have |
Draft: I'd like to open this up and start a discussion on a couple of things.
Here is a first stab at adding open_datatree to the backend of xarray.
I'm not sure which tests to migrate/write with this change. The only datatree tests that have open_datatree are the ones in tests/test_io.py. While those (and all of the datatree tests) run and pass, they are still located in _datatree, and they don't seem to fit in the test_backends.py. I do see that Tom did a good job of naming the tests so that they would fit with the existing tests in many places.Migrated tests over into xarray/tests/datatree once approved and merged can be added into the existing tests where appropriate (from Tom).No check boxes checked yet.
whats-new.rst
api.rst