Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable __setitem__ for dask versions that support it. #5174

Merged
merged 27 commits into from
May 1, 2021
Merged

Enable __setitem__ for dask versions that support it. #5174

merged 27 commits into from
May 1, 2021

Conversation

tammasloughran
Copy link
Contributor

The basics seem to work for me with just these changes. But I haven't tested much to see if it plays nicely with the rest of xarray. Looks like dask's item assignment is still mostly limited to 1D arrays and slices of multi dimensional arrays—no fancy stuff.

@pep8speaks
Copy link

pep8speaks commented Apr 16, 2021

Hello @tammasloughran! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-05-01 11:11:23 UTC

xarray/core/indexing.py Outdated Show resolved Hide resolved
xarray/core/indexing.py Outdated Show resolved Hide resolved
@max-sixty
Copy link
Collaborator

Looks good! Not sure whether there's a reasonable way of testing this...

Are there any current tests for the error that break on the new version (i.e. because it doesn't raise an error)?

@tammasloughran
Copy link
Contributor Author

Are there any current tests for the error that break on the new version (i.e. because it doesn't raise an error)?

I don't quite understand understand what you mean. Like this?

@max-sixty
Copy link
Collaborator

I don't quite understand understand what you mean. Like this?

Forgive me not being clear. The current test with the version check is great. And I hadn't realized there was a released dask version — even better that the test is already running on both versions of the code.

xarray/core/indexing.py Outdated Show resolved Hide resolved
xarray/tests/test_indexing.py Outdated Show resolved Hide resolved
xarray/tests/test_indexing.py Outdated Show resolved Hide resolved
xarray/tests/test_indexing.py Outdated Show resolved Hide resolved
@dcherian
Copy link
Contributor

Thanks @tammasloughran This is a great PR!

@tammasloughran
Copy link
Contributor Author

tammasloughran commented Apr 19, 2021

I don't quite understand understand what you mean. Like this?

Forgive me not being clear. The current test with the version check is great. And I hadn't realized there was a released dask version — even better that the test is already running on both versions of the code.

Well... I see now there is also a duplicate test here, which fails when using the newer dask version. I don't understand why xarray goes to all the trouble of directly unit testing an external library. Makes more sense to me to only test the xarray wrapper for the dask functionality. I could add another version check there, or remove that test. What do you think?

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
@dcherian
Copy link
Contributor

I see now there is also a duplicate test here, which fails when using the newer dask version. I

That's testing the Variable object which is the lowest-level Xarray object. It would be a good place to add the assignment tests

xarray/core/indexing.py Outdated Show resolved Hide resolved
@tammasloughran tammasloughran requested a review from shoyer April 19, 2021 21:08
Copy link
Member

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tammasloughran, I left a few inline suggestions

xarray/core/indexing.py Outdated Show resolved Hide resolved
xarray/core/indexing.py Outdated Show resolved Hide resolved
@max-sixty
Copy link
Collaborator

After those two suggestions, let's merge?

tammasloughran and others added 4 commits April 21, 2021 23:02
Co-authored-by: Anderson Banihirwe <axbanihirwe@ualr.edu>
Co-authored-by: Anderson Banihirwe <axbanihirwe@ualr.edu>
@max-sixty
Copy link
Collaborator

@tammasloughran would be great to get this in by #5232 if you have any time, it's great work. I think it's fairly close to the tests passing.

xarray/core/indexing.py Outdated Show resolved Hide resolved
@Illviljan
Copy link
Contributor

The errors seems unrelated? Seems to rather be a mismatch in the CI?

A dask method __dask_distributed_pack__ is complaining:

>       from distributed.protocol.core import dumps_msgpack
E       ImportError: cannot import name 'dumps_msgpack' from 'distributed.protocol.core' (/usr/share/miniconda/envs/xarray-tests/lib

But those seems to have been removed in dask/distributed#4677. Further reading in dask/dask-yarn#147 (comment)

Co-authored-by: Illviljan <14371165+Illviljan@users.noreply.github.com>
@keewis
Copy link
Collaborator

keewis commented Apr 30, 2021

A dask method __dask_distributed_pack__ is complaining:

that seems to have been a conda issue (but it's gone now), and the remaining failure is a known flaky test, so CI looks good

Comment on lines 113 to 144
if LooseVersion(dask.__version__) >= LooseVersion("2021.04.0+17"):
arr = Variable(("x"), da.array([1, 2, 3, 4]))
expected = Variable(("x"), da.array([99, 2, 3, 4]))
arr[0] = 99 # Indexing by integers
assert_identical(arr, expected)
arr = Variable(("x"), da.array([1, 2, 3, 4]))
expected = Variable(("x"), da.array([99, 99, 99, 4]))
arr[2::-1] = 99 # Indexing by slices
assert_identical(arr, expected)
arr = Variable(("x"), da.array([1, 2, 3, 4]))
expected = Variable(("x"), da.array([99, 99, 3, 99]))
arr[[0, -1, 1]] = 99 # Indexing by a list of integers
assert_identical(arr, expected)
arr = Variable(("x"), da.array([1, 2, 3, 4]))
expected = Variable(("x"), da.array([99, 99, 99, 4]))
arr[np.arange(3)] = 99 # Indexing by a 1-d numpy array of integers
assert_identical(arr, expected)
arr = Variable(("x"), da.array([1, 2, 3, 4]))
expected = Variable(("x"), da.array([1, 99, 99, 99]))
arr[[False, True, True, True]] = 99 # Indexing by a list of booleans
assert_identical(arr, expected)
arr = Variable(("x"), da.array([1, 2, 3, 4]))
expected = Variable(("x"), da.array([1, 99, 99, 99]))
arr[np.arange(4) > 0] = 99 # Indexing by a 1-d numpy array of booleans
assert_identical(arr, expected)
arr = Variable(("x"), da.array([1, 2, 3, 4]))
expected = Variable(("x"), da.array([99, 99, 99, 99]))
arr[arr > 0] = 99 # Indexing by one broadcastable Array of booleans
assert_identical(arr, expected)
else:
with pytest.raises(TypeError, match=r"stored in a dask array"):
v[:1] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I was wondering if we could use @pytest.mark.parametrize here to parametrize this test so as to avoid code duplication?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree, but I think I'd prioritize merging quickly. For a first PR, it's great code overall. We could add a TODO to the current code so it doesn't look like best practice to the next person.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur...

@max-sixty
Copy link
Collaborator

@tammasloughran this just needs a whatsnew now!

@max-sixty
Copy link
Collaborator

Excellent — thank you @tammasloughran !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants