-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: correct dask array handling in _calc_idxminmax #3922
FIX: correct dask array handling in _calc_idxminmax #3922
Conversation
Hello @kmuehlbauer! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-05-09 14:03:57 UTC |
xarray/core/computation.py
Outdated
if isinstance(array.data, dask_array_type): | ||
res = array.map_blocks( | ||
lambda a, b: a[b], coordarray, indx, dtype=indx.dtype | ||
).compute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What breaks if you don't compute
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
further down, this will break:
# The dim is gone but we need to remove the corresponding coordinate.
del res.coords[dim]
# Copy attributes from argmin/argmax, if any
res.attrs = indx.attrs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My dask-knowledge lacks, so I was not able to come up with a better solution, unfortunately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah this was quite broken. I just pushed a commit. Please see if that works. Clearly we need to add some tests with dask-backed objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcherian Thanks, I'll have a look first thing next morning (CET).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for adding dask tests
@dcherian This seemed to work until computation:
where Full Traceback: /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in compute(self, **kwargs)
839 """
840 new = self.copy(deep=False)
--> 841 return new.load(**kwargs)
842
843 def persist(self, **kwargs) -> "DataArray":
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in load(self, **kwargs)
813 dask.array.compute
814 """
--> 815 ds = self._to_temp_dataset().load(**kwargs)
816 new = self._from_temp_dataset(ds)
817 self._variable = new._variable
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in load(self, **kwargs)
654
655 # evaluate all the dask arrays simultaneously
--> 656 evaluated_data = da.compute(*lazy_data.values(), **kwargs)
657
658 for k, data in zip(lazy_data, evaluated_data):
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs)
435 keys = [x.__dask_keys__() for x in collections]
436 postcomputes = [x.__dask_postcompute__() for x in collections]
--> 437 results = schedule(dsk, keys, **kwargs)
438 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
439
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs)
74 pools[thread][num_workers] = pool
75
---> 76 results = get_async(
77 pool.apply_async,
78 len(pool._pool),
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs)
484 _execute_task(task, data) # Re-execute locally
485 else:
--> 486 raise_exception(exc, tb)
487 res, worker_id = loads(res_info)
488 state["cache"][key] = res
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in reraise(exc, tb)
314 if exc.__traceback__ is not tb:
315 raise exc.with_traceback(tb)
--> 316 raise exc
317
318
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
220 try:
221 task, data = loads(task_info)
--> 222 result = _execute_task(task, data)
223 id = get_id()
224 result = dumps((result, id))
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
119 # temporaries by their reference count and can execute certain
120 # operations in-place.
--> 121 return func(*(_execute_task(a, cache) for a in args))
122 elif not ishashable(arg):
123 return arg
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/optimization.py in __call__(self, *args)
980 if not len(args) == len(self.inkeys):
981 raise ValueError("Expected %d args, got %d" % (len(self.inkeys), len(args)))
--> 982 return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
983
984 def __reduce__(self):
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in get(dsk, out, cache)
149 for key in toposort(dsk):
150 task = dsk[key]
--> 151 result = _execute_task(task, cache)
152 cache[key] = result
153 result = _execute_task(out, cache)
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
119 # temporaries by their reference count and can execute certain
120 # operations in-place.
--> 121 return func(*(_execute_task(a, cache) for a in args))
122 elif not ishashable(arg):
123 return arg
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/computation.py in <lambda>(ind, coord)
1387 res = indx.copy(
1388 data=indx.data.map_blocks(
-> 1389 lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype
1390 )
1391 )
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in __getitem__(self, key)
642 else:
643 # xarray-style array indexing
--> 644 return self.isel(indexers=self._item_key_to_dict(key))
645
646 def __setitem__(self, key: Any, value: Any) -> None:
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in isel(self, indexers, drop, **indexers_kwargs)
1020 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "isel")
1021 if any(is_fancy_indexer(idx) for idx in indexers.values()):
-> 1022 ds = self._to_temp_dataset()._isel_fancy(indexers, drop=drop)
1023 return self._from_temp_dataset(ds)
1024
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in _isel_fancy(self, indexers, drop)
1962 # Note: we need to preserve the original indexers variable in order to merge the
1963 # coords below
-> 1964 indexers_list = list(self._validate_indexers(indexers))
1965
1966 variables: Dict[Hashable, Variable] = {}
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in _validate_indexers(self, indexers)
1805
1806 if v.ndim > 1:
-> 1807 raise IndexError(
1808 "Unlabeled multi-dimensional array cannot be "
1809 "used for indexing: {}".format(k)
IndexError: Unlabeled multi-dimensional array cannot be used for indexing: array_bin |
@kmuehlbauer what's the code that's running to generate that traceback? I can try and help in lieu of @dcherian Thanks for giving this a go And CC @toddrjen if they have any insight |
@max-sixty Thanks! I'll really appreciate your help. I've tracked the possible source down to a dimension problem. I've tried to create a minimal example as follows using the current # create dask backed 3d array
darray = da.from_array(np.random.RandomState(0).randn(10*20*30).reshape(10, 20, 30), chunks=(10, 20, 30), name='data_arr')
array = xr.DataArray(darray, dims=["x", "y", 'z'])
array = array.assign_coords({'x': (['x'], np.arange(10)),
'y': (['y'], np.arange(20)),
'z': (['z'], np.arange(30)),
})
func=lambda x, *args, **kwargs: x.argmax(*args, **kwargs)
indx = func(array, dim='z', axis=None, keep_attrs=True, skipna=False)
coordarray = array['z']
res = indx.copy(
data=indx.data.map_blocks(
lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype
)
)
print(res)
# the following line break breaks
print(res.compute())
# using only 2dim array everything works as intended
array2d = array.sel(y=0, drop=True)
indx = func(array2d, dim='z', axis=None, keep_attrs=True, skipna=False)
coordarray = array['z']
res = indx.copy(
data=indx.data.map_blocks(
lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype
)
)
print(res)
# this works for two dim data
print(res.compute()) |
The issue is that To get your 3D example (and potentially every N-D example) to work, simply fall back to the wrapped array's integer indexing (using In [2]: darray = da.from_array(
...: np.random.RandomState(0).randn(10 *20 * 30).reshape(10, 20, 30),
...: chunks=(1, 20, 30), # so we actually have multiple blocks
...: name='data_arr'
...: )
...: array = xr.DataArray(
...: darray,
...: dims=["x", "y", 'z'],
...: coords={"x": np.arange(10), "y": np.arange(20), "z": np.arange(30)},
...: )
...: array
Out[2]:
<xarray.DataArray 'data_arr' (x: 10, y: 20, z: 30)>
dask.array<data_arr, shape=(10, 20, 30), dtype=float64, chunksize=(1, 20, 30), chunktype=numpy.ndarray>
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
* z (z) int64 0 1 2 3 4 5 6 7 8 9 10 ... 20 21 22 23 24 25 26 27 28 29
In [3]: indx = array.argmin(dim='z', keep_attrs=True, skipna=False)
...: res = indx.copy(
...: data=indx.data.map_blocks(
...: lambda ind, coord: coord[(ind,)],
...: array.z.data,
...: dtype=array.z.dtype
...: )
...: )
In [4]: res.compute()
Out[4]:
<xarray.DataArray 'data_arr' (x: 10, y: 20)>
array([[20, 3, 3, 11, 20, 17, 3, 27, 24, 1, 7, 4, 22, 14, 7, 18,
5, 18, 7, 19],
[10, 21, 25, 3, 15, 25, 28, 8, 10, 9, 13, 3, 24, 17, 19, 23,
12, 19, 19, 28],
[ 1, 26, 10, 9, 16, 8, 17, 8, 6, 24, 28, 13, 23, 22, 26, 13,
28, 11, 6, 16],
[ 6, 9, 26, 27, 1, 2, 21, 8, 10, 19, 14, 14, 20, 25, 24, 4,
18, 12, 20, 2],
[22, 5, 12, 17, 13, 23, 23, 8, 27, 22, 1, 19, 26, 16, 12, 17,
19, 28, 8, 12],
[20, 8, 25, 13, 4, 12, 23, 13, 27, 18, 15, 28, 10, 10, 0, 12,
5, 14, 5, 27],
[29, 0, 19, 7, 15, 2, 8, 8, 13, 4, 12, 1, 7, 19, 14, 0,
3, 7, 12, 9],
[ 9, 8, 4, 9, 17, 6, 7, 5, 29, 0, 15, 28, 22, 6, 24, 24,
20, 0, 24, 23],
[ 1, 19, 12, 20, 4, 26, 5, 13, 21, 26, 25, 10, 5, 1, 11, 21,
6, 18, 4, 21],
[15, 27, 13, 7, 25, 3, 14, 14, 17, 15, 11, 4, 16, 22, 22, 23,
0, 16, 26, 13]])
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Note that in this case |
@keewis Thanks a bunch for the explanation. Would we be on the safe side, if we use your proposed N-D example? It also works in the 2d-case. |
👍 |
@keewis I checked with my datasets, works like a charm. I'll try to add dask tests for this as @shoyer suggested. Where should these tests go? Currently the idxmax/idxmin tests are in test_dataarray and test_dataset: xarray/xarray/tests/test_dataarray.py Line 4512 in b3bafee
xarray/xarray/tests/test_dataarray.py Line 4608 in b3bafee
xarray/xarray/tests/test_dataset.py Lines 4603 to 4610 in 1416d5a
Any pointers? |
I'd put the Edit: that's easy for |
…on, attach dim name to result
@keewis I've started by adding the dask tests to the existing |
it seems you can't use In [24]: time = np.asarray(pd.date_range("2019-07-17", periods=10))
...: array = xr.DataArray(
...: time,
...: dims="x",
...: coords={"x": np.arange(time.size) * 4},
...: ).chunk({})
...: array
Out[24]:
<xarray.DataArray (x: 10)>
dask.array<xarray-<this-array>, shape=(10,), dtype=datetime64[ns], chunksize=(10,), chunktype=numpy.ndarray>
Coordinates:
* x (x) int64 0 4 8 12 16 20 24 28 32 36
In [25]: array.compute().argmin(dim="x")
Out[25]:
<xarray.DataArray ()>
array(0)
In [26]: array.argmin(dim="x")
---------------------------------------------------------------------------
UFuncTypeError Traceback (most recent call last)
<ipython-input-26-e665d5b1b9b4> in <module>
----> 1 array.argmin(dim="x")
.../xarray/core/common.py in wrapped_func(self, dim, axis, skipna, **kwargs)
44
45 def wrapped_func(self, dim=None, axis=None, skipna=None, **kwargs):
---> 46 return self.reduce(func, dim, axis, skipna=skipna, **kwargs)
47
48 else:
.../xarray/core/dataarray.py in reduce(self, func, dim, axis, keep_attrs, keepdims, **kwargs)
2260 """
2261
-> 2262 var = self.variable.reduce(func, dim, axis, keep_attrs, keepdims, **kwargs)
2263 return self._replace_maybe_drop_dims(var)
2264
.../xarray/core/variable.py in reduce(self, func, dim, axis, keep_attrs, keepdims, allow_lazy, **kwargs)
1573
1574 if axis is not None:
-> 1575 data = func(input_data, axis=axis, **kwargs)
1576 else:
1577 data = func(input_data, **kwargs)
.../xarray/core/duck_array_ops.py in f(values, axis, skipna, **kwargs)
302
303 try:
--> 304 return func(values, axis=axis, **kwargs)
305 except AttributeError:
306 if not isinstance(values, dask_array_type):
.../xarray/core/duck_array_ops.py in f(*args, **kwargs)
45 else:
46 wrapped = getattr(eager_module, name)
---> 47 return wrapped(*args, **kwargs)
48
49 else:
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in wrapped(x, axis, split_every, out)
1002
1003 def wrapped(x, axis=None, split_every=None, out=None):
-> 1004 return arg_reduction(
1005 x, chunk, combine, agg, axis, split_every=split_every, out=out
1006 )
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in arg_reduction(x, chunk, combine, agg, axis, split_every, out)
980 tmp = Array(graph, name, chunks, dtype=x.dtype)
981 dtype = np.argmin([1]).dtype
--> 982 result = _tree_reduce(tmp, agg, axis, False, dtype, split_every, combine)
983 return handle_out(out, result)
984
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in _tree_reduce(x, aggregate, axis, keepdims, dtype, split_every, combine, name, concatenate, reduced_meta)
243 if concatenate:
244 func = compose(func, partial(_concatenate2, axes=axis))
--> 245 return partial_reduce(
246 func,
247 x,
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in partial_reduce(func, x, split_every, keepdims, dtype, name, reduced_meta)
314 if is_arraylike(meta) and meta.ndim != len(out_chunks):
315 if len(out_chunks) == 0:
--> 316 meta = meta.sum()
317 else:
318 meta = meta.reshape((0,) * len(out_chunks))
~/.conda/envs/xarray/lib/python3.8/site-packages/numpy/core/_methods.py in _sum(a, axis, dtype, out, keepdims, initial, where)
36 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
37 initial=_NoValue, where=True):
---> 38 return umr_sum(a, axis, dtype, out, keepdims, initial, where)
39
40 def _prod(a, axis=None, dtype=None, out=None, keepdims=False,
UFuncTypeError: ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('<M8[ns]') I guess that's a Edit: you can reproduce it without MWE with only numpy / dask.arrayIn [32]: time = np.asarray(pd.date_range("2019-07-17", periods=10))
...: np.argmin(da.from_array(time))
---------------------------------------------------------------------------
UFuncTypeError Traceback (most recent call last)
<ipython-input-32-190cb901ff65> in <module>
1 time = np.asarray(pd.date_range("2019-07-17", periods=10))
----> 2 np.argmin(da.from_array(time))
<__array_function__ internals> in argmin(*args, **kwargs)
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/core.py in __array_function__(self, func, types, args, kwargs)
1348 if da_func is func:
1349 return handle_nonmatching_names(func, args, kwargs)
-> 1350 return da_func(*args, **kwargs)
1351
1352 @property
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in wrapped(x, axis, split_every, out)
1002
1003 def wrapped(x, axis=None, split_every=None, out=None):
-> 1004 return arg_reduction(
1005 x, chunk, combine, agg, axis, split_every=split_every, out=out
1006 )
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in arg_reduction(x, chunk, combine, agg, axis, split_every, out)
980 tmp = Array(graph, name, chunks, dtype=x.dtype)
981 dtype = np.argmin([1]).dtype
--> 982 result = _tree_reduce(tmp, agg, axis, False, dtype, split_every, combine)
983 return handle_out(out, result)
984
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in _tree_reduce(x, aggregate, axis, keepdims, dtype, split_every, combine, name, concatenate, reduced_meta)
243 if concatenate:
244 func = compose(func, partial(_concatenate2, axes=axis))
--> 245 return partial_reduce(
246 func,
247 x,
~/.conda/envs/xarray/lib/python3.8/site-packages/dask/array/reductions.py in partial_reduce(func, x, split_every, keepdims, dtype, name, reduced_meta)
314 if is_arraylike(meta) and meta.ndim != len(out_chunks):
315 if len(out_chunks) == 0:
--> 316 meta = meta.sum()
317 else:
318 meta = meta.reshape((0,) * len(out_chunks))
~/.conda/envs/xarray/lib/python3.8/site-packages/numpy/core/_methods.py in _sum(a, axis, dtype, out, keepdims, initial, where)
36 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
37 initial=_NoValue, where=True):
---> 38 return umr_sum(a, axis, dtype, out, keepdims, initial, where)
39
40 def _prod(a, axis=None, dtype=None, out=None, keepdims=False,
UFuncTypeError: ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('<M8[ns]') |
@keewis OK, how should I handle this? Shall we XFAIL the these tests then? |
I think so? For now, xfail if |
…ray, xfail dask tests for dtype dateime64 (M)
Seems that everything goes well, besides the If this is ready for merge, should I extend the idxmin/idxmax section of whats-new.rst ? And how should I distribute the credit for all contributors @dcherian, @keewis, @max-sixty? |
I'd say add a new entry. Also, I think we're all just reviewers so adding just your name should be fine. |
This is ready from my end for final review. Should be merged before #3936, IMHO. |
If your tests are passing now, it's likely that they're computing things to make things work. We should add the |
@dcherian Thanks for the suggestion with the dask compute context, I'll have a look the next day. Nevertheless, I've debugged locally and the res-output of the idxmin/idxmax holds dask data. Anyway, I'll revert to the former working status and leave a comment to this PR in the code. |
I suggest updating the tests before reverting anything. This solution may work ... |
@dcherian I've tried to apply the I'm now totally unsure how to proceed from here. Any guidance very much appreciated. |
@kmuehlbauer I've pushed a commit adding the decorator to just the 2D
I think @shoyer is right here. |
@dcherian Thanks for explaining the decorator a bit more. So it's indeed simpler than I thought. I'll revert to the |
Not a problem. Thanks for working on this! |
A simpler option than using |
…ask_computes()` context to idxmin-tests
@shoyer Thanks for the hint. I'm currently experimenting with the different possibilities. non-dask:
dask:
The relevant code inside idxmin/idxmax: # This will run argmin or argmax.
# indx will be dask if array is dask
indx = func(array, dim=dim, axis=None, keep_attrs=keep_attrs, skipna=skipna)
# separated out for debugging
# with the current test layout coords will not be dask since the array's coords are not dask
coords = array[dim]
# try to make dask of coords as per @shoyer's suggestion
# the below fails silently, but cannot be forced even with trying to
# do something like dask.array.asarray(), this errors out with
# "Cannot assign to the .data attribute of dimension coordinate a.k.a IndexVariable 'x'"
if isinstance(indx.data, dask_array_type):
coords = coords.chunk({})
res = coords[(indx,)] It seems that the map-blocks approach is the only one which seem to work throughout the tests, but one. It fails with array set as array.astype("object") in the test fixture. Reason: dask gets computed within argmin/argmax. I'll revert to the map-blocks now and add the |
Error log of the compute error:
|
Yeah interestingly we don't raise an error when trying to chunk IndexVariables. I've pushed a commit where we extract the underlying numpy array, chunk that, index it and then wrap it up in a DataArray o_O. |
The compute error is from here: Lines 48 to 60 in 6a6f2c8
I think we'll have to rethink the skipna conditions for dask arrays so that the compute doesn't happen. Or figure out why we do this check in the first place. hmm... |
This reverts commit 58901b9.
…k-issues * upstream/master: (22 commits) support darkmode (pydata#4036) Use literal syntax instead of function calls to create the data structure (pydata#4038) Add template xarray object kwarg to map_blocks (pydata#3816) Transpose coords by default (pydata#3824) Remove broken test for Panel with to_pandas() (pydata#4028) Allow warning with cartopy in docs plotting build (pydata#4032) Support overriding existing variables in to_zarr() without appending (pydata#4029) chore: Remove unnecessary comprehension (pydata#4026) fix to_netcdf docstring typo (pydata#4021) Pint support for DataArray (pydata#3643) Apply blackdoc to the documentation (pydata#4012) ensure Variable._repr_html_ works (pydata#3973) Fix handling of abbreviated units like msec (pydata#3998) full_like: error on non-scalar fill_value (pydata#3979) Fix some code quality and bug-risk issues (pydata#3999) DOC: add pandas.DataFrame.to_xarray (pydata#3994) Better chunking error messages for zarr backend (pydata#3983) Silence sphinx warnings (pydata#3990) Fix distributed tests on upstream-dev (pydata#3989) Add multi-dimensional extrapolation example and mention different behavior of kwargs in interp (pydata#3956) ...
The test fails for object arrays because we compute eagerly in To solve this we could
For now, I bumped up |
Thanks @dcherian for getting back to this. To my bad, this adventure went too far for my capabilities. Nevertheless I hope to catch up learning xarray inside. |
* Added chunks='auto' option in dataset.py * FIX: correct dask array handling in _calc_idxminmax (#3922) * FIX: correct dask array handling in _calc_idxminmax * FIX: remove unneeded import, reformat via black * fix idxmax, idxmin with dask arrays * FIX: use array[dim].data in `_calc_idxminmax` as per @keewis suggestion, attach dim name to result * ADD: add dask tests to `idxmin`/`idxmax` dataarray tests * FIX: add back fixture line removed by accident * ADD: complete dask handling in `idxmin`/`idxmax` tests in test_dataarray, xfail dask tests for dtype dateime64 (M) * ADD: add "support dask handling for idxmin/idxmax" in whats-new.rst * MIN: reintroduce changes added by #3953 * MIN: change if-clause to use `and` instead of `&` as per review-comment * MIN: change if-clause to use `and` instead of `&` as per review-comment * WIP: remove dask handling entirely for debugging purposes * Test for dask computes * WIP: re-add dask handling (map_blocks-approach), add `with raise_if_dask_computes()` context to idxmin-tests * Use dask indexing instead of map_blocks. * Better chunk choice. * Return -1 for _nan_argminmax_object if all NaNs along dim * Revert "Return -1 for _nan_argminmax_object if all NaNs along dim" This reverts commit 58901b9. * Raise error for object arrays * No error for object arrays. Instead expect 1 compute in tests. Co-authored-by: dcherian <deepak@cherian.net> * fix the failing flake8 CI (#4057) * rename d and l to dim and length * Fixed typo in rasterio docs (#4063) * Added chunks='auto' option in dataset.py Added changes to whats-new.rst * Added chunks='auto' option in dataset.py Added changes to whats-new.rst * Error fix, catch chunks=None * Minor reformatting + flake8 changes * Added isinstance(chunks, (Number, str)) in dataset.py, passing * format changes * added auto-chunk test for dataarrays * Assert chunk sizes equal in auto-chunk test Co-authored-by: Kai Mühlbauer <kmuehlbauer@users.noreply.github.com> Co-authored-by: dcherian <deepak@cherian.net> Co-authored-by: keewis <keewis@users.noreply.github.com> Co-authored-by: clausmichele <31700619+clausmichele@users.noreply.github.com> Co-authored-by: Keewis <keewis@posteo.de>
Fixes dask handling for implementation in #3871.
isort -rc . && black . && mypy . && flake8
whats-new.rst
for all changes andapi.rst
for new API