-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix datetime_to_numeric and Variable._to_numeric #2668
Conversation
xarray/core/duck_array_ops.py
Outdated
else: | ||
print(type(array), array.dtype) | ||
print(array) | ||
return array.astype(dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests pass a np.ndarray
with object dtype, an entry of which is datetime.datetime
, such as
<class 'numpy.ndarray'> object
[datetime.timedelta(0) datetime.timedelta(days=1)
datetime.timedelta(days=2)]
Is it an expected behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think astype()
does any conversion for object arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fujiisoup this function might be helpful for casting the object arrays:
xarray/xarray/core/variable.py
Lines 132 to 136 in d4c4682
def _possibly_convert_objects(values): | |
"""Convert arrays of datetime.datetime and datetime.timedelta objects into | |
datetime64 and timedelta64, according to the pandas convention. | |
""" | |
return np.asarray(pd.Series(values.ravel())).reshape(values.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still tests are failing.
Probably I do not well understand how cftime is working now...
@spencerkclark what convention do you use for missing values in cftime arrays? |
Currently there is nothing built in to xarray to support missing values for cftime arrays. I think we would have to come up with a convention. |
OK, in that case let's only worry about NaT for datetime64/timedelta64
arrays.
…On Fri, Jan 11, 2019 at 2:37 PM Spencer Clark ***@***.***> wrote:
what convention do you use for missing values in cftime arrays?
Currently there is nothing built in to xarray to support missing values
for cftime arrays. I think we would have to come up with a convention.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2668 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1iSjXGAEXm6e5gqGCY390bNdMIiMks5vCRISgaJpZM4Z8P5F>
.
|
(at least for now, until we can figure out how to handle missing values)
…On Fri, Jan 11, 2019 at 2:44 PM Stephan Hoyer ***@***.***> wrote:
OK, in that case let's only worry about NaT for datetime64/timedelta64
arrays.
On Fri, Jan 11, 2019 at 2:37 PM Spencer Clark ***@***.***>
wrote:
> what convention do you use for missing values in cftime arrays?
>
> Currently there is nothing built in to xarray to support missing values
> for cftime arrays. I think we would have to come up with a convention.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#2668 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ABKS1iSjXGAEXm6e5gqGCY390bNdMIiMks5vCRISgaJpZM4Z8P5F>
> .
>
|
Hello @fujiisoup! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on February 10, 2019 at 15:04 Hours UTC |
I couldn't make |
Sounds good to me. It's definitely best to handle numpy arrays and xarray.Variable objects separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fujiisoup for your working cleaning up this logic; I have a few initial comments/questions.
doc/whats-new.rst
Outdated
@@ -40,6 +40,8 @@ Enhancements | |||
Bug fixes | |||
~~~~~~~~~ | |||
|
|||
- Bug fix for interpolation with an datetime array. (:issue:`2668`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @dcherian caught this bug before it made it into a release, so maybe we do not need a bug-fix note?
xarray/tests/test_interp.py
Outdated
|
||
@requires_cftime | ||
@requires_scipy | ||
def test_cftime_interp(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think you can find a better name for this test? It does not seem to use cftime.
@@ -24,6 +24,7 @@ | |||
from .coordinates import ( | |||
DatasetCoordinates, LevelCoordinatesSource, | |||
assert_coordinate_consistent, remap_label_indexers) | |||
from .duck_array_ops import datetime_to_numeric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within differentiate
, we use datetime_to_numeric
on Variable objects; I think you should switch to the new Variable-specific method there.
xarray/core/duck_array_ops.py
Outdated
array = array - offset | ||
|
||
if datetime_unit: | ||
array = array / np.timedelta64(1, datetime_unit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to move this to after we potentially convert an array of datetime.timedelta
objects to np.timedelta64
(otherwise this will raise an error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To explain this a little more -- for cftime dates, array - offset
(a few lines above) will return an array of datetime.timedelta
objects. When datetime_to_numeric
was used with xarray.Variable
objects, we did not have to worry about converting to np.timedelta64
, because that would happen automatically. In the case of pure NumPy or dask arrays, we do.
Thanks, @spencerkclark It looks I need more work for this (probably a week). |
Thanks @fujiisoup, no worries. There was only one place where I saw a potential issue for cftime dates, which I noted above. I made a PR to your branch with the fix (fujiisoup#9); let me know if there were other places with issues and I can sort those out too. |
Some minor changes to try and help with cftime dates
Thanks, @spencerkclark, for the PR. I will finish it up tonight (hopefully). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good @fujiisoup.
In #2667 you mentioned:
Then, it would be probably nice if datetime_to_numeric only accepts np.ndarray and da.array not Variable or DataArray, and move this function to duck_array_ops.
By da.array
I'm assuming you mean dask arrays? Should we strive to have compatibility in datetime_to_numeric
there? Right now for dask arrays with dtype datetime64[ns]
it eagerly loads the data, while for cftime dask arrays there is an error. I think both issues would be reasonably straightforward to fix, but do you think it's worth the effort?
Again I'm happy to help out, particularly on the cftime side.
xarray/tests/test_interp.py
Outdated
@@ -573,3 +573,17 @@ def test_cftime_to_non_cftime_error(): | |||
|
|||
with pytest.raises(TypeError): | |||
da.interp(time=0.5) | |||
|
|||
|
|||
@requires_cftime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, please remove this decorator too.
# Conflicts: # xarray/core/dataset.py # xarray/core/missing.py
Sorry for leaving it for a long time.
As I'm not a heavy user of |
No worries @fujiisoup, this is totally fine with me. I think the most common use case for datetime arrays is in indexes, which currently need to be stored in memory anyway, so I don't think the dask use case would come up very often. |
Thanks @spencerkclark for reviewing. Merged. |
Thanks for wrapping this up @fujiisoup! |
* master: typo in whats_new (pydata#2763) Update computation.py to use Python 3 function signatures (pydata#2756) add h5netcdf+dask tests (pydata#2737) Fix name loss when masking (pydata#2749) fix datetime_to_numeric and Variable._to_numeric (pydata#2668) Fix mypy errors (pydata#2753) enable internal plotting with cftime datetime (pydata#2665) remove references to cyordereddict (pydata#2750) BUG: Pass kwargs to the FileManager for pynio engine (pydata#2380) (pydata#2732) reintroduce pynio/rasterio/iris to py36 test env (pydata#2738) Fix CRS being WKT instead of PROJ.4 (pydata#2715) Refactor (part of) dataset.py to use explicit indexes (pydata#2696)
whats-new.rst
for all changes andapi.rst
for new APIStarted to fixing #2667