Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset.resample fails with certain time offset strings provided to the loffset parameter #8399

Closed
5 tasks done
kafitzgerald opened this issue Nov 1, 2023 · 3 comments · Fixed by #8422
Closed
5 tasks done

Comments

@kafitzgerald
Copy link
Contributor

What happened?

resample fails with offset aliases provided to the loffset argument that do not result in unambiguous timedelta values following #7206. We're running into this over on NCAR/geocat-comp.

I realize the loffset argument is slated to be deprecated anyway, but wanted to at least document this for others who might run into it. Especially since #7596 is still open and the time offset arithmetic gets a little tricky with cftime.

What did you expect to happen?

The operation to complete without error.

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np

dates = xr.cftime_range(start="0001", periods=24, freq="MS", calendar="noleap")
da = xr.DataArray(np.arange(24), coords=[dates], dims=["time"], name="foo")
dar = da.resample({'time':'QS-DEC'},loffset='MS').mean()

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

<stdin>:1: FutureWarning: Following pandas, the `loffset` parameter to resample will be deprecated in a future version of xarray.  Switch to using time offset arithmetic.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/dataarray.py", line 7087, in resample
    return self._resample(
           ^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/common.py", line 1055, in _resample
    return resample_cls(
           ^^^^^^^^^^^^^
  File "/Users//miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/resample.py", line 49, in __init__
    super().__init__(*args, **kwargs)
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 729, in __init__
    grouper_.factorize(squeeze)
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 377, in factorize
    ) = self._factorize(squeeze)
        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 546, in _factorize
    full_index, first_items, codes_ = self._get_index_and_items()
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 519, in _get_index_and_items
    first_items, codes = self.first_items()
                         ^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/groupby.py", line 531, in first_items
    return self.index_grouper.first_items(self.group_as_index)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/xarray/core/resample_cftime.py", line 155, in first_items
    labels = labels + pd.to_timedelta(self.loffset)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/pandas/core/tools/timedeltas.py", line 223, in to_timedelta
    return _coerce_scalar_to_timedelta_type(arg, unit=unit, errors=errors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/username/miniconda3/envs/pandas/lib/python3.12/site-packages/pandas/core/tools/timedeltas.py", line 233, in _coerce_scalar_to_timedelta_type
    result = Timedelta(r, unit)
             ^^^^^^^^^^^^^^^^^^
  File "timedeltas.pyx", line 1820, in pandas._libs.tslibs.timedeltas.Timedelta.__new__
  File "timedeltas.pyx", line 653, in pandas._libs.tslibs.timedeltas.parse_timedelta_string
ValueError: unit abbreviation w/o a number

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.0 | packaged by conda-forge | (main, Oct 3 2023, 08:36:57) [Clang 15.0.7 ]
python-bits: 64
OS: Darwin
OS-release: 21.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2023.10.1
pandas: 2.1.2
numpy: 1.26.0
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.3
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.8.0
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None

@kafitzgerald kafitzgerald added bug needs triage Issue that has not been reviewed by xarray team member labels Nov 1, 2023
Copy link

welcome bot commented Nov 1, 2023

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@spencerkclark
Copy link
Member

spencerkclark commented Nov 2, 2023

Thanks for the report @kafitzgerald — indeed we should see about getting this to work despite the deprecation. It may just be a matter of inserting the appropriate logic around this line:

labels = labels + pd.to_timedelta(self.loffset)

In other words, in the case that self.loffset is a string we could use xarray.coding.cftime_offsets.to_offset instead of pd.to_timedelta. I think we'd be happy to take a PR if you're up for it.

@spencerkclark spencerkclark added topic-cftime and removed needs triage Issue that has not been reviewed by xarray team member labels Nov 2, 2023
@kafitzgerald
Copy link
Contributor Author

Thanks, I'll take a look and see if I can submit a quick PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants