Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve base and loffset arguments in resample #7444

Merged
merged 18 commits into from
Mar 8, 2023

Conversation

spencerkclark
Copy link
Member

@spencerkclark spencerkclark commented Jan 16, 2023

While pandas is getting set to remove the base and loffset arguments in resample, we have not had a chance to emit a deprecation warning for them yet in xarray (#7420). This PR preserves their functionality in xarray and should hopefully give users some extra time to adapt. Deprecation warnings for each are added so that we can eventually remove them.

I've taken the liberty to define a TimeResampleGrouper object, since we need some way to carry the loffset argument through the resample chain, even though it will no longer be allowed on the pd.Grouper object. Currently it is not particularly complicated, so hopefully it would be straightforward to adapt to what is envisioned in #6610 (comment).

While pandas is getting set to remove these, we have not had a
chance to emit a deprecation warning yet for them in xarray.
This should hopefully give users some extra time to adapt.
@spencerkclark spencerkclark changed the title WIP: preserve base and loffset arguments in resample` WIP: preserve base and loffset arguments in resample Jan 16, 2023
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the checks on base are backwards? It's raising a warning when base is None which makes the doctests fail.

xarray/core/common.py Outdated Show resolved Hide resolved
xarray/core/common.py Show resolved Hide resolved
@keewis
Copy link
Collaborator

keewis commented Mar 7, 2023

both the failing upstream-dev CI and the failing docs CI seem to be unrelated to this. The TimeResampleGrouper seems to do something different from the one in #7561, but I think that's up to that PR to figure out (since we don't yet expose it as public API).

So considering that, how much do you think is needed until we can merge this? And can I help with anything?

@dcherian
Copy link
Contributor

dcherian commented Mar 7, 2023

TimeResampleGrouper seems to do something different from the one in #7561,

The version in that PR is just a container for the factorize method. I'm waiting for this to get merged before continuing

@dcherian
Copy link
Contributor

dcherian commented Mar 7, 2023

Shall we merge?

AFAICT all groupby and resample tests now pass on upstream-dev:

=========================== short test summary info ============================
FAILED xarray/tests/test_backends.py::TestNetCDF4Data::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestNetCDF4ViaDaskData::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestZarrDictStore::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestZarrDirectoryStore::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestZarrKVStoreV3::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestZarrDirectoryStoreV3::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestZarrDirectoryStoreV3::test_write_read_select_write - KeyError: 'var1'
FAILED xarray/tests/test_backends.py::TestZarrDirectoryStoreV3FromPath::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestZarrDirectoryStoreV3FromPath::test_write_read_select_write - KeyError: 'var1'
FAILED xarray/tests/test_backends.py::TestScipyInMemoryData::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestScipyFileObject::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestScipyFilePath::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestNetCDF3ViaNetCDF4Data::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestNetCDF4ClassicViaNetCDF4Data::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestGenericNetCDFData::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestH5NetCDFData::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestH5NetCDFFileObject::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_backends.py::TestH5NetCDFViaDaskData::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[360_day] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[365_day] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[366_day] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[all_leap] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[gregorian] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[julian] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[noleap] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[proleptic_gregorian] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_cftimeindex.py::test_to_datetimeindex_out_of_range[standard] - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_coding_times.py::test_should_cftime_be_used_source_outside_range - Failed: DID NOT RAISE <class 'ValueError'>
FAILED xarray/tests/test_conventions.py::TestCFEncodedDataStore::test_roundtrip_cftime_datetime_data - AssertionError: assert 'days since 1-01-01' == 'days since 0001-01-01'
  - days since 0001-01-01
  ?            ---
  + days since 1-01-01
FAILED xarray/tests/test_dataarray.py::TestDataArray::test_sel_float - NotImplementedError: float16 indexes are not supported
FAILED xarray/tests/test_dataarray.py::TestDataArray::test_to_and_from_cdms2_sgrid - ValueError: operands could not be broadcast together with shapes (3,) (4,)
FAILED xarray/tests/test_units.py::TestVariable::test_aggregation[int64-method_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestVariable::test_aggregation[int64-method_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[float64-function_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[float64-function_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int64-function_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int64-function_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int64-method_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int64-method_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_unary_operations[float64-round] - TypeError: no implementation found for 'numpy.round' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataArray::test_unary_operations[int64-round] - TypeError: no implementation found for 'numpy.round' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataset::test_aggregation[int64-method_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
FAILED xarray/tests/test_units.py::TestDataset::test_aggregation[int64-method_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]
= 43 failed, 15089 passed, 1223 skipped, 168 xfailed, 74 xpassed, 473 warnings in 671.46s (0:11:11) =

@spencerkclark spencerkclark marked this pull request as ready for review March 8, 2023 02:29
@spencerkclark spencerkclark changed the title WIP: preserve base and loffset arguments in resample Preserve base and loffset arguments in resample Mar 8, 2023
@spencerkclark
Copy link
Member Author

@keewis and @dcherian indeed I just wanted to give it another once over. I think it should be ready for review now. Thanks for your patience on this!

@keewis
Copy link
Collaborator

keewis commented Mar 8, 2023

the docs failure is real, I think we need to update

da.resample(time="81T", closed="right", label="right", base=3).mean()
to use the new parameter (not sure how exactly, though)

@spencerkclark
Copy link
Member Author

Thanks @keewis -- I pushed a fix. The documentation build succeeded. The other build failures look unrelated.

@keewis
Copy link
Collaborator

keewis commented Mar 8, 2023

I agree, those do indeed seem unrelated, and I have no idea why the python 3.11 macos CI takes that long to run.

deprecated and will be removed in a future version of xarray. Using the
``origin`` or ``offset`` parameters is recommended as a replacement for using
the ``base`` parameter and using time offset arithmetic is recommended as a
replacement for using the ``loffset`` parameter (:pull:`8459`). By `Spencer
Copy link
Contributor

@dcherian dcherian Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest adding an example of time offset arithmetic here. But then thought we could just link to docs. But turns out we have no docs on this AFAICT! I'll open an issue (see #7596)

@dcherian dcherian merged commit 6d771fc into pydata:main Mar 8, 2023
@spencerkclark spencerkclark deleted the deprecate-base branch March 8, 2023 16:57
dcherian added a commit to dcherian/xarray that referenced this pull request Mar 9, 2023
* main:
  Preserve `base` and `loffset` arguments in `resample` (pydata#7444)
  ignore the `pkg_resources` deprecation warning (pydata#7594)
  Update contains_cftime_datetimes to avoid loading entire variable array (pydata#7494)
  Support first, last with dask arrays (pydata#7562)
  update the docs environment (pydata#7442)
  Add xCDAT to list of Xarray related projects (pydata#7579)
  [pre-commit.ci] pre-commit autoupdate (pydata#7565)
  fix nczarr when libnetcdf>4.8.1 (pydata#7575)
  use numpys SupportsDtype (pydata#7521)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

⚠️ Nightly upstream-dev CI failed ⚠️: pandas removed deprecated keyword arguments
3 participants