-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quarter offset implemented (base is now latest pydata-master). #2721
Conversation
Perfect, @jwenfai, thanks -- I have a busy start to this week, but hopefully I'll get some time to review this by next weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good to me @jwenfai. I only have a few mostly minor comments/suggestions. Please go ahead and add a what's new entry.
Would you mind also adding a single test case to the test for cftime_range
with a quarterly offset? I'm pretty sure things should work fine based on your thorough testing of the offset operations, but it would be good to have at least one test verifying that they work in cftime_range
, since that is essentially the only place they are used in practice in xarray.
xarray/coding/cftime_offsets.py
Outdated
+----------+--------------------------------------------------------------------+ | ||
| Q(S)-FEB | Quarter frequency, anchored at the end (or beginning) of February | | ||
+----------+--------------------------------------------------------------------+ | ||
| Q(S)-MAR | Quarter frequency, anchored at the end (or beginning) of January | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below, fix the full month names at the end of each entry (e.g. this should be March
instead of January
).
xarray/coding/cftime_offsets.py
Outdated
|
||
|
||
def _get_day_of_month(other, day_option): | ||
"""Find the day in `other`'s month that satisfies a DateOffset's onOffset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: update this docstring to use the appropriate names for cftime offsets (e.g. DateOffset
->BaseCFTimeOffset
, day_opt
->day_option
).
xarray/coding/cftime_offsets.py
Outdated
self.month = self._default_month | ||
else: | ||
self.month = month | ||
if not isinstance(self.month, int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe create a helper function for this checking logic (e.g. def _validate_month(month)
), since the same is used in YearOffset
below?
xarray/coding/cftime_offsets.py
Outdated
elif day_option == 'end': | ||
days_in_month = _days_in_month(other) | ||
return days_in_month | ||
elif isinstance(day_option, np.integer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is not currently used, so let's refrain from adding it for now.
xarray/coding/cftime_offsets.py
Outdated
other : cftime.datetime | ||
n : number of periods to increment, before adjusting for rolling | ||
month : int reference month giving the first month of the year | ||
day_option : 'start', 'end', 'business_start', 'business_end', or int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the options that are currently not supported (e.g .'business_start'
etc.).
xarray/coding/cftime_offsets.py
Outdated
|
||
See Also | ||
-------- | ||
get_day_of_month : Find the day in a month provided an offset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: in your definition this function has a leading underscore, i.e. _get_day_of_month
.
xarray/tests/test_cftime_offsets.py
Outdated
@@ -562,19 +707,22 @@ def test_onOffset(calendar, date_args, offset, expected): | |||
|
|||
|
|||
@pytest.mark.parametrize( | |||
('year_month_args', 'sub_day_args', 'offset'), | |||
('year_quarter_month_args', 'sub_day_args', 'offset'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this argument name was meant to denote that it is a length-two tuple with the "year and month" of the date tested, so there is no need to change the name.
xarray/coding/cftime_offsets.py
Outdated
month = 2 corresponds to dates like 2/28/2007, 5/31/2007, ... | ||
month = 3 corresponds to dates like 3/31/2007, 6/30/2007, ... | ||
""" | ||
# In pandas, _from_name_startingMonth = 1 used when freq='QS' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...this is indeed confusing behavior in pandas. Reluctantly I'm inclined to match it for consistency, so maybe let's go with _default_month = 3
here? In the _FREQUENCIES
dictionary you can then just use partial(QuarterBegin, month=1)
for the 'QS'
entry, so that it behaves properly in to_offset
.
xarray/coding/cftime_offsets.py
Outdated
""" | ||
# In pandas, QuarterOffset._from_name suffix == 'DEC' | ||
# See _lite_rule_alias in pandas._libs.tslibs.frequencies | ||
_default_month = 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my comment in QuarterBegin
above, I'm inclined to match pandas' behavior here for consistency, so maybe let's go with _default_month = 3
here. In the _FREQUENCIES
dictionary you can then just use partial(QuarterEnd, month=12)
for the 'Q'
entry, so that it behaves properly in to_offset
.
Fixes implemented and tests are all passing. I had to modify the tests in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-Authored-By: jwenfai <jwenfai@gmail.com>
Sure, I'll add resample support back in once #2593 is merged. |
…ss redundancy in CFTimeIndex resampling tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding back in the resample support -- just a few more minor comments.
doc/whats-new.rst
Outdated
@@ -47,6 +47,8 @@ Enhancements | |||
report showing what exactly differs between the two objects (dimensions / | |||
coordinates / variables / attributes) (:issue:`1507`). | |||
By `Benoit Bovy <https://github.com/benbovy>`_. | |||
- CFTimeIndex now supports QuarterBegin and QuarterEnd offsets. (:issue:`2663`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this duplicate entry.
xarray/coding/cftime_offsets.py
Outdated
|
||
def __init__(self, n=1, normalize=False, month=None): | ||
BaseCFTimeOffset.__init__(self, n) | ||
self.normalize = normalize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentionally put off adding support for the normalize
option in the other offsets, because it was not necessary for cftime_range
(and by extension resample
). Even for these quarterly offsets, you would need to do some more work to fully support/test it (e.g. you would need some modifications to __apply__
as well -- note the use of the @apply_wraps
decorator in pandas). Let's maybe hold off on adding it until there is a need.
xarray/coding/cftime_offsets.py
Outdated
return _get_day_of_month(other, self._day_option) | ||
|
||
|
||
def _is_normalized(datetime): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment related to the normalize
option; as a result I think we can hold off on adding this function for now.
…ffsets.py. Removed redundant lines in whats-new.rst.
I didn't realize support for normalization was more involved; normalization-related code should all be removed now. Duplicate entry on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last couple small things; otherwise this looks good to me.
xarray/coding/cftime_offsets.py
Outdated
day_option : 'start', 'end' | ||
'start': returns 1 | ||
'end': returns last day of the month | ||
int: returns the day in the month indicated by `other`, or the last of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've since removed this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
'8D', '8001D', | ||
'2MS', '3MS', | ||
'2QS-AUG', '3QS-SEP', | ||
'3AS-MAR', '4A-MAY']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems in refactoring these tests (which in general I'm fine with), we lost test cases where pandas resampling would fail. Do you know of one you could add back in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added them back in. For reference, pandas resampling is failing when resampling to small daily frequencies (probably sub-monthly) such as 3D or 8D and the value for base is 24. The frequency of the original time range does not matter.
…back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D').
Hello @jwenfai! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on February 26, 2019 at 00:48 Hours UTC |
@jwenfai I made a couple minor edits to the docstrings/comments to bring them up to date (let me know if those look ok). If there are no more comments from others in the next few days, I'll go ahead and merge this. |
@spencerkclark Thanks for improving the docstring and cleaning up the extra whitespaces. They look ok to me. |
Going to go ahead and merge this now; thanks @jwenfai! |
* upstream/master: Rework whats-new for 0.12 Add whats-new for 0.12.1 Release 0.12.0 enable loading remote hdf5 files (pydata#2782) Push back finalizing deprecations for 0.12 (pydata#2809) Drop failing tests writing multi-dimensional arrays as attributes (pydata#2810) some docs updates (pydata#2746) Add support for cftime.datetime coordinates with coarsen (pydata#2778) Don't use deprecated np.asscalar() (pydata#2800) Improve name concat (pydata#2792) Add `Dataset.drop_dims` (pydata#2767) Quarter offset implemented (base is now latest pydata-master). (pydata#2721) Add use_cftime option to open_dataset (pydata#2759) Bugfix/reduce no axis (pydata#2769) 'standard' now refers to 'gregorian' in cftime_range (pydata#2771)
…a#2721) * Quarter offset implemented (base is now latest pydata-master). * Fixed issues raised in review (pydata#2721 (review)) * Updated whats-new.rst with info on quarter offset support. * Updated whats-new.rst with info on quarter offset support. * Update doc/whats-new.rst Co-Authored-By: jwenfai <jwenfai@gmail.com> * Added support for quarter frequencies when resampling CFTimeIndex. Less redundancy in CFTimeIndex resampling tests. * Removed normalization code (unnecessary for cftime_range) in cftime_offsets.py. Removed redundant lines in whats-new.rst. * Removed invalid option from _get_day_of_month docstring. Added tests back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D'). * Minor edits to docstrings/comments * lint
…a#2721) * Quarter offset implemented (base is now latest pydata-master). * Fixed issues raised in review (pydata#2721 (review)) * Updated whats-new.rst with info on quarter offset support. * Updated whats-new.rst with info on quarter offset support. * Update doc/whats-new.rst Co-Authored-By: jwenfai <jwenfai@gmail.com> * Added support for quarter frequencies when resampling CFTimeIndex. Less redundancy in CFTimeIndex resampling tests. * Removed normalization code (unnecessary for cftime_range) in cftime_offsets.py. Removed redundant lines in whats-new.rst. * Removed invalid option from _get_day_of_month docstring. Added tests back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D'). * Minor edits to docstrings/comments * lint
…ns with size>1 (#2757) * Quarter offset implemented (base is now latest pydata-master). (#2721) * Quarter offset implemented (base is now latest pydata-master). * Fixed issues raised in review (#2721 (review)) * Updated whats-new.rst with info on quarter offset support. * Updated whats-new.rst with info on quarter offset support. * Update doc/whats-new.rst Co-Authored-By: jwenfai <jwenfai@gmail.com> * Added support for quarter frequencies when resampling CFTimeIndex. Less redundancy in CFTimeIndex resampling tests. * Removed normalization code (unnecessary for cftime_range) in cftime_offsets.py. Removed redundant lines in whats-new.rst. * Removed invalid option from _get_day_of_month docstring. Added tests back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D'). * Minor edits to docstrings/comments * lint * Add `Dataset.drop_dims` (#2767) * ENH: Add Dataset.drop_dims() * Drops full dimensions and any corresponding variables in a Dataset * Fixes GH1949 * DOC: Add Dataset.drop_dims() documentation * Improve name concat (#2792) * Added tests of desired name inferring behaviour * Infers names * updated what's new * Don't use deprecated np.asscalar() (#2800) It got deprecated in numpy 1.16 and throws a ton of warnings due to that. All the function does is returning .item() anyway, which is why it got deprecated. * Add support for cftime.datetime coordinates with coarsen (#2778) * some docs updates (#2746) * Friendlier io title. * Fix lists. * Fix *args, **kwargs "inline emphasis..." * misc * Reference xarray_extras for csv writing. Closes #2289 * Add metpy accessor. Closes #461 * fix transpose docstring. Closes #2576 * Revert "Fix lists." This reverts commit 39983a5. * Revert "Fix *args, **kwargs" This reverts commit 1b9da35. * Add MetPy to related projects. * Add Weather and Climate specific page. * Add hvplot. * Note open_dataset, mfdataset open files as read-only (closes #2345). * Update metpy 1 Co-Authored-By: dcherian <dcherian@users.noreply.github.com> * Update doc/weather-climate.rst Co-Authored-By: dcherian <dcherian@users.noreply.github.com> * Drop failing tests writing multi-dimensional arrays as attributes (#2810) These aren't valid for netCDF files. Fixes GH2803 * Push back finalizing deprecations for 0.12 (#2809) 0.12 will already have a big change in dropping Python 2.7 support. I'd rather wait a bit longer to finalize these deprecations to minimize the impact on users. * enable loading remote hdf5 files (#2782) * attempt at loading remote hdf5 * added a couple tests * rewind bytes after reading header * addressed comments for tests and error message * fixed pep8 formatting * created _get_engine_from_magic_number function, new tests * added description in whats-new * fixed test failure on windows * same error on windows and nix * Release 0.12.0 * Add whats-new for 0.12.1 * Rework whats-new for 0.12 * DOC: Update donation links * DOC: remove outdated warning (#2818) * Allow expand_dims() method to support inserting/broadcasting dimensions with size>1 (#2757) * Make using dim_kwargs for python 3.5 illegal -- a ValueError is thrown * dataset.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * dataarray.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * Add alternative option to passing a dict to the dim argument, which is now an optional kwarg, passing in each new dimension as its own kwarg * Add expand_dims enhancement from issue 2710 to whats-new.rst * Fix test_dataarray.TestDataArray.test_expand_dims_with_greater_dim_size tests to pass in python 3.5 using ordered dicts instead of regular dicts. This was needed because python 3.5 and earlier did not maintain insertion order for dicts * Restrict core logic to use 'dim' as a dict--it will be converted into a dict on entry if it is a str or a sequence of str * Don't cast dim values (coords) as a list since IndexVariable/Variable will internally convert it into a numpy.ndarray. So just use IndexVariable((k,), v) * TypeErrors should be raised for invalid input types, rather than ValueErrors. * Force 'dim' to be OrderedDict for python 3.5 * Allow expand_dims() method to support inserting/broadcasting dimensions with size>1 (#2757) * use .size attribute to determine the size of a dimension, rather than converting to a list, which can be slow for large iterables * Make using dim_kwargs for python 3.5 illegal -- a ValueError is thrown * dataset.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * dataarray.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * Add alternative option to passing a dict to the dim argument, which is now an optional kwarg, passing in each new dimension as its own kwarg * Add expand_dims enhancement from issue 2710 to whats-new.rst * Fix test_dataarray.TestDataArray.test_expand_dims_with_greater_dim_size tests to pass in python 3.5 using ordered dicts instead of regular dicts. This was needed because python 3.5 and earlier did not maintain insertion order for dicts * Restrict core logic to use 'dim' as a dict--it will be converted into a dict on entry if it is a str or a sequence of str * Don't cast dim values (coords) as a list since IndexVariable/Variable will internally convert it into a numpy.ndarray. So just use IndexVariable((k,), v) * TypeErrors should be raised for invalid input types, rather than ValueErrors. * Force 'dim' to be OrderedDict for python 3.5 * Allow expand_dims() method to support inserting/broadcasting dimensions with size>1 (#2757) * Move enhancement description up to 0.12.1 * use .size attribute to determine the size of a dimension, rather than converting to a list, which can be slow for large iterables * Make using dim_kwargs for python 3.5 illegal -- a ValueError is thrown * dataset.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * dataarray.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * Add alternative option to passing a dict to the dim argument, which is now an optional kwarg, passing in each new dimension as its own kwarg * Add expand_dims enhancement from issue 2710 to whats-new.rst * Fix test_dataarray.TestDataArray.test_expand_dims_with_greater_dim_size tests to pass in python 3.5 using ordered dicts instead of regular dicts. This was needed because python 3.5 and earlier did not maintain insertion order for dicts * Restrict core logic to use 'dim' as a dict--it will be converted into a dict on entry if it is a str or a sequence of str * Don't cast dim values (coords) as a list since IndexVariable/Variable will internally convert it into a numpy.ndarray. So just use IndexVariable((k,), v) * TypeErrors should be raised for invalid input types, rather than ValueErrors. * Force 'dim' to be OrderedDict for python 3.5
whats-new.rst
for all changes andapi.rst
for new APIWaiting for reviews before editing
whats-new.rst
.