Quarter offset implemented (base is now latest pydata-master). #2721

jwenfai · 2019-01-27T22:08:32Z

Closes Implement quarterly frequencies for cftime #2663
Tests added
Fully documented, including whats-new.rst for all changes and api.rst for new API

Waiting for reviews before editing whats-new.rst.

spencerkclark · 2019-01-27T22:10:06Z

Perfect, @jwenfai, thanks -- I have a busy start to this week, but hopefully I'll get some time to review this by next weekend.

spencerkclark

This looks really good to me @jwenfai. I only have a few mostly minor comments/suggestions. Please go ahead and add a what's new entry.

Would you mind also adding a single test case to the test for cftime_range with a quarterly offset? I'm pretty sure things should work fine based on your thorough testing of the offset operations, but it would be good to have at least one test verifying that they work in cftime_range, since that is essentially the only place they are used in practice in xarray.

spencerkclark · 2019-02-02T12:43:57Z

xarray/coding/cftime_offsets.py

+    +----------+--------------------------------------------------------------------+
+    | Q(S)-FEB | Quarter frequency, anchored at the end (or beginning) of February  |
+    +----------+--------------------------------------------------------------------+
+    | Q(S)-MAR | Quarter frequency, anchored at the end (or beginning) of January   |


Here and below, fix the full month names at the end of each entry (e.g. this should be March instead of January).

spencerkclark · 2019-02-02T12:52:27Z

xarray/coding/cftime_offsets.py

+
+
+def _get_day_of_month(other, day_option):
+    """Find the day in `other`'s month that satisfies a DateOffset's onOffset


nit: update this docstring to use the appropriate names for cftime offsets (e.g. DateOffset->BaseCFTimeOffset, day_opt->day_option).

spencerkclark · 2019-02-02T13:14:32Z

xarray/coding/cftime_offsets.py

+            self.month = self._default_month
+        else:
+            self.month = month
+        if not isinstance(self.month, int):


Maybe create a helper function for this checking logic (e.g. def _validate_month(month)), since the same is used in YearOffset below?

spencerkclark · 2019-02-02T13:34:54Z

xarray/coding/cftime_offsets.py

+    elif day_option == 'end':
+        days_in_month = _days_in_month(other)
+        return days_in_month
+    elif isinstance(day_option, np.integer):


This case is not currently used, so let's refrain from adding it for now.

spencerkclark · 2019-02-02T13:42:20Z

xarray/coding/cftime_offsets.py

+    other : cftime.datetime
+    n : number of periods to increment, before adjusting for rolling
+    month : int reference month giving the first month of the year
+    day_option : 'start', 'end', 'business_start', 'business_end', or int


Remove the options that are currently not supported (e.g .'business_start' etc.).

spencerkclark · 2019-02-02T13:43:50Z

xarray/coding/cftime_offsets.py

+
+    See Also
+    --------
+    get_day_of_month : Find the day in a month provided an offset.


nit: in your definition this function has a leading underscore, i.e. _get_day_of_month.

spencerkclark · 2019-02-02T13:56:35Z

xarray/tests/test_cftime_offsets.py

@@ -562,19 +707,22 @@ def test_onOffset(calendar, date_args, offset, expected):


 @pytest.mark.parametrize(
-    ('year_month_args', 'sub_day_args', 'offset'),
+    ('year_quarter_month_args', 'sub_day_args', 'offset'),


nit: this argument name was meant to denote that it is a length-two tuple with the "year and month" of the date tested, so there is no need to change the name.

spencerkclark · 2019-02-02T14:28:26Z

xarray/coding/cftime_offsets.py

+    month = 2 corresponds to dates like 2/28/2007, 5/31/2007, ...
+    month = 3 corresponds to dates like 3/31/2007, 6/30/2007, ...
+    """
+    # In pandas, _from_name_startingMonth = 1 used when freq='QS'


Hmm...this is indeed confusing behavior in pandas. Reluctantly I'm inclined to match it for consistency, so maybe let's go with _default_month = 3 here? In the _FREQUENCIES dictionary you can then just use partial(QuarterBegin, month=1) for the 'QS' entry, so that it behaves properly in to_offset.

spencerkclark · 2019-02-02T14:31:07Z

xarray/coding/cftime_offsets.py

+    """
+    # In pandas, QuarterOffset._from_name suffix == 'DEC'
+    # See _lite_rule_alias in pandas._libs.tslibs.frequencies
+    _default_month = 12


Similar to my comment in QuarterBegin above, I'm inclined to match pandas' behavior here for consistency, so maybe let's go with _default_month = 3 here. In the _FREQUENCIES dictionary you can then just use partial(QuarterEnd, month=12) for the 'Q' entry, so that it behaves properly in to_offset.

jwenfai · 2019-02-02T21:09:25Z

Fixes implemented and tests are all passing. I had to modify the tests in test_cftime_offsets.py now that _default_month = 3 for QuarterBegin and QuarterEnd.

spencerkclark

Thanks @jwenfai. Once #2593 is merged, would you mind adding back in your changes to fully support resample for these offsets?

doc/whats-new.rst

Co-Authored-By: jwenfai <jwenfai@gmail.com>

jwenfai · 2019-02-02T23:46:43Z

Sure, I'll add resample support back in once #2593 is merged.

…ss redundancy in CFTimeIndex resampling tests.

…o quarter-offset

spencerkclark

Thanks for adding back in the resample support -- just a few more minor comments.

spencerkclark · 2019-02-03T12:17:49Z

doc/whats-new.rst

@@ -47,6 +47,8 @@ Enhancements
  report showing what exactly differs between the two objects (dimensions /
  coordinates / variables / attributes)  (:issue:`1507`).
  By `Benoit Bovy <https://github.com/benbovy>`_.
+- CFTimeIndex now supports QuarterBegin and QuarterEnd offsets. (:issue:`2663`)


Delete this duplicate entry.

spencerkclark · 2019-02-03T12:40:48Z

xarray/coding/cftime_offsets.py

+
+    def __init__(self, n=1, normalize=False, month=None):
+        BaseCFTimeOffset.__init__(self, n)
+        self.normalize = normalize


I intentionally put off adding support for the normalize option in the other offsets, because it was not necessary for cftime_range (and by extension resample). Even for these quarterly offsets, you would need to do some more work to fully support/test it (e.g. you would need some modifications to __apply__ as well -- note the use of the @apply_wraps decorator in pandas). Let's maybe hold off on adding it until there is a need.

spencerkclark · 2019-02-03T12:54:21Z

xarray/coding/cftime_offsets.py

+        return _get_day_of_month(other, self._day_option)
+
+
+def _is_normalized(datetime):


See my comment related to the normalize option; as a result I think we can hold off on adding this function for now.

…ffsets.py. Removed redundant lines in whats-new.rst.

jwenfai · 2019-02-03T19:01:04Z

I didn't realize support for normalization was more involved; normalization-related code should all be removed now. Duplicate entry on whats-new.rst is gone as well.

spencerkclark

Last couple small things; otherwise this looks good to me.

spencerkclark · 2019-02-04T00:18:35Z

xarray/coding/cftime_offsets.py

+    day_option : 'start', 'end'
+        'start': returns 1
+        'end': returns last day of the month
+        int: returns the day in the month indicated by `other`, or the last of


We've since removed this option.

Good catch.

spencerkclark · 2019-02-04T00:22:31Z

xarray/tests/test_cftimeindex_resample.py

+    '8D', '8001D',
+    '2MS', '3MS',
+    '2QS-AUG', '3QS-SEP',
+    '3AS-MAR', '4A-MAY'])


It seems in refactoring these tests (which in general I'm fine with), we lost test cases where pandas resampling would fail. Do you know of one you could add back in?

Added them back in. For reference, pandas resampling is failing when resampling to small daily frequencies (probably sub-monthly) such as 3D or 8D and the value for base is 24. The frequency of the original time range does not matter.

…back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D').

pep8speaks · 2019-02-26T00:47:05Z

Hello @jwenfai! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 26, 2019 at 00:48 Hours UTC

spencerkclark · 2019-02-26T00:53:37Z

@jwenfai I made a couple minor edits to the docstrings/comments to bring them up to date (let me know if those look ok). If there are no more comments from others in the next few days, I'll go ahead and merge this.

jwenfai · 2019-02-26T00:58:25Z

@spencerkclark Thanks for improving the docstring and cleaning up the extra whitespaces. They look ok to me.

spencerkclark · 2019-03-02T01:41:11Z

Going to go ahead and merge this now; thanks @jwenfai!

* upstream/master: Rework whats-new for 0.12 Add whats-new for 0.12.1 Release 0.12.0 enable loading remote hdf5 files (pydata#2782) Push back finalizing deprecations for 0.12 (pydata#2809) Drop failing tests writing multi-dimensional arrays as attributes (pydata#2810) some docs updates (pydata#2746) Add support for cftime.datetime coordinates with coarsen (pydata#2778) Don't use deprecated np.asscalar() (pydata#2800) Improve name concat (pydata#2792) Add `Dataset.drop_dims` (pydata#2767) Quarter offset implemented (base is now latest pydata-master). (pydata#2721) Add use_cftime option to open_dataset (pydata#2759) Bugfix/reduce no axis (pydata#2769) 'standard' now refers to 'gregorian' in cftime_range (pydata#2771)

…a#2721) * Quarter offset implemented (base is now latest pydata-master). * Fixed issues raised in review (pydata#2721 (review)) * Updated whats-new.rst with info on quarter offset support. * Updated whats-new.rst with info on quarter offset support. * Update doc/whats-new.rst Co-Authored-By: jwenfai <jwenfai@gmail.com> * Added support for quarter frequencies when resampling CFTimeIndex. Less redundancy in CFTimeIndex resampling tests. * Removed normalization code (unnecessary for cftime_range) in cftime_offsets.py. Removed redundant lines in whats-new.rst. * Removed invalid option from _get_day_of_month docstring. Added tests back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D'). * Minor edits to docstrings/comments * lint

…ns with size>1 (#2757) * Quarter offset implemented (base is now latest pydata-master). (#2721) * Quarter offset implemented (base is now latest pydata-master). * Fixed issues raised in review (#2721 (review)) * Updated whats-new.rst with info on quarter offset support. * Updated whats-new.rst with info on quarter offset support. * Update doc/whats-new.rst Co-Authored-By: jwenfai <jwenfai@gmail.com> * Added support for quarter frequencies when resampling CFTimeIndex. Less redundancy in CFTimeIndex resampling tests. * Removed normalization code (unnecessary for cftime_range) in cftime_offsets.py. Removed redundant lines in whats-new.rst. * Removed invalid option from _get_day_of_month docstring. Added tests back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D'). * Minor edits to docstrings/comments * lint * Add `Dataset.drop_dims` (#2767) * ENH: Add Dataset.drop_dims() * Drops full dimensions and any corresponding variables in a Dataset * Fixes GH1949 * DOC: Add Dataset.drop_dims() documentation * Improve name concat (#2792) * Added tests of desired name inferring behaviour * Infers names * updated what's new * Don't use deprecated np.asscalar() (#2800) It got deprecated in numpy 1.16 and throws a ton of warnings due to that. All the function does is returning .item() anyway, which is why it got deprecated. * Add support for cftime.datetime coordinates with coarsen (#2778) * some docs updates (#2746) * Friendlier io title. * Fix lists. * Fix *args, **kwargs "inline emphasis..." * misc * Reference xarray_extras for csv writing. Closes #2289 * Add metpy accessor. Closes #461 * fix transpose docstring. Closes #2576 * Revert "Fix lists." This reverts commit 39983a5. * Revert "Fix *args, **kwargs" This reverts commit 1b9da35. * Add MetPy to related projects. * Add Weather and Climate specific page. * Add hvplot. * Note open_dataset, mfdataset open files as read-only (closes #2345). * Update metpy 1 Co-Authored-By: dcherian <dcherian@users.noreply.github.com> * Update doc/weather-climate.rst Co-Authored-By: dcherian <dcherian@users.noreply.github.com> * Drop failing tests writing multi-dimensional arrays as attributes (#2810) These aren't valid for netCDF files. Fixes GH2803 * Push back finalizing deprecations for 0.12 (#2809) 0.12 will already have a big change in dropping Python 2.7 support. I'd rather wait a bit longer to finalize these deprecations to minimize the impact on users. * enable loading remote hdf5 files (#2782) * attempt at loading remote hdf5 * added a couple tests * rewind bytes after reading header * addressed comments for tests and error message * fixed pep8 formatting * created _get_engine_from_magic_number function, new tests * added description in whats-new * fixed test failure on windows * same error on windows and nix * Release 0.12.0 * Add whats-new for 0.12.1 * Rework whats-new for 0.12 * DOC: Update donation links * DOC: remove outdated warning (#2818) * Allow expand_dims() method to support inserting/broadcasting dimensions with size>1 (#2757) * Make using dim_kwargs for python 3.5 illegal -- a ValueError is thrown * dataset.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * dataarray.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * Add alternative option to passing a dict to the dim argument, which is now an optional kwarg, passing in each new dimension as its own kwarg * Add expand_dims enhancement from issue 2710 to whats-new.rst * Fix test_dataarray.TestDataArray.test_expand_dims_with_greater_dim_size tests to pass in python 3.5 using ordered dicts instead of regular dicts. This was needed because python 3.5 and earlier did not maintain insertion order for dicts * Restrict core logic to use 'dim' as a dict--it will be converted into a dict on entry if it is a str or a sequence of str * Don't cast dim values (coords) as a list since IndexVariable/Variable will internally convert it into a numpy.ndarray. So just use IndexVariable((k,), v) * TypeErrors should be raised for invalid input types, rather than ValueErrors. * Force 'dim' to be OrderedDict for python 3.5 * Allow expand_dims() method to support inserting/broadcasting dimensions with size>1 (#2757) * use .size attribute to determine the size of a dimension, rather than converting to a list, which can be slow for large iterables * Make using dim_kwargs for python 3.5 illegal -- a ValueError is thrown * dataset.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * dataarray.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * Add alternative option to passing a dict to the dim argument, which is now an optional kwarg, passing in each new dimension as its own kwarg * Add expand_dims enhancement from issue 2710 to whats-new.rst * Fix test_dataarray.TestDataArray.test_expand_dims_with_greater_dim_size tests to pass in python 3.5 using ordered dicts instead of regular dicts. This was needed because python 3.5 and earlier did not maintain insertion order for dicts * Restrict core logic to use 'dim' as a dict--it will be converted into a dict on entry if it is a str or a sequence of str * Don't cast dim values (coords) as a list since IndexVariable/Variable will internally convert it into a numpy.ndarray. So just use IndexVariable((k,), v) * TypeErrors should be raised for invalid input types, rather than ValueErrors. * Force 'dim' to be OrderedDict for python 3.5 * Allow expand_dims() method to support inserting/broadcasting dimensions with size>1 (#2757) * Move enhancement description up to 0.12.1 * use .size attribute to determine the size of a dimension, rather than converting to a list, which can be slow for large iterables * Make using dim_kwargs for python 3.5 illegal -- a ValueError is thrown * dataset.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * dataarray.expand_dims() method take dict like object where values represent length of dimensions or coordinates of dimesnsions * Add alternative option to passing a dict to the dim argument, which is now an optional kwarg, passing in each new dimension as its own kwarg * Add expand_dims enhancement from issue 2710 to whats-new.rst * Fix test_dataarray.TestDataArray.test_expand_dims_with_greater_dim_size tests to pass in python 3.5 using ordered dicts instead of regular dicts. This was needed because python 3.5 and earlier did not maintain insertion order for dicts * Restrict core logic to use 'dim' as a dict--it will be converted into a dict on entry if it is a str or a sequence of str * Don't cast dim values (coords) as a list since IndexVariable/Variable will internally convert it into a numpy.ndarray. So just use IndexVariable((k,), v) * TypeErrors should be raised for invalid input types, rather than ValueErrors. * Force 'dim' to be OrderedDict for python 3.5

Quarter offset implemented (base is now latest pydata-master).

0a58cba

spencerkclark reviewed Feb 2, 2019

View reviewed changes

jwenfai added 4 commits February 2, 2019 14:46

Fixed issues raised in review (pydata#2721 (review))

ae82957

Updated whats-new.rst with info on quarter offset support.

87c9690

Merge branch 'master' into quarter-offset

cd1bffb

Updated whats-new.rst with info on quarter offset support.

3ab3b3c

spencerkclark reviewed Feb 2, 2019

View reviewed changes

doc/whats-new.rst Outdated Show resolved Hide resolved

Update doc/whats-new.rst

46e9540

Co-Authored-By: jwenfai <jwenfai@gmail.com>

jwenfai added 3 commits February 2, 2019 23:15

Merge branch 'pydata-master' into quarter-offset

018e5f2

Added support for quarter frequencies when resampling CFTimeIndex. Le…

39386fa

…ss redundancy in CFTimeIndex resampling tests.

Merge remote-tracking branch 'ouranosinc-upstream/quarter-offset' int…

3465c00

…o quarter-offset

spencerkclark reviewed Feb 3, 2019

View reviewed changes

Removed normalization code (unnecessary for cftime_range) in cftime_o…

0d086a0

…ffsets.py. Removed redundant lines in whats-new.rst.

spencerkclark approved these changes Feb 4, 2019

View reviewed changes

Removed invalid option from _get_day_of_month docstring. Added tests …

2d0e2b1

…back in that raises ValueError when resampling (base=24 when resampling to daily freq, e.g., '8D').

spencerkclark mentioned this pull request Feb 19, 2019

0.12.0 release #2776

Closed

spencerkclark added 2 commits February 20, 2019 16:50

Merge branch 'master' into quarter-offset

66e103e

Minor edits to docstrings/comments

7a5e0cb

lint

0a1f219

spencerkclark merged commit 627a881 into pydata:master Mar 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quarter offset implemented (base is now latest pydata-master). #2721

Quarter offset implemented (base is now latest pydata-master). #2721

jwenfai commented Jan 27, 2019 •

edited by spencerkclark

Loading

spencerkclark commented Jan 27, 2019

spencerkclark left a comment

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

spencerkclark Feb 2, 2019

jwenfai commented Feb 2, 2019

spencerkclark left a comment

jwenfai commented Feb 2, 2019

spencerkclark left a comment

spencerkclark Feb 3, 2019

spencerkclark Feb 3, 2019

spencerkclark Feb 3, 2019

jwenfai commented Feb 3, 2019

spencerkclark left a comment

spencerkclark Feb 4, 2019

jwenfai Feb 4, 2019

spencerkclark Feb 4, 2019

jwenfai Feb 4, 2019

pep8speaks commented Feb 26, 2019 •

edited

Loading

spencerkclark commented Feb 26, 2019

jwenfai commented Feb 26, 2019

spencerkclark commented Mar 2, 2019



		def _get_day_of_month(other, day_option):
		"""Find the day in `other`'s month that satisfies a DateOffset's onOffset

		return _get_day_of_month(other, self._day_option)


		def _is_normalized(datetime):

Quarter offset implemented (base is now latest pydata-master). #2721

Quarter offset implemented (base is now latest pydata-master). #2721

Conversation

jwenfai commented Jan 27, 2019 • edited by spencerkclark Loading

spencerkclark commented Jan 27, 2019

spencerkclark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwenfai commented Feb 2, 2019

spencerkclark left a comment

Choose a reason for hiding this comment

jwenfai commented Feb 2, 2019

spencerkclark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwenfai commented Feb 3, 2019

spencerkclark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pep8speaks commented Feb 26, 2019 • edited Loading

Comment last updated on February 26, 2019 at 00:48 Hours UTC

spencerkclark commented Feb 26, 2019

jwenfai commented Feb 26, 2019

spencerkclark commented Mar 2, 2019

jwenfai commented Jan 27, 2019 •

edited by spencerkclark

Loading

pep8speaks commented Feb 26, 2019 •

edited

Loading