Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Refactor Date/TimeLikeOps #24038

Merged
merged 2 commits into from
Dec 2, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
224 changes: 222 additions & 2 deletions pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@
from pandas._libs.tslibs.period import (
DIFFERENT_FREQ_INDEX, IncompatibleFrequency, Period)
from pandas._libs.tslibs.timedeltas import Timedelta, delta_to_nanoseconds
from pandas._libs.tslibs.timestamps import maybe_integer_op_deprecated
from pandas._libs.tslibs.timestamps import (
RoundTo, maybe_integer_op_deprecated, round_nsint64)
import pandas.compat as compat
from pandas.errors import (
AbstractMethodError, NullFrequencyError, PerformanceWarning)
from pandas.util._decorators import deprecate_kwarg
from pandas.util._decorators import Appender, deprecate_kwarg

from pandas.core.dtypes.common import (
is_bool_dtype, is_datetime64_any_dtype, is_datetime64_dtype,
Expand Down Expand Up @@ -80,6 +81,189 @@ def _get_attributes_dict(self):
return {k: getattr(self, k, None) for k in self._attributes}


class DatelikeOps(object):
"""
Common ops for DatetimeIndex/PeriodIndex, but not TimedeltaIndex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Index --> Array/Index

"""

def strftime(self, date_format):
from pandas import Index
return Index(self.format(date_format=date_format),
dtype=compat.text_type)
strftime.__doc__ = """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but is there a reason not to put this docstring in the normal docstring place and then accomplish the formatting with @Substitution(...)?

Convert to Index using specified date_format.

Return an Index of formatted strings specified by date_format, which
supports the same string format as the python standard library. Details
of the string format can be found in `python string format doc <{0}>`__

Parameters
----------
date_format : str
Date format string (e.g. "%Y-%m-%d").

Returns
-------
Index
Index of formatted strings

See Also
--------
to_datetime : Convert the given argument to datetime.
DatetimeIndex.normalize : Return DatetimeIndex with times to midnight.
DatetimeIndex.round : Round the DatetimeIndex to the specified freq.
DatetimeIndex.floor : Floor the DatetimeIndex to the specified freq.

Examples
--------
>>> rng = pd.date_range(pd.Timestamp("2018-03-10 09:00"),
... periods=3, freq='s')
>>> rng.strftime('%B %d, %Y, %r')
Index(['March 10, 2018, 09:00:00 AM', 'March 10, 2018, 09:00:01 AM',
'March 10, 2018, 09:00:02 AM'],
dtype='object')
""".format("https://docs.python.org/3/library/datetime.html"
"#strftime-and-strptime-behavior")


class TimelikeOps(object):
"""
Common ops for TimedeltaIndex/DatetimeIndex, but not PeriodIndex.
"""

_round_doc = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid having these in the TimedeltaArray/Index namespace?

"""
Perform {op} operation on the data to the specified `freq`.

Parameters
----------
freq : str or Offset
The frequency level to {op} the index to. Must be a fixed
frequency like 'S' (second) not 'ME' (month end). See
:ref:`frequency aliases <timeseries.offset_aliases>` for
a list of possible `freq` values.
ambiguous : 'infer', bool-ndarray, 'NaT', default 'raise'
Only relevant for DatetimeIndex:

- 'infer' will attempt to infer fall dst-transition hours based on
order
- bool-ndarray where True signifies a DST time, False designates
a non-DST time (note that this flag is only applicable for
ambiguous times)
- 'NaT' will return NaT where there are ambiguous times
- 'raise' will raise an AmbiguousTimeError if there are ambiguous
times

.. versionadded:: 0.24.0
nonexistent : 'shift', 'NaT', default 'raise'
A nonexistent time does not exist in a particular timezone
where clocks moved forward due to DST.

- 'shift' will shift the nonexistent time forward to the closest
existing time
- 'NaT' will return NaT where there are nonexistent times
- 'raise' will raise an NonExistentTimeError if there are
nonexistent times

.. versionadded:: 0.24.0

Returns
-------
DatetimeIndex, TimedeltaIndex, or Series
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array/Index/Series

Index of the same type for a DatetimeIndex or TimedeltaIndex,
or a Series with the same index for a Series.

Raises
------
ValueError if the `freq` cannot be converted.

Examples
--------
**DatetimeIndex**

>>> rng = pd.date_range('1/1/2018 11:59:00', periods=3, freq='min')
>>> rng
DatetimeIndex(['2018-01-01 11:59:00', '2018-01-01 12:00:00',
'2018-01-01 12:01:00'],
dtype='datetime64[ns]', freq='T')
""")

_round_example = (
""">>> rng.round('H')
DatetimeIndex(['2018-01-01 12:00:00', '2018-01-01 12:00:00',
'2018-01-01 12:00:00'],
dtype='datetime64[ns]', freq=None)

**Series**

>>> pd.Series(rng).dt.round("H")
0 2018-01-01 12:00:00
1 2018-01-01 12:00:00
2 2018-01-01 12:00:00
dtype: datetime64[ns]
""")

_floor_example = (
""">>> rng.floor('H')
DatetimeIndex(['2018-01-01 11:00:00', '2018-01-01 12:00:00',
'2018-01-01 12:00:00'],
dtype='datetime64[ns]', freq=None)

**Series**

>>> pd.Series(rng).dt.floor("H")
0 2018-01-01 11:00:00
1 2018-01-01 12:00:00
2 2018-01-01 12:00:00
dtype: datetime64[ns]
"""
)

_ceil_example = (
""">>> rng.ceil('H')
DatetimeIndex(['2018-01-01 12:00:00', '2018-01-01 12:00:00',
'2018-01-01 13:00:00'],
dtype='datetime64[ns]', freq=None)

**Series**

>>> pd.Series(rng).dt.ceil("H")
0 2018-01-01 12:00:00
1 2018-01-01 12:00:00
2 2018-01-01 13:00:00
dtype: datetime64[ns]
"""
)

def _round(self, freq, mode, ambiguous, nonexistent):
# round the local times
values = _ensure_datetimelike_to_i8(self)
result = round_nsint64(values, mode, freq)
result = self._maybe_mask_results(result, fill_value=NaT)

attribs = self._get_attributes_dict()
attribs['freq'] = None
if 'tz' in attribs:
attribs['tz'] = None
return self._ensure_localized(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the array classes have _ensure_localized?

self._shallow_copy(result, **attribs), ambiguous, nonexistent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The array classes don't have _shallow_copy anymore. Use _simple_new directly?

)

@Appender((_round_doc + _round_example).format(op="round"))
def round(self, freq, ambiguous='raise', nonexistent='raise'):
return self._round(
freq, RoundTo.NEAREST_HALF_EVEN, ambiguous, nonexistent
)

@Appender((_round_doc + _floor_example).format(op="floor"))
def floor(self, freq, ambiguous='raise', nonexistent='raise'):
return self._round(freq, RoundTo.MINUS_INFTY, ambiguous, nonexistent)

@Appender((_round_doc + _ceil_example).format(op="ceil"))
def ceil(self, freq, ambiguous='raise', nonexistent='raise'):
return self._round(freq, RoundTo.PLUS_INFTY, ambiguous, nonexistent)


class DatetimeLikeArrayMixin(ExtensionOpsMixin, AttributesMixin):
"""
Shared Base/Mixin class for DatetimeArray, TimedeltaArray, PeriodArray
Expand Down Expand Up @@ -1023,3 +1207,39 @@ def validate_dtype_freq(dtype, freq):
raise IncompatibleFrequency('specified freq and dtype '
'are different')
return freq


def _ensure_datetimelike_to_i8(other, to_utc=False):
"""
Helper for coercing an input scalar or array to i8.

Parameters
----------
other : 1d array
to_utc : bool, default False
If True, convert the values to UTC before extracting the i8 values
If False, extract the i8 values directly.

Returns
-------
i8 1d array
"""
from pandas import Index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to avoid this? We've pretty assiduously kept the EA subclasses Index-ignorant so far

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the old version did. I haven't looked at which is more appropriate here.

from pandas.core.arrays import PeriodArray

if lib.is_scalar(other) and isna(other):
return iNaT
elif isinstance(other, (PeriodArray, ABCIndexClass)):
# convert tz if needed
if getattr(other, 'tz', None) is not None:
if to_utc:
other = other.tz_convert('UTC')
else:
other = other.tz_localize(None)
else:
try:
return np.array(other, copy=False).view('i8')
except TypeError:
# period array cannot be coerced to int
other = Index(other)
return other.asi8
4 changes: 3 additions & 1 deletion pandas/core/arrays/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,9 @@ def wrapper(self, other):
return compat.set_function_name(wrapper, opname, cls)


class DatetimeArrayMixin(dtl.DatetimeLikeArrayMixin):
class DatetimeArrayMixin(dtl.DatetimeLikeArrayMixin,
dtl.TimelikeOps,
dtl.DatelikeOps):
"""
Assumes that subclass __new__/__init__ defines:
tz
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/arrays/timedeltas.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ def method(self, other):
return method


class TimedeltaArrayMixin(dtl.DatetimeLikeArrayMixin):
class TimedeltaArrayMixin(dtl.DatetimeLikeArrayMixin, dtl.TimelikeOps):
_typ = "timedeltaarray"
__array_priority__ = 1000

Expand Down
Loading