Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add full tz aware support for IntervalIndex #18537

Closed
jschendel opened this issue Nov 28, 2017 · 3 comments · Fixed by #18558
Closed

ENH: Add full tz aware support for IntervalIndex #18537

jschendel opened this issue Nov 28, 2017 · 3 comments · Fixed by #18558
Labels
API Design Bug Interval Interval data type Timezones Timezone data dtype
Milestone

Comments

@jschendel
Copy link
Member

jschendel commented Nov 28, 2017

Problem description

The fixes in #18424 allow for and IntervalIndex to be constructed with tz aware timestamps, but there are a few attributes/methods that still do not work properly with tz aware:

  • Disallow an IntervalIndex to contain elements with differing tz
  • IntervalIndex.mid returns tz naive
  • Anything using _get_next_label or _get_previous_label will raise
    • get_indexer when passed an IntervalIndex
    • get_loc for overlapping/non-monotonic IntervalIndex
    • contains always returns False for overlapping/non-monotonic due to Try/Except
    • etc.

Code Sample, a copy-pastable example if possible

Setup:

In [2]: start = pd.Timestamp('2017-01-01', tz='US/Eastern')
   ...: index = pd.interval_range(start, periods=3)
   ...: index
   ...:
Out[2]:
IntervalIndex([(2017-01-01, 2017-01-02], (2017-01-02, 2017-01-03], (2017-01-03, 2017-01-04]]
              closed='right',
              dtype='interval[datetime64[ns, US/Eastern]]')

IntervalIndex.mid returns tz naive:

In [3]: index.mid
Out[3]:
DatetimeIndex(['2017-01-01 17:00:00', '2017-01-02 17:00:00',
               '2017-01-03 17:00:00'],
              dtype='datetime64[ns]', freq=None)

get_indexer raises when passed an IntervalIndex:

In [4]: index.get_indexer(index[:-1])
---------------------------------------------------------------------------
TypeError: cannot determine next label for type <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

Differing tz:

In [5]: left = pd.date_range('2017-01-01', periods=3, tz='US/Eastern')
   ...: right = pd.date_range('2017-01-02', periods=3, tz='US/Pacific')
   ...: index = pd.IntervalIndex.from_arrays(left, right)
   ...: index
   ...:
Out[5]:
IntervalIndex([(2017-01-01, 2017-01-02], (2017-01-02, 2017-01-03], (2017-01-03, 2017-01-04]]
              closed='right',
              dtype='interval[datetime64[ns, US/Eastern]]')

In [6]: index.left.dtype
Out[6]: datetime64[ns, US/Eastern]

In [7]: index.right.dtype
Out[7]: datetime64[ns, US/Pacific]
@jschendel
Copy link
Member Author

Side question: Should a plain Interval be able to hold timestamps with different tz? Seems a bit strange, but since different tz are comparable, I don't want to outright declare this a bug.

In [2]: left = pd.Timestamp('2017-01-01', tz='US/Eastern')
   ...: right = pd.Timestamp('2017-01-02', tz='US/Pacific')
   ...: iv = pd.Interval(left, right)
   ...:

In [3]: iv
Out[3]: Interval('2017-01-01', '2017-01-02', closed='right')

In [4]: iv.left
Out[4]: Timestamp('2017-01-01 00:00:00-0500', tz='US/Eastern')

In [5]: iv.right
Out[5]: Timestamp('2017-01-02 00:00:00-0800', tz='US/Pacific')

Note that an Interval cannot be constructed from an a combination of aware and naive timestamps, as the two aren't comparable:

In [6]: left = pd.Timestamp('2017-01-01', tz='US/Eastern')
   ...: right = pd.Timestamp('2017-01-01')
   ...: iv = pd.Interval(left, right)
   ...:
---------------------------------------------------------------------------
TypeError: Cannot compare tz-naive and tz-aware timestamps

@jreback
Copy link
Contributor

jreback commented Nov 28, 2017

no an interval must have he same tz in left/right (could be None in both as welll)

@jschendel
Copy link
Member Author

@jreback : okay, will open a new issue for that to give it more visibility/searchability.

@jreback jreback added Difficulty Intermediate Interval Interval data type Timezones Timezone data dtype labels Nov 28, 2017
@jreback jreback added this to the Next Major Release milestone Nov 28, 2017
@jreback jreback modified the milestones: Next Major Release, 0.22.0 Nov 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Bug Interval Interval data type Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants