-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incorrect DTI/TDI indexing; warn before dropping tzinfo #22549
Conversation
Codecov Report
@@ Coverage Diff @@
## master #22549 +/- ##
==========================================
+ Coverage 92.05% 92.05% +<.01%
==========================================
Files 169 169
Lines 50783 50788 +5
==========================================
+ Hits 46749 46754 +5
Misses 4034 4034
Continue to review full report at Codecov.
|
pandas/_libs/tslibs/timestamps.pyx
Outdated
if self.tz is not None: | ||
# GH#21333 | ||
warnings.warn("Converting to Period representation will " | ||
"drop timezone information.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warning type and stack level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This questions applies to everywhere where you placed warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea what warning type to use. Suggestions?
As to stack level, I tried a bunch to get that to work with tm.assert_produces_warning and eventually threw in the towel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you have it, UserWarning
makes sense, but I think being explicit about it is good.
I know what you mean regarding stacklevel
. We generally try to get one that makes sense, and if the tests don't cooperate, we can just use check_stacklevel=False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the warnings are raised in direct user-api methods, normally putting a stacklevel=2
should do the correct thing.
What did you not get working in the tests? (which kind of code sample)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What did you not get working in the tests?
IIRC the problem is that the affected code paths issues both the new FutureWarning
and in some cases also a PerformanceWarning
. tm.assert_produces_warning
doesn't support multiple expected warnings, and my attempt to modify it led to tests failing the stacklevel checks.
pandas/core/indexes/datetimes.py
Outdated
@@ -1201,6 +1205,12 @@ def get_loc(self, key, method=None, tolerance=None): | |||
key = Timestamp(key, tz=self.tz) | |||
return Index.get_loc(self, key, method, tolerance) | |||
|
|||
if isinstance(key, timedelta): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if/elif here?
@@ -487,7 +488,8 @@ def get_loc(self, key, method=None, tolerance=None): | |||
------- | |||
loc : int | |||
""" | |||
if is_list_like(key): | |||
if is_list_like(key) or (isinstance(key, datetime) and key is not NaT): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if we use isna else for NaT checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this context isna
is less clear. Since there is a specific na-like object we are catching here, we should be explicit about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this comment looks good
pandas/core/indexes/timedeltas.py
Outdated
@@ -487,7 +488,8 @@ def get_loc(self, key, method=None, tolerance=None): | |||
------- | |||
loc : int | |||
""" | |||
if is_list_like(key): | |||
if is_list_like(key) or (isinstance(key, datetime) and key is not NaT): | |||
# GH#20464 for datetime case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this more explicit, meaning datetime dtype for all-NaT
with tm.assert_produces_warning(UserWarning): | ||
# warning that timezone info will be lost | ||
result = ts.to_period()[0] | ||
expected = ts[0].to_period() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in future PR these should be parameterized
@@ -41,6 +41,14 @@ def test_getitem(self): | |||
tm.assert_index_equal(result, expected) | |||
assert result.freq == expected.freq | |||
|
|||
def test_timestamp_invalid_key(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you parameterize over a datetime as well
@@ -41,6 +41,14 @@ def test_getitem(self): | |||
tm.assert_index_equal(result, expected) | |||
assert result.freq == expected.freq | |||
|
|||
def test_timestamp_invalid_key(self): | |||
# GH#20464 | |||
tdi = pd.timedelta_range(0, periods=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a test that indexes with NaT? (both Timedelta and Datetime dtype)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have one for timedelta, just added one for datetime
Hello @jbrockmendel! Thanks for updating the PR.
|
Looks like circleCI is a hypothesis timeout |
lgtm. rebase after @mroeschke CalendarDay change. ping on green. |
Ping |
thanks! |
Slight hodge-podge.
#21333 is about
to_period
method on DatetimeIndex silently dropping timezone info. This PR makesDatetimeIndex.to_period
andTimestamp.to_period
issue a warning when losing timezone info.#20464 is entirely unrelated, notes that
pd.date_range('1970-01-01', period=2).get_loc(pd.Timedelta(0))
returns0
instead of raising; ditto for indexing a TDI with a Timestamp.git diff upstream/master -u -- "*.py" | flake8 --diff