Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: prevent coercion to datetime64[ns] when a Series is initialized with both tz-naive and tz-aware #18361

Merged
merged 4 commits into from
Nov 23, 2017

Conversation

jamestran201
Copy link

@jamestran201 jamestran201 commented Nov 18, 2017

@pep8speaks
Copy link

pep8speaks commented Nov 18, 2017

Hello @tmnhat2001! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on November 23, 2017 at 02:49 Hours UTC

for i in range(n):
val = values[i]
if _checknull_with_nat(val):
iresult[i] = NPY_NAT
elif PyDateTime_Check(val):
if val.tzinfo is not None:
if tz_naive_or_tz_aware is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this accomplishing that ‘inferred_tz’ isn’t?

@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

so where you are fixing is not the correct location, its a bit after the actual error, rather try here:
https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/inference.pyx#L459

a mixed datetime64 & datetime64 that is tz-aware are in the same bucket, rather this needs disambiguation, and ultimately should be 'mixed'.

Note we have no tests for this routine, so that is the first thing we need. Pls add to
pandas/tests/dtypes/test_inference.py

@jreback jreback added Bug Datetime Datetime data dtype Timezones Timezone data dtype labels Nov 19, 2017
@jamestran201
Copy link
Author

my bad, should have checked out that routine

@jamestran201
Copy link
Author

np.datetime64 objects don't have timezone info right? I just want to make sure not to leave it out.

@jreback
Copy link
Contributor

jreback commented Nov 21, 2017

no, no.datetime64 don’t support time zone (they print with a local time zone which is extra confusing)

@codecov
Copy link

codecov bot commented Nov 22, 2017

Codecov Report

Merging #18361 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18361      +/-   ##
==========================================
- Coverage   91.35%   91.34%   -0.02%     
==========================================
  Files         163      163              
  Lines       49691    49691              
==========================================
- Hits        45397    45388       -9     
- Misses       4294     4303       +9
Flag Coverage Δ
#multiple 89.14% <ø> (ø) ⬆️
#single 39.67% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fedc503...d6f9181. Read the comment docs.

@@ -178,5 +178,5 @@ Other
^^^^^

- Improved error message when attempting to use a Python keyword as an identifier in a numexpr query (:issue:`18221`)
-
- Bug in :func:`inference.pyx.infer_to_datetimelike_array` that coerces tz-naive and tz-aware values in the same array to datetime64[ns] (:issue:`16406`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give a user perspective here (IOW don't reference internal functions).

@@ -3238,7 +3240,8 @@ def _try_cast(arr, take_fast_path):
else:
subarr = maybe_convert_platform(data)

subarr = maybe_cast_to_datetime(subarr, dtype)
if try_cast_to_datetime:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this changed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry about that, i was trying out something similar to infer_datetimelike_array in that function before you told me to try in inference.pyx. I'll change it back to the original version.

assert lib.infer_datetimelike_array(arr) == "date"

@pytest.mark.parametrize(
"data",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add strings, ints, float, and a python object

e.g.

class Foo():
    pass;
Foo()

should all return mixed (as this is a date infered)

@@ -949,3 +949,8 @@ def test_get_level_values_box(self):
index = MultiIndex(levels=levels, labels=labels)

assert isinstance(index.get_level_values(0)[0], Timestamp)

def test_tz_naive_and_tz_aware_mix(self):
s = Series([Timestamp('20130101'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number as a comment

do a full comparision with assert_series_equal, IOW construct the expected

move this test to
test_constructors.py

@jreback jreback added this to the 0.22.0 milestone Nov 23, 2017
@jreback jreback merged commit 41004d9 into pandas-dev:master Nov 23, 2017
@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

@tmnhat2001 nice patch! keep em coming!

@jamestran201
Copy link
Author

thanks

@jamestran201 jamestran201 deleted the tz-16406 branch November 24, 2017 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: construction of tz-naive and tz-aware should not coerce
5 participants