-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG+DEPR: undeprecate item, fix dt64/td64 output type #30175
BUG+DEPR: undeprecate item, fix dt64/td64 output type #30175
Conversation
stacklevel=2, | ||
) | ||
return self.values.item() | ||
if len(self) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was adding this condition discussed somewhere? I would have thought just keep existing behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the bugfix part. dt64, dt64tz, and td64 we're currently incorrectly returning int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm doesn't this break non-DTA though?
>>> type(pd.Series(range(1)).item())
<class 'int'>
>>> type(pd.Series(range(1))[0])
<class 'numpy.int64'>
I thought one of the points of item was to return a Python object (at least in the Numpy world)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ current behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should keep the behaviour of item
to return a python scalar (where possible of course, so for datetime/timedelta it is fine to return a pandas Timestamp/Timedelta I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this been resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the concerns raised by @WillAyd and @jorisvandenbossche have been addressed
This issue is #29250 |
Co-Authored-By: Stephan Hoyer <shoyer@google.com>
if not needs_i8_conversion(self.dtype): | ||
# numpy returns ints instead of datetime64/timedelta64 objects, | ||
# which we need to wrap in Timestamp/Timedelta/Period regardless. | ||
return self.values.item() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for ExtensionArrays. We can discuss adding item
to the interface, but I would rather (or at least for now) let ExtensionArrays take the path you have below that uses iteration (which should already handle the conversion to a python scalar)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be +1 on adding to EA arrays, why have inconsistency in code paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for ExtensionArrays
This uses .values
, so will convert to ndarray and then call item
. So it shouldn't be any more broken than what we have now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have no objection to adding item to EAs separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it shouldn't be any more broken than what we have now.
And to fix that, the only thing that is needed is adding a and not is_extension_array_dtype(self.dtype):
to the above if
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im happy to do that here, will need tests in a follow-up
if not needs_i8_conversion(self.dtype): | ||
# numpy returns ints instead of datetime64/timedelta64 objects, | ||
# which we need to wrap in Timestamp/Timedelta/Period regardless. | ||
return self.values.item() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be +1 on adding to EA arrays, why have inconsistency in code paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, question
) | ||
return self.values.item() | ||
if not ( | ||
is_extension_array_dtype(self.dtype) or needs_i8_conversion(self.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this redundant? as all needs_i8_conversion are already EA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, dt64 and td64 are need i8 conversion but are not EA
also needs a rebase |
LGTM, assuming that https://github.com/pandas-dev/pandas/pull/30175/files/0d2867d5518ac8e5032e7dfca5d0adb5811fa84d#diff-9c718a39eb63c1bfac4fbeacf2906ffd |
Thanks @jbrockmendel ! (will look into adding a EA test) |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Does the un-deprecation need a dedicated discussion, or is there consensus on that?
I'd be happy to split this into separate bug/depr pieces of reviewers prefer.