Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Make DTI/TDI/PI argsort match their underlying arrays #37965

Merged
merged 5 commits into from
Dec 17, 2020

Conversation

jbrockmendel
Copy link
Member

The fix to keep resample working is a kludge I'd like to make unnecessary before merging; cc @mroeschke

@@ -374,7 +374,8 @@ def _set_grouper(self, obj: FrameOrSeries, sort: bool = False):
# possibly sort
if (self.sort or sort) and not ax.is_monotonic:
# use stable sort to support first, last, nth
indexer = self.indexer = ax.argsort(kind="mergesort")
# TODO: why does putting na_position="first" fix datetimelike cases?
indexer = self.indexer = ax.argsort(kind="mergesort", na_position="first")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

   def argsort(
        self,
        ascending: bool = True,
        kind: str = "quicksort",
        na_position: str = "last",

in arrays/base.py so this is strange

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it must involve the fact that on older numpys, dt64.argsort used to put NaT at the beginning instead of the end

@jreback jreback added API - Consistency Internal Consistency of API/Behavior ExtensionArray Extending pandas with custom dtypes or arrays. Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 20, 2020
@jbrockmendel
Copy link
Member Author

@mroeschke will you have time to take a look at this in the not-too-distant future? if not ill mothball, since its stuck in the mud.

@@ -374,7 +374,10 @@ def _set_grouper(self, obj: FrameOrSeries, sort: bool = False):
# possibly sort
if (self.sort or sort) and not ax.is_monotonic:
# use stable sort to support first, last, nth
indexer = self.indexer = ax.argsort(kind="mergesort")
# TODO: why does putting na_position="first" fix datetimelike cases?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really familiar with the history of NaTs with respect to resample. I would have assumed NaT would/should have been dropped by this point since we just exclude them from the result #13164.

I'm okay with this solution as a stop gap solution.

@jreback
Copy link
Contributor

jreback commented Nov 26, 2020

lgtm. any user facing change?

@jbrockmendel
Copy link
Member Author

lgtm. any user facing change?

Changes DTi/TDI/PI.argsort, that should be it.

@jbrockmendel jbrockmendel marked this pull request as ready for review December 2, 2020 14:51
@jreback jreback added this to the 1.3 milestone Dec 17, 2020
@jreback jreback merged commit bffc7ad into pandas-dev:master Dec 17, 2020
@jbrockmendel jbrockmendel deleted the bug-argsort branch December 17, 2020 15:43
luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior ExtensionArray Extending pandas with custom dtypes or arrays. Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API/BUG: DatetimeIndex.argsort does not match DatetimeArray.argsort
3 participants