ENH: Add ignore_index for df.sort_values and series.sort_values #30402

charlesdong1991 · 2019-12-22T12:14:23Z

xref API: add ignore_index keyword to .sort_* & .drop_duplicates #30114
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

This is the first step for closing the PR #30114, I will come up with another PR for drop_duplicates

pandas/core/frame.py

WillAyd · 2019-12-23T08:45:32Z

doc/source/whatsnew/v1.0.0.rst

@@ -474,6 +474,7 @@ Other API changes
  Supplying anything else than ``how`` to ``**kwargs`` raised a ``TypeError`` previously (:issue:`29388`)
 - When testing pandas, the new minimum required version of pytest is 5.0.1 (:issue:`29664`)
 - :meth:`Series.str.__iter__` was deprecated and will be removed in future releases (:issue:`28277`).
+- ``ignore_index`` is added in :meth:`DataFrame.sort_values` to be able to get reset index after sorting (:issue:`30114`)


Suggested change

- ``ignore_index`` is added in :meth:`DataFrame.sort_values` to be able to get reset index after sorting (:issue:`30114`)

- ``ignore_index`` is added in :meth:`DataFrame.sort_values` to be able to reset index after sorting (:issue:`30114`)

thanks! will change!

WillAyd · 2019-12-23T18:33:56Z

pandas/core/series.py

@@ -2855,6 +2858,9 @@ def _try_kind_sort(arr):

        result = self._constructor(arr[sortedIdx], index=self.index[sortedIdx])

+        if ignore_index:
+            result = result.reset_index(drop=True)


Is there a reason for this to differ with the frame implementation? Should be the same in both cases

oh, sorry! I thought the other one was faster in the first place, and then later found basically no difference, and forgot to align them to keep consistency.

Changed! thanks! @WillAyd

ahh!! i recall why they are different, this new_data here is BlockManager while in series is not. I revert the change.

use result.index = ibase.default_index(len(indexer))

this is NOT a BlockManager here, but a series

ahh, sorry for my bad English, i meant new_data in frame.py is BlockManager (should have used there other than here to be precise 😅 ), here it is Series. @jreback

you will see my comment in the PR about drop_duplicates. We do not want to use .reset_index directly at all. This causes yet another copy of the data which we can easily avoid here.

you have to be careful to avoid mutating things though.

pandas/core/series.py

pandas/core/frame.py

jreback

looks pretty good, some comments.

pandas/core/generic.py

pandas/core/frame.py

jreback · 2019-12-24T16:28:11Z

doc/source/whatsnew/v1.0.0.rst

@@ -474,6 +474,7 @@ Other API changes
  Supplying anything else than ``how`` to ``**kwargs`` raised a ``TypeError`` previously (:issue:`29388`)
 - When testing pandas, the new minimum required version of pytest is 5.0.1 (:issue:`29664`)
 - :meth:`Series.str.__iter__` was deprecated and will be removed in future releases (:issue:`28277`).
+- ``ignore_index`` is added in :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` to be able to reset index after sorting (:issue:`30114`)


reverse the ordering a bit :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` have gained the ignore_index......

changed and moved!

jreback · 2019-12-24T16:28:21Z

doc/source/whatsnew/v1.0.0.rst

@@ -474,6 +474,7 @@ Other API changes
  Supplying anything else than ``how`` to ``**kwargs`` raised a ``TypeError`` previously (:issue:`29388`)
 - When testing pandas, the new minimum required version of pytest is 5.0.1 (:issue:`29664`)
 - :meth:`Series.str.__iter__` was deprecated and will be removed in future releases (:issue:`28277`).
+- ``ignore_index`` is added in :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` to be able to reset index after sorting (:issue:`30114`)




move this to other enhancements

jreback · 2019-12-24T16:30:09Z

pandas/core/series.py

@@ -2855,6 +2858,9 @@ def _try_kind_sort(arr):

        result = self._constructor(arr[sortedIdx], index=self.index[sortedIdx])

+        if ignore_index:
+            result = result.reset_index(drop=True)


use result.index = ibase.default_index(len(indexer))

this is NOT a BlockManager here, but a series

pandas/core/series.py

pandas/tests/frame/methods/test_sort_values.py

pandas/tests/series/methods/test_sort_values.py

WillAyd · 2019-12-26T14:05:44Z

pandas/core/series.py

@@ -2820,7 +2825,7 @@ def _try_kind_sort(arr):
                return arr.argsort(kind="quicksort")

        arr = self._values
-        sortedIdx = np.empty(len(self), dtype=np.int32)
+        sorted_index = np.empty(len(self), dtype=np.int32)


Orthogonal to this but I wonder why this is specified as np.int32; might be a bug for input with more than 2 ** 32 - 1 rows

numpy uses int32 for argsort so best can do (to be honest 2B is enough)

jreback · 2019-12-27T16:12:27Z

thanks @charlesdong1991

…as-dev#30402)

…ndexing-1row-df * upstream/master: (333 commits) CI: troubleshoot Web_and_Docs failing (pandas-dev#30534) WARN: Ignore NumbaPerformanceWarning in test suite (pandas-dev#30525) DEPR: camelCase in offsets, get_offset (pandas-dev#30340) PERF: implement scalar ops blockwise (pandas-dev#29853) DEPR: Remove Series.compress (pandas-dev#30514) ENH: Add numba engine for rolling apply (pandas-dev#30151) [ENH] Add to_markdown method (pandas-dev#30350) DEPR: Deprecate pandas.np module (pandas-dev#30386) ENH: Add ignore_index for df.drop_duplicates (pandas-dev#30405) BUG: The setting xrot=0 in DataFrame.hist() doesn't work with by and subplots pandas-dev#30288 (pandas-dev#30491) CI: Fix GBQ Tests (pandas-dev#30478) Bug groupby quantile listlike q and int columns (pandas-dev#30485) ENH: Add ignore_index for df.sort_values and series.sort_values (pandas-dev#30402) TYP: Typing hints in pandas/io/formats/{css,csvs}.py (pandas-dev#30398) BUG: raise on non-hashable Index name, closes pandas-dev#29069 (pandas-dev#30335) Replace "foo!r" to "repr(foo)" syntax pandas-dev#29886 (pandas-dev#30502) BUG: preserve EA dtype in transpose (pandas-dev#30091) BLD: add check to prevent tempita name error, clsoes pandas-dev#28836 (pandas-dev#30498) REF/TST: method-specific files for test_append (pandas-dev#30503) marked unused parameters (pandas-dev#30504) ...

charlesdong1991 added 8 commits December 3, 2018 17:43

remove \n from docstring

7e461a1

fix conflicts

1314059

Merge remote-tracking branch 'upstream/master'

8bcb313

Merge remote-tracking branch 'upstream/master' into fix_issue_30114

1f9a4ce

Merge remote-tracking branch 'upstream/master' into fix_issue_30114

5f924c3

Add ignore index for sort values

d0a134f

black reformat

6d52765

add tests

a31797d

gfyoung added API Design Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 22, 2019

remove type hint to see if test passes

b80f380

WillAyd requested changes Dec 23, 2019

View reviewed changes

charlesdong1991 added 7 commits December 23, 2019 11:00

code change based on WA review

b997d3f

restore reformat change on other parts

12d1260

revert change

4ff2493

change bool

e9d63f4

remove annotation

70ffec7

remove for series

b4245d7

add ignore_index for series

f9e7ec2

charlesdong1991 changed the title ~~ENH: Add ignore_index for df.sort_values~~ ENH: Add ignore_index for df.sort_values and series.sort_values Dec 23, 2019

WillAyd reviewed Dec 23, 2019

View reviewed changes

charlesdong1991 added 4 commits December 23, 2019 19:42

keep consistency

d95a89f

revert change

f241e67

resolve conflict

7f9846a

restore change

bbb4754

WillAyd reviewed Dec 24, 2019

View reviewed changes

pandas/core/series.py Show resolved Hide resolved

WillAyd reviewed Dec 24, 2019

View reviewed changes

pandas/core/frame.py Show resolved Hide resolved

jreback requested changes Dec 24, 2019

View reviewed changes

charlesdong1991 added 2 commits December 24, 2019 19:35

code change based on WA and JR reviews

3c37eb9

better english

4ce9f43

skip annotation

0f89aa2

charlesdong1991 mentioned this pull request Dec 24, 2019

TYPING: keywords in df.sort_values and sr.sort_values cannot be annotated #30451

Closed

jreback requested changes Dec 26, 2019

View reviewed changes

pandas/core/series.py Show resolved Hide resolved

pandas/tests/frame/methods/test_sort_values.py Show resolved Hide resolved

pandas/tests/series/methods/test_sort_values.py Show resolved Hide resolved

code change on JR review

d02b651

WillAyd approved these changes Dec 26, 2019

View reviewed changes

jreback added this to the 1.0 milestone Dec 27, 2019

jreback approved these changes Dec 27, 2019

View reviewed changes

jreback merged commit f738581 into pandas-dev:master Dec 27, 2019

AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Dec 29, 2019

ENH: Add ignore_index for df.sort_values and series.sort_values (pand…

b4dea2c

…as-dev#30402)

charlesdong1991 mentioned this pull request Jan 3, 2020

CLN: Clean tests for *.sort_index, *.sort_values and df.drop_duplicates #30651

Merged

1 task

jorisvandenbossche mentioned this pull request Jun 22, 2020

ENH: add ignore_index argument to DataFrame.explode / Series.explode #34932

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add ignore_index for df.sort_values and series.sort_values #30402

ENH: Add ignore_index for df.sort_values and series.sort_values #30402

charlesdong1991 commented Dec 22, 2019 •

edited

Loading

WillAyd Dec 23, 2019

charlesdong1991 Dec 23, 2019

WillAyd Dec 23, 2019

charlesdong1991 Dec 23, 2019

charlesdong1991 Dec 23, 2019

jreback Dec 24, 2019

charlesdong1991 Dec 24, 2019

jreback Dec 24, 2019

jreback left a comment

jreback Dec 24, 2019

charlesdong1991 Dec 24, 2019

jreback Dec 24, 2019

jreback Dec 24, 2019

WillAyd Dec 26, 2019

jreback Dec 27, 2019

jreback commented Dec 27, 2019

	- ``ignore_index`` is added in :meth:`DataFrame.sort_values` to be able to get reset index after sorting (:issue:`30114`)
	- ``ignore_index`` is added in :meth:`DataFrame.sort_values` to be able to reset index after sorting (:issue:`30114`)

ENH: Add ignore_index for df.sort_values and series.sort_values #30402

ENH: Add ignore_index for df.sort_values and series.sort_values #30402

Conversation

charlesdong1991 commented Dec 22, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 27, 2019

charlesdong1991 commented Dec 22, 2019 •

edited

Loading