Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Fix #24268 by updating description for keep in Series.nlargest #25358

Merged
merged 11 commits into from
Mar 5, 2019
24 changes: 19 additions & 5 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -3100,11 +3100,18 @@ def nlargest(self, n=5, keep='first'):
When there are duplicate values that cannot all fit in a
Series of `n` elements:

- ``first`` : take the first occurrences based on the index order
- ``last`` : take the last occurrences based on the index order
- ``first`` : return the first `n` occurrences in the
Copy link
Contributor

@TomAugspurger TomAugspurger Feb 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "given index order" the correct description? I interpreted
#24268 (comment) as a modification on the result. So below the descriptions of first / last / all, I would note something like

The `keep` parameter determines which ones to keep when there are duplicates.
Regardless of `keep`, the result will be sorted by the row label.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will include this, but when keep="last", the row labels are sorted in reverse order even in the examples in #24268
And I feel "given index order" is correct because it is "given" by the user as input, if it is not an implicit RangeIndex

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you are writing here but I find the readability rather difficult. Any reason we don't use the same terminology in the unique docstring?

Uniques are returned in order of appearance

^ obviously accounting for reversal with the last argument

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd What do you suggest me to do? If it's changing the unique docstring, I want to leave this as-is, because I still want to somehow emphasize the fact that index order is reversed when keep="last"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggested wording is Return the first n occurrences in order of appearance for first and then Return the last n occurrences in reverse order of appearance

given index order.
- ``last`` : return the last `n` occurrences in the
reverse of the given index order.
- ``all`` : keep all occurrences. This can result in a Series of
size larger than `n`.

The `keep` parameter determines which ones to keep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this sentence as it is stated above

when there are duplicates.
Regardless of `keep`, the result will be sorted
bharatr21 marked this conversation as resolved.
Show resolved Hide resolved
by the row label.

Returns
-------
Series
Expand Down Expand Up @@ -3196,11 +3203,18 @@ def nsmallest(self, n=5, keep='first'):
When there are duplicate values that cannot all fit in a
Series of `n` elements:

- ``first`` : take the first occurrences based on the index order
- ``last`` : take the last occurrences based on the index order
- ``first`` : return the first `n` occurrences in the
given index order.
- ``last`` : return the last `n` occurrences in the
reverse of the given index order.
- ``all`` : keep all occurrences. This can result in a Series of
size larger than `n`.

The `keep` parameter determines which ones to keep
when there are duplicates.
Regardless of `keep`, the result will be sorted
by the row label.

Returns
-------
Series
Expand Down Expand Up @@ -3238,7 +3252,7 @@ def nsmallest(self, n=5, keep='first'):
Monserat 5200
dtype: int64

The `n` largest elements where ``n=5`` by default.
The `n` smallest elements where ``n=5`` by default.
bharatr21 marked this conversation as resolved.
Show resolved Hide resolved

>>> s.nsmallest()
Monserat 5200
Expand Down