-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating DataFrame.mode
docstring.
#22404
Conversation
pandas/core/frame.py
Outdated
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame({'A': [1, 2, 1, 2, 1, 2, 3]}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it standard to have a newline after the section title?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not, good catch. I added it to the TODO list for the validation script in #20298 (it happens very often).
Codecov Report
@@ Coverage Diff @@
## master #22404 +/- ##
=======================================
Coverage 92.18% 92.18%
=======================================
Files 169 169
Lines 50820 50820
=======================================
Hits 46850 46850
Misses 3970 3970
Continue to review full report at Codecov.
|
pandas/core/frame.py
Outdated
dropna : boolean, default True | ||
The axis to iterate over while searching for the mode. | ||
To find the mode for each column, use ``axis='index'``. | ||
To find the mode for each row, use ``axis='columns'``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the first line of your summary. However, I would put back the two bullet points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, fixed. And we should probably define a standard and be consistent when documenting axis
(added it to #20298)
pandas/core/frame.py
Outdated
spider arthropod 8 0.0 | ||
ostrich bird 2 NaN | ||
|
||
By default, missing values are not considered, and the mode of winds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think you mean to say wings
here and not winds
; side note - ostriches do still have wings :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NaN
means unknown in this case, not zero, and I need missing values to show the behavior. But yeah, agree. ;)
pandas/core/frame.py
Outdated
ostrich bird 2 NaN | ||
|
||
By default, missing values are not considered, and the mode of winds | ||
are both 0 and 2. The second row of species and legs contains NaN, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double backticks for NaN
?
pandas/core/frame.py
Outdated
|
||
Notes | ||
----- | ||
Every column or row of the resulting DataFrame contains all its modes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May just be me but I don't understand the Notes section - is this necessary or better explained via examples?
values with the mode in a dataframe ``df``, you can just do this: | ||
``df.fillna(df.mode().iloc[0])`` | ||
The mode of a set of values is the value that appears most often. | ||
It can be multiple values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an example to show this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first example shows this (wings has two modes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed changes from code review.
values with the mode in a dataframe ``df``, you can just do this: | ||
``df.fillna(df.mode().iloc[0])`` | ||
The mode of a set of values is the value that appears most often. | ||
It can be multiple values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first example shows this (wings has two modes)
pandas/core/frame.py
Outdated
spider arthropod 8 0.0 | ||
ostrich bird 2 NaN | ||
|
||
By default, missing values are not considered, and the mode of winds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NaN
means unknown in this case, not zero, and I need missing values to show the behavior. But yeah, agree. ;)
Hello @datapythonista! Thanks for updating the PR.
|
@WillAyd can you take a look and merged this, if the comments from your review are addressed correctly. Thanks! |
Supersedes #20241 (source branch does not exist anymore).