Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating DataFrame.mode docstring. #22404

Merged
merged 12 commits into from
Sep 30, 2018
Merged

Updating DataFrame.mode docstring. #22404

merged 12 commits into from
Sep 30, 2018

Conversation

datapythonista
Copy link
Member

Supersedes #20241 (source branch does not exist anymore).


Examples
--------
>>> df = pd.DataFrame({'A': [1, 2, 1, 2, 1, 2, 3]})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it standard to have a newline after the section title?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not, good catch. I added it to the TODO list for the validation script in #20298 (it happens very often).

@codecov
Copy link

codecov bot commented Aug 18, 2018

Codecov Report

Merging #22404 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #22404   +/-   ##
=======================================
  Coverage   92.18%   92.18%           
=======================================
  Files         169      169           
  Lines       50820    50820           
=======================================
  Hits        46850    46850           
  Misses       3970     3970
Flag Coverage Δ
#multiple 90.6% <ø> (ø) ⬆️
#single 42.38% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 97.2% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 27de8e6...83858ff. Read the comment docs.

dropna : boolean, default True
The axis to iterate over while searching for the mode.
To find the mode for each column, use ``axis='index'``.
To find the mode for each row, use ``axis='columns'``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the first line of your summary. However, I would put back the two bullet points.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, fixed. And we should probably define a standard and be consistent when documenting axis (added it to #20298)

spider arthropod 8 0.0
ostrich bird 2 NaN

By default, missing values are not considered, and the mode of winds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think you mean to say wings here and not winds; side note - ostriches do still have wings :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NaN means unknown in this case, not zero, and I need missing values to show the behavior. But yeah, agree. ;)

ostrich bird 2 NaN

By default, missing values are not considered, and the mode of winds
are both 0 and 2. The second row of species and legs contains NaN,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double backticks for NaN?


Notes
-----
Every column or row of the resulting DataFrame contains all its modes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May just be me but I don't understand the Notes section - is this necessary or better explained via examples?

values with the mode in a dataframe ``df``, you can just do this:
``df.fillna(df.mode().iloc[0])``
The mode of a set of values is the value that appears most often.
It can be multiple values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an example to show this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first example shows this (wings has two modes)

Copy link
Member Author

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed changes from code review.

values with the mode in a dataframe ``df``, you can just do this:
``df.fillna(df.mode().iloc[0])``
The mode of a set of values is the value that appears most often.
It can be multiple values.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first example shows this (wings has two modes)

spider arthropod 8 0.0
ostrich bird 2 NaN

By default, missing values are not considered, and the mode of winds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NaN means unknown in this case, not zero, and I need missing values to show the behavior. But yeah, agree. ;)

@pep8speaks
Copy link

Hello @datapythonista! Thanks for updating the PR.

@datapythonista
Copy link
Member Author

@WillAyd can you take a look and merged this, if the comments from your review are addressed correctly. Thanks!

@WillAyd WillAyd merged commit f849134 into pandas-dev:master Sep 30, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants