-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: update the DataFrame.mode method docstring #20241
Changes from 4 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,8 +12,8 @@ | |
# pylint: disable=E1101,E1103 | ||
# pylint: disable=W0212,W0231,W0703,W0622 | ||
|
||
import functools | ||
import collections | ||
import functools | ||
import itertools | ||
import sys | ||
import types | ||
|
@@ -111,9 +111,9 @@ | |
by : str or list of str | ||
Name or list of names to sort by. | ||
|
||
- if `axis` is 0 or `'index'` then `by` may contain index | ||
- if ``axis`` is 0 or ``'index'`` then `by` may contain index | ||
levels and/or column labels | ||
- if `axis` is 1 or `'columns'` then `by` may contain column | ||
- if ``axis`` is 1 or ``'columns'`` then `by` may contain column | ||
levels and/or index labels | ||
|
||
.. versionchanged:: 0.23.0 | ||
|
@@ -5873,35 +5873,84 @@ def _get_agg_axis(self, axis_num): | |
|
||
def mode(self, axis=0, numeric_only=False): | ||
""" | ||
Gets the mode(s) of each element along the axis selected. Adds a row | ||
for each mode per label, fills in gaps with nan. | ||
|
||
Note that there could be multiple values returned for the selected | ||
axis (when more than one item share the maximum frequency), which is | ||
the reason why a dataframe is returned. If you want to impute missing | ||
values with the mode in a dataframe ``df``, you can just do this: | ||
``df.fillna(df.mode().iloc[0])`` | ||
Get the mode(s) of each element along the axis selected. | ||
|
||
Adds a row for each mode per label, filling gaps with NaN. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can add a line explaining what mode is |
||
|
||
Parameters | ||
---------- | ||
axis : {0 or 'index', 1 or 'columns'}, default 0 | ||
* 0 or 'index' : get mode of each column | ||
* 1 or 'columns' : get mode of each row | ||
The axis to iterate over while searching for the mode. | ||
To find the mode for each column, iterate over rows (``axis=0``, | ||
default behaviour). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you don't need the parens, it doesn't read well |
||
To find the mode for each row, iterate over columns (``axis=1``). | ||
numeric_only : boolean, default False | ||
if True, only apply to numeric columns | ||
If True, only apply to numeric dimensions. | ||
|
||
Returns | ||
------- | ||
modes : DataFrame (sorted) | ||
A DataFrame containing the modes. | ||
If ``axis=0``, there will be one column per column in the original | ||
DataFrame, with as many rows as there are modes. | ||
If ``axis=1``, there will be one row per row in the original | ||
DataFrame, with as many columns as there are modes. | ||
|
||
Notes | ||
----- | ||
There may be multiple values returned for the selected | ||
axis (when more than one item share the maximum frequency), which is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no parens, just use a comma. capitalize DataFrame. you can leave off the impute sentence |
||
the reason why a dataframe is returned. If you want to impute missing | ||
values with the mode in a dataframe ``df``, you can just do this: | ||
``df.fillna(df.mode().iloc[0])``. | ||
|
||
See Also | ||
-------- | ||
Series.mode : Return the highest frequency value in a Series. | ||
Series.value_counts : Returns a Series with all occuring values as | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. counts of values |
||
indices and the number of occurences as values. | ||
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame({'A': [1, 2, 1, 2, 1, 2, 3]}) | ||
>>> df.mode() | ||
A | ||
0 1 | ||
1 2 | ||
""" | ||
|
||
``mode`` returns a DataFrame with multiple rows if there is more than | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you don't need this sentence as its above |
||
one mode. Missing entries are imputed with NaN. | ||
|
||
>>> grades = pd.DataFrame({ | ||
... 'Science': [80, 70, 80, 75, 80, 75, 85, 90, 80, 70], | ||
... 'Math': [70, 70, 75, 75, 80, 80, 85, 85, 90, 90] | ||
... }) | ||
>>> grades.apply(lambda x: x.value_counts()) | ||
Science Math | ||
70 2 2 | ||
75 2 2 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. put the mode example first |
||
80 4 2 | ||
85 1 2 | ||
90 1 2 | ||
>>> grades.mode() | ||
Science Math | ||
0 80.0 70 | ||
1 NaN 75 | ||
2 NaN 80 | ||
3 NaN 85 | ||
4 NaN 90 | ||
|
||
Use ``axis=1`` to apply mode over columns (get the mode of each row). | ||
|
||
>>> student_grades = pd.DataFrame.from_dict({ | ||
... 'Alice': [80, 85, 90, 85, 95], | ||
... 'Bob': [70, 80, 80, 75, 90] | ||
... }, 'index') | ||
>>> student_grades | ||
0 1 2 3 4 | ||
Alice 80 85 90 85 95 | ||
Bob 70 80 80 75 90 | ||
>>> student_grades.mode(axis=1) | ||
0 | ||
Alice 85 | ||
Bob 80 | ||
""" | ||
|
||
data = self if not numeric_only else self._get_numeric_data() | ||
|
||
def f(s): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try not to change unrelated things