Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: update the DataFrame.reindex_like docstring #22775

Merged
merged 19 commits into from
Nov 26, 2018
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 51 additions & 30 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3909,43 +3909,59 @@ def shift(self, periods=1, freq=None, axis=0):
def set_index(self, keys, drop=True, append=False, inplace=False,
verify_integrity=False):
"""
An index is created with existing columns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit confusing as the summary line (because it seems to suggest that an index is returned, since that is created).
I found the previous "Set the DataFrame index using existing columns" a bit clearer. @datapythonista thoughts?


Set the DataFrame index (row labels) using one or more existing
columns. By default yields a new object.
columns. The index can replace the existing index or expand on it.

Parameters
----------
keys : column label or list of column labels / arrays
drop : boolean, default True
Delete columns to be used as the new index
append : boolean, default False
Whether to append columns to existing index
inplace : boolean, default False
Modify the DataFrame in place (do not create a new object)
verify_integrity : boolean, default False
keys : str or list of str or array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be strict, the keys can be something else as a string ... (column names can also be integers, timestamps, ..)
That also the reason that there was 'label' before.

Column label or list of column labels / arrays that will
form the new index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this 'arrays' option is really something completely different (it are the actual index values passed as an array, not referring to one of the existing columns), I would put that in a separate sentence.

drop : bool, default True
Delete columns to be used as the new index.
append : bool, default False
Whether to append columns to existing index.
inplace : bool, default False
Modify the DataFrame in place (do not create a new object).
verify_integrity : bool, default False
Check the new index for duplicates. Otherwise defer the check until
necessary. Setting to False will improve the performance of this
method
method.

Returns
-------
DataFrame
Changed row labels.

See Also
--------
DataFrame.reset_index : Opposite of set_index.
DataFrame.reindex : Change to new indices or expand indices.
DataFrame.reindex_like : Change to same indices as other DataFrame.

Examples
--------
>>> df = pd.DataFrame({'month': [1, 4, 7, 10],
... 'year': [2012, 2014, 2013, 2014],
... 'sale':[55, 40, 84, 31]})
month sale year
0 1 55 2012
1 4 40 2014
2 7 84 2013
3 10 31 2014
... 'sale': [55, 40, 84, 31]})
math-and-data marked this conversation as resolved.
Show resolved Hide resolved
>>> df
month year sale
0 1 2012 55
1 4 2014 40
2 7 2013 84
3 10 2014 31

Set the index to become the 'month' column:

>>> df.set_index('month')
sale year
year sale
month
1 55 2012
4 40 2014
7 84 2013
10 31 2014
1 2012 55
4 2014 40
7 2013 84
10 2014 31

Create a multi-index using columns 'year' and 'month':

Expand All @@ -3966,10 +3982,6 @@ def set_index(self, keys, drop=True, append=False, inplace=False,
2 2014 4 40
3 2013 7 84
4 2014 10 31

Returns
-------
dataframe : DataFrame
"""
inplace = validate_bool_kwarg(inplace, 'inplace')
if not isinstance(keys, list):
Expand Down Expand Up @@ -4037,6 +4049,8 @@ def set_index(self, keys, drop=True, append=False, inplace=False,
def reset_index(self, level=None, drop=False, inplace=False, col_level=0,
col_fill=''):
"""
An existing index is modified.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also do better here for the summary method, as just from this line "An existing index is modified.", a user won't get much clue of what the method is doing.

But of course, given that the method is doing different things, it might be difficult to have something that is still generally true but less vague as the above ...

Trying to think of better descriptions:

Remove index (level) (but that is also a bit short / cryptic)
Reset index to default index or remove index level

(will think further on it)


For DataFrame with multi-level index, return new DataFrame with
labeling information in the columns under the index names, defaulting
to 'level_0', 'level_1', etc. if any are None. For a standard index,
Expand All @@ -4047,12 +4061,12 @@ def reset_index(self, level=None, drop=False, inplace=False, col_level=0,
----------
level : int, str, tuple, or list, default None
Only remove the given levels from the index. Removes all levels by
default
drop : boolean, default False
default.
drop : bool, default False
Do not try to insert index into dataframe columns. This resets
the index to the default integer index.
inplace : boolean, default False
Modify the DataFrame in place (do not create a new object)
inplace : bool, default False
Modify the DataFrame in place (do not create a new object).
col_level : int or str, default 0
If the columns have multiple levels, determines which level the
labels are inserted into. By default it is inserted into the first
Expand All @@ -4063,7 +4077,14 @@ def reset_index(self, level=None, drop=False, inplace=False, col_level=0,

Returns
-------
resetted : DataFrame
DataFrame
Changed row labels.

See Also
--------
DataFrame.set_index : Opposite of reset_index.
DataFrame.reindex : Change to new indices or expand indices.
DataFrame.reindex_like : Change to same indices as other DataFrame.

Examples
--------
Expand Down
Loading