Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: update the DataFrame.reindex_like docstring #22775

Merged
merged 19 commits into from
Nov 26, 2018

Conversation

math-and-data
Copy link
Contributor

  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single
  • The validation script passes: scripts/validate_docstrings.py
    Errors due to the fact that this method .reindex_like()can use the same input parameters as .reindex() (I avoided copy/paste and referred to the other method instead)
        Errors in parameters section
                Parameter "other" has no description
                Parameter "method" has no description
                Parameter "copy" has no description

@pep8speaks
Copy link

pep8speaks commented Sep 19, 2018

Hello @math-and-data! Thanks for updating the PR.

Comment last updated on November 04, 2018 at 21:16 Hours UTC

@math-and-data math-and-data changed the title Docstring reindex_like DOC: update the DataFrame.reindex_like docstring Sep 19, 2018
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on the docstrings!

Added some first comments.

Can you make sure to run the validation script on all docstrings you edited? (not only for reindex_like)

pandas/core/generic.py Outdated Show resolved Hide resolved
pandas/core/generic.py Show resolved Hide resolved

Returns
-------
reindexed : same as input
Object
same object type as input, but with changed indices on each axis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave out the "Object" line, and directly put the explanation on that line.

And can you use Capital first character and end point punctuation?

pandas/core/generic.py Show resolved Hide resolved
pandas/core/generic.py Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 20, 2018

Codecov Report

Merging #22775 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22775      +/-   ##
==========================================
+ Coverage   92.29%   92.29%   +<.01%     
==========================================
  Files         161      161              
  Lines       51500    51501       +1     
==========================================
+ Hits        47531    47534       +3     
+ Misses       3969     3967       -2
Flag Coverage Δ
#multiple 90.69% <100%> (ø) ⬆️
#single 42.42% <66.66%> (+0.11%) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 97.02% <ø> (ø) ⬆️
pandas/core/generic.py 96.84% <100%> (-0.01%) ⬇️
pandas/io/parsers.py 95.36% <0%> (-0.19%) ⬇️
pandas/io/pytables.py 92.3% <0%> (-0.05%) ⬇️
pandas/core/base.py 97.6% <0%> (-0.01%) ⬇️
pandas/core/indexes/multi.py 95.49% <0%> (-0.01%) ⬇️
pandas/core/reshape/pivot.py 96.55% <0%> (ø) ⬆️
pandas/io/json/normalize.py 96.87% <0%> (ø) ⬆️
pandas/io/packers.py 88.08% <0%> (ø) ⬆️
... and 41 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e01c38...7df1f79. Read the comment docs.

@math-and-data
Copy link
Contributor Author

  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single
  • The validation script passes: scripts/validate_docstrings.py
  • .reindex_like: formatting of bulleted list leads to missing dot
Errors found:
        Errors in parameters section
                Parameter "method" description should finish with "."
  • .reindex: formatting of bulleted list leads to missing dot
Errors found:
        Errors in parameters section
                Parameters {index, columns} not documented
                Unknown parameters {index, columns}
                Parameter "index, columns" description should finish with "."
                Parameter "method" description should finish with "."
  • .set_index, .reset_index:
Docstring for "pandas.DataFrame.reset_index" correct. :)

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, added some comments about formatting

pandas/core/frame.py Show resolved Hide resolved
keys : column label or list of column labels / arrays
keys : str or list of str or array
Column label or list of column labels / arrays that will
form the new index.
drop : boolean, default True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace all the boolean by bool? I think the validation script should warn about it now.

--------
DataFrame.set_index : opposite of reset_index
DataFrame.reindex : change to new indices or expand indices
DataFrame.reindex_like : change to same indices as other DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you capitalize the first letter of each description, and add a period at the end please

Conform the object to the same index on all axes. Optional
filling logic, placing NA/NaN in locations having no value
in the previous index. A new object is produced unless the
new index is equivalent to the current one and copy=False.

Parameters
----------
other : Object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess other can only be Series and DataFrame? Or can we be more specific?

valid
* backfill / bfill: use next valid observation to fill gap
* nearest: use nearest valid observations to fill gap

copy : boolean, default True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, bool instead of boolean

method : string or None
Its row and column indices are used to define the new indices
of this object.
method : {None, 'backfill'/'bfill', 'pad'/'ffill', 'nearest'}, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove the None as it's already clear with the optional


.. versionadded:: 0.21.0 (list-like tolerance)

Notes
-----
Like calling s.reindex(index=other.index, columns=other.columns,
method=...)
Like calling `.reindex(index=other.index, columns=other.columns, ...)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you finish with a period.

--------
DataFrame.set_index : set row labels
DataFrame.reset_index : remove row labels or move them to new columns
DataFrame.reindex_like : change to same indices as other DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, capitalize and finish with periods.

DataFrame.set_index : set row labels
DataFrame.reset_index : remove row labels or move them to new columns
DataFrame.reindex : change to new indices or expand indices
DataFrame.reindex_like : change to same indices as other DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you capitalize and finish with period these descriptions.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes, few more minor things, and I think we're ready to merge. Thanks!

pandas/core/generic.py Outdated Show resolved Hide resolved

Returns
-------
reindexed : same as input
Same object type as input, but with changed indices on each axis.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a line before that with Series or DataFrame. This description should be indented in the next line.

pandas/core/generic.py Outdated Show resolved Hide resolved
pandas/core/generic.py Outdated Show resolved Hide resolved
@@ -3714,27 +3769,27 @@ def reindex(self, *args, **kwargs):
New labels / index to conform to. Preferably an Index object to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the (should be specified using keywords) to the description. As a normal sentence, doesn't need to be in brackets.

DataFrame.set_index : set row labels
DataFrame.reset_index : remove row labels or move them to new columns
DataFrame.reindex : change to new indices or expand indices
DataFrame.reindex_like : change to same indices as other DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you capitalize and finish with period these descriptions.

DataFrame.set_index : set row labels
DataFrame.reset_index : remove row labels or move them to new columns
DataFrame.reindex : change to new indices or expand indices
DataFrame.reindex_like : change to same indices as other DataFrame

Returns
-------
reindexed : %(klass)s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
reindexed : %(klass)s
%(klass)s

And add a short description of what is being returned in the next line.

datapythonista and others added 6 commits November 4, 2018 21:09
Co-Authored-By: math-and-data <math-and-data@users.noreply.github.com>
Co-Authored-By: math-and-data <math-and-data@users.noreply.github.com>
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, really nice work.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@math-and-data Nice enhancements! I added a few more comments, specifically on the wording of the summary lines.

Also, can you see if you can remove set_index and reindex from

-k"-axes -combine -itertuples -join -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_stata"

?

(so the doctests are run on our CI to make sure the examples are correct)

inplace : boolean, default False
Modify the DataFrame in place (do not create a new object)
verify_integrity : boolean, default False
keys : str or list of str or array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be strict, the keys can be something else as a string ... (column names can also be integers, timestamps, ..)
That also the reason that there was 'label' before.

verify_integrity : boolean, default False
keys : str or list of str or array
Column label or list of column labels / arrays that will
form the new index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this 'arrays' option is really something completely different (it are the actual index values passed as an array, not referring to one of the existing columns), I would put that in a separate sentence.

@@ -3909,43 +3909,59 @@ def shift(self, periods=1, freq=None, axis=0):
def set_index(self, keys, drop=True, append=False, inplace=False,
verify_integrity=False):
"""
An index is created with existing columns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit confusing as the summary line (because it seems to suggest that an index is returned, since that is created).
I found the previous "Set the DataFrame index using existing columns" a bit clearer. @datapythonista thoughts?

@@ -4037,6 +4049,8 @@ def set_index(self, keys, drop=True, append=False, inplace=False,
def reset_index(self, level=None, drop=False, inplace=False, col_level=0,
col_fill=''):
"""
An existing index is modified.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also do better here for the summary method, as just from this line "An existing index is modified.", a user won't get much clue of what the method is doing.

But of course, given that the method is doing different things, it might be difficult to have something that is still generally true but less vague as the above ...

Trying to think of better descriptions:

Remove index (level) (but that is also a bit short / cryptic)
Reset index to default index or remove index level

(will think further on it)


.. versionadded:: 0.21.0 (list-like tolerance)

Notes
-----
Like calling s.reindex(index=other.index, columns=other.columns,
method=...)
Like calling ``.reindex(index=other.index,columns=other.columns,...)``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Like calling ``.reindex(index=other.index,columns=other.columns,...)``.
Like calling ``.reindex(index=other.index, columns=other.columns,...)``.

@datapythonista
Copy link
Member

@math-and-data do you have time to make the changes asked in the last review?

@datapythonista
Copy link
Member

@jorisvandenbossche addressed the comments from your code review. Let me know if there is anything else that needs to be changed.


Reset the index of the DataFrame, and use the default one instead.
If the DataFrame has a MultiIndex, this method can remove one or more
of them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think levels would be clearer than of them

Return an object with matching indices as other object.

Conform the object to the same index on all axes. Optional
filling logic, placing NA/NaN in locations having no value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just say NaN


Returns
-------
reindexed : same as input
Object with input data type, but with changed indices on each axis.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace everything before the comma with Same type as caller


Examples
--------
>>> df_weather_station_1 = pd.DataFrame([[24.3, 75.7, 'high'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the _weather_station bit adds any value; would be OK to just use df1, df2, etc...

@datapythonista
Copy link
Member

@WillAyd made the changes, can you take a look?

@WillAyd WillAyd merged commit ede0dae into pandas-dev:master Nov 26, 2018
@WillAyd
Copy link
Member

WillAyd commented Nov 26, 2018

Thanks @math-and-data and @datapythonista !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants