Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Fix double blank lines in Series docstrings #23632

Closed
datapythonista opened this issue Nov 11, 2018 · 9 comments · Fixed by #23698
Closed

DOC: Fix double blank lines in Series docstrings #23632

datapythonista opened this issue Nov 11, 2018 · 9 comments · Fixed by #23698
Labels
CI Continuous Integration Code Style Code style, linting, code_checks Docs good first issue

Comments

@datapythonista
Copy link
Member

In the docstrings, we use single blank lines to separate sections (or inside sections). For example:

Notes
------
This is a notes section.

Examples
---------
This is the examples section, there is just a blank line between the
end of the notes section and this one.

>>> print('only one blank line before the previous paragraph and this code')
only one blank line before the previous paragraph and this code

But in some cases two blank lines are incorrectly left:

Notes
------
This is a notes section.


Examples
---------
Now there are two blank lines before examples.


>>> print('and two blank lines before the code too')
and two blank lines before the code too

This happen somehow often, but we should remove the extra blank lines, to keep the format consistent. In this issue we'll fix only the docstrings in Series having this problem.

To get the list, this can be used (after fixing, the same command shouldn't report anything):

$ ./scripts/validate_docstrings.py --prefix=pandas.Series --errors=GL03
pandas.Series.values: Use only one blank line to separate sections or paragraphs
pandas.Series.xs: Use only one blank line to separate sections or paragraphs
pandas.Series.round: Use only one blank line to separate sections or paragraphs
pandas.Series.product: Use only one blank line to separate sections or paragraphs
pandas.Series.apply: Use only one blank line to separate sections or paragraphs
pandas.Series.rolling: Use only one blank line to separate sections or paragraphs
pandas.Series.corr: Use only one blank line to separate sections or paragraphs
pandas.Series.kurt: Use only one blank line to separate sections or paragraphs
pandas.Series.mad: Use only one blank line to separate sections or paragraphs
pandas.Series.max: Use only one blank line to separate sections or paragraphs
pandas.Series.mean: Use only one blank line to separate sections or paragraphs
pandas.Series.median: Use only one blank line to separate sections or paragraphs
pandas.Series.min: Use only one blank line to separate sections or paragraphs
pandas.Series.prod: Use only one blank line to separate sections or paragraphs
pandas.Series.sem: Use only one blank line to separate sections or paragraphs
pandas.Series.skew: Use only one blank line to separate sections or paragraphs
pandas.Series.std: Use only one blank line to separate sections or paragraphs
pandas.Series.sum: Use only one blank line to separate sections or paragraphs
pandas.Series.var: Use only one blank line to separate sections or paragraphs
pandas.Series.kurtosis: Use only one blank line to separate sections or paragraphs
pandas.Series.compound: Use only one blank line to separate sections or paragraphs
pandas.Series.droplevel: Use only one blank line to separate sections or paragraphs
pandas.Series.rename_axis: Use only one blank line to separate sections or paragraphs
pandas.Series.argmin: Use only one blank line to separate sections or paragraphs
pandas.Series.argmax: Use only one blank line to separate sections or paragraphs
pandas.Series.swaplevel: Use only one blank line to separate sections or paragraphs
pandas.Series.append: Use only one blank line to separate sections or paragraphs
pandas.Series.update: Use only one blank line to separate sections or paragraphs
pandas.Series.tz_localize: Use only one blank line to separate sections or paragraphs
pandas.Series.dt.tz_localize: Use only one blank line to separate sections or paragraphs
pandas.Series.str.findall: Use only one blank line to separate sections or paragraphs
pandas.Series.str.match: Use only one blank line to separate sections or paragraphs
pandas.Series.str.partition: Use only one blank line to separate sections or paragraphs
pandas.Series.str.replace: Use only one blank line to separate sections or paragraphs
pandas.Series.str.rpartition: Use only one blank line to separate sections or paragraphs
pandas.Series.cat: Use only one blank line to separate sections or paragraphs
pandas.Series.hist: Use only one blank line to separate sections or paragraphs
pandas.Series.to_excel: Use only one blank line to separate sections or paragraphs
pandas.Series.to_hdf: Use only one blank line to separate sections or paragraphs
pandas.Series.as_matrix: Use only one blank line to separate sections or paragraphs
pandas.Series.from_array: Use only one blank line to separate sections or paragraphs
pandas.Series.ix: Use only one blank line to separate sections or paragraphs
pandas.Series.ptp: Use only one blank line to separate sections or paragraphs
@datapythonista datapythonista added Docs CI Continuous Integration Code Style Code style, linting, code_checks Effort Low good first issue labels Nov 11, 2018
@douglatornell
Copy link
Contributor

At least some of the hits reported by ./scripts/validate_docstrings.py --prefix=pandas.Series --errors=GL03 are due to an extra blank line at the end of the docstring rather than double blank lines; for example pandas.Series.values.

@datapythonista
Copy link
Member Author

datapythonista commented Nov 12, 2018

Thanks for reporting @douglatornell. That makes sense, as actually we're checking double line breaks, not double blank lines.

We should probably change the error message to be more descriptive, but in any case, those should be fixed too.

@douglatornell
Copy link
Contributor

PR #23651 is a start at fixing this issue but not all of the issues that validate_docstrings.py finds are present in pandas.core.series.py. I haven't managed to understand how the method injection mechanism works to track down the remainder. Output of validate_docstrings.py for me is now:

$ ./scripts/validate_docstrings.py --prefix=pandas.Series --errors=GL03
pandas.Series.xs: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.product: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.rolling: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.kurt: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.mad: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.max: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.mean: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.median: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.min: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.prod: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.sem: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.skew: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.std: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.sum: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.var: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.kurtosis: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.compound: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.droplevel: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.rename_axis: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.argmin: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.argmax: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.tz_localize: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.dt.tz_localize: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.str.findall: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.str.match: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.str.partition: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.str.replace: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.str.rpartition: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.cat: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.hist: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.to_excel: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.to_hdf: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.as_matrix: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.ix: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings
pandas.Series.ptp: Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings

35 messages compared to 43 before c114376.

@datapythonista
Copy link
Member Author

All the statistical functions like mean, max... reuse the same docstring, if you want to spend a bit of time on that, you should be able to fix 10 docstring with one change. The ones in Series.str should be easy to find, you just need to search for def str_cat instead of class Series/def cat.

We can merge your changes and leave them for later, but if you want to continue, what you can do is to see the docstring content (in the documentation web, or by calling scripts/validate_docstrings.py pandas.Series.ptp or whatever method) and then grep the whole source for part of the code.

Or if you use --format=azure in ./scripts/validate_docstrings.py --prefix=pandas.Series --errors=GL03 --format=azure, a file name and the line in the file is reported, and it won't always be correct, but it will in some cases when the code doesn't do tricky things.

@douglatornell
Copy link
Contributor

I'm planning to sprint again tomorrow in Toronto, so I will see if I can advance this further. Thanks for the suggestions on how to track down the docstrings that aren't defined in pandas.Seried. We'll see how it goes...

@datapythonista
Copy link
Member Author

Ok, cool. Better let's use a new branch and open another PR. So we don't need to review again the changes of today.

@douglatornell
Copy link
Contributor

I've gotten the list of errors down to 2:

$ ./scripts/validate_docstrings.py --prefix=pandas.Series --errors=GL03 --format=azure
##vso[task.logissue type=error;sourcepath=pandas/util/_decorators.py;linenumber=1645;code=GL03;]pandas.Series.argmin: Use only one blank line to separate sections or paragraphs
##vso[task.logissue type=error;sourcepath=pandas/util/_decorators.py;linenumber=1714;code=GL03;]pandas.Series.argmax: Use only one blank line to separate sections or paragraphs

I think the issue for them is due to the deprecation warnings that are included in their docstrings. I'll see if I can sort that out in the next few days, and then make that and what I did today a new PR.

@datapythonista
Copy link
Member Author

Thanks @douglatornell, don't worry about those. It's not worth spending time in fixing docstrings that will be removed in the next couple of months. We're making changes so they are ignored in the validation in #23650. So you can open the new PR with what you've got.

Thanks for taking care of this!

douglatornell added a commit to douglatornell/pandas that referenced this issue Nov 14, 2018
douglatornell added a commit to douglatornell/pandas that referenced this issue Nov 14, 2018
@douglatornell
Copy link
Contributor

Done! Thanks for your help on this @datapythonista.

Sorry about the 2 commits above referencing this issue. I messed up on branching. PR #23698 is a clean, single commit 864720a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Code Style Code style, linting, code_checks Docs good first issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants