DOC: update the pandas.DataFrame.any docstring #20217

ghost · 2018-03-10T19:03:47Z

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

[+] PR title is "DOC: update the docstring"
[+] The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
[+] The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
[+] The html version looks good: python doc/make.py --single <your-function-or-method>
[+] It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

################################################################################
####################### Docstring (pandas.DataFrame.any) #######################
################################################################################

Return whether any element is True over requested axis.

Unlike :meth:`DataFrame.all`, this performs an *or* operation. If any of the
values along the specified axis is True, this will return True.

Parameters
----------
axis : int, default 0
    Select the axis which can be 0 for indices and 1 for columns.
skipna : boolean, default True
    Exclude NA/null values. If an entire row/column is NA, the result
    will be NA.
level : int or level name, default None
    If the axis is a MultiIndex (hierarchical), count along a
    particular level, collapsing into a Series.
bool_only : boolean, default None
    Include only boolean columns. If None, will attempt to use everything,
    then use only boolean data. Not implemented for Series.
**kwargs : any, default None
    Additional keywords have no affect but might be accepted for
    compatibility with numpy.

Returns
-------
any : Series or DataFrame (if level specified)

See Also
--------
pandas.DataFrame.all : Return whether all elements are True.

Examples
--------
**Series**

For Series input, the output is a scalar indicating whether any element
is True.

>>> pd.Series([True, False]).any()
True

**DataFrame**

Whether each column contains at least one True element (the default).

>>> pd.DataFrame({
...     "A": [1, 2, 3],
...     "B": [4, 5, 6]
... }).any()
A    True
B    True
dtype: bool

Aggregating over the columns.

>>> pd.DataFrame({
...     "A": [True, False, True],
...     "B": [4, 5, 6]
... }).any(axis='columns')
0    True
1    True
2    True
dtype: bool

>>> pd.DataFrame({
...     "A": [True, False, True],
...     "B": [4, 0, 6]
... }).any(axis='columns')
0    True
1    False
2    True
dtype: bool

`any` for an empty DataFrame is an empty Series.

>>> pd.DataFrame([]).any()
Series([], dtype: bool)

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	Use only one blank line to separate sections or paragraphs
	Errors in parameters section
		Parameters {'kwargs'} not documented
		Unknown parameters {'**kwargs'}

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

Checklist for other PRs (remove this part if you are doing a PR for the pandas documentation sprint):

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

TomAugspurger

Hmm looks like some extra files were committed. I thikn we have another PR adding savefig to our gitignore.

Can you remove those files rm -rf doc/source/savefig and then update your PR. I thikn with git rm doc/source/savefig.

Reviewers: we should wait for at least one CI to finish since this is changing parameters passed through the functions making the docstrings.

TomAugspurger · 2018-03-10T19:52:51Z

pandas/core/generic.py

+_any_also = """\
+See Also
+--------
+pandas.DataFrame.all : Return whether all elements are True \


I don't think you need the trailing \ here do you?

to be clear, you do need the one on the first line, jsut not these.

yes, and the same for the ones below as well

@TomAugspurger if I don't put \, line becomes longer than 79 characters and it isn't passing git diff origin/master -u -- "*.py" | flake8 --diff validation...

We want them, else we get long lines in the text docstring liek

One dimensional boolean pandas.Series is returned. Unlike pandas.DataFrame.all, pandas.DataFrame.any performs OR operation; in other word, if any of the values along the specified axis is True, pandas.DataFrame.any will return True.

@TomAugspurger & @jorisvandenbossche , let's say I removed \ and , then the result of git diff origin/master -u -- "*.py" | flake8 --diff is going to be pandas/core/generic.py:7834:80: E501 line too long (83 > 79 characters), since the line is pandas.DataFrame.all : Return whether all elements are True over requested axis. - I don't want to break the line with \n, instead, I'm using \. There is exactly same reason behind the cases I used \.

TomAugspurger · 2018-03-10T19:53:34Z

pandas/core/generic.py

+_any_examples = """\
+Examples
+--------
+By default, any from an empty DataFrame is empty Series::


No double colon, just a .

@TomAugspurger according to this documentation, double colon is required to show code samples.

TomAugspurger · 2018-03-10T19:53:42Z

pandas/core/generic.py

+--------
+By default, any from an empty DataFrame is empty Series::
+
+    >> pd.DataFrame([]).any()


Three >. Doesn't need to be indented.

@TomAugspurger , you mean, without double-colon and three >, is it going to show code samples as required?

TomAugspurger · 2018-03-10T19:53:49Z

pandas/core/generic.py

+
+Non-boolean values will always give True::
+
+    >> pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]}).any()


TomAugspurger · 2018-03-10T19:53:59Z

pandas/core/generic.py

+
+It is performing OR along the specified axis::
+
+    >> pd.DataFrame({"A": [1, False, 3], "B": [4, 5, 6]}).any(axis=1)


TomAugspurger · 2018-03-10T19:54:03Z

pandas/core/generic.py

+    2    True
+    dtype: bool
+
+    >> pd.DataFrame({"A": [1, False, 3], "B": [4, False, 6]}).any(axis=1)


jorisvandenbossche

Can you also add an example with Series ? (the docstring is shared for both Series and DataFrame)

jorisvandenbossche · 2018-03-10T20:46:43Z

pandas/core/generic.py

+_any_also = """\
+See Also
+--------
+pandas.DataFrame.all : Return whether all elements are True \


yes, and the same for the ones below as well

jorisvandenbossche · 2018-03-10T20:47:54Z

pandas/core/generic.py

+_any_desc = """\
+Return whether any element is True over requested axis.
+
+One dimensional pandas.Series having boolean values will be returned. \


pandas.Series -> Series

I would also say "boolean Series" instead of "Series having boolean values"

jorisvandenbossche · 2018-03-10T20:48:42Z

pandas/core/generic.py

+One dimensional pandas.Series having boolean values will be returned. \
+Unlike pandas.DataFrame.all, pandas.DataFrame.any performs OR operation; \
+in other word, if any of the values along the specified axis is True, \
+pandas.DataFrame.any will return True."""


Can you also mention here something that for Series the return value is a single boolean value?

jreback · 2018-03-11T14:47:03Z

this needs a rebase now

TomAugspurger · 2018-03-11T15:37:08Z

@smusali I'm doing the rebase / merge. 1 minute.

…ali-booldoc

TomAugspurger · 2018-03-11T15:50:50Z

@smusali fixed the merge conflict. Also made an update to the grammar and examples.

I changed the examples to have consistent types for the columns. In general, having a mix like [4, False, 6] is less common than having all bools or all ints like[4, 0, 6]

…e...

ghost · 2018-03-11T16:09:50Z

Done some requested changes and made some fixes - please, review, @TomAugspurger, @jreback and @jorisvandenbossche; thanks in advance!

Split Series and DataFrame Edgecase last in examples. Use axis='columns' Simplify extended description.

TomAugspurger · 2018-03-11T17:00:22Z

################################################################################
####################### Docstring (pandas.DataFrame.any) #######################
################################################################################

Return whether any element is True over requested axis.

Unlike :meth:`DataFrame.all`, this performs an *or* operation. If any of the
values along the specified axis is True, this will return True.

Parameters
----------
axis : int, default 0
    Select the axis which can be 0 for indices and 1 for columns.
skipna : boolean, default True
    Exclude NA/null values. If an entire row/column is NA, the result
    will be NA.
level : int or level name, default None
    If the axis is a MultiIndex (hierarchical), count along a
    particular level, collapsing into a Series.
bool_only : boolean, default None
    Include only boolean columns. If None, will attempt to use everything,
    then use only boolean data. Not implemented for Series.
**kwargs : any, default None
    Additional keywords have no affect but might be accepted for
    compatibility with numpy.

Returns
-------
any : Series or DataFrame (if level specified)

See Also
--------
pandas.DataFrame.all : Return whether all elements are True.

Examples
--------
**Series**

For Series input, the output is a scalar indicating whether any element
is True.

>>> pd.Series([True, False]).any()
True

**DataFrame**

Whether each column contains at least one True element (the default).

>>> pd.DataFrame({
...     "A": [1, 2, 3],
...     "B": [4, 5, 6]
... }).any()
A    True
B    True
dtype: bool

Aggregating over the columns.

>>> pd.DataFrame({
...     "A": [True, False, True],
...     "B": [4, 5, 6]
... }).any(axis='columns')
0    True
1    True
2    True
dtype: bool

>>> pd.DataFrame({
...     "A": [True, False, True],
...     "B": [4, 0, 6]
... }).any(axis='columns')
0    True
1    False
2    True
dtype: bool

`any` for an empty DataFrame is an empty Series.

>>> pd.DataFrame([]).any()
Series([], dtype: bool)

################################################################################
################################## Validation ##################################
################################################################################

ghost · 2018-03-11T17:25:21Z

@TomAugspurger , do u have any more change request?

TomAugspurger · 2018-03-12T16:52:22Z

Just updated the examples a tad to show the dataframes.

smusali added 2 commits March 10, 2018 13:22

pandas.DataFrame.any

f514443

Updated pandas.core.generic.py...

140b164

TomAugspurger requested changes Mar 10, 2018

View reviewed changes

jorisvandenbossche reviewed Mar 10, 2018

View reviewed changes

jorisvandenbossche mentioned this pull request Mar 10, 2018

DOC: update the pandas.DataFrame.all docstring #20216

Merged

5 tasks

jreback added Docs Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Mar 10, 2018

smusali and others added 4 commits March 11, 2018 11:20

Removing extra files...

d6214c1

Requested changes...

0ab88c0

Merge remote-tracking branch 'upstream/master' into smusali-booldoc

e171319

Little update...

74db002

TomAugspurger added 2 commits March 11, 2018 10:47

Updated

f567b78

Merge branch 'booldoc' of https://github.com/smusali/pandas into smus…

39ac181

…ali-booldoc

smusali added 3 commits March 11, 2018 11:54

Added one more example related to Series and improved long desc...

9314366

Fixed conflict and udpated long_desc and example: added Series exampl…

9646f7b

…e...

Fixed some issues...

4579eea

Updated

9b971fd

Split Series and DataFrame Edgecase last in examples. Use axis='columns' Simplify extended description.

Explicit examples

484c409

TomAugspurger merged commit 3bed3eb into pandas-dev:master Mar 12, 2018

jorisvandenbossche mentioned this pull request Mar 14, 2018

DOC: Improved the docstring of Series.any() #20078

Closed

5 tasks

arminv mentioned this pull request Mar 15, 2018

DOC: update the pandas.DataFrame.cummax docstring #20336

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: update the pandas.DataFrame.any docstring #20217

DOC: update the pandas.DataFrame.any docstring #20217

ghost commented Mar 10, 2018 •

edited by ghost

Loading

TomAugspurger left a comment •

edited

Loading

TomAugspurger Mar 10, 2018

TomAugspurger Mar 10, 2018

jorisvandenbossche Mar 10, 2018

ghost Mar 11, 2018

TomAugspurger Mar 11, 2018

ghost Mar 11, 2018

TomAugspurger Mar 10, 2018

ghost Mar 11, 2018

TomAugspurger Mar 10, 2018

ghost Mar 11, 2018

TomAugspurger Mar 10, 2018

TomAugspurger Mar 10, 2018

TomAugspurger Mar 10, 2018

jorisvandenbossche left a comment

jorisvandenbossche Mar 10, 2018

jorisvandenbossche Mar 10, 2018

jorisvandenbossche Mar 10, 2018

jorisvandenbossche Mar 10, 2018

jreback commented Mar 11, 2018

TomAugspurger commented Mar 11, 2018

TomAugspurger commented Mar 11, 2018

ghost commented Mar 11, 2018

TomAugspurger commented Mar 11, 2018

ghost commented Mar 11, 2018

TomAugspurger commented Mar 12, 2018


		Non-boolean values will always give True::

		>> pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]}).any()


		It is performing OR along the specified axis::

		>> pd.DataFrame({"A": [1, False, 3], "B": [4, 5, 6]}).any(axis=1)

DOC: update the pandas.DataFrame.any docstring #20217

DOC: update the pandas.DataFrame.any docstring #20217

Conversation

ghost commented Mar 10, 2018 • edited by ghost Loading

TomAugspurger left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Mar 11, 2018

TomAugspurger commented Mar 11, 2018

TomAugspurger commented Mar 11, 2018

ghost commented Mar 11, 2018

TomAugspurger commented Mar 11, 2018

ghost commented Mar 11, 2018

TomAugspurger commented Mar 12, 2018

ghost commented Mar 10, 2018 •

edited by ghost

Loading

TomAugspurger left a comment •

edited

Loading