DOC: Formatting in Series.str.extractall #22565

lucadonini96 · 2018-09-01T12:34:03Z

In Series.str.extractall, corrected the formatting in the return value and added a period at the end of the parameter descriptions. Can also clarify descriptions if useful.

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

################################################################################
################### Docstring (pandas.Series.str.extractall) ###################
################################################################################

For each subject string in the Series, extract groups from all
matches of regular expression pat. When each subject string in the
Series has exactly one match, extractall(pat).xs(0, level='match')
is the same as extract(pat).

.. versionadded:: 0.18.0

Parameters
----------
pat : str
    Regular expression pattern with capturing groups.
flags : int, default 0 (no flags)
    re module flags, e.g. re.IGNORECASE.

Returns
-------
DataFrame
    A DataFrame with one row for each match, and one column for each
    group. Its rows have a MultiIndex with first levels that come from
    the subject Series. The last level is named 'match' and indexes the
    matches in each item of the Series. Any capture group names in regular
    expression pat will be used for column names; otherwise capture
    group numbers will be used.

See Also
--------
extract : returns first match only (not all matches)

Examples
--------
A pattern with one group will return a DataFrame with one column.
Indices with no matches will not appear in the result.

>>> s = pd.Series(["a1a2", "b1", "c1"], index=["A", "B", "C"])
>>> s.str.extractall(r"[ab](\d)")
         0
  match
A 0      1
  1      2
B 0      1

Capture group names are used for column names of the result.

>>> s.str.extractall(r"[ab](?P<digit>\d)")
        digit
  match
A 0         1
  1         2
B 0         1

A pattern with two groups will return a DataFrame with two columns.

>>> s.str.extractall(r"(?P<letter>[ab])(?P<digit>\d)")
        letter digit
  match
A 0          a     1
  1          a     2
B 0          b     1

Optional groups that do not match are NaN in the result.

>>> s.str.extractall(r"(?P<letter>[ab])?(?P<digit>\d)")
        letter digit
  match
A 0          a     1
  1          a     2
B 0          b     1
C 0        NaN     1

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
	Use only one blank line to separate sections or paragraphs
	Errors in parameters section
		Parameter "flags" description should start with a capital letter

…e end of the parameter descriptions. Can also clarify descriptions if useful.

datapythonista · 2018-09-01T13:37:56Z

Can you check comments in #22562 and see if they make sense for the changes here too?

pep8speaks · 2018-09-01T14:12:50Z

Hello @lucadonini96! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on September 01, 2018 at 14:15 Hours UTC

codecov · 2018-09-01T14:12:56Z

Codecov Report

Merging #22565 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #22565   +/-   ##
=======================================
  Coverage   92.04%   92.04%           
=======================================
  Files         169      169           
  Lines       50787    50787           
=======================================
  Hits        46745    46745           
  Misses       4042     4042

Flag	Coverage Δ
#multiple	`90.45% <ø> (ø)`	⬆️
#single	`42.29% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/strings.py	`98.63% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 98fb53c...c964407. Read the comment docs.

datapythonista

Good changes, added just couple of minor things

datapythonista · 2018-09-02T16:24:49Z

pandas/core/strings.py

@@ -1000,7 +1004,6 @@ def str_extractall(arr, pat, flags=0):
      1          a     2
    B 0          b     1
    C 0        NaN     1
-
    """


Can you remove this blank line at the end before the quotes.

datapythonista · 2018-09-02T16:25:36Z

pandas/core/strings.py

+        A ``DataFrame`` with one row for each match, and one column for each
+        group. Its rows have a ``MultiIndex`` with first levels that come from
+        the subject ``Series``. The last level is named 'match' and indexes the
+        matches in each item of the ``Series``. Any capture group names in


I think in DataFrame... is better to use backticks, as they are more "links to other pages" than "code" in my opinion.

I meant single backticks in the previous comment, instead of double backticks.

jreback · 2018-09-18T12:27:59Z

@datapythonista

Corrected the formatting in the return value and added a period at th…

3f92a69

…e end of the parameter descriptions. Can also clarify descriptions if useful.

datapythonista added Docs Strings String extension data type and string data labels Sep 1, 2018

Improved description of "flags" argument and general formatting.

ba1995e

pep8

c964407

datapythonista reviewed Sep 2, 2018

View reviewed changes

datapythonista merged commit c994e80 into pandas-dev:master Sep 18, 2018

aeltanawy pushed a commit to aeltanawy/pandas that referenced this pull request Sep 20, 2018

DOC: Updating the docstring of Series.str.extractall (pandas-dev#22565)

d03ef77

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

DOC: Updating the docstring of Series.str.extractall (pandas-dev#22565)

ce3e7fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Formatting in Series.str.extractall #22565

DOC: Formatting in Series.str.extractall #22565

lucadonini96 commented Sep 1, 2018 •

edited by datapythonista

Loading

datapythonista commented Sep 1, 2018

pep8speaks commented Sep 1, 2018 •

edited

Loading

codecov bot commented Sep 1, 2018 •

edited

Loading

datapythonista left a comment

datapythonista Sep 2, 2018

datapythonista Sep 2, 2018

datapythonista Sep 3, 2018

jreback commented Sep 18, 2018

DOC: Formatting in Series.str.extractall #22565

DOC: Formatting in Series.str.extractall #22565

Conversation

lucadonini96 commented Sep 1, 2018 • edited by datapythonista Loading

datapythonista commented Sep 1, 2018

pep8speaks commented Sep 1, 2018 • edited Loading

Comment last updated on September 01, 2018 at 14:15 Hours UTC

codecov bot commented Sep 1, 2018 • edited Loading

Codecov Report

datapythonista left a comment

Choose a reason for hiding this comment

datapythonista Sep 2, 2018

Choose a reason for hiding this comment

datapythonista Sep 2, 2018

Choose a reason for hiding this comment

datapythonista Sep 3, 2018

Choose a reason for hiding this comment

jreback commented Sep 18, 2018

lucadonini96 commented Sep 1, 2018 •

edited by datapythonista

Loading

pep8speaks commented Sep 1, 2018 •

edited

Loading

codecov bot commented Sep 1, 2018 •

edited

Loading