DEPR: Deprecate str.split return_type #10085

sinhrks · 2015-05-08T21:07:34Z

Closes #9847, Closes #9870. Default value is expand=False to be compat withreturn_type='series'. May better to change the default to True in future (and show warning about it)?

CC @jreback @jorisvandenbossche @sreejata

jreback · 2015-05-08T22:00:37Z

pandas/core/strings.py

@@ -742,10 +735,7 @@ def str_split(arr, pat=None, n=None, return_type='series'):
                n = 0
            regex = re.compile(pat)
            f = lambda x: regex.split(x, maxsplit=n)
-    if return_type == 'frame':
-        res = DataFrame((Series(x) for x in _na_map(f, arr)), index=arr.index)


isn't this what expand=True does (and btw, can easily fix #10081) at the same time, just take out the use of Series here.

yes, the logic is replaced to _wrap_results_expand.

I think #10081 should be considered separately, because we can't simply use workaround for all the cases.

s = pd.Series([1.1, '2.2']) # current behavior s.str.split('.', expand=True) # 0 1 # 0 NaN NaN # 1 2 2 # workaround pd.read_table(StringIO(s.to_csv(None, index=None)), sep='.') # 1 1.1 # 0 2 2 # numpy pd.DataFrame(list(np.core.defchararray.split(s.values.astype(str), '.'))) # 0 1 # 0 1 1 # 1 2 2

mortada · 2015-05-08T23:58:31Z

doc/source/whatsnew/v0.16.1.txt

@@ -221,6 +221,28 @@ enhancements are performed to make string operation easier.
     idx.str.startswith('a')
     s[s.index.str.startswith('a')]

+
+- ``split`` now takes ``expand`` keyword to specify returning dimensionality. ``return_type`` is deprecated. (:issue:`9847`)


"returning dimensionality" should be "expanding dimensionality"

how about "specify whether to expand dimensionality"?

Sounds good!

On May 8, 2015, at 8:59 PM, Sinhrks notifications@github.com wrote:

In doc/source/whatsnew/v0.16.1.txt:

@@ -221,6 +221,28 @@ enhancements are performed to make string operation easier.
idx.str.startswith('a')
s[s.index.str.startswith('a')]

+- split now takes expand keyword to specify returning dimensionality. return_type is deprecated. (:issue:9847)
how about "specify whether to expand dimensionality"?

—
Reply to this email directly or view it on GitHub.

sinhrks · 2015-05-09T04:21:40Z

@jreback, @mortada Thanks, updated.

jorisvandenbossche · 2015-05-09T09:19:10Z

Looks good!

Another thing I encountered while reviewing: there is no good explanation of the n kwarg of split in the docstring, and the default differs between str_split/docstring (default of None) and the signature of split itself (default of -1)

sinhrks · 2015-05-09T12:49:33Z

@jorisvandenbossche Explanation of n is written as notes, thus moved it to main part. Also, modified the docstring to meet the str.split without changing the default values. Or may better to change defaults as None and -1 has the same meaning?

I also found current arg names are different from standard str.split (sep and maxsplit ) ...

jreback · 2015-05-09T15:31:14Z

merged via 8b89842

@sinhrks thank you!

if you have minor doc changes, pls fee free to make another PR.

Deprecated in #10085

sinhrks added the Strings String extension data type and string data label May 8, 2015

sinhrks added this to the 0.16.1 milestone May 8, 2015

sinhrks force-pushed the str_split_expand branch from 1346295 to e07da48 Compare May 8, 2015 21:25

jreback reviewed May 8, 2015
View reviewed changes

jreback added the Deprecate Functionality to remove in pandas label May 8, 2015

jsexauer mentioned this pull request May 8, 2015

DEPR: Clean up list of deprecations from prior versions #6581

Closed

1 task

mortada reviewed May 8, 2015
View reviewed changes

sinhrks mentioned this pull request May 9, 2015

ENH: Index StringMethods should return MultiIndex when result dimension is more than one #10008

Closed

6 tasks

sinhrks force-pushed the str_split_expand branch from e07da48 to fe13f04 Compare May 9, 2015 04:13

sinhrks force-pushed the str_split_expand branch from fe13f04 to a5403a5 Compare May 9, 2015 12:43

DEPR: Deprecate str.split return_type

4e08839

sinhrks force-pushed the str_split_expand branch from a5403a5 to 4e08839 Compare May 9, 2015 12:46

This was referenced May 9, 2015

API: return_type argument in StringMethods.split() #9847

Closed

ENH: #9847, adding a "same" and "expand" param to StringMethods.split() #9870

Closed

jreback closed this May 9, 2015

sinhrks deleted the str_split_expand branch May 9, 2015 21:15

gfyoung mentioned this pull request Jul 19, 2016

CLN: Removed the return_type param in StringMethods.split #13701

Merged

jorisvandenbossche pushed a commit that referenced this pull request Jul 22, 2016

CLN: Removed the return_type param in StringMethods.split (#13701)

253ed5d

Deprecated in #10085

jreback mentioned this pull request Jul 24, 2016

DEPR: deprecations log for removed issues #13777

Closed

WillAyd mentioned this pull request Mar 11, 2018

DOC: update the pandas.Series.str.split docstring #20282

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEPR: Deprecate str.split return_type #10085

DEPR: Deprecate str.split return_type #10085

sinhrks commented May 8, 2015

jreback May 8, 2015

sinhrks May 9, 2015

mortada May 8, 2015

sinhrks May 9, 2015

mortada May 9, 2015

sinhrks commented May 9, 2015

jorisvandenbossche commented May 9, 2015

sinhrks commented May 9, 2015

jreback commented May 9, 2015

DEPR: Deprecate str.split return_type #10085

DEPR: Deprecate str.split return_type #10085

Conversation

sinhrks commented May 8, 2015

jreback May 8, 2015

Choose a reason for hiding this comment

sinhrks May 9, 2015

Choose a reason for hiding this comment

mortada May 8, 2015

Choose a reason for hiding this comment

sinhrks May 9, 2015

Choose a reason for hiding this comment

mortada May 9, 2015

Choose a reason for hiding this comment

sinhrks commented May 9, 2015

jorisvandenbossche commented May 9, 2015

sinhrks commented May 9, 2015

jreback commented May 9, 2015