-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: Deprecate str.split return_type #10085
Conversation
@@ -742,10 +735,7 @@ def str_split(arr, pat=None, n=None, return_type='series'): | |||
n = 0 | |||
regex = re.compile(pat) | |||
f = lambda x: regex.split(x, maxsplit=n) | |||
if return_type == 'frame': | |||
res = DataFrame((Series(x) for x in _na_map(f, arr)), index=arr.index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this what expand=True
does (and btw, can easily fix #10081) at the same time, just take out the use of Series
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the logic is replaced to _wrap_results_expand
.
I think #10081 should be considered separately, because we can't simply use workaround for all the cases.
s = pd.Series([1.1, '2.2'])
# current behavior
s.str.split('.', expand=True)
# 0 1
# 0 NaN NaN
# 1 2 2
# workaround
pd.read_table(StringIO(s.to_csv(None, index=None)), sep='.')
# 1 1.1
# 0 2 2
# numpy
pd.DataFrame(list(np.core.defchararray.split(s.values.astype(str), '.')))
# 0 1
# 0 1 1
# 1 2 2
@@ -221,6 +221,28 @@ enhancements are performed to make string operation easier. | |||
idx.str.startswith('a') | |||
s[s.index.str.startswith('a')] | |||
|
|||
|
|||
- ``split`` now takes ``expand`` keyword to specify returning dimensionality. ``return_type`` is deprecated. (:issue:`9847`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"returning dimensionality" should be "expanding dimensionality"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "specify whether to expand dimensionality"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
On May 8, 2015, at 8:59 PM, Sinhrks notifications@github.com wrote:
In doc/source/whatsnew/v0.16.1.txt:
@@ -221,6 +221,28 @@ enhancements are performed to make string operation easier.
idx.str.startswith('a')
s[s.index.str.startswith('a')]
+-
split
now takesexpand
keyword to specify returning dimensionality.return_type
is deprecated. (:issue:9847
)
how about "specify whether to expand dimensionality"?—
Reply to this email directly or view it on GitHub.
Looks good! Another thing I encountered while reviewing: there is no good explanation of the |
@jorisvandenbossche Explanation of I also found current arg names are different from standard |
Closes #9847, Closes #9870. Default value is
expand=False
to be compat withreturn_type='series'
. May better to change the default to True in future (and show warning about it)?CC @jreback @jorisvandenbossche @sreejata