-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Series.str.split can return a DataFrame instead of Series of lists #8663
Conversation
looks good. pls add a release note in v0.15.1.txt I am not sure I love the option maybe |
@@ -631,6 +631,9 @@ def str_split(arr, pat=None, n=None): | |||
pat : string, default None | |||
String or regular expression to split on. If None, splits on whitespace | |||
n : int, default None (all) | |||
to_df : Boolean, default False | |||
If True, returns a DataFrame, | |||
If False, returns an array with one dimension (elements are lists). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"array with one dimension" -> "series" ? (it does return a series no?)
very minor: "Boolean" doesn't need a capital B (that is more consistent)
for the name, I would certainly use edit: maybe |
👍 for Eventually changing the default to True (to return a frame) sounds good. I think you're saying start deprecating in 0.16 right? And make the change later? |
As a pure "what if" option, in |
@immerrr But this I don't know if there are other examples in pandas of similar things? |
|
ah, that is not what I meant :-) How does the word 'orient' relate to the fact it is a series or expanded to dataframe? |
@jorisvandenbossche that is true. ok, |
in boxplot you have the |
ohh, I like that. @billletson want to change to that kw? @TomAugspurger yes, the idea is to change this in 0.16.0 (or maybe just deprecate and change the default later) |
Speaking of making it the default, it makes a lot of sense to return frames by default when |
Revised the kw, added a release note, as well as a couple more test cases. |
looks good to me, @jorisvandenbossche ? @billletson can you also create a new issue to have the default changed to frame? pls reference this issue (i will mark it as 0.16.0) |
yep, looks good. @billletson Thanks a lot! |
ENH: Series.str.split can return a DataFrame instead of Series of lists
closes #8428.
Adds a flag which when True returns a DataFrame with columns being the index of the lists generated by the string splitting operation. When False, it returns a 1D numpy array, as before. Defaults to false to not break compatibility.
In the case with no splits, returns a single column DataFrame rather than squashing to a Series.