Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add StringMethods.partition and rpartition #9773

Merged
merged 1 commit into from
May 7, 2015

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Apr 1, 2015

Derived from #9111.

@jorisvandenbossche
Copy link
Member

What do you think of having return_type defaults to 'frame'? (I know in str.split the default is 'series', but that is for back compatibility)

@shoyer
Copy link
Member

shoyer commented Apr 1, 2015

+1 for defaulting to return a frame.

@sinhrks
Copy link
Member Author

sinhrks commented Apr 2, 2015

OK, changed default to 'frame'.

@jreback jreback added API Design Strings String extension data type and string data labels Apr 2, 2015
@jreback
Copy link
Contributor

jreback commented Apr 2, 2015

lgtm, but not sure when would use partition over split

@jreback jreback added this to the 0.16.1 milestone Apr 2, 2015
@jorisvandenbossche
Copy link
Member

partition will always give three items back (only split on first or last occurence of separator) I think?

@sinhrks maybe add some examples to the docstring, and put a 'see also' to split explaining the difference. (+ a see also the the other (r)partition)

@sinhrks
Copy link
Member Author

sinhrks commented Apr 4, 2015

Added docstrings (examples and see also). partition is more specific than split, may useful when want to retain separators.

pat : string, default whitespace
String to split on.
return_type : {'series', 'frame'}, default 'frame'
If frame, returns a DataFrame (elements are strings)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small detail, but can you put a point . at the end of this sentence (as in html, the line breaks are not preserved, so the following If series, ... comes directly behind this, so without punctuation it looks strange

@sinhrks
Copy link
Member Author

sinhrks commented Apr 6, 2015

@jorisvandenbossche Thanks, fixed your points.

About return type, I'll prepare separate PR for existing funcs which is also incorrect.

@sinhrks
Copy link
Member Author

sinhrks commented Apr 6, 2015

Hmm, StringMethods docstrings are mostly derived from corresponding module level functions, and there are 2 policies:

  1. Describe the return type when called via StringMethods (describe what user will see)
  2. Describe the return type of the module level function (what the module level function returns)

First policy looks preferable, so OK to change all the descriptions like first one? I understand that these module level functions are not public.

@jorisvandenbossche
Copy link
Member

Yes, there is no real good solution to have it correct for both (unless starting with templating, but then the purpose of having a visual docstring you can correctly see in the source code for the module level functions is a bit lost). So I would say: most important that user sees the correct info, so I would indeed go with the first one.

@jorisvandenbossche
Copy link
Member

@sinhrks or, maybe what we could also do is this: replace the array with Series, and leave the module level function docstrings as is.
Eg for the example of str_findall: have a .replace('matches : array', 'matches : Series') on the docstring

BTW: it is not that there is a different policy in the docstrings apparently. Eg the str_extract you linked to is really returning already a Series/DataFrame. So it is just that the str_ functions have different return types.

@sinhrks
Copy link
Member Author

sinhrks commented Apr 9, 2015

Separate docstring issue to #9843 as we have to fix other methods, and also consider #9667.

@jorisvandenbossche
Copy link
Member

@sinhrks can you update this with expand=True/false instead of return_type? (see discussion in #9870)

@sinhrks sinhrks force-pushed the partition branch 2 times, most recently from b6b05a9 to cc46330 Compare May 3, 2015 01:20
@sinhrks
Copy link
Member Author

sinhrks commented May 3, 2015

OK, fixed this. And posting the behavior of expand in #9870.

@jreback
Copy link
Contributor

jreback commented May 6, 2015

looks fine. needs a rebase.

pat : string, default whitespace
String to split on.
expand : bool, default True
If True, return DataFrame/MultiIndex expanding dimensionality
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a . at the end of the sentence.

Or make it a list:

* If true ..
* If false ..

@sinhrks
Copy link
Member Author

sinhrks commented May 7, 2015

@jreback @jorisvandenbossche Thanks, fixed your points. Pls check again.

@jorisvandenbossche
Copy link
Member

Looks good!

jorisvandenbossche added a commit that referenced this pull request May 7, 2015
ENH: Add StringMethods.partition and rpartition
@jorisvandenbossche jorisvandenbossche merged commit 45f69cd into pandas-dev:master May 7, 2015
@sinhrks sinhrks deleted the partition branch May 7, 2015 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants