Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename NDFrame.filter to .select? #26642

Open
topper-123 opened this issue Jun 4, 2019 · 6 comments
Open

Rename NDFrame.filter to .select? #26642

topper-123 opened this issue Jun 4, 2019 · 6 comments
Labels
API Design Needs Discussion Requires discussion from core team before further action

Comments

@topper-123
Copy link
Contributor

I've made a PR to remove NDFrame.select, see #26641. That method was deprecated in 0.21.

There was a discussion in #12401, where many participants agreed that removing the old select and having NDFrame.filter renamed to NDFrame.select would be a better naming scheme. I agree on that for several of the reasons mentioned in the thread.

What do people think of such a name change and a related deprecation of the name (not functionality) .filter?

@WillAyd
Copy link
Member

WillAyd commented Jun 4, 2019

I would agree with the naming scheme. So is your proposal to deprecate both with a warning for filter that it will eventually become select?

@topper-123
Copy link
Contributor Author

topper-123 commented Jun 4, 2019

.select was already deprecated in 0.21 and will be removed in 0.25. But yeah, after #26641 is merged, just change .filter to .select and make .filter a pass-through method like this:

def filter(self, items=None, like=None, regex=None, axis=None):
    warnings.warn(..., FutureWarning)
    return self.select(items=items, like=like, regex=regex, axis=axis)

@topper-123
Copy link
Contributor Author

topper-123 commented Jun 15, 2019

I've made a PR on this, see #26866,

There are some things I don't like in the current signature though:

  • like is very simple and matches anywhere in a string, so I would probably never use it, in fear of accepting something wrong. Should it be taken out?
  • .select currently doesn't accept a callable. A callable would be the more readable alternative to using regexes. Should .select accept callables, e.g. lambda x: x.startswith('abc')? Accepting callables would make .select be usable for non-string indexes also.
  • both like and regex internally convert all labels to strings using ensure_str. I think it should be the users responsibility to only use regex with string labels. Should use of ensure_str be removed?

@topper-123
Copy link
Contributor Author

topper-123 commented Jun 15, 2019

Thinking a bit more: most of the functionality of .filter is already in .loc. The only one missing is really regex matching. I wouldn´t be opposed to drop adding this .select method, but add a .select_regex method instead to complement .select_dtypes, and recommend all other axis filtering being done in .loc (and __getitem__).

This would be more focused and we`d avoid having similar functionality in different methods.

@WillAyd
Copy link
Member

WillAyd commented Jun 15, 2019

Yea there is a ton of overlap between all of these items. I like your proposal for .select_regex and pushing everything else to .loc, so long as we don't feel like we need to remove some functionality from .loc

@jorisvandenbossche
Copy link
Member

As I also commented on the PR, I personally like having more "complex" behaviour (things that need keyword arguments, like the regex) in a method like select, rather than further overloading loc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Needs Discussion Requires discussion from core team before further action
Projects
None yet
4 participants