Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate .select() #17633

Merged
merged 1 commit into from
Oct 4, 2017
Merged

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Sep 22, 2017

xref #12401

@jreback jreback added API Design Indexing Related to indexing on series/frames, not to indexes themselves labels Sep 22, 2017
@jreback jreback added this to the 0.21.0 milestone Sep 22, 2017
@codecov
Copy link

codecov bot commented Sep 23, 2017

Codecov Report

Merging #17633 into master will decrease coverage by 0.04%.
The diff coverage is 84.53%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17633      +/-   ##
==========================================
- Coverage   91.22%   91.18%   -0.05%     
==========================================
  Files         163      163              
  Lines       49652    49709      +57     
==========================================
+ Hits        45294    45326      +32     
- Misses       4358     4383      +25
Flag Coverage Δ
#multiple 88.97% <84.53%> (-0.02%) ⬇️
#single 40.16% <41.23%> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 91.96% <100%> (-0.03%) ⬇️
pandas/core/common.py 91.13% <76.47%> (-0.86%) ⬇️
pandas/core/indexing.py 93.13% <83.07%> (-0.81%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.77% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6d8953...1581a05. Read the comment docs.

@codecov
Copy link

codecov bot commented Sep 23, 2017

Codecov Report

Merging #17633 into master will decrease coverage by 0.03%.
The diff coverage is 82.89%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17633      +/-   ##
==========================================
- Coverage   91.26%   91.22%   -0.04%     
==========================================
  Files         163      163              
  Lines       49869    49916      +47     
==========================================
+ Hits        45511    45535      +24     
- Misses       4358     4381      +23
Flag Coverage Δ
#multiple 89.02% <82.89%> (-0.02%) ⬇️
#single 40.24% <48.68%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/common.py 91.42% <ø> (ø) ⬆️
pandas/core/generic.py 92.1% <100%> (-0.03%) ⬇️
pandas/core/indexing.py 93% <79.68%> (-0.98%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.73% <0%> (-0.1%) ⬇️
pandas/io/formats/format.py 96.07% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69024a0...ef031c1. Read the comment docs.

@jorisvandenbossche
Copy link
Member

I personally don't think we should recommend .loc(axis=..)[]. Why not just df.loc[] (or df.loc[:, ..] for axis=1) ?

@jreback
Copy link
Contributor Author

jreback commented Sep 25, 2017

I personally don't think we should recommend .loc(axis=..)[]. Why not just df.loc[] (or df.loc[:, ..] for axis=1) ?

those work, its an option though

@jreback jreback force-pushed the select branch 3 times, most recently from 9c6734a to 9c2b402 Compare October 1, 2017 16:20
@pep8speaks
Copy link

pep8speaks commented Oct 1, 2017

Hello @jreback! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on October 03, 2017 at 20:10 Hours UTC

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't feel comfortable with adding this extra 'try-and-see' guess to the loc method (whether the function should be applied on the object itself or on the index elements). In generally we would like to see less of those logics, not more IMO.

I also don't think it is really needed for deprecating select? It only means that the alternative will be a bit more convoluted.

(and if we add it, it should also be documented)

.. _whatsnew_0210.api_breaking.select:

Select method is deprecated
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to the 'deprecations' section?


df = DataFrame({'A': [1, 2, 3]}, index=['foo', 'bar', 'baz'])

.. code_block:: ipython
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code_block -> codeblock

@jreback
Copy link
Contributor Author

jreback commented Oct 1, 2017

I also don't think it is really needed for deprecating select? It only means that the alternative will be a bit more convoluted.

Well, we already have this ability to accept pretty arbitrary things for .loc callables. The alternative is basically to deprecate .select() and NOT offer any alternatives directly. Which is for sure simpler. This way it still 'works'.

and yes its needed for our current tests to pass. Yes its pretty fragile. I think removing this logic and simply deprecating .select entirely is an option.

@jorisvandenbossche
Copy link
Member

Well, we already have this ability to accept pretty arbitrary things for .loc callables.

It is not arbitrary, it is the calling object (series of frame). Which is clearly different from the index elements.

Yes its pretty fragile. I think removing this logic and simply deprecating .select entirely is an option.

I would personally go for that, instead of adding this fragile logic

There are still 'alternatives' to offer, although a bit more work to type. Eg:

s.loc[s.index.map(f).values]

it's only a bit annoying that you need the .values (and there are maybe others I don't think of)

We could also add the 'function' ability to filter ..

@jreback
Copy link
Contributor Author

jreback commented Oct 1, 2017

ok will revise tomorrow

@jreback
Copy link
Contributor Author

jreback commented Oct 1, 2017

@jorisvandenbossche updated, MUCH simpler.

@jreback jreback changed the title DEPR: deprecate .select() in favor of .loc(axis=)[] DEPR: deprecate .select() Oct 2, 2017
@jreback
Copy link
Contributor Author

jreback commented Oct 2, 2017

any final comments

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed looks a lot cleaner! I think there are some left-overs from the previous version, but maybe missed something

@@ -441,13 +441,22 @@ def _get_callable_name(obj):
return None


def _apply_if_callable(maybe_callable, obj, **kwargs):
def _apply_if_callable(maybe_callable, obj, axis=None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the axis=None a left-over from the previous version of the PR? (it's not used anymore)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no used, but its ok to put in here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I fail to see how it is useful. It is just adding an unused argument, complicating the code?

@@ -2339,6 +2339,8 @@ def select(self, crit, axis=0):
"""
Return data corresponding to axis labels matching criteria

DEPRECATED: use df.loc[labels.map(crit)] to select via labels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe df.index instead of labels ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed, I put the labels because you can actually select on any axis (though it does default to 0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, that was a good reason :-)

@@ -2349,6 +2351,11 @@ def select(self, crit, axis=0):
-------
selection : type of caller
"""
warnings.warn("select is deprecated and will be removed in a "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quote 'select' ?

@@ -5756,7 +5766,7 @@ def _where(self, cond, other=np.nan, inplace=False, axis=None, level=None,
inplace = validate_bool_kwarg(inplace, 'inplace')

# align the cond to same shape as myself
cond = com._apply_if_callable(cond, self)
cond = com._apply_if_callable(cond, self, axis=axis)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above, the axis is not used

@@ -107,7 +109,8 @@ def __iter__(self):

def __getitem__(self, key):
if type(key) is tuple:
key = tuple(com._apply_if_callable(x, self.obj) for x in key)
key = tuple(com._apply_if_callable(x, self.obj, i)
for i, x in enumerate(key))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here possibly as well, and a bit below as well (leftover)

expected_weekdays = self.tsframe.reindex(index=index)

with tm.assert_produces_warning(FutureWarning,
check_stacklevel=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in principle the check_stacklevel=False should not be needed


.. ipython:: python

df.loc[df.index.map(lambda x: x in ['bar', 'baz'])]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need the .values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #17738, this was just changed to not need it

@jreback
Copy link
Contributor Author

jreback commented Oct 3, 2017

updated, any final comments?

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please either remove the unused code or either explain why it is useful/needed ?

@jreback
Copy link
Contributor Author

jreback commented Oct 3, 2017

its useful to have maybe_callable actually know that it has an axis. It should have originally been done like this.

@jorisvandenbossche
Copy link
Member

But it is not used by the callable. Can you please try to explain to me why you want it there? And "it is useful" is not clarifying

@jreback
Copy link
Contributor Author

jreback commented Oct 3, 2017

ok I took the axis bizness out. should be good to go now.

@jorisvandenbossche jorisvandenbossche merged commit 48d0460 into pandas-dev:master Oct 4, 2017
@jorisvandenbossche
Copy link
Member

Thanks @jreback !

ghost pushed a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017
alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
@topper-123 topper-123 mentioned this pull request Jun 3, 2019
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants