-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Prune unnecessary indexing code #27576
Conversation
@@ -2077,6 +2072,8 @@ def _getitem_tuple(self, tup: Tuple): | |||
|
|||
# if the dim was reduced, then pass a lower-dim the next time | |||
if retval.ndim < self.ndim: | |||
# TODO: this is never reached in tests; can we confirm that | |||
# it is impossible? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toobaz any thoughts on showing this is unreachable? If we can confirm this is the case, then this chunk of the code can be shared with the other _getitem_tuple method above
(pinging you since you wrote the wiki page on simplifying this code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this also wasnt hit in 0.24.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for context, example case which triggers on that commit but not on master
Code
import numpy as np
from pandas import *
mi_int = DataFrame(
np.random.randn(3, 3),
columns=[[2, 2, 4], [6, 8, 10]],
index=[[4, 4, 8], [8, 10, 12]],
)
# this triggers the code path when testing against 9f0dc3befb
rs = mi_int.iloc[:, 2]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that path is ever reached. It should be reached by something like df.loc[1, [0, 1]]
, which however gets captured in the call above to self._getitem_lowerdim(tup)
(which doesn't make sense, because it is named "lowerdim", but then it resorts to _is_nested_tuple_indexer
). So to sum up: I don't know exactly what the aim of that axis -= 1
is, I don't think it is ever reached, but I think it should be reached in principle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(but no major objections to removing for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pilkibun there is another un-hit block on lines 440-450. any idea what that was intended to catch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was related to Panel indexing, can probably go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toobaz how about we fix getitem_lowerdim to make sense before removing this check. simplifying this code is going to take a few passes
No objection
@@ -2152,7 +2149,7 @@ def _convert_to_indexer( | |||
) | |||
|
|||
|
|||
class _ScalarAccessIndexer(_NDFrameIndexer): | |||
class _ScalarAccessIndexer(_NDFrameIndexerBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only _NDFrameIndexer method these use is _tuplify, which is why this PR takes that out and makes it a function instead of a method. Much easier to reason about these classes now
pass | ||
elif is_timedelta64_dtype(self.dtype): | ||
# reassign a null value to iNaT | ||
if is_valid_nat_for_dtype(value, self.dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is no longer reachable since we fixed a bug in the _set_with_engine call above in #27536.
@@ -3464,3 +3462,22 @@ def _sparsify(label_list, start=0, sentinel=""): | |||
|
|||
def _get_na_rep(dtype): | |||
return {np.datetime64: "NaT", np.timedelta64: "NaT"}.get(dtype, "NaN") | |||
|
|||
|
|||
def maybe_droplevels(index, key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a doc-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
@@ -2077,6 +2072,8 @@ def _getitem_tuple(self, tup: Tuple): | |||
|
|||
# if the dim was reduced, then pass a lower-dim the next time | |||
if retval.ndim < self.ndim: | |||
# TODO: this is never reached in tests; can we confirm that | |||
# it is impossible? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was related to Panel indexing, can probably go.
@@ -2320,6 +2314,12 @@ def _convert_key(self, key, is_setter: bool = False): | |||
return key | |||
|
|||
|
|||
def _tuplify(ndim: int, loc) -> tuple: | |||
tup = [slice(None, None) for _ in range(ndim)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a doc-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstrings added. will take another look at removing the discussed lines in the next pass
@@ -2750,7 +2747,8 @@ def get_loc_level(self, key, level=0, drop_level=True): | |||
(1, None) | |||
""" | |||
|
|||
def maybe_droplevels(indexer, levels, drop_level): | |||
# different name to distinguish from maybe_droplevels | |||
def maybe_mi_droplevels(indexer, levels, drop_level: bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in future maybe pull this out to a module level function
thanks |
No description provided.