Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: loc dropping levels when df has only one row #38150

Merged
merged 13 commits into from
Dec 30, 2020
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,7 @@ Indexing
- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.__getitem__` raising ``KeyError`` when columns were :class:`MultiIndex` with only one level (:issue:`29749`)
- Bug in :meth:`Series.__getitem__` and :meth:`DataFrame.__getitem__` raising blank ``KeyError`` without missing keys for :class:`IntervalIndex` (:issue:`27365`)
- Bug in setting a new label on a :class:`DataFrame` or :class:`Series` with a :class:`CategoricalIndex` incorrectly raising ``TypeError`` when the new label is not among the index's categories (:issue:`38098`)
- Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)

Missing
^^^^^^^
Expand Down
6 changes: 4 additions & 2 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -838,8 +838,10 @@ def _getitem_nested_tuple(self, tup: Tuple):
if self.name != "loc":
# This should never be reached, but lets be explicit about it
raise ValueError("Too many indices")
with suppress(IndexingError):
return self._handle_lowerdim_multi_index_axis0(tup)
if len(self.obj) > 1 or not any(isinstance(x, slice) for x in tup):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm can we just handle this properly in _handle_lower_multi_index_axis0 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could raise an indexingError in there. We can not call xs.

            axis = self.axis or 0
            return self._getitem_axis(tup, axis=axis)

Would this be preferably to the if condition outside?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the len(self.obj) needed here? that smells a bit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance reasons, not necessary from a technical standpoint. Will remove it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok pls add a comment then

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# GH#10521 IndexingError is not raised for slices for objs with one row
with suppress(IndexingError):
return self._handle_lowerdim_multi_index_axis0(tup)

# this is a series with a multi-index specified a tuple of
# selectors
Expand Down
10 changes: 10 additions & 0 deletions pandas/tests/indexing/multiindex/test_loc.py
Original file line number Diff line number Diff line change
Expand Up @@ -695,3 +695,13 @@ def test_loc_getitem_index_differently_ordered_slice_none():
columns=["a", "b"],
)
tm.assert_frame_equal(result, expected)


def test_loc_getitem_drops_levels_for_one_row_dataframe():
# GH#10521
df = DataFrame({"a": ["a"], "b": ["b"], "c": ["a"], "d": 0}).set_index(
jreback marked this conversation as resolved.
Show resolved Hide resolved
["a", "b", "c"]
)
expected = df.copy()
result = df.loc["a", :, "a"]
tm.assert_frame_equal(result, expected)