Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame.drop unexpectedly drops frequency #58846

Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,7 @@ Categorical
Datetimelike
^^^^^^^^^^^^
- Bug in :class:`Timestamp` constructor failing to raise when ``tz=None`` is explicitly specified in conjunction with timezone-aware ``tzinfo`` or data (:issue:`48688`)
- Bug in :func:`Datafreme.drop` returning ``Freq=None`` when the dataframe has a ``DatetimeIndex`` (:issue:`58743`)
- Bug in :func:`date_range` where the last valid timestamp would sometimes not be produced (:issue:`56134`)
- Bug in :func:`date_range` where using a negative frequency value would not include all points between the start and end values (:issue:`56382`)
- Bug in :func:`tseries.api.guess_datetime_format` would fail to infer time format when "%Y" == "%H%M" (:issue:`57452`)
Expand Down
11 changes: 10 additions & 1 deletion pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
IncompatibleFrequency,
OutOfBoundsDatetime,
Timestamp,
to_offset,
tz_compare,
)
from pandas._typing import (
Expand Down Expand Up @@ -6949,7 +6950,15 @@ def drop(
if errors != "ignore":
raise KeyError(f"{labels[mask].tolist()} not found in axis")
indexer = indexer[~mask]
return self.delete(indexer)
new_index = self.delete(indexer)

# check if we need to set the freq attribute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of doing this here, override DatetimeIndex.drop

from pandas import DatetimeIndex

if isinstance(self, DatetimeIndex) and isinstance(new_index, DatetimeIndex):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should only do this if self.freq is not None

new_index.freq = to_offset(new_index.inferred_freq)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inferred_freq can be expensive. might be cheaper to invert indexer and call take, which has another PR to preserve freq which we can optimize


return new_index

@final
def infer_objects(self, copy: bool = True) -> Index:
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/generic/test_generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -483,3 +483,19 @@ def test_flags_identity(self, frame_or_series):
assert obj.flags is obj.flags
obj2 = obj.copy()
assert obj2.flags is not obj.flags

@pytest.mark.parametrize("freq", ["YE", "ME", "D"])
def test_drop_method_freq_preservation(self, freq):
start = "1970-01-01"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a "# GH#???" reference

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added # GH 58846

periods = 10

index = date_range(start=start, periods=10, freq=freq)
df = DataFrame((np.ones(len(index))), index=index)

# set inplace as false
test_df = df.drop(index=df.index[[0, periods - 1]], inplace=False)
tm.assert_equal(test_df.index.freq, index.freq)

# set inplace as true
df.drop(index=df.index[[0, periods - 1]], inplace=True)
tm.assert_equal(df.index.freq, index.freq)
4 changes: 2 additions & 2 deletions pandas/tests/indexes/datetimes/methods/test_delete.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,9 @@ def test_delete_slice2(self, tz, unit):
assert result.freq == expected.freq
assert result.tz == expected.tz

# reset freq to None
# reset freq
result = ts.drop(ts.index[[1, 3, 5, 7, 9]]).index
expected = dti[::2]._with_freq(None)
expected = dti[::2]._with_freq("infer")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "infer" here should be unnecessary if dti already has a freq

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed _with_freq("infer")

tm.assert_index_equal(result, expected)
assert result.name == expected.name
assert result.freq == expected.freq
Expand Down
Loading