Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: first("1M") returning two months when first day is last day of month #38331

Merged
merged 6 commits into from
Dec 12, 2020

Conversation

phofl
Copy link
Member

@phofl phofl commented Dec 6, 2020

@@ -8426,7 +8426,10 @@ def first(self: FrameOrSeries, offset) -> FrameOrSeries:
return self

offset = to_offset(offset)
end_date = end = self.index[0] + offset
if offset._day_opt == "end" and offset.is_on_offset(self.index[0]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use rollforward?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the if-else? Then no, This would work for offsets like 1M, but not for something like 10 Days. That is also the thing which lead me to putting offset._day_opt == "end" in. But we could use it like

        if offset._day_opt == "end":
            end_date = end = offset.rollforward(self.index[0])
        else:
            end_date = end = self.index[0] + offset

if this would be preferrably

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an OK solution, but ideally there should be a way to do this with only public methods.

but not for something like 10 Days

can you give an example? (possibly merits a test so i dont try to incorrectly simplify this somewhere down the line) if its only for Tick offsets, special-casing wouldn't be that bad since we already special case them a few lines down

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tick works, thanks.

test_first_subset in pandas/tests/frame/methods/test_first_and_last.py covers the behavior and avoids the simplification (as I learned myself a bit earlier :))

Copy link
Member

@jbrockmendel jbrockmendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jreback jreback added Bug Frequency DateOffsets labels Dec 12, 2020
@jreback jreback modified the milestones: 1.3, 1.2 Dec 12, 2020
@jreback jreback merged commit 6ecc787 into pandas-dev:master Dec 12, 2020
@jreback
Copy link
Contributor

jreback commented Dec 12, 2020

thanks @phofl

@phofl phofl deleted the 29623 branch December 12, 2020 23:08
@jreback
Copy link
Contributor

jreback commented Dec 12, 2020

@meeseeksdev backport 1.2.x

@simonjayhawkins
Copy link
Member

@phofl this didn't fix the "2M" case from the issue?

>>> import pandas as pd
>>>
>>> pd.__version__
'1.3.0.dev0+42.g36c4d5ccf1'
>>>
>>> x = pd.Series(1, index=pd.bdate_range("2010-03-31", periods=100))
>>>
>>> print(x.first("1M"))
2010-03-31    1
Freq: B, dtype: int64
>>> print(x.first("2M"))
2010-03-31    1
Freq: B, dtype: int64
>>>

@phofl
Copy link
Member Author

phofl commented Dec 13, 2020

Unfortunately you are right. Opened #38446 to adress this.
Forgot to add the test concerning this case unfortunately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: incorrect output of first('1M') in case if first index is the last day of the month
4 participants