-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: try coerce result back to DatetimeBlock #22008
Conversation
@jbrockmendel . Should this work with tz-aware date times as well? |
Yes. If there's any way at all to do this without poking into core.internals, that would be preferred. The Series version of this works correctly right? This should be doing something like operating column-wise to dispatch to the Series implementation (which itself dispatches to the DatetimeIndex implementation). Can you include tests for subtraction while you're at it? |
Codecov Report
@@ Coverage Diff @@
## master #22008 +/- ##
==========================================
+ Coverage 91.99% 91.99% +<.01%
==========================================
Files 167 167
Lines 50578 50588 +10
==========================================
+ Hits 46530 46540 +10
Misses 4048 4048
Continue to review full report at Codecov.
|
pandas/core/internals.py
Outdated
@@ -1362,10 +1368,15 @@ def eval(self, func, other, errors='raise', try_cast=False, mgr=None): | |||
values, values_mask, other, other_mask = self._try_coerce_args( | |||
transf(values), other) | |||
except TypeError: | |||
# `Timestamp` operate with `DateOffset` must cast to `object`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would rather see you override eval in the appropriate blocks
ad96a28
to
a75f375
Compare
v2.0, override Below test fails, but not a regression. Will look deeper later. def test_timestamp_df_sub_timestamp(self):
expected = pd.DataFrame([pd.Timedelta('365d')])
result = pd.DataFrame([pd.Timestamp('2018')]) - pd.Timestamp('2017')
tm.assert_frame_equal(expected, result) |
docs said returning a new pandas/pandas/core/internals/__init__.py Lines 1333 to 1336 in 37c7458
pandas/pandas/core/internals/__init__.py Lines 1455 to 1457 in 37c7458
|
pandas/core/internals/__init__.py
Outdated
block = super().eval(func, other, try_cast=try_cast, **kwargs)[0] | ||
if try_cast: | ||
if isinstance(other, | ||
(tslibs.Timestamp, np.datetime64, datetime, date)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an ABCTimestamp to check? At least 3 times use similar isinstance()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an ABCTimestamp to check?
No, but Timestamp subclasses datetime, so could actually be removed from this check.
4c36407
to
8a3a20e
Compare
@@ -2785,6 +2787,17 @@ def set(self, locs, values, check=False): | |||
|
|||
self.values[locs] = values | |||
|
|||
def eval(self, func, other, try_cast=False, **kwargs): | |||
block = super(DatetimeBlock, self).eval(func, other, try_cast=try_cast, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens here if instead of calling super eval you delegate to self.holder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In DatetimeBlock - datetime
case and DatetimeBlock + DateOffset
case, it require the result block changing (DatetimeBlock
-> TimeDeltaBlock
, ObjectBlock
-> DatetimeBlock
or DatetimeTZBlock
).
Delegating to self.holder
couldn't change block type. Besides, self._holder
is unaware of other
.
In fact, what should be blamed is the last line in Block.eval
carelessly make a wrong block. A proper fix should be
- spliting a method to make block regarding to
other
. - override
make_block
withother
as argument. But this will lead toself.make_block
returning another type of block.
pandas/pandas/core/internals/__init__.py
Line 1456 in 0828c25
return [self.make_block(result)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, what should be blamed is the last line in Block.eval carelessly make a wrong block. A proper fix should be
There has been an effort recently to fix mismatches between the behaviors of arithmetic ops in Index vs Series vs DataFrame. For the timedelta and datetime cases, the canonical behavior is defined in TimedeltaIndex/DatetimeIndex.
The best case here to to avoid making this fix in core.internals
but instead do it in core.DataFrame (or core.ops). The next base case is to have the FooBlock dispatch to the appropriate Index subclass to ensure the behavior is canonical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will be great, but it may need a big rewrite of eval
and _try_coerce_args
, cause now values
is all coerced to ndarray[i8]
for DatetimeLike
. (May be benefited from numpy bi-ops, but confused to convert back.)
Is there an issue to xfer? This PR would depand on it.
@@ -1238,7 +1238,6 @@ class TestCanHoldElement(object): | |||
(2**63, 'complex128'), | |||
(True, 'bool'), | |||
(np.timedelta64(20, 'ns'), '<m8[ns]'), | |||
(np.datetime64(20, 'ns'), '<M8[ns]'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removed? (Besides no is exempt sub working)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR lead to datetime64 OP datetime64 = timedelta64
, regardless of OP
. Although meaningless, it will result datetime64 + datetime64 = timedelta64
.
The assumption datetime64 OP datetime64 = datetime64
is meaningless for all OP
. If allow this line, it will require extra OP
check in DatetimeBlock.eval
, which I don't think worth.
(operator.mod, 'complex128'), | ||
(operator.mod, '<m8[ns]'), | ||
(operator.pow, 'bool'), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls avoid formatting changes where convenient, makes it harder for reviewers to identify the actual changes.
'<M8[ns]' - '<M8[ns]' = '<m8[ns]' , other binop are invalid.
c82ea08
to
21c1c7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a lot of concerns over the changes in the impl here.
@@ -2737,7 +2739,7 @@ def _try_coerce_args(self, values, other): | |||
|
|||
def _try_coerce_result(self, result): | |||
""" reverse of try_coerce_args """ | |||
if isinstance(result, np.ndarray): | |||
if isinstance(result, (np.ndarray, Block)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a good change, please remove. the point is that these are scalars/arrays NEVER blocks.
@@ -2920,6 +2933,8 @@ def _try_coerce_result(self, result): | |||
if isinstance(result, np.ndarray): | |||
if result.dtype.kind in ['i', 'f', 'O']: | |||
result = result.astype('M8[ns]') | |||
elif isinstance(result, Block): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, NEVER do this
Two of the bugs this addresses (#21610, #8554) were closed by #22163. The remaining issue #12437 is still open. @holymonson can this be simplified if only aimed at that issue? |
This |
git diff upstream/master -u -- "*.py" | flake8 --diff
Hang up until
Block.eval
reformed.