Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Test on Cython 3.0 on numpydev #46029

Merged
merged 35 commits into from
Mar 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
cdd46ba
CI: Test on Cython 3.0 on numpydev
lithomas1 Feb 17, 2022
27a674b
try something
lithomas1 Feb 17, 2022
d22e588
try something else
lithomas1 Feb 17, 2022
646dced
update
lithomas1 Feb 24, 2022
8328092
Merge branch 'main' of https://github.com/pandas-dev/pandas into cyth…
lithomas1 Feb 24, 2022
4a87227
install cython from master
lithomas1 Feb 26, 2022
d5a9c7e
try to get further in the test suite
lithomas1 Feb 27, 2022
51ce2b2
Merge branch 'main' into cython-alpha-testing
lithomas1 Feb 27, 2022
5aa9f13
revert back to cython master branch
lithomas1 Feb 27, 2022
de88da9
try to get farther in the test suite
lithomas1 Mar 1, 2022
ae3e6fa
fixes to offsets.pyx
lithomas1 Mar 1, 2022
89891f7
Merge branch 'cython-alpha-testing' of github.com:lithomas1/pandas in…
lithomas1 Mar 1, 2022
dc7517a
fix some more binops
lithomas1 Mar 2, 2022
416bca5
fix some more tests
lithomas1 Mar 5, 2022
9599b74
Merge branch 'main' into cython-alpha-testing
lithomas1 Mar 5, 2022
127e97b
workaround cython bug?
lithomas1 Mar 8, 2022
84d742d
Merge branch 'cython-alpha-testing' of github-other.com:lithomas1/pan…
lithomas1 Mar 8, 2022
6a2cb90
Merge branch 'main' of https://github.com/pandas-dev/pandas into cyth…
lithomas1 Mar 8, 2022
fd22fdb
Merge branch 'main' of https://github.com/pandas-dev/pandas into cyth…
lithomas1 Mar 11, 2022
fa2c1e2
Merge branch 'main' of https://github.com/pandas-dev/pandas into cyth…
lithomas1 Mar 13, 2022
0e3b56a
Merge branch 'main' into cython-alpha-testing
lithomas1 Mar 14, 2022
a154bb2
Merge branch 'cython-alpha-testing' of github-other.com:lithomas1/pan…
lithomas1 Mar 14, 2022
7259c4c
standardize todos
lithomas1 Mar 14, 2022
f123a47
switch to master branch
lithomas1 Mar 16, 2022
313b8eb
Merge branch 'main' into cython-alpha-testing
lithomas1 Mar 16, 2022
00c492d
address comments
lithomas1 Mar 21, 2022
71cee4e
Merge branch 'main' of https://github.com/pandas-dev/pandas into cyth…
lithomas1 Mar 21, 2022
7f10301
Merge branch 'cython-alpha-testing' of github-other.com:lithomas1/pan…
lithomas1 Mar 21, 2022
92325d7
handle overflow correctly
lithomas1 Mar 22, 2022
eaea742
address comments
lithomas1 Mar 25, 2022
db69f58
Merge branch 'main' of https://github.com/pandas-dev/pandas into cyth…
lithomas1 Mar 25, 2022
f467452
test something
lithomas1 Mar 25, 2022
7bc1609
Merge branch 'pandas-dev:main' into cython-alpha-testing
lithomas1 Mar 26, 2022
9557ae9
Merge branch 'pandas-dev:main' into cython-alpha-testing
lithomas1 Mar 28, 2022
efd227a
Merge branch 'main' into cython-alpha-testing
jreback Mar 28, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion ci/deps/actions-310-numpydev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ dependencies:
- pytz
- pip
- pip:
- cython==0.29.24 # GH#34014
#- cython # TODO: don't install from master after Cython 3.0.0a11 is released
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob should make an issue for this

- "git+https://github.com/cython/cython.git@master"
- "--extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple"
- "--pre"
- "numpy"
Expand Down
18 changes: 18 additions & 0 deletions pandas/_libs/interval.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -404,6 +404,8 @@ cdef class Interval(IntervalMixin):
):
return Interval(self.left + y, self.right + y, closed=self.closed)
elif (
# __radd__ pattern
# TODO(cython3): remove this
isinstance(y, Interval)
and (
isinstance(self, numbers.Number)
Expand All @@ -414,6 +416,15 @@ cdef class Interval(IntervalMixin):
return Interval(y.left + self, y.right + self, closed=y.closed)
return NotImplemented

def __radd__(self, other):
if (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of doing these checks(probably has an adverse effect on performance). Unfortunately, Cython gets stuck in a infinite recursion look from one binop to the reverse binop, when the reverse binop calls the regular binop and the regular binop raises NotImplemented.

Not sure if this is intended by Cython or a bug, but this works around it.
(The code path is not hit for Cython's < 3.0, so I think its fine to land as is and open an issue as a followup).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if you implement infinite recursion, then you get infinite recursion. No surprise here.

Consider extracting a separate function for the actual implementation instead, and make both special methods call that, in their own way. Then you can catch the "can't handle that" cases in one place, and otherwise run through the addition code and be done, rather than risking infinite recursion in the first place.

isinstance(other, numbers.Number)
or PyDelta_Check(other)
or is_timedelta64_object(other)
):
return Interval(self.left + other, self.right + other, closed=self.closed)
return NotImplemented

def __sub__(self, y):
if (
isinstance(y, numbers.Number)
Expand All @@ -427,9 +438,16 @@ cdef class Interval(IntervalMixin):
if isinstance(y, numbers.Number):
return Interval(self.left * y, self.right * y, closed=self.closed)
elif isinstance(y, Interval) and isinstance(self, numbers.Number):
# __radd__ semantics
# TODO(cython3): remove this
return Interval(y.left * self, y.right * self, closed=y.closed)
return NotImplemented

def __rmul__(self, other):
if isinstance(other, numbers.Number):
return Interval(self.left * other, self.right * other, closed=self.closed)
return NotImplemented

def __truediv__(self, y):
if isinstance(y, numbers.Number):
return Interval(self.left / y, self.right / y, closed=self.closed)
Expand Down
25 changes: 25 additions & 0 deletions pandas/_libs/tslibs/nattype.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ cdef class _NaT(datetime):

def __add__(self, other):
if self is not c_NaT:
# TODO(cython3): remove this it moved to __radd__
# cython __radd__ semantics
self, other = other, self

Expand All @@ -169,6 +170,9 @@ cdef class _NaT(datetime):
# Includes Period, DateOffset going through here
return NotImplemented

def __radd__(self, other):
lithomas1 marked this conversation as resolved.
Show resolved Hide resolved
return self.__add__(other)

def __sub__(self, other):
# Duplicate some logic from _Timestamp.__sub__ to avoid needing
# to subclass; allows us to @final(_Timestamp.__sub__)
Expand All @@ -177,6 +181,7 @@ cdef class _NaT(datetime):

if self is not c_NaT:
# cython __rsub__ semantics
# TODO(cython3): remove __rsub__ logic from here
self, other = other, self
is_rsub = True

Expand All @@ -200,6 +205,8 @@ cdef class _NaT(datetime):
result.fill("NaT")
return result

# __rsub__ logic here
# TODO(cython3): remove this, move above code out of ``if not is_rsub`` block
# timedelta64 - NaT we have to treat NaT as timedelta64
# for this to be meaningful, and the result is timedelta64
result = np.empty(other.shape, dtype="timedelta64[ns]")
Expand All @@ -220,6 +227,24 @@ cdef class _NaT(datetime):
# Includes Period, DateOffset going through here
return NotImplemented

def __rsub__(self, other):
if util.is_array(other):
if other.dtype.kind == "m":
# timedelta64 - NaT we have to treat NaT as timedelta64
# for this to be meaningful, and the result is timedelta64
result = np.empty(other.shape, dtype="timedelta64[ns]")
result.fill("NaT")
return result

elif other.dtype.kind == "M":
lithomas1 marked this conversation as resolved.
Show resolved Hide resolved
# We treat NaT as a datetime, so regardless of whether this is
# NaT - other or other - NaT, the result is timedelta64
result = np.empty(other.shape, dtype="timedelta64[ns]")
result.fill("NaT")
return result
# other cases are same, swap operands is allowed even though we subtract because this is NaT
return self.__sub__(other)

def __pos__(self):
return NaT

Expand Down
24 changes: 24 additions & 0 deletions pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -450,6 +450,7 @@ cdef class BaseOffset:
def __add__(self, other):
if not isinstance(self, BaseOffset):
# cython semantics; this is __radd__
# TODO(cython3): remove this, this moved to __radd__
return other.__add__(self)

elif util.is_array(other) and other.dtype == object:
Expand All @@ -460,19 +461,26 @@ cdef class BaseOffset:
except ApplyTypeError:
return NotImplemented

def __radd__(self, other):
return self.__add__(other)

def __sub__(self, other):
if PyDateTime_Check(other):
raise TypeError('Cannot subtract datetime from offset.')
elif type(other) == type(self):
return type(self)(self.n - other.n, normalize=self.normalize,
**self.kwds)
elif not isinstance(self, BaseOffset):
# TODO(cython3): remove, this moved to __rsub__
# cython semantics, this is __rsub__
return (-other).__add__(self)
else:
# e.g. PeriodIndex
return NotImplemented

def __rsub__(self, other):
return (-self).__add__(other)

def __call__(self, other):
warnings.warn(
"DateOffset.__call__ is deprecated and will be removed in a future "
Expand All @@ -499,10 +507,14 @@ cdef class BaseOffset:
return type(self)(n=other * self.n, normalize=self.normalize,
**self.kwds)
elif not isinstance(self, BaseOffset):
# TODO(cython3): remove this, this moved to __rmul__
# cython semantics, this is __rmul__
return other.__mul__(self)
return NotImplemented

def __rmul__(self, other):
return self.__mul__(other)

def __neg__(self):
# Note: we are deferring directly to __mul__ instead of __rmul__, as
# that allows us to use methods that can go in a `cdef class`
Expand Down Expand Up @@ -887,6 +899,7 @@ cdef class Tick(SingleConstructorOffset):

def __mul__(self, other):
if not isinstance(self, Tick):
# TODO(cython3), remove this, this moved to __rmul__
# cython semantics, this is __rmul__
return other.__mul__(self)
if is_float_object(other):
Expand All @@ -900,6 +913,9 @@ cdef class Tick(SingleConstructorOffset):
return new_self * other
return BaseOffset.__mul__(self, other)

def __rmul__(self, other):
return self.__mul__(other)

def __truediv__(self, other):
if not isinstance(self, Tick):
# cython semantics mean the args are sometimes swapped
Expand All @@ -908,9 +924,14 @@ cdef class Tick(SingleConstructorOffset):
result = self.delta.__truediv__(other)
return _wrap_timedelta_result(result)

def __rtruediv__(self, other):
result = self.delta.__rtruediv__(other)
return _wrap_timedelta_result(result)

def __add__(self, other):
if not isinstance(self, Tick):
# cython semantics; this is __radd__
# TODO(cython3): remove this, this moved to __radd__
return other.__add__(self)

if isinstance(other, Tick):
Expand All @@ -928,6 +949,9 @@ cdef class Tick(SingleConstructorOffset):
f"the add operation between {self} and {other} will overflow"
) from err

def __radd__(self, other):
lithomas1 marked this conversation as resolved.
Show resolved Hide resolved
return self.__add__(other)

def _apply(self, other):
# Timestamp can handle tz and nano sec, thus no need to use apply_wraps
if isinstance(other, _Timestamp):
Expand Down
10 changes: 10 additions & 0 deletions pandas/_libs/tslibs/period.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1710,6 +1710,7 @@ cdef class _Period(PeriodMixin):
def __add__(self, other):
if not is_period_object(self):
# cython semantics; this is analogous to a call to __radd__
# TODO(cython3): remove this
if self is NaT:
return NaT
return other.__add__(self)
Expand All @@ -1734,9 +1735,13 @@ cdef class _Period(PeriodMixin):

return NotImplemented

def __radd__(self, other):
return self.__add__(other)

def __sub__(self, other):
if not is_period_object(self):
# cython semantics; this is like a call to __rsub__
# TODO(cython3): remove this
if self is NaT:
return NaT
return NotImplemented
Expand All @@ -1760,6 +1765,11 @@ cdef class _Period(PeriodMixin):

return NotImplemented

def __rsub__(self, other):
if other is NaT:
return NaT
return NotImplemented

def asfreq(self, freq, how='E') -> "Period":
"""
Convert Period to desired frequency, at the start or end of the interval.
Expand Down
28 changes: 27 additions & 1 deletion pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -280,13 +280,19 @@ cdef class _Timestamp(ABCTimestamp):
return other.tzinfo is not None
return other.tzinfo is None

@cython.overflowcheck(True)
def __add__(self, other):
cdef:
int64_t nanos = 0
lithomas1 marked this conversation as resolved.
Show resolved Hide resolved

if is_any_td_scalar(other):
nanos = delta_to_nanoseconds(other)
result = type(self)(self.value + nanos, tz=self.tzinfo)
try:
result = type(self)(self.value + nanos, tz=self.tzinfo)
except OverflowError:
# Use Python ints
lithomas1 marked this conversation as resolved.
Show resolved Hide resolved
# Hit in test_tdi_add_overflow
result = type(self)(int(self.value) + int(nanos), tz=self.tzinfo)
if result is not NaT:
result._set_freq(self._freq) # avoid warning in constructor
return result
Expand All @@ -307,9 +313,16 @@ cdef class _Timestamp(ABCTimestamp):

elif not isinstance(self, _Timestamp):
# cython semantics, args have been switched and this is __radd__
# TODO(cython3): remove this it moved to __radd__
return other.__add__(self)
return NotImplemented

def __radd__(self, other):
# Have to duplicate checks to avoid infinite recursion due to NotImplemented
if is_any_td_scalar(other) or is_integer_object(other) or is_array(other):
return self.__add__(other)
return NotImplemented

def __sub__(self, other):

if is_any_td_scalar(other) or is_integer_object(other):
Expand All @@ -336,6 +349,7 @@ cdef class _Timestamp(ABCTimestamp):
and (PyDateTime_Check(other) or is_datetime64_object(other))):
# both_timestamps is to determine whether Timedelta(self - other)
# should raise the OOB error, or fall back returning a timedelta.
# TODO(cython3): clean out the bits that moved to __rsub__
both_timestamps = (isinstance(other, _Timestamp) and
isinstance(self, _Timestamp))
if isinstance(self, _Timestamp):
Expand Down Expand Up @@ -366,10 +380,22 @@ cdef class _Timestamp(ABCTimestamp):
elif is_datetime64_object(self):
# GH#28286 cython semantics for __rsub__, `other` is actually
# the Timestamp
# TODO(cython3): remove this, this moved to __rsub__
return type(other)(self) - other

return NotImplemented

def __rsub__(self, other):
if PyDateTime_Check(other):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any problem with return -(self - other)? i guess thats the overflow in the stata tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually copy-pasted from the sub code block above(including the contents). This actually shows up in other tests in addition to the stata ones(which is why I needed to copy-paste it).

return -(self - other) might work, but I think the tests would still fail, as the expected behavior there is to return a datetime.timedelta.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Took me a few tries to find a relevant example:

pd.Timestamp(2261, 1, 1).to_pydatetime() - pd.Timestamp(1677, 9, 30)

returns a pytimedelta. There's a small hiccup (doesn't matter for this PR, but i want to write it down before I forget it) if the Timestamp has nanos:

pd.Timestamp(2261, 1, 1).to_pydatetime() - pd.Timestamp(1677, 9, 30, nanosecond=4)

We lose the nanoseconds. In principle we could call to_pydatetime() and get a warning if appropropriate.

try:
return type(self)(other) - self
except (OverflowError, OutOfBoundsDatetime) as err:
# We get here in stata tests, fall back to stdlib datetime
# method and return stdlib timedelta object
pass
elif is_datetime64_object(other):
return type(self)(other) - self
return NotImplemented
# -----------------------------------------------------------------

cdef int64_t _maybe_convert_value_to_local(self):
Expand Down