Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: timedeltas.pyx.c varies between builds #60078

Open
3 tasks done
bmwiedemann opened this issue Oct 21, 2024 · 0 comments
Open
3 tasks done

BUG: timedeltas.pyx.c varies between builds #60078

bmwiedemann opened this issue Oct 21, 2024 · 0 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@bmwiedemann
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

I extracted this reproducer:


cd ~/rpmbuild/BUILD/pandas-2.2.2 && for i in $(seq 30) ; do setarch -R cython -3 ./pandas/../pandas/../pandas/../pandas/_libs/tslibs/timedeltas.pyx -o pyx.c ; m=$(md5sum pyx.c|cut -c1-20); cp -a pyx.c pyx.c.$m ; echo $m; done | sort | uniq -c

Issue Description

While working on reproducible builds for openSUSE (sponsored by the NLnet NGI0 fund), I found that
our python-pandas 2.2.2 and 2.2.3 (also seen in pandas-2.0.2, possibly also earlier versions) vary
in the generated pandas/_libs/tslibs/timedeltas.pyx.c file.

The reproducer only produces 30 times the same hash with the setarch -R to disable ASLR.
Without it, you get a random chance for two different results.
The probability also seems to depend on the length of the path string.

      5 7653942dcdd2ad6fd8a5
     25 af0a5fefad3a417a7420

Using

diff -u9 pyx.c.[7a]*

the diff of those files is

--- pyx.c.7653942dcdd2ad6fd8a5 2024-10-21 03:51:25.407942993 +0000
+++ pyx.c.af0a5fefad3a417a7420 2024-10-21 03:52:02.744128704 +0000
@@ -45871,19 +45871,19 @@
 
 static PyObject *__pyx_pf_6pandas_5_libs_6tslibs_10timedeltas_9Timedelta___new__(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_cls, PyObject *__pyx_v_value, PyObject *__pyx_v_unit, PyObject *__pyx_v_kwargs) {
   PyObject *__pyx_v_unsupported_kwargs = NULL;
   PyObject *__pyx_v_seconds = NULL;
   PyObject *__pyx_v_ns = NULL;
   PyObject *__pyx_v_us = NULL;
   PyObject *__pyx_v_ms = NULL;
   PyObject *__pyx_v_err = NULL;
   PyObject *__pyx_v_msg = NULL;
-  npy_timedelta __pyx_v_new_value;
+  __pyx_t_5numpy_int64_t __pyx_v_new_value;
   NPY_DATETIMEUNIT __pyx_v_reso;
   NPY_DATETIMEUNIT __pyx_v_new_reso;
   PyObject *__pyx_8genexpr2__pyx_v_key = NULL;
   PyObject *__pyx_r = NULL;
   __Pyx_RefNannyDeclarations
   PyObject *__pyx_t_1 = NULL;
   int __pyx_t_2;
   Py_ssize_t __pyx_t_3;
   PyObject *__pyx_t_4 = NULL;

Expected Behavior

Build results should be deterministic.

Installed Versions

openSUSE Tumbleweed 20241018
Cython-3.0.11

@bmwiedemann bmwiedemann added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

1 participant