-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplify skiplist inclusion/cimport to be more cythonize-friendly #18420
Conversation
Codecov Report
@@ Coverage Diff @@
## master #18420 +/- ##
=======================================
Coverage 91.33% 91.33%
=======================================
Files 163 163
Lines 49801 49801
=======================================
Hits 45487 45487
Misses 4314 4314
Continue to review full report at Codecov.
|
so rather than making this even more obscure. why don't you just make skiplist.pyx a separate cython module? and leave the pxd (move out of src is ok). |
Works for me. We'll need to do the same for khash anyway. Any objection to lumping that into this PR? |
no. |
As I look at it more closely, this is going to take non-trivial refactoring to make make _libs.skiplist and make the appropriate cimports available in the appropriate places. Given that it currently |
skiplist should be a separate |
separate module |
list next | ||
list width | ||
# cdef public: | ||
# double_t value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you commenting these out? just to show what the structure is? I would remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't you supposed to declare the member variables in the .pyx, and not the .pxd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't you supposed to declare the member variables in the .pyx, and not the .pxd?
AFAICT the rule is that if the class is declared in the .pxd, the member variables need to be declared there (and only there)
why are you commenting these out? just to show what the structure is? I would remove this
I leave these in place so we don't have to go back and check the pxd
Py_ssize_t size, maxlevels | ||
Node head | ||
# cdef: | ||
# Py_ssize_t size, maxlevels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
ok this looks fine. can you run some of the rolling benchmarks (quantile, median) which use this. should be no change. |
Ran a few times; this looks consistent. I guess when things are exposed via pxd, the compiler cant be quite as aggressive as it could be if all uses are internal? |
concur master
PR
|
pandas/_libs/window.pyx
Outdated
skiplist_insert, | ||
skiplist_remove) | ||
|
||
cdef extern from "../src/headers/math.h": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if this makes any difference (the sqrt & log above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be pretty weird (isn't the path there incorrect anyway?) but no harm in reverting it just to see...
Since skiplist is only used in window, and used to be "include"d anyway, I think pasting it at the bottom of window is a decent fallback option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pending double-checking... reverting to getting sqrt from src/headers/math.h seems to make up the difference and then some. It looks like the sqrt exposed in libc.math is not declared with ... or I could accidentally be profiling one version via SSH and another locally. Never mind.nogil
; that could plausibly make a difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, reverting the sqrt
import does appear to have closed the gap.
taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b rolling -b quantile -b median
[...]
BENCHMARKS NOT SIGNIFICANTLY CHANGED.
taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b rolling -b quantile -b median
[...]
before after ratio
[2dbf2a6a] [4432b71c]
- 2.61s 2.35s 0.90 rolling.DataframeRolling.time_rolling_cov
- 2.71s 2.23s 0.82 rolling.DataframeRolling.time_rolling_corr
taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b rolling -b quantile -b median
[...]
BENCHMARKS NOT SIGNIFICANTLY CHANGED.
I'll push with this change and open an issue to check for other places where libc.sqrt is cimported instead of the nogil version.
so the sqrt nogil was the culprit? |
verified locally this looks good. |
AFAICT yes. I added a note in #18125 to track down other places where we might be using the wrong one. |
Can't be. From the point of view of Cython, both use "nogil", thus are entirely the same, and should result in the same C code. The only potential difference is the order in which C header files are included, but your code would better not depend on that... |
Next up in the moving-towards-
cythonize
parade..._libs.window
has bothinclude "skiplist.pyx"
andfrom skiplist cimport
. The cimport refers tosrc/skiplist.pxd
, which is just passing through declarations fromsrc/skiplist.h
. This PR removes skiplist.pxd and moves its contents into src/skiplist.pyx.Background on motivation: when
cythonize
is used in setup.py it chokes on cimports from the src/ directory. After some troubleshooting to avoid this choking, I decided to take the alternate route of just avoiding cimporting from there. This is the first of four cimports to remove.git diff upstream/master -u -- "*.py" | flake8 --diff