Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify skiplist inclusion/cimport to be more cythonize-friendly #18420

Merged
merged 6 commits into from
Nov 27, 2017

Conversation

jbrockmendel
Copy link
Member

Next up in the moving-towards-cythonize parade...

_libs.window has both include "skiplist.pyx" and from skiplist cimport. The cimport refers to src/skiplist.pxd, which is just passing through declarations from src/skiplist.h. This PR removes skiplist.pxd and moves its contents into src/skiplist.pyx.

Background on motivation: when cythonize is used in setup.py it chokes on cimports from the src/ directory. After some troubleshooting to avoid this choking, I decided to take the alternate route of just avoiding cimporting from there. This is the first of four cimports to remove.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@codecov
Copy link

codecov bot commented Nov 22, 2017

Codecov Report

Merging #18420 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #18420   +/-   ##
=======================================
  Coverage   91.33%   91.33%           
=======================================
  Files         163      163           
  Lines       49801    49801           
=======================================
  Hits        45487    45487           
  Misses       4314     4314
Flag Coverage Δ
#multiple 89.13% <ø> (ø) ⬆️
#single 40.77% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d101064...81b1762. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

so rather than making this even more obscure. why don't you just make skiplist.pyx a separate cython module? and leave the pxd (move out of src is ok).

@jreback jreback added the Clean label Nov 23, 2017
@jbrockmendel
Copy link
Member Author

so rather than making this even more obscure. why don't you just make skiplist.pyx a separate cython module? and leave the pxd (move out of src is ok).

Works for me. We'll need to do the same for khash anyway. Any objection to lumping that into this PR?

@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

Works for me. We'll need to do the same for khash anyway. Any objection to lumping that into this PR?

no.

@jbrockmendel
Copy link
Member Author

As I look at it more closely, this is going to take non-trivial refactoring to make make _libs.skiplist and make the appropriate cimports available in the appropriate places. Given that it currently included into window.pyx, why not just paste it into the bottom of that file? It's only 146 lines.

@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

skiplist should be a separate
khash is header only so this is different

@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

separate module

list next
list width
# cdef public:
# double_t value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you commenting these out? just to show what the structure is? I would remove this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't you supposed to declare the member variables in the .pyx, and not the .pxd?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't you supposed to declare the member variables in the .pyx, and not the .pxd?

AFAICT the rule is that if the class is declared in the .pxd, the member variables need to be declared there (and only there)

why are you commenting these out? just to show what the structure is? I would remove this

I leave these in place so we don't have to go back and check the pxd

Py_ssize_t size, maxlevels
Node head
# cdef:
# Py_ssize_t size, maxlevels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

ok this looks fine. can you run some of the rolling benchmarks (quantile, median) which use this. should be no change.

@jreback jreback added this to the 0.22.0 milestone Nov 25, 2017
@jbrockmendel
Copy link
Member Author

taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b rolling -b quantile -b median
[...]
     [2dbf2a6a]       [0fde0459]
+       124±0.9ms        140±0.8ms     1.13  rolling.SeriesRolling.time_rolling_quantile_median

Ran a few times; this looks consistent. I guess when things are exposed via pxd, the compiler cant be quite as aggressive as it could be if all uses are internal?

@jreback
Copy link
Contributor

jreback commented Nov 26, 2017

concur

master

In [1]: np.random.seed(1234); df = pd.DataFrame({'a': np.random.random(100000)})

In [2]: %timeit df.rolling(10).median()
59.6 ms ± 3.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit df.rolling(10).median()
58.4 ms ± 748 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: %timeit df.rolling(10).quantile(.1)
140 ms ± 4.31 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [5]: %timeit df.rolling(10).quantile(.1)
139 ms ± 4.74 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

PR

In [1]: np.random.seed(1234); df = pd.DataFrame({'a': np.random.random(100000)})

In [2]: %timeit df.rolling(10).median()
60.4 ms ± 3.98 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit df.rolling(10).median()
63.7 ms ± 4.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: %timeit df.rolling(10).quantile(.1)
162 ms ± 4.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [5]: %timeit df.rolling(10).quantile(.1)
159 ms ± 5.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

skiplist_insert,
skiplist_remove)

cdef extern from "../src/headers/math.h":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if this makes any difference (the sqrt & log above)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be pretty weird (isn't the path there incorrect anyway?) but no harm in reverting it just to see...

Since skiplist is only used in window, and used to be "include"d anyway, I think pasting it at the bottom of window is a decent fallback option.

Copy link
Member Author

@jbrockmendel jbrockmendel Nov 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending double-checking... reverting to getting sqrt from src/headers/math.h seems to make up the difference and then some. It looks like the sqrt exposed in libc.math is not declared with nogil; that could plausibly make a difference. ... or I could accidentally be profiling one version via SSH and another locally. Never mind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, reverting the sqrt import does appear to have closed the gap.

taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b rolling -b quantile -b median
[...]
BENCHMARKS NOT SIGNIFICANTLY CHANGED.

taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b rolling -b quantile -b median
[...]
       before           after         ratio
     [2dbf2a6a]       [4432b71c]
-           2.61s            2.35s     0.90  rolling.DataframeRolling.time_rolling_cov
-           2.71s            2.23s     0.82  rolling.DataframeRolling.time_rolling_corr

taskset 2 asv continuous -f 1.1 -E virtualenv master HEAD -b rolling -b quantile -b median
[...]
BENCHMARKS NOT SIGNIFICANTLY CHANGED.

I'll push with this change and open an issue to check for other places where libc.sqrt is cimported instead of the nogil version.

@jreback
Copy link
Contributor

jreback commented Nov 27, 2017

so the sqrt nogil was the culprit?

@jreback jreback merged commit 49ddcd5 into pandas-dev:master Nov 27, 2017
@jreback
Copy link
Contributor

jreback commented Nov 27, 2017

verified locally this looks good.

@jbrockmendel
Copy link
Member Author

so the sqrt nogil was the culprit?

AFAICT yes. I added a note in #18125 to track down other places where we might be using the wrong one.

@scoder
Copy link

scoder commented Dec 2, 2017

so the sqrt nogil was the culprit?

Can't be. From the point of view of Cython, both use "nogil", thus are entirely the same, and should result in the same C code. The only potential difference is the order in which C header files are included, but your code would better not depend on that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants