Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make khash its own extension #18472

Merged
merged 3 commits into from
Nov 29, 2017
Merged

Conversation

jbrockmendel
Copy link
Member

The path towards cythonize continues.

@jreback
Copy link
Contributor

jreback commented Nov 24, 2017

this is a header only, I don't think this is necessary. I was referring to skiplist becoming its own module, khash is fine as is.

@jbrockmendel
Copy link
Member Author

I don't think this is necessary. I was referring to skiplist becoming its own module, khash is fine as is.

As discussed, it is necessary for cythonize. In attempts to date, cythonize chokes on khash, skiplist, datetime, and util.

@jbrockmendel
Copy link
Member Author

BTW skiplist push is coming before long, but I'm testing locally before pushing.

@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

well, then you should edit everything to remove the direct imports from khash.h and put the declarations in .pxd.

@jbrockmendel
Copy link
Member Author

well, then you should edit everything to remove the direct imports from khash.h and put the declarations in .pxd.

OK. I'm not clear on what you're asking for. The only things that have "direct imports from khash.h" (based on a quick grep) are "src/klib/khash_python.h" and "src/parser/tokenizer.h". The only place that has directly imports from khash_python.h is khash.pxd.

@codecov
Copy link

codecov bot commented Nov 25, 2017

Codecov Report

Merging #18472 into master will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18472      +/-   ##
==========================================
- Coverage   91.36%   91.32%   -0.04%     
==========================================
  Files         163      163              
  Lines       49700    49717      +17     
==========================================
- Hits        45407    45404       -3     
- Misses       4293     4313      +20
Flag Coverage Δ
#multiple 89.12% <ø> (-0.03%) ⬇️
#single 40.51% <ø> (+0.79%) ⬆️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/io/clipboard/__init__.py 56.86% <0%> (-19.33%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/io/clipboard/clipboards.py 32.18% <0%> (+8.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaee541...6741d4c. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 25, 2017

Codecov Report

Merging #18472 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18472      +/-   ##
==========================================
- Coverage   91.36%   91.33%   -0.03%     
==========================================
  Files         163      164       +1     
  Lines       49700    49819     +119     
==========================================
+ Hits        45407    45503      +96     
- Misses       4293     4316      +23
Flag Coverage Δ
#multiple 89.13% <ø> (-0.01%) ⬇️
#single 40.8% <ø> (+1.07%) ⬆️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/io/clipboard/__init__.py 56.86% <0%> (-19.33%) ⬇️
pandas/core/dtypes/missing.py 90.24% <0%> (-0.5%) ⬇️
pandas/core/panel.py 96.85% <0%> (-0.29%) ⬇️
pandas/core/indexes/datetimelike.py 96.91% <0%> (-0.2%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.09%) ⬇️
pandas/core/series.py 94.76% <0%> (-0.08%) ⬇️
pandas/core/dtypes/cast.py 88.52% <0%> (-0.08%) ⬇️
pandas/tseries/offsets.py 96.94% <0%> (-0.07%) ⬇️
pandas/core/base.py 96.54% <0%> (-0.03%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaee541...b584985. Read the comment docs.

setup.py Outdated
@@ -335,6 +335,7 @@ class CheckSDist(sdist_class):
'pandas/_libs/index.pyx',
'pandas/_libs/algos.pyx',
'pandas/_libs/join.pyx',
'pandas/_libs/khash.pyx',
'pandas/_libs/indexing.pyx',
'pandas/_libs/interval.pyx',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashtable is still coupled to khash_python.h

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm still unclear on the transitive-dependencies policy. A bunch of other things are coupled to hashtable, so are they also coupled to khash_python.h?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well this is exactly the issue
you now have an actual module rather than header only so you shouldn’t import from the c files and instead put the defa inside the .pxd

you may want to find an example how to do this
i don’t like the idea of a dummy module

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would getting rid of the .pyx file solve this?

@jreback jreback added the Clean label Nov 25, 2017
@jbrockmendel
Copy link
Member Author

One down, thee to go. Let’s knock this out this week.

@jreback
Copy link
Contributor

jreback commented Nov 27, 2017

i don't like the dummy file here.

@jreback
Copy link
Contributor

jreback commented Nov 28, 2017

does this help with cythonize?

@jbrockmendel
Copy link
Member Author

does this help with cythonize?

Good question. With the pyx file it was a "yes". I'll have to double-check without.

@jbrockmendel
Copy link
Member Author

Yes, this helps with cythonize. i.e. the setup that I got working with cythonize also works without the dummy .pyx file.

FYI, the steps to get cythonize working (not necessarily the only way, just the way that I found):

  • Move khash, skiplist, datetime, util up out of src to fix ImportErrors. (update tslibs cimports to relative cimports)
  • Fix issubclass problem generated in inference.pyx caused by mismatch between cython's numpy pxd files and src/numpy.pxd.(Numpy C-API issubdtype? numpy/numpy#10063 (comment))
  • Add cythonize in setup.py

@@ -10,7 +10,7 @@ np.import_array()

from util cimport is_string_object, get_nat

from khash cimport (
from pandas._libs.khash cimport (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do relative imports here?

from ..khash cimport ? not sure. but this is ok for now.

@jreback jreback added this to the 0.22.0 milestone Nov 29, 2017
@jreback jreback merged commit e459658 into pandas-dev:master Nov 29, 2017
@jbrockmendel jbrockmendel deleted the pre_cythonize2 branch December 8, 2017 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants