Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Very minor PR, mostly an excuse to get in touch!
I was looking at some KDTree implementations, especially one that works well with numba. I found your comment in this issue: numba/numba-scipy#36
... which obviously brought me here!
Some background:
I'm coming from a geospatial use case, I want to do some (2D for now) unstructured mesh regridding which involves computing some areal overlaps from overlapping cells. Most (vector-based) GIS stuff seems awfully slow or some existing solutions come with massive binary dependencies, so I figured I'd have a look at it myself. A KDTree with a radius query seems like a decent starting point to create some sort of short list of possibly overlapping cells. I greatly prefer numba since it's basically seamless to Python, and it's much nicer to distribute than e.g. Cython; the JIT'ing also allows arbritary area weighting functions at runtime, so it's great all around.
Anyway,
numba-neighbors
is looking pretty spiffy! I've been looking at these:And for my ad hoc benchmark (but fairly realistic for the use case),
numba-neighbors
beats them by a fairly wide margin (> 30% on cKTree, more so on the others). Might be interesting to redo this analysis at some point: https://jakevdp.github.io/blog/2013/04/29/benchmarking-nearest-neighbor-searches-in-python/Anyway, my main issue with the other methods is how they return the indices. I don't really want to be stuck with a fixed number of neighbors, so sklearn's
radius_query
is what I want... except it returns an array of arrays which probably isn't going to be great for further numba functions to work on. I could copy and edit some Cython stuff, but as mentioned numba is just much nicer to distribute. So withnumba-neighbors
it should be pretty easy to write a custom query function for my goals!Wrapping up:
numba-neigbors
seems useful, I'd like to use it. I saw you did putlicense='MIT'
in the setup.py -- but I had to look for it (if only briefly).