-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spatial bench #30
Spatial bench #30
Conversation
f395a7a
to
9f15d70
Compare
This now changes the default backend to kiddo, which ~halves the total query time. Kiddo has an "approximate" option too, which makes it about 2.5x faster still, but it is so approximate that it throws the results a long way off. Also there are some ergonomic considerations, including some renames and a usage example up front. |
Very cool, thanks. When I run it on my machine Running benches/spatial.rs (/Users/philipps/Google Drive/Cloudbox/Github/nblast-rs/target/release/deps/spatial-c86081616a09345a)
Gnuplot not found, using plotters backend
construction/bosque time: [26.221 ms 26.567 ms 26.976 ms]
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
construction/kiddo time: [30.892 ms 31.108 ms 31.333 ms]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
construction/nabo time: [38.613 ms 39.127 ms 39.723 ms]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
Benchmarking construction/rstar: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 19.9s, or reduce sample count to 20.
construction/rstar time: [194.42 ms 195.60 ms 197.01 ms]
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) high mild
6 (6.00%) high severe
Benchmarking pairwise query/bosque: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 25.3s, or reduce sample count to 10.
pairwise query/bosque time: [253.83 ms 254.54 ms 255.40 ms]
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
1 (1.00%) high mild
5 (5.00%) high severe
Benchmarking pairwise query/kiddo: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 23.2s, or reduce sample count to 20.
pairwise query/kiddo time: [234.47 ms 237.50 ms 241.81 ms]
Found 13 outliers among 100 measurements (13.00%)
7 (7.00%) high mild
6 (6.00%) high severe
Benchmarking pairwise query/nabo: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 34.0s, or reduce sample count to 10.
pairwise query/nabo time: [339.43 ms 342.67 ms 347.12 ms]
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
Benchmarking pairwise query/rstar: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 60.1s, or reduce sample count to 10.
pairwise query/rstar time: [602.08 ms 609.06 ms 618.91 ms]
Found 16 outliers among 100 measurements (16.00%)
7 (7.00%) high mild
9 (9.00%) high severe Does this look similar to your results? |
From my work laptop:
Kiddo takes about 2/3 the time that bosque does for the queries. I saw something similar on my desktop at home. Are you on apple silicon? Bosque optimises hard for cache locality AFAICT, so maybe different CPUs' cache sizes/ strategies are a major determinant here. If they're about the same I'd lean heavily towards using kiddo; bosque would require a major refactor and force more of the logic into the arena type. |
Interesting. I benched on x86 (2.2 GHz 6-Core Intel Core i7). Not sure if that's even an issue but my understanding is that |
There is some memory overhead with kiddo but it's pretty minimal. Kiddo copies the coordinates on creation but at present you can't iterate over the points inside it, so you need to keep the original copy around for doing point matches - I have a PR for this here sdd/kiddo#135 (and the equivalent PR for nabo was merged earlier today enlightware/nabo-rs#3 ). As memory scales with N and slowness of all-to-all scales with N^2, I'm more inclined to seek speed boosts. For the memory issue, kiddo's zero-copy serialisation form is also useful (it wouldn't be hard to apply the same with bosque, of course). |
It turns out implementing a bosque neuron wasn't as refactor-y as I thought it would be, so I've got a branch with that in and you'll be able to select that backend if it's preferred! That will be the default for the wasm package as kiddo still has some issues there. |
Hey Chris. I've merged your iteration PR in and have just released it as part of Kiddo v4.1.0 - thanks again! It really puts a smile on my face to see Kiddo performing so well in your benchmarks here - things like this make all the effort worthwhile 😊 I'm gonna see if I can sort the WASM issues out for you now as well. |
Thank you for taking an interest! We're not sure what the special sauce is but our data is different to a lot of point clouds - samples in a branching tree structure with long near-linear regions which I imagine doesn't fit very well into an R-tree's partitioning. Whatever's going on under the hood in kiddo, it seems to work for us :) |
Using kiddo 4.1 to iterate through the tree's points directly means we save on having to store the points twice for tree to tree lookups, which I'd estimate cuts the RAM usage by around 30%. We lose a little performance because the point iteration gets a little more complicated, there's some data copies, and lookups into the tangent_alpha vec. In the raw spatial bench this is up to 10%, but once you roll in the rest of the NBLAST algorithm it's only 1-5%, which I think is acceptable.
|
That's great! Also, the WASM fix was trivial in the end and is now released as 4.1.1 :-) Just having a read through the NBLAST paper, out of curiosity. Looks fascinating stuff. |
Crate for benchmarking various spatial lookup crates for our purposes.
Kiddo comes out on top, but will be easier to use once sdd/kiddo#135 is merged. Note that there are some questions about using kiddo in wasm: sdd/kiddo#130
@schlegelp may find this useful! Just run
cargo bench
from thespatial_bench
directory. For each of bosque (fastcore-rs default), kiddo, nabo, and rstar (nblast-rs default), it benchmarks building 1000 trees by augmenting the example data, and running just the spatial query bit of 1 000 000 neuron pair lookups. These are all in serial, I don't have a reason to suspect they'll parallelise differently.