-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make BTree's internals safer and do more checks at compile time instead of run time #19782
Conversation
cc @gankro Also, the docs on this change are coming, but not yet done. |
top: node::Handle<*mut Node<K, V>, KV, LeafOrInternal>, | ||
} | ||
|
||
struct LeafifiedSearchStack<'a, K:'a, V:'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This struct probably needs a comment, and a better name. Does LeafSearchStack
still describe what this does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though LeafOccupiedSearchStack
would be more accurate. However, maybe I should just make a single generic FullSearchStack
with two extra parameters for the type of handle at the top.
Fixed @cgaebel's issues and added docs. |
-32 net LoC AND it's faster? Welp, I'm sold. |
@@ -428,8 +428,10 @@ impl<K: Clone, V: Clone> Clone for Node<K, V> { | |||
} | |||
} | |||
|
|||
/// A reference to a key/value pair in the middle of a `Node`. Methods are provided for removing | |||
/// the pair and accessing the pair and the adjacent edges. | |||
/// A reference to a something in the middle of a `Node`. There are two `Type`s of `Handle`s, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A reference to a something?
Random comment on the benchmarks: It's really weird that And since |
PartialSearchStack { | ||
map: self.map, | ||
stack: self.stack, | ||
next: edge.edge_mut() as *mut _, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these by-value semantics necessary/useful anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, simply because Pusher
is distinct from PartialSearchStack
.
On that subject, why are the sequential runs slower than the random runs? |
Maybe it's an artifact of btree splitting that's degrading into worst case performance? Someone should probably break out some pencil and paper and figure out why that's happening. Hmm maybe it's because all the node linear searches are always taking the longest case. |
Sequential insertion isn't necessarily something the (academic) BTree is good at. You fill up a node, split it, then never touch the left child again. This creates a big pool of half-empty nodes, and more allocations. Whereas random insertion should produce "fuller" nodes. I am vaguely aware that Google's BTree implementation does some tricks to optimize for sequential insertions. A substantial refactor would also potentially consider this fancy kind of BTree that shares data between pairs of children. Each child must be ~2/3's full or something. Can't remember the name. |
The iterator code is definitely not optimized. iirc if you toss in some I have a suspicion that BTree might be a tragic example of external doubled-ended iteration being inferior to internal iteration. |
let right_edge_raw = right_edge.as_raw(); | ||
match right_edge.edge_mut() { | ||
None => { | ||
match top.leaf() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't seem to be using the Result-ness at all. I'd much rather a custom enum for this if that's the case. Possibly rename the method to force
which returns Leaf(..) or Internal(..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My one worry with creating a new enum with Leaf
and Internal
variants is the sheer amount of type/value namespace madness that would imply. There already is the type Edge
separate from the enum variant Edge
, though, so I guess it would be all right.
Actually, looking at the benchmarking code, there seem to be many more differences between
|
} | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... it occurs to me that the handle API loses some of the value of the SearchStack model. The index-based model meant the Stack could guarantee that it had a complete path, but with this design one can follow handles down a few nodes and then push that handle, violating the guarantee, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - this was exactly the problem that I spent most of my time on the handle API working on that is solved by my use of IdRef
and InvariantLifetime
. Pusher::push
requires the handle being pushed to have an IdRef
based reference to the node with exactly the correct lifetime. However, if you tried to move down the node tree, then you would no longer have access to an IdRef
, as those are only provided by PartialSearchStack::with
- an attempt to push a node further down in the tree would fail statically with something like "expected IdRef<'id, Node<K, V>>
, found &'a mut Node<K, V>
". Even if you did manage to get another IdRef
(e.g. through another call to PartialSearchStack::with
), this would have the wrong lifetime, and the use of InvariantLifetime
means that the lifetimes would have to match exactly.
I encourage you to try to break the API - it might become more clear how safety is ensured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, awesome! I got confused about how exactly the 'id
manages to live on through the search
call, but I went over the code again and see now how it works. The handle wraps exactly the node reference you give to search, and if you tried to actually "follow" the handle to the next child, the 'id
wouldn't come with it.
Pretty slick!
@gereeter I'm not sure it matters, since different benches aren't really intended to be compared to each other, but I suppose you could get the "all misses" behaviour by making sequential Vec and shuffling it. |
I just hacked up an intrusive version of btree::iter:
Use that information how you will. I think this is strong evidence for inclusion in the API. It composes poorly, but is 6 times faster at iterating over 20 elements, and 4 times faster at iterating over 100k elements. (sorry, this PR just so happens to be a forum with everyone who cares about this sorta stuff. Don't mean to hijack the discussion!) I'll prepare a PR for discussion. |
I'm currently skimming through their source for details, but here's at least a fragment of the sequential insertion optimization I was talking about before: https://code.google.com/p/cpp-btree/source/browse/btree.h#1552 |
They sure use a lot of unsafe code. That optimization doesn't look too hard to add. We should do that. |
I'd like to leave the biasing idea out of this for now - while it probably is a good idea in certain cases, it will hurt some cases, and I don't think we have the benchmarks and usage data to say whether it truly is valuable. In contrast, I think that the change in this PR should help all use cases. |
Oh sorry I meant in the future. Definitely not in this PR. Human/IP is a shitty protocol. |
Yes agreed. I simply noted it for the sake of discussion. |
d92c67c
to
a4e5422
Compare
I think I've fixed all the issues brought up so far. |
LGTM. |
…ome runtine checks in favor of newly gained static safety
r=me with squash |
a4e5422
to
808eeff
Compare
Squashed. |
Make BTree's internals safer and do more checks at compile time instead of run time Reviewed-by: Gankro
Before: ``` test btree::map::bench::find_rand_100 ... bench: 12 ns/iter (+/- 0) test btree::map::bench::find_rand_10_000 ... bench: 13 ns/iter (+/- 1) test btree::map::bench::find_seq_100 ... bench: 11 ns/iter (+/- 0) test btree::map::bench::find_seq_10_000 ... bench: 11 ns/iter (+/- 1) test btree::map::bench::insert_rand_100 ... bench: 106 ns/iter (+/- 1) test btree::map::bench::insert_rand_10_000 ... bench: 326 ns/iter (+/- 8) test btree::map::bench::insert_seq_100 ... bench: 198 ns/iter (+/- 1) test btree::map::bench::insert_seq_10_000 ... bench: 312 ns/iter (+/- 3) test btree::map::bench::iter_1000 ... bench: 16563 ns/iter (+/- 173) test btree::map::bench::iter_100000 ... bench: 1686508 ns/iter (+/- 108592) test btree::map::bench::iter_20 ... bench: 365 ns/iter (+/- 25) ``` After: ``` test btree::map::bench::find_rand_100 ... bench: 12 ns/iter (+/- 0) test btree::map::bench::find_rand_10_000 ... bench: 12 ns/iter (+/- 0) test btree::map::bench::find_seq_100 ... bench: 11 ns/iter (+/- 0) test btree::map::bench::find_seq_10_000 ... bench: 11 ns/iter (+/- 0) test btree::map::bench::insert_rand_100 ... bench: 89 ns/iter (+/- 1) test btree::map::bench::insert_rand_10_000 ... bench: 121 ns/iter (+/- 3) test btree::map::bench::insert_seq_100 ... bench: 149 ns/iter (+/- 0) test btree::map::bench::insert_seq_10_000 ... bench: 228 ns/iter (+/- 1) test btree::map::bench::iter_1000 ... bench: 16965 ns/iter (+/- 220) test btree::map::bench::iter_100000 ... bench: 1687836 ns/iter (+/- 18746) test btree::map::bench::iter_20 ... bench: 366 ns/iter (+/- 21) ```
Before:
After: