-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to hashbrown's RawTable internally #131
Conversation
Note that this does raise the MSRV to 1.32, but that's well more than a year old. Perhaps we should release 1.4.1 for other recent changes, then bump to 1.5.0 for this change. We can also upgrade to 2018 edition, but I didn't want to pollute this PR with that churn.
|
For future work, I've also been playing with parameterizing the index type, since pub struct IndexMap<K, V, S = RandomState, Idx = usize> { ... }
pub struct IndexSet<T, S = RandomState, Idx = usize> { ... } Then internally we'd switch to |
Really cool work. So we lose the 32-bit index optimization, and still have some performance improvements, does that mean that there is more to gain? We are giving up being implemented in safe Rust and can only do so if we show performance improvements that are above the noise. Some of the lookup benchmark cases do that, and the insert cases barely do it. The improved lookup benches are really encouraging. I'll read RawTable and then come back to reviewing. Are you going to start experimenting with this version of indexmap in rustc, before it gets merged? |
Maybe, but that answer depends on whether I understand that optimization. 🙂 AIUI, the benefit of the 32-bit index is that we can stuff a short hash in the other 32-bits, which means we can use the hash without another memory access to
Generally, I prefer the simplicity without those
I do have local builds where I'm using this. So far it's a very slight improvement, less than 1%, but that's a good thing! For example, they reported a 21% slowdown in Here are those branches: (You may need to adjust the rust Cargo.toml patch if you want to compile this yourself.) |
@bluss where do you stand on this? Just waiting for time to review it? |
This weekend should have some time where I can review |
There isn't much |
b918c88
to
15a69cb
Compare
Thanks all for your input on the discussion, and helping me find some solid footing again with the raw pointer code. Thanks cuviper for working on this - I'll work on helping you wrap this up as soon as you want. There are two discomforting things left; I think it's something that's rather general to come up in this situation
|
This might be the first time we raise the MSRV in 1.x, but with a good reason. It looks like we might break serde_json, which builds CI with Rust 1.31? cc @dtolnay, when we raise ours to Rust 1.32 here. Other top rdeps - toml, http, petgraph, they look fine from MSRV standpoint. |
I'll play a little with abstracting this. It would be nice if we could contain
Yeah, this is kind of like the tootsie-pop model, just like
If we entertain the idea of indexmap 2.0, are there other changes you would want?
I am reluctant to push MSRV changes on others, but at least that is only for their non-default |
That looks fine, thanks. |
I don't think it's wrong to carefully raise the MSRV, it's the plan we have documented and promised since 1.0. However we can discuss it with those that depend on us. |
I tried an abstraction to wrap the raw bucket with an appropriate lifetime, by bundling to a map reference. We really only need that in I settled instead on just encapsulating this in its own module. I moved |
I agree |
I can't really think of any breaking changes that are on the wishlist. I definitely don't want to pile on work here, but a 2.0 version could include the parameterization by index type. Then we can update the The "experimental" things - Equivalent trait works well, the MutableKeys trait I don't know, so those things seem like they could stay as they are for 2.0. Which method to make the default |
I opened #135 for 2.0 discussion. |
I'd vote to merge this and go for indexmap 1.5 with this |
Not sure if my vote counts, but I agree :p |
These methods trust their caller to pass correct RawBucket values, so we mark them unsafe to use the common safe/unsafe distinction. I used allow(unused_unsafe) to write the functions in the (hopefully) future style of internal unsafe blocks in unsafe functions.
It was an over-optimization to use `clear_no_drop`, which hurts the possibility of using griddle as an alternate table implementation.
a7c83bc
to
603c326
Compare
I've added the version bump and some release notes to this PR. |
|
Great! I've updated to use the new hashbrown methods. |
Let's ship it! |
This switches
IndexMapCore
from its bespoke hash table tohashbrown::raw::RawTable<usize>
, storing just the index into our orderedentries: Vec<Bucket<K, V>>
. We lose the badge of having nounsafe
code, but the overall implementation is much simpler, relying on the battle-hardenedhashbrown
for the tricky parts. I have also confirmed that the testsuite passes undercargo miri
.I'll post the benchmark comparison in a followup comment, but broadly speaking it appears faster for insertion and lookup, and slower for removal. I think that's a reasonable trade-off for this crate.
As a bonus, I also implemented a proper
reserve
and addedshrink_to_fit
, since they are now pretty straightforward.cc @Amanieu -- thanks for
hashbrown
and for exposingRawTable
to make this possible!