-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compare compiler performance with hashbrown
#55514
Comments
Should be fairly easy -- well, except I think we now don't have the FxHash implementations in-tree so that might be a bit harder... I think this is a great issue for someone to take on though, if they want to swap out the FxHash implementation and post a PR we can run perf on it! |
Better yet... if it is a fully drop-in replacement for |
depends on if "drop in" means "adheres to every expected requirement of HashMap" or just "API compatible". |
It also lists
It'd be very good to performance test it on all the architectures we care about people running the compiler on. If we replace the standard library implementation , then expand that to every tier 1 platform. |
@shepmaster "drop-in" means that I literally copied the |
@Amanieu are there any guarantees not covered in the signatures that don't fit other than the default hasher? (we would retain the standard library's default hasher...) |
There is one small detail due to the way re-hashing works: if the |
@Amanieu ah; that seems like sane behavior tho (or at least I don't think it's weird behavior...) and very subtle / unlikely to happen. I'm having a hard time imagining a situation where the the Hash impl would not panic on insertion but then panic on rehash... maybe for some type with interior mutability perhaps (where the |
Is there anything that needs to be accounted for in #54043 to ensure that such a replacement would be possible in the future? |
@scottmcm There is no issue there, the raw entry API can easily be ported to |
We'd also want a way to avoid double lookups when interning like my PR did. The It would be a good idea to pick up the refactoring to use a |
I'm having some trouble replacing |
You can replace it only for stage1+, using |
Is there a way to specify a dependency only for stage1+ in Cargo.toml? I already have the |
I don't know of a way to do that. @gnzlbg Do you know of a workaround here? Or why |
@Amanieu do you have a way to manually enable the usage of Is there a way to use the Looking at hashbrown, that looks a lot like the new swiss tables data-structures in abseil (https://abseil.io/blog/20180927-swisstables), looks like a really cool project! |
I ended up just making a separate branch for it: https://github.com/Amanieu/hashbrown/tree/stage0 |
@Amanieu If your library cannot compile without |
No, it's just that it tries to use |
cc @gankro |
@Amanieu are you pursuing this integration further? I think if you can get a branch that works, you just need to open a PR and we can run the initial |
cc @rust-lang/wg-compiler-performance |
Discussed during the compiler meeting. It would be nice to have a faster hashmap for the compiler, but it would also be interesting to see what are the options for including the algorithm into the standard library in the first place (the compiler would automatically reap the benefits without any extra effort necessary). There were questions about what exactly makes these hashmaps faster/better than those in libstd. What are the pros/cons etc. |
I started work on a branch which replaces It doesn't compile yet, and I am currently busy with other things so I can't work on this. Simply replacing the standard hash map in libstd as @nagisa suggests might be an option, however there are several factors that we need to take into account:
|
Two years ago I described rustc's execution as "mostly malloc/free and hash tables". That is less true than it used to be, but hash tables are still used a lot. If hashbrown doesn't store key hashes it should use less memory, right? |
It might be interesting to see if we could start caching hashes e.g. in |
I found a potential breaking change while porting |
cc #56241 |
I notice that the benchmarks listed are with maps using integers as keys. How does it do with medium length strings? (5-20 character) This is a very common for hashmap keys, and hashing time would likely dominate, making re-hashing potentially expensive. |
Some time ago I started working on a hashmap benchmark and comparison project https://github.com/Voultapher/hashmap-compare. It's still very WIP. Among other things its missing result analysis and visualization. Looking at the raw output of https://gist.github.com/Voultapher/126dda90923c4c5b24135a9ecff403f3 |
@Voultapher Can we perhaps get the difference between hash tables printed in the Unless I'm missing something, it looks like @Amanieu Can we do anything to speed up small hash tables (~10 entries)? |
What is happening here is that the benchmarks include the time taken to grow the table. Unlike the std hashmap, hashbrown starts at a size of 1 element and double in size as needed. On the other hand the standard hashmap immediately starts with 32 elements. Of course, this is something that can be tweaked. The intention was to avoid excessive memory usage when having many tables with few elements. |
Good to hear this will be easy to fix then! We'll probably want to have faster growth for small hash tables without compromising too much on memory usage. Maybe something like 0 -> 4 -> 16 -> 32 -> 64 ... |
@Voultapher If you want, try changing |
Perhaps this behavior could be adaptive based on |
|
Do I understand correctly that this means that
|
@shepmaster No, both the old and new HashMap do not allocate any memory when created. However once an element is inserted, hashbrown will gradually grow the capacity (2,4,8,16, ...) whereas the current HashMap will immediately start at 32 buckets. |
FWIW I prefer @Amanieu 's approach of not starting at 32 elements. If a user finds this to be a performance issue, the solution is easy: use However, if a user needs hash maps that are smaller than 32 elements (e.g. due to memory consumption concerns), and the hash map always start at 32 elements, the solution isn't really that easy anymore. This might worsen the performance of some applications for which a 32 element start was paying off, but AFAICT these applications might perform even better by reserving some other number of elements, and relying on a specific growth policy of |
I agree to a point. But if this is super common we're just creating a performance trap for most of our users to make the system more flexible for the minority. Also we could just as easily write the map with the following logic (in fact I would expect it to have this form):
And users who want slow growing can That said I think it's fine to land with the either growing logic and refine it in followups with realworld data. |
We probably want to take a look at this sooner rather than later, specially considering that Rust encourages/defaults to Sip13. Most (All?) benchmarks so far are focusing on very fast hashers and integer keys. |
Here are @Voultapher's results in cmp
cmp_reserve
cmp_string_key
cmp_string_value
cmp_string_key_string_value
cmp_string_pad_string_key
cmp_string_pad_string_value
cmp_string_pad_string_key_string_value
|
To be clear I'm not saying it's going to be slower or anything like that. We just need to look into the worst case. |
These results look pretty good! Is there a reason why the new hashmap impl can't cache hash values? |
One of the major improvements of hashbrown over our current design is to truncate the hash array to be byte entries instead of 8 bytes. This significantly improves cache locality of the hash search under high load factors, reduces the memory footprint of all our hashtables, and allows 4, 8, or 16 hashes to be compared at once (on 32-bit, 64-bit, and sse2 platforms respectively). This comes at the cost of only storing 7 bits of the hash (you still need one bit for empty/full). So in theory we could avoid rehashing up to a capacity of 128, but past that we need to rehash to retrieve the higher bits. |
The 7 bits stored in the table are taken from the top bits of the hash code, while the bits used to determine the initial search position in the array are taken from the bottom bits. This is important since it means that when we find a matching control byte in the table, we have actually matched both the top and bottom bits of the hash code, which significantly decreases the chance of false positives. However this means that these 7 bits are completely useless when rehashing, since they don't contain the bits used for determining the position in the array. |
Ah dang, well in my defense I haven't gotten to that part of the code review yet :) |
Ah, interesting. Thanks for the info! |
When will be the new HashMap implementation( |
@mominul On July 4th, in 1.36, if I recall correctly. |
Closing: performance was explored, it was excellent, it is now the implementation of std::collections::HashMap (and Set). |
This https://github.com/Amanieu/hashbrown crate looks nice!
We should do a performance comparison.
cc @Amanieu
The text was updated successfully, but these errors were encountered: