[Merged by Bors] - Proper prehashing #3963

cart · 2022-02-16T23:28:47Z

For some keys, it is too expensive to hash them on every lookup. Historically in Bevy, we have regrettably done the "wrong" thing in these cases (pre-computing hashes, then re-hashing them) because Rust's built in hashed collections don't give us the tools we need to do otherwise. Doing this is "wrong" because two different values can result in the same hash. Hashed collections generally get around this by falling back to equality checks on hash collisions. You can't do that if the key is the hash. Additionally, re-hashing a hash increase the odds of collision!

#3959 needs pre-hashing to be viable, so I decided to finally properly solve the problem. The solution involves two different changes:

A new generalized "pre-hashing" solution in bevy_utils: Hashed<T> types, which store a value alongside a pre-computed hash. And PreHashMap<K, V> (which uses Hashed<T> internally) . PreHashMap is just an alias for a normal HashMap that uses Hashed<T> as the key and a new PassHash implementation as the Hasher.
Replacing the std::collections re-exports in bevy_utils with equivalent hashbrown impls. Avoiding re-hashes requires the raw_entry_mut api, which isn't stabilized yet (and may never be ... entry_ref has favor now, but also isn't available yet). If std's HashMap ever provides the tools we need, we can move back to that. The latest version of hashbrown adds support for the entity_ref api, so we can move to that in preparation for an std migration, if thats the direction they seem to be going in. Note that adding hashbrown doesn't increase our dependency count because it was already in our tree.

In addition to providing these core tools, I also ported the "table identity hashing" in bevy_ecs to raw_entry_mut, which was a particularly egregious case.

The biggest outstanding case is AssetPathId, which stores a pre-hash. We need AssetPathId to be cheaply clone-able (and ideally Copy), but Hashed<AssetPath> requires ownership of the AssetPath, which makes cloning ids way more expensive. We could consider doing Hashed<Arc<AssetPath>>, but cloning an arc is still a non-trivial expensive that needs to be considered. I would like to handle this in a separate PR. And given that we will be re-evaluating the Bevy Assets implementation in the very near future, I'd prefer to hold off until after that conversation is concluded.

crates/bevy_utils/src/lib.rs

Guvante

LGTM except splitting out fast_eq seems like unnecessary complexity

crates/bevy_utils/src/lib.rs

cart · 2022-02-18T03:25:46Z

bors r+

For some keys, it is too expensive to hash them on every lookup. Historically in Bevy, we have regrettably done the "wrong" thing in these cases (pre-computing hashes, then re-hashing them) because Rust's built in hashed collections don't give us the tools we need to do otherwise. Doing this is "wrong" because two different values can result in the same hash. Hashed collections generally get around this by falling back to equality checks on hash collisions. You can't do that if the key _is_ the hash. Additionally, re-hashing a hash increase the odds of collision! #3959 needs pre-hashing to be viable, so I decided to finally properly solve the problem. The solution involves two different changes: 1. A new generalized "pre-hashing" solution in bevy_utils: `Hashed<T>` types, which store a value alongside a pre-computed hash. And `PreHashMap<K, V>` (which uses `Hashed<T>` internally) . `PreHashMap` is just an alias for a normal HashMap that uses `Hashed<T>` as the key and a new `PassHash` implementation as the Hasher. 2. Replacing the `std::collections` re-exports in `bevy_utils` with equivalent `hashbrown` impls. Avoiding re-hashes requires the `raw_entry_mut` api, which isn't stabilized yet (and may never be ... `entry_ref` has favor now, but also isn't available yet). If std's HashMap ever provides the tools we need, we can move back to that. The latest version of `hashbrown` adds support for the `entity_ref` api, so we can move to that in preparation for an std migration, if thats the direction they seem to be going in. Note that adding hashbrown doesn't increase our dependency count because it was already in our tree. In addition to providing these core tools, I also ported the "table identity hashing" in `bevy_ecs` to `raw_entry_mut`, which was a particularly egregious case. The biggest outstanding case is `AssetPathId`, which stores a pre-hash. We need AssetPathId to be cheaply clone-able (and ideally Copy), but `Hashed<AssetPath>` requires ownership of the AssetPath, which makes cloning ids way more expensive. We could consider doing `Hashed<Arc<AssetPath>>`, but cloning an arc is still a non-trivial expensive that needs to be considered. I would like to handle this in a separate PR. And given that we will be re-evaluating the Bevy Assets implementation in the very near future, I'd prefer to hold off until after that conversation is concluded.

bors · 2022-02-18T03:53:23Z

Pull request successfully merged into main.

Build succeeded:

cart added 6 commits February 16, 2022 14:12

use hashbrown for bevy hashmaps / hashsets

866f758

proper prehashing

0f5d298

remove commented out code

9d4c30e

bevy_ecs: replace incorrect pre-hashing strategy

18a40ae

Add docs

cd7131f

clippy

3d56b37

github-actions bot added the S-Needs-Triage This issue needs to be labelled label Feb 16, 2022

cart added A-Core and removed S-Needs-Triage This issue needs to be labelled labels Feb 16, 2022

fix doc tests

6793774

james7132 reviewed Feb 17, 2022

View reviewed changes

crates/bevy_utils/src/lib.rs Outdated Show resolved Hide resolved

Guvante reviewed Feb 17, 2022

View reviewed changes

crates/bevy_utils/src/lib.rs Outdated Show resolved Hide resolved

cart added 2 commits February 17, 2022 19:12

resolve comments

5c4c601

tweak docs

3ca8841

bors bot changed the title ~~Proper prehashing~~ [Merged by Bors] - Proper prehashing Feb 18, 2022

bors bot closed this Feb 18, 2022

This was referenced Feb 18, 2022

[Merged by Bors] - Mesh vertex buffer layouts #3959

Closed

Update hashbrown requirement from 0.11 to 0.12 #4004

Closed

alice-i-cecile mentioned this pull request May 11, 2022

Update to hashbrown 0.12 #4722

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Merged by Bors] - Proper prehashing #3963

[Merged by Bors] - Proper prehashing #3963

cart commented Feb 16, 2022

Guvante left a comment

cart commented Feb 18, 2022

bors bot commented Feb 18, 2022

[Merged by Bors] - Proper prehashing #3963

[Merged by Bors] - Proper prehashing #3963

Conversation

cart commented Feb 16, 2022

Guvante left a comment

Choose a reason for hiding this comment

cart commented Feb 18, 2022

bors bot commented Feb 18, 2022