std: Cache HashMap keys in TLS #33318

alexcrichton · 2016-05-01T18:22:31Z

This is a rebase and extension of #31356 where we not only cache the keys in
thread local storage but we also bump each key every time a new HashMap is
created. This should give us a nice speed bost in creating hash maps along with
retaining the property that all maps have a nondeterministic iteration order.

Closes #27243

rust-highfive · 2016-05-01T18:22:35Z

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

sfackler · 2016-05-10T00:07:52Z

ping?

briansmith · 2016-05-10T00:17:03Z

Why is it safe to key different HashMaps with keys that are known to each differ by only 1?

eternaleye · 2016-05-10T00:18:55Z

@briansmith: Because if it was not safe that would constitute a related-key attack, and none is known for SipHash?

briansmith · 2016-05-10T00:21:37Z

briansmith: Because if it was not safe that would constitute a related-key attack, and none is known for SipHash?

The hash function isn't necessarily SipHash. It's pluggable per-hashmap and also the default may be changed.
I don't find "we don't currently know of a related-key attack on SipHash" very encouraging. There are lots of ways to derive N secret keys from an initial secret key that are fast and don't depend on the related-key-security of the hashing function.

eternaleye · 2016-05-10T00:23:32Z

@briansmith: Fair enough on both points - Not being assured of SipHash means that it may well be an issue, and I agree that the margin of security even so is unnecessarily thin.

sfackler · 2016-05-10T00:24:09Z

This logic is only used by the SipHash hasher.

aturon · 2016-05-10T01:01:49Z

r=me on the code and perf front. But I don't feel qualified to judge the security/DDoS protection side of this.

briansmith · 2016-05-10T02:42:08Z

But I don't feel qualified to judge the security/DDoS protection side of this.

Others have quoted this, but again, this is what the SipHash paper says:

On startup a program reads a secret SipHash key from the operating system’s cryptographic random-number generator; the program then uses SipHash for all of its hash tables.

Thus, the easiest thing to do is what the SipHash authors recommend: Use std::sync::Once to generate one key using the OsRng, and then use that same key for every hash table.

Think about the threat model: We assume that the attacker can add or remove arbitrary (key, value) entries from any hash table used in the program. From this, it follows that we assume the attacker can change any hash table A into any other hash table B by removing all the items from A and then copying all the entries from B into A. Thus, it seems to not help if A or B have different keys, at least under this threat model. If you have a different threat model, it would be a good idea to document it.

More generally, crypto people never generate a secret key by adding a constant value to another secret key. See https://en.wikipedia.org/wiki/Related-key_attack for an introduction to why. tl;dr: Knowing the difference of two secrets can help an attacker find the value of one (usually both) secrets, even if they wouldn't be able to find the values any other way. Because no crypto people would do this, it is unlikely that somebody will seriously study the problems that may or may not occur when somebody does what is proposed in this PR because we generally assume it is a-priori wrong to do.

HTH.

aturon · 2016-05-17T19:44:04Z

@briansmith Thanks for the comment! That does indeed help highlight some of the tradeoffs here.

AIUI, the motivation for using these distinct (but related) keys is just to avoid clients of the default hashmap from accidentally assuming that all instances share a common key -- a behavior we could conceivably want to change in the future. But it could easily be that this cure is worse than the disease, and we'd be better off just very clearly documenting that you cannot rely on the apparent determinism. We just risk de facto lock-in to that behavior, but that seems (to me) better than taking a step that could easily end up revealing hashmap keys.

@rust-lang/libs Thoughts here?

alexcrichton · 2016-05-18T00:42:39Z

Yeah I'm not too worried about switching to a per-process key with the risk of relying on a per-process deterministic iteration order. It's just a "nice to have" to make everything nondeterministic really I think.

huonw · 2016-05-18T01:00:54Z

It seems to me that we could have this per-thread without the adjustment, but maybe having inter-thread differences isn't worth the slightly higher complexity vs. just being uniform through a process.

aturon · 2016-05-19T23:16:47Z

Per-thread sounds like a good compromise across the board. @alexcrichton, want to update accordingly?

This is a rebase and extension of rust-lang#31356 where we cache the keys in thread local storage. This should give us a nice speed bost in creating hash maps along with mostly retaining the property that all maps have a nondeterministic iteration order. Closes rust-lang#27243

alexcrichton · 2016-05-20T00:00:00Z

Sounds like a plan to me, I've updated the PR, the comment, and I also tweaked to use OsRng directly. The thread_rng isn't necesssary when we cache per-thread b/c we're gonna hit OsRng for the first time on each thread anyway.

aturon · 2016-05-20T16:08:52Z

Thanks!

@bors: r+

bors · 2016-05-20T16:08:53Z

📌 Commit eaeef3d has been approved by aturon

bors · 2016-05-20T19:39:01Z

⌛ Testing commit eaeef3d with merge 179539f...

std: Cache HashMap keys in TLS This is a rebase and extension of #31356 where we not only cache the keys in thread local storage but we also bump each key every time a new `HashMap` is created. This should give us a nice speed bost in creating hash maps along with retaining the property that all maps have a nondeterministic iteration order. Closes #27243

bors · 2016-05-20T23:36:59Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-32cross-opt, auto-linux-64-cargotest, auto-linux-64-cross-armhf, auto-linux-64-cross-armsf, auto-linux-64-cross-freebsd, auto-linux-64-cross-netbsd, auto-linux-64-debug-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-opt-mir, auto-linux-64-opt-rustbuild, auto-linux-64-x-android-t, auto-linux-cross-opt, auto-linux-musl-64-opt, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-mac-64-opt-rustbuild, auto-mac-cross-ios-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-32-opt-rustbuild, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt, auto-win-msvc-32-cross-opt, auto-win-msvc-32-opt, auto-win-msvc-64-cargotest, auto-win-msvc-64-opt, auto-win-msvc-64-opt-mir, auto-win-msvc-64-opt-rustbuild

rust-highfive assigned aturon May 1, 2016

alexcrichton mentioned this pull request May 1, 2016

Seed HashMaps thread-locally, straight from the OS. #31356

Closed

alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 18, 2016

alexcrichton force-pushed the hashmap-seed branch from d7503b2 to eaeef3d Compare May 19, 2016 23:59

bors mentioned this pull request May 20, 2016

Book: small improvement to a table to make it clearer #33743

Merged

bors merged commit eaeef3d into rust-lang:master May 20, 2016

bluss added the relnotes Marks issues that should be documented in the release notes of the next release. label May 21, 2016

alexcrichton deleted the hashmap-seed branch May 25, 2016 00:22

Veedrac mentioned this pull request Sep 14, 2016

Exposure of HashMap iteration order allows for O(n²) blowup. #36481

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

std: Cache HashMap keys in TLS #33318

std: Cache HashMap keys in TLS #33318

alexcrichton commented May 1, 2016

rust-highfive commented May 1, 2016

sfackler commented May 10, 2016

briansmith commented May 10, 2016

eternaleye commented May 10, 2016 •

edited

Loading

briansmith commented May 10, 2016

eternaleye commented May 10, 2016

sfackler commented May 10, 2016

aturon commented May 10, 2016

briansmith commented May 10, 2016

aturon commented May 17, 2016

alexcrichton commented May 18, 2016

huonw commented May 18, 2016

aturon commented May 19, 2016

alexcrichton commented May 20, 2016

aturon commented May 20, 2016

bors commented May 20, 2016

bors commented May 20, 2016

bors commented May 20, 2016

std: Cache HashMap keys in TLS #33318

std: Cache HashMap keys in TLS #33318

Conversation

alexcrichton commented May 1, 2016

rust-highfive commented May 1, 2016

sfackler commented May 10, 2016

briansmith commented May 10, 2016

eternaleye commented May 10, 2016 • edited Loading

briansmith commented May 10, 2016

eternaleye commented May 10, 2016

sfackler commented May 10, 2016

aturon commented May 10, 2016

briansmith commented May 10, 2016

aturon commented May 17, 2016

alexcrichton commented May 18, 2016

huonw commented May 18, 2016

aturon commented May 19, 2016

alexcrichton commented May 20, 2016

aturon commented May 20, 2016

bors commented May 20, 2016

bors commented May 20, 2016

bors commented May 20, 2016

eternaleye commented May 10, 2016 •

edited

Loading