-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use getrandom crate for retrieving system entropy? #62079
Comments
Are there existing instances of std depending on another crate for cross-OS functionality? In particular what I'm wondering is: When a new platform is added, do we have a strategy for how to update both |
For unsupported platforms |
I think this should only be done if What concrete problem is this actually solving?
Note that even earlier this month there were serious bugs found in the |
I'd personally like to see some sort of first class API in Rust which solves this particular problem. That said I think I'm not sure the proper interface boundary to make this work "first class": a lang-item? To answer @briansmith's concerns: should As I look at recent I think it probably should be. The overwhelming majority of CPUs and microcontrollers are dedicating silicon to this problem. Based on that, and the amount of time I spend pontificating on this particular problem with both hardware designers and cryptographers, I think the problem is a |
@briansmith
Having a single code base for securely retrieving system entropy, not 3 or more. Considering non-triviality of this problem on some targets, I believe it's a problem worth soving. |
One of the reasons cited for why this is needed is because HashMap depends on random. In the course of implementing aHash I found a much faster way to do this. For it's randomization aHash uses a combination on Const-Random and the address of the HashBuilder. This essentially eliminates any runtime overhead and makes creating small hashmaps on the stack much cheaper. I am hoping aHash becomes the default hasher, but even if it doesn't this approach is worth considering. |
So you combine a high-quality per-binary key, a low-quality (limited address space) per-run key with a hash function optimised for small input who's security relies on no-one seeing the output? Certainly intriguing. I guess the important question is whether this is good enough for everyone's hashing needs (because if not, it doesn't remove the need for a system RNG in |
@tkaitchuck That approach seems thoroughly unsuitable as a default, as its security relies on both:
The current, default Moreover, cryptography is littered with broken systems that were designed under assumptions that weren't precisely the ones that are guaranteed by the actual primitives used; having the default hash implement a random oracle (i.e. a PRF) is the strongest requirement which we can make of it, it matches the intuition of what a hash function is, and many other desirable properties (MAC, ...) follow from it. On the topic of aHash itself, it doesn't seem to make any well-defined security claims, let alone claims w.r.t. the standard models used in cryptography (being a PRF, a XOF, a MAC, ...); as such, it's unclear to me how anyone is supposed to use it appropriately (i.e. only relying on properties it claims to provide). |
As @KellerFuchs noted this breaks reproducible builds, which have been a major goal of the Rust ecosystem and an area of concern for the Rust Secure Code WG. I personally consider anything which breaks reproducible builds to be highly undesirable.
I strongly hope anything selected as the default hasher has undergone a considerable amount of peer review, particularly by those involved in previous hashDoS attacks (e.g. Jean-Phillippe Aumasson, Martin Boßlet, djb). I do not think aHash meets that bar. What aHash does seems not too far off from Golang's |
That simply is not true. If an attacker can induce values to be hashed and see the resulting hashes. They can simply brute force partial collisions. No hashing algorithm can protect against this. |
@tkaitchuck Yes, that's what “doesn't provide a meaningful advantage” means: the adversary cannot distinguish it (up to some negligible advantage, here it's IIRC 2⁻⁶⁴) from a random oracle, and so cannot do better than a generic brute-force attack (i.e. they don't gain a meaningful advantage from us using the concrete SipHash function, compared to the idealised model). That's the essence of what security claims in cryptography are, and you might want to familiarise yourself with that field before pushing for the inclusion of your ego-project in the stdlib? |
@KellerFuchs Consider a (bad) hashTable that decides to rehash after N consecutive collisions. If a attacker can just observe the iteration order of the map they can very rapidly exhaust the systems memory. Because if the lower bits of group of keys is the same they will all go into the same hash bucket. So brute forcing N bits can force the allocation of 2^n ram. If instead collisions are linked, a cpu use attack becomes trivial. Instead a HashMap must inline the collisions into the array, and should keep additional bits (as HashBrown does) and change the hash function upon resize. Attempting to use a "secure" hash function with a bad collision mitigation mechanism in the HashMap is pure security theater. Once all of those requirements on the HashMap are in place, the requirements on the hash function are actually weak but very specific: It must meet the strict avalanche criterion and bit Independence criterion for all the bits of the input and of the key. That's it. It doesn't have to be a strong PRF, be non-reversable, or be indistinguishable from random output. Asserting that these properties hold is in fact counter productive because it leads people to the incorrect belief that these are relevant or sufficient, when they are not. Which in turn leads to the very common mistake of installing a 'strong' hash function and assuming the problem is solved, leaving the system wide open to attack. |
@tkaitchuck this will be my last response on this issue as I feel like your hash function is a distraction from the real issue at hand, and for that reason I'll keep it brief. The core idea of hashDoS is applying traditional cryptanalysis techniques to non-cryptographic hash functions, and then leveraging those attacks to generate multicollisions. If we look at how the attacks on MurmurHash2 and CityHash (i.e. hashDoS Reloaded as presented at 29c3) actually worked, they were both based on differential cryptanalysis (see Martin Bosslet's excellent writeup on these attacks). Reduced round AES is known to be vulnerable to differential cryptanalysis, including but not limited to practical impossible differential cryptanalysis up to 5 rounds, as well as practical key recovery attacks up to 5 rounds. Constructions which purport to address hashDoS require rigorous security requirements and analysis, ideally published in the form of a peer reviewed research paper. Your construction does not meet these requirements, and does quite the opposite in terms of using AES in ways which, for me at least, sets off alarm bells. |
Yes. This has gotten off topic. Please post any comments in terms of what you would like to see here: tkaitchuck/aHash#10 |
Yes, when rehashing a table, one should select a new random key (hence the term rehashing...). Nobody in this discussion claimed the opposite, or that
This is incorrect for several reasons:
I'm not sure why you are asserting that weakening the properties provided by a public API is good ergonomics and will lead to less insecure code being written; I'm not usually in the habit of repeating myself, but here you go:
In any case, this isn't something I will debate: providing misuse-resistant constructions are the best tool we have to help people avoid using them insecurely; this is widely-acknowledged fact, for instance by cryptographer Phillip Rogaway (who was instrumental to developing the notion of misuse-resistance in cryptography), or @tarcieri (who is a Rust Secure Coding WG member, created the cryptography WG, and the Miscreant misuse-resistant cryptography library)... It's the last I will answer, considering that I do not believe you are arguing in good faith:
Note that those are characteristic features of sealioning, a common trolling and harrasment tactic, and that neither of those behaviours (harrasment and trolling) are compatible with the Rust Code of Conduct. |
The last comment is from 2019. Since that time, I think it's clear that the needs of libstd are different from what a typical |
Right now
std
andgetrandom
essentially duplicate each other. And Rust already depends onrand
(viatempfile
) which in v0.7 will usegetrandom
. So I think it makes sense to use a single implementation, to keep work on correctly retrieving system entropy focused in one place.Note that right now I do not propose to expose
getrandom
API as part ofstd
or introduce a lang item, it's a separate discussion, see rust-random/getrandom#21. Also it's probably will be better to wait until rust-random/getrandom#13 will be closed, which should happen relatively soon.cc @dhardy @tarcieri
The text was updated successfully, but these errors were encountered: