-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove last non-dev dependency on rand crate #1324
Conversation
This allows dropping _rand_ crate dep here, accept as a dev dependency for tests or benchmarks.
This is really great to see! Two questions, primarily out of curiosity:
|
Rand 0.7 is better than 0.6 was, but its still 9 crates total that can be avoided. To test this, I made a standalone project out of "tokio/examples/hello_world.rs" and patched in tokio master from local git tree, with all rand crates and other dependencies already downloaded: before (patched to current master, d0bb161)% time cargo build
real 0m28.327s
user 1m26.310s
sys 0m5.002s
% du -hs target/
253M target/
% cargo clean
% time cargo build --release
real 0m58.074s
user 3m27.050s
sys 0m4.468s
% du -hs target/
85M target/
% cargo clean after (patch tree updated to this PR, 1b0e41a)% cargo update
Removing autocfg v0.1.5
Removing c2-chacha v0.2.2
Removing getrandom v0.1.6
Removing ppv-lite86 v0.2.5
Removing rand v0.7.0
Removing rand_chacha v0.2.0
Removing rand_core v0.5.0
Removing rand_hc v0.2.0
Removing spin v0.5.0
% time cargo build
real 0m25.559s
user 1m15.476s
sys 0m4.558s
% du -hs target/
236M target/
% cargo clean
% time cargo build --release
real 0m54.466s
user 3m11.480s
sys 0m3.758s
% du -hs target/
77M target/ |
That's a really solid improvement! I can't speak to the code changes, but in principle, +1. |
It's probably really bad if crates here and there start inventing their own seeding and PRNG implementations. The reason we wanted to really get away from it in As Amanieu points out, everyone else should depend on |
If all you need is some entropy to seed something, maybe don't use |
I agree it's a problem that Again, I can't stress enough how one should not use |
@faern, your concern would seem to apply firstly to #466 which has been merged for while. That author found custom Xorshift PRNG significanty faster then using rand s In this PR, I'm basically just trying to complete that process and reap the associated dependency minimization benefits. Yes getrandom is a possible alternative for seeding, which adds back one dep. I don't know of any upside to that alternative though.
A fringe benefit then, IMO. |
A benefit of using 0x9b4e_6d25 // misc bits
I'm not familiar with that PR since I'm not really associated with tokio. I just randomly landed on this PR just now thanks to a github notification. But yes, on first sight that PR does not sound optimal to me. Would have been better to try
This makes it sound like you are only removing stuff that is unused. But tokio does currently use |
My personal opinion is that this project should try out some other low dependency PRNG using |
No, I friendly cc'd you because you were the author of the same Xorshift PRNG in parking_lot_core: Amanieu/parking_lot@4d78285cef7
Actually even with |
Yes. Even if still not optimal it would better signal "Just not zero" IMO. Some other constant, like the one here at least directly makes me think about why this number was chosen. In cryptography in general there are a lot of magic constants chosen for some unique proven property about them. So when reading this code my first reaction is that this value is somehow important in itself. I strongly believe that the best is to avoid magic numbers, and the second best is to use the least surprising ones if they are needed. And I do realize we did not do very well in I see the cc now. Missed that on the first glance, sorry about that. |
I think rand_pcg could be a good choice. It only depends on |
Thanks for feedback to date. I've used it to improve the seed function and its comments. Stepping back slightly, please consider the following prior art using this same 32-bit Xorshift PRNG: In the libstd, pattern-defeating quicksort, where the same 32-bit Xorshift PRNG is seeded with the sort slice buffer length (obviously predictable). In Amanieu/parking_lot#144 where the same 32-bit Xorshift PRNG is used, here seeded by a 1-based incremented atomic (±predictable). Here via #466, currently and as in this PR, with per-thread unpredictable seeds. All of these use cases desire a very fast PRNG with minimal dependencies and can easily afford significant statistical bias in the PRNG output. As far as seeding, we can see in the above that much less care was given to it in other places. I don't think I've "invented" anything in this PR, but rather just found some appropriate if novel use of public libstd interfaces. Would anyone CC'd suggest I also offer this better, unpredictable seeding function to parking_lot or libstd for that matter? Regarding PCGTo be clear, I'm not personally going to propose replacing Xorshift-32 with PCG-32, because I simply don't think its necessary for this use case. This is largely orthogonal to both the dependencies question/goal and the seeding question. If someone wanted to separately propose changing to PCG-32, I would recommend just swapping in the 4-lines of the PCG-32 with the 3-lines of Xorshift-32 here. Alternative: Use getrandom crate for seedingFirst, this would still need to be combined with some per-thread counter or other mechanism, so as to avoid the cost of system random bytes per-thread created. Also rand's getrandom crate, which was originally extracted from libstd, is returning to its origin, see rust-lang/rust#62079, rust-lang/rust#62082! So in a future rust stable release, the change of this PR will be using getrandom. Its better for users to not take on a second copy of getrandom as a dependency of tokio when it can be reasonable avoided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK w/ this strategy. The requirements for the random generator here are pretty minimal. Reducing the number of deps seems good 👍
// 128-bit state). We only need one of these, to make the seeds for all | ||
// process threads different via hashed IDs, collision resistant, and | ||
// unpredictable. | ||
lazy_static! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably wouldn't bother with a lazy_static
. Instead, why not create one when the pool is created and pass the seed in to each worker when that is initialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually tried that first, but get compile errors because &self
is not constant (in fn rand_usize
) for the thread local PRNG. It is an appropriate compiler lint, I think, because if its thread local then it needs to be the same across all pools... I mean, to the extent that the same thread could even be a part of multiple pools (a detail the compiler apparently doesn't know).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, not critical for merging this PR. We can follow up if we care.
@faern I hear your concern, but |
Use std RandomState for XorShift seeding. This allows dropping _rand_ crate dep here, accept as a dev dependency for tests or benchmarks.
Use std RandomState for XorShift seeding. This allows dropping _rand_ crate dep here, accept as a dev dependency for tests or benchmarks.
Use std RandomState for XorShift seeding. This allows dropping _rand_ crate dep here, accept as a dev dependency for tests or benchmarks.
* upgrade to rand 0.7.0 (MSRV 1.32) * upgrade to parking_lot 0.9.0 * Remove last non-dev dependency on rand crate (#1324) Use std RandomState for XorShift seeding. This allows dropping _rand_ crate dep here, accept as a dev dependency for tests or benchmarks. * increase CI MSRV to 1.31.0 * increase nightly for CI TSAN tests * add TSAN suppressions for recent rand related updates * make latest TSAN suppression patterns more general * upgrade tempfile dev dep for common rand version But avoid tempfile 3.2 for now, since history demonstrates it bumps rand versions and MSRV in MINOR updates. * update (dev dep) env_logger to latest 0.6 * reactor, threadpool: bump PATCH versions, doc links, change logs [ci-release]
Motivation
Minimizing dependencies to reduce build size and time for typical tokio applications.
Background
In parking_lot 0.9.0 (upgraded here recently in #1298) the rand dependency was replaced entirely with its own private Xorshift PRNG, used for its fairness mechanism.
The tokio-threadpool also has its own Xorshift PRNG for "fair" work stealing, introduced in #466 as a performance improvement. However the rand dependency was kept, its use limited to producing the initial seeds for the thread local Xorshift PRNGs. This is currently the only existing non-dev dependency on rand, direct or transitive, in tokio. As mentioned, parking_lot no longer depends on rand, and tempfile (rand dep) is also only a dev dependency.
Solution
This PR changes the seed generation to piggyback on libstd's mechanism for obtaining system randomness for the default hasher (as used by HashMap for HashDoS resistance), and hashing on the current thread id. Based on some
eprintln
testing of seeds and Xorshift output, this strategy appears more than adequate for the use case of approximating fairness.cc: @carllerche @stjepang @Amanieu @faern