Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seed HashMaps thread-locally, straight from the OS. #31356

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 17 additions & 6 deletions src/libstd/collections/hash/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ use borrow::Borrow;
use cmp::max;
use fmt::{self, Debug};
use hash::{Hash, SipHasher, BuildHasher};
use internal_rand;
use iter::{self, Map, FromIterator};
use mem::{self, replace};
use ops::{Deref, Index};
use rand::{self, Rng};

use super::table::{
self,
Expand Down Expand Up @@ -1648,9 +1648,10 @@ impl<'a, K, V, S> Extend<(&'a K, &'a V)> for HashMap<K, V, S>

/// `RandomState` is the default state for `HashMap` types.
///
/// A particular instance `RandomState` will create the same instances of
/// `Hasher`, but the hashers created by two different `RandomState`
/// instances are unlikely to produce the same result for the same values.
/// A particular instance `RandomState` will create the same instances
/// of `Hasher`, but there is no guarantee that hashers created by two
/// different `RandomState` instances will produce the same result for
/// the same values.
#[derive(Clone)]
#[stable(feature = "hashmap_build_hasher", since = "1.7.0")]
pub struct RandomState {
Expand All @@ -1664,8 +1665,18 @@ impl RandomState {
#[allow(deprecated)] // rand
#[stable(feature = "hashmap_build_hasher", since = "1.7.0")]
pub fn new() -> RandomState {
let mut r = rand::thread_rng();
RandomState { k0: r.gen(), k1: r.gen() }
thread_local!(static KEYS: (u64, u64) = {
// get some random bytes from the OS
let mut bytes = [0_u8; 16];
internal_rand::fill_bytes(&mut bytes);

let keys: (u64, u64) = unsafe { mem::transmute(bytes) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird transmute. Is this really ok? Seems like the alignment of the u8 array is probably less than the u64 tuple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, should start with keys, and transmute keys to bytes before passing them to internal_rand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A transmute like this should work fine: it's reinterpreting the value, not the memory location. That is, let out = transmute::<A, B>(x) is/should be equivalent to (in C)

B out;
memcpy(&out, &x, sizeof(B))

It would definitely be problematic if it was &[u8; 16] to &(u64, u64).

(If the above is not the case, it seems to me like it is an unnecessary footgun.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Do we guarantee tuple layout?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but there's no way for that to matter here. If the transmute succeeds then you've loaded up 128 bits with randomness, and that's all that matters.

I am intrigued by the by-val transmute issue... I don't think I see a lot of transmuting of non-pointers... I have no idea how it should behave!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding padding will make the tuple larger, and hence cause the transmute to fail. Also, note that this is in the standard library, and hence can rely on current behaviours of the compiler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gankro You shouldn't see much transmutes of pointers either, since those are covered by regular casts (and &* to create a reference).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a more natural way to write this is to create a [u64; 2] and take a slice of it as &mut [u8] (its byte representation) and pass it to fill bytes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, let out = transmute::<A, B>(x) is/should be equivalent to (in C) ... (memcpy)

should be, but istm llvm could decide there are alignment-dependent instructions that would work better. just handwaving

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a more natural way to write this is to create a [u64; 2] and take a slice of it as &mut [u8](its byte representation) and pass it to fill bytes?

I did this originally, but it is significantly more fiddly.

llvm could decide there are alignment-dependent instructions that would work better. just handwaving

I think that would mean that LLVM is explicitly disobeying our instructions and hence would be a miscompilation. (I had a look at the IR, and it does compile down to a memcpy.)

keys
});

KEYS.with(|&(k0, k1)| {
RandomState { k0: k0, k1: k1 }
})
}
}

Expand Down
275 changes: 275 additions & 0 deletions src/libstd/internal_rand.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,275 @@
// Copyright 2013-2015 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution and at
// http://rust-lang.org/COPYRIGHT.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

//! Interface to the operating system provided random number
//! generators:
//!
//! - Unix-like systems (Linux, Android, Mac OSX): read directly from
//! `/dev/urandom`, or from `getrandom(2)` system call if available.
//! - Windows: calls `CryptGenRandom`, using the default cryptographic
//! service provider with the `PROV_RSA_FULL` type.
//! - iOS: calls SecRandomCopyBytes as /dev/(u)random is sandboxed.
//! - OpenBSD: uses the `getentropy(2)` system call.


pub use self::imp::fill_bytes;

#[cfg(all(unix, not(target_os = "ios"), not(target_os = "openbsd")))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to deduplicate this with the existing randomness support? At least in terms of an internal implementation detail it'd be good to not have to keep this in sync in two places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mention above, I was/am going to remove the existing support from std since we only need what's in this file now. However, the rand crate will still need to exist for internal testing purposes; should it still be dedup'd with that? (It's definitely possible, it just means slightly more boilerplate in here.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry for letting this slip through the cracks, but yeah that's seems fine to me!

mod imp {
use fs::File;
use io::{self, Read};
use libc;
use sys::os::errno;

#[cfg(all(target_os = "linux",
any(target_arch = "x86_64",
target_arch = "x86",
target_arch = "arm",
target_arch = "aarch64",
target_arch = "powerpc",
target_arch = "powerpc64",
target_arch = "powerpc64le")))]
fn getrandom(buf: &mut [u8]) -> libc::c_long {
#[cfg(target_arch = "x86_64")]
const NR_GETRANDOM: libc::c_long = 318;
#[cfg(target_arch = "x86")]
const NR_GETRANDOM: libc::c_long = 355;
#[cfg(target_arch = "arm")]
const NR_GETRANDOM: libc::c_long = 384;
#[cfg(any(target_arch = "powerpc", target_arch = "powerpc64",
target_arch = "powerpc64le"))]
const NR_GETRANDOM: libc::c_long = 359;
#[cfg(target_arch = "aarch64")]
const NR_GETRANDOM: libc::c_long = 278;

unsafe {
libc::syscall(NR_GETRANDOM, buf.as_mut_ptr(), buf.len(), 0)
}
}

#[cfg(not(all(target_os = "linux",
any(target_arch = "x86_64",
target_arch = "x86",
target_arch = "arm",
target_arch = "aarch64",
target_arch = "powerpc",
target_arch = "powerpc64",
target_arch = "powerpc64le"))))]
fn getrandom(_buf: &mut [u8]) -> libc::c_long { -1 }

fn getrandom_fill_bytes(v: &mut [u8]) {
let mut read = 0;
while read < v.len() {
let result = getrandom(&mut v[read..]);
if result == -1 {
let err = errno() as libc::c_int;
if err == libc::EINTR {
continue;
} else {
panic!("unexpected getrandom error: {}", err);
}
} else {
read += result as usize;
}
}
}

#[cfg(all(target_os = "linux",
any(target_arch = "x86_64",
target_arch = "x86",
target_arch = "arm",
target_arch = "aarch64",
target_arch = "powerpc",
target_arch = "powerpc64",
target_arch = "powerpc64le")))]
fn is_getrandom_available() -> bool {
use sync::atomic::{AtomicBool, Ordering};
use sync::Once;

static CHECKER: Once = Once::new();
static AVAILABLE: AtomicBool = AtomicBool::new(false);

CHECKER.call_once(|| {
let mut buf: [u8; 0] = [];
let result = getrandom(&mut buf);
let available = if result == -1 {
let err = io::Error::last_os_error().raw_os_error();
err != Some(libc::ENOSYS)
} else {
true
};
AVAILABLE.store(available, Ordering::Relaxed);
});

AVAILABLE.load(Ordering::Relaxed)
}

#[cfg(not(all(target_os = "linux",
any(target_arch = "x86_64",
target_arch = "x86",
target_arch = "arm",
target_arch = "aarch64",
target_arch = "powerpc",
target_arch = "powerpc64",
target_arch = "powerpc64le"))))]
fn is_getrandom_available() -> bool { false }

pub fn fill_bytes(b: &mut [u8]) {
if is_getrandom_available() {
getrandom_fill_bytes(b)
} else {
let mut reader = File::open("/dev/urandom").expect("failed to open /dev/urandom");
reader.read_exact(b).expect("failed to read bytes from /dev/urandom");
}
}
}

#[cfg(target_os = "openbsd")]
mod imp {
use libc;
use sys::os::errno;

pub fn fill_bytes(v: &mut [u8]) {
// getentropy(2) permits a maximum buffer size of 256 bytes
for s in v.chunks_mut(256) {
let ret = unsafe {
libc::getentropy(s.as_mut_ptr() as *mut libc::c_void, s.len())
};
if ret == -1 {
panic!("unexpected getentropy error: {}", errno());
}
}
}
}

#[cfg(target_os = "ios")]
mod imp {
use io;
use ptr;
use libc::{c_int, size_t};

enum SecRandom {}

#[allow(non_upper_case_globals)]
const kSecRandomDefault: *const SecRandom = ptr::null();

#[link(name = "Security", kind = "framework")]
extern "C" {
fn SecRandomCopyBytes(rnd: *const SecRandom,
count: size_t, bytes: *mut u8) -> c_int;
}

pub fn fill_bytes(v: &mut [u8]) {
let ret = unsafe {
SecRandomCopyBytes(kSecRandomDefault, v.len() as size_t,
v.as_mut_ptr())
};
if ret == -1 {
panic!("couldn't generate random bytes: {}",
io::Error::last_os_error());
}
}
}

#[cfg(windows)]
mod imp {
use io;
use sys::c;

// SBRM struct to ensure the cryptography context gets cleaned up
// properly
struct OsRng {
hcryptprov: c::HCRYPTPROV
}

impl OsRng {
/// Create a new `OsRng`.
pub fn new() -> io::Result<OsRng> {
let mut hcp = 0;
let ret = unsafe {
c::CryptAcquireContextA(&mut hcp, 0 as c::LPCSTR, 0 as c::LPCSTR,
c::PROV_RSA_FULL,
c::CRYPT_VERIFYCONTEXT | c::CRYPT_SILENT)
};

if ret == 0 {
Err(io::Error::last_os_error())
} else {
Ok(OsRng { hcryptprov: hcp })
}
}
}
impl Drop for OsRng {
fn drop(&mut self) {
let ret = unsafe {
c::CryptReleaseContext(self.hcryptprov, 0)
};
if ret == 0 {
panic!("couldn't release context: {}",
io::Error::last_os_error());
}
}
}

pub fn fill_bytes(v: &mut [u8]) {
let os = OsRng::new().expect("failed to acquire crypt context for randomness");
let ret = unsafe {
c::CryptGenRandom(self.hcryptprov, v.len() as c::DWORD,
v.as_mut_ptr())
};
if ret == 0 {
panic!("couldn't generate random bytes: {}",
io::Error::last_os_error());
}
}

}

#[cfg(test)]
mod tests {
use sync::mpsc::channel;
use super::fill_bytes;
use thread;

#[test]
fn test_fill_bytes() {
let mut v = [0; 1000];
fill_bytes(&mut v);
}

#[test]
fn test_fill_bytes_tasks() {

let mut txs = vec!();
for _ in 0..20 {
let (tx, rx) = channel();
txs.push(tx);

thread::spawn(move|| {
// wait until all the threads are ready to go.
rx.recv().unwrap();

let mut v = [0; 1000];

for _ in 0..100 {
fill_bytes(&mut v);
// deschedule to attempt to interleave things as much
// as possible (XXX: is this a good test?)
thread::yield_now();
}
});
}

// start all the threads
for tx in &txs {
tx.send(()).unwrap();
}
}
}
1 change: 1 addition & 0 deletions src/libstd/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,7 @@ mod memchr;
pub mod rt;
mod panicking;
mod rand;
mod internal_rand;

// Some external utilities of the standard library rely on randomness (aka
// rustc_back::TempDir and tests) and need a way to get at the OS rng we've got
Expand Down