Skip to content

Commit

Permalink
std: Stabilize the std::hash module
Browse files Browse the repository at this point in the history
This commit aims to prepare the `std::hash` module for alpha by formalizing its
current interface whileholding off on adding `#[stable]` to the new APIs.  The
current usage with the `HashMap` and `HashSet` types is also reconciled by
separating out composable parts of the design. The primary goal of this slight
redesign is to separate the concepts of a hasher's state from a hashing
algorithm itself.

The primary change of this commit is to separate the `Hasher` trait into a
`Hasher` and a `HashState` trait. Conceptually the old `Hasher` trait was
actually just a factory for various states, but hashing had very little control
over how these states were used. Additionally the old `Hasher` trait was
actually fairly unrelated to hashing.

This commit redesigns the existing `Hasher` trait to match what the notion of a
`Hasher` normally implies with the following definition:

    trait Hasher {
        type Output;
        fn reset(&mut self);
        fn finish(&self) -> Output;
    }

This `Hasher` trait emphasizes that hashing algorithms may produce outputs other
than a `u64`, so the output type is made generic. Other than that, however, very
little is assumed about a particular hasher. It is left up to implementors to
provide specific methods or trait implementations to feed data into a hasher.

The corresponding `Hash` trait becomes:

    trait Hash<H: Hasher> {
        fn hash(&self, &mut H);
    }

The old default of `SipState` was removed from this trait as it's not something
that we're willing to stabilize until the end of time, but the type parameter is
always required to implement `Hasher`. Note that the type parameter `H` remains
on the trait to enable multidispatch for specialization of hashing for
particular hashers.

Note that `Writer` is not mentioned in either of `Hash` or `Hasher`, it is
simply used as part `derive` and the implementations for all primitive types.

With these definitions, the old `Hasher` trait is realized as a new `HashState`
trait in the `collections::hash_state` module as an unstable addition for
now. The current definition looks like:

    trait HashState {
        type Hasher: Hasher;
        fn hasher(&self) -> Hasher;
    }

The purpose of this trait is to emphasize that the one piece of functionality
for implementors is that new instances of `Hasher` can be created.  This
conceptually represents the two keys from which more instances of a
`SipHasher` can be created, and a `HashState` is what's stored in a
`HashMap`, not a `Hasher`.

Implementors of custom hash algorithms should implement the `Hasher` trait, and
only hash algorithms intended for use in hash maps need to implement or worry
about the `HashState` trait.

The entire module and `HashState` infrastructure remains `#[unstable]` due to it
being recently redesigned, but some other stability decision made for the
`std::hash` module are:

* The `Writer` trait remains `#[experimental]` as it's intended to be replaced
  with an `io::Writer` (more details soon).
* The top-level `hash` function is `#[unstable]` as it is intended to be generic
  over the hashing algorithm instead of hardwired to `SipHasher`
* The inner `sip` module is now private as its one export, `SipHasher` is
  reexported in the `hash` module.

And finally, a few changes were made to the default parameters on `HashMap`.

* The `RandomSipHasher` default type parameter was renamed to `RandomState`.
  This renaming emphasizes that it is not a hasher, but rather just state to
  generate hashers. It also moves away from the name "sip" as it may not always
  be implemented as `SipHasher`. This type lives in the
  `std::collections::hash_map` module as `#[unstable]`

* The associated `Hasher` type of `RandomState` is creatively called...
  `Hasher`! This concrete structure lives next to `RandomState` as an
  implemenation of the "default hashing algorithm" used for a `HashMap`. Under
  the hood this is currently implemented as `SipHasher`, but it draws an
  explicit interface for now and allows us to modify the implementation over
  time if necessary.

There are many breaking changes outlined above, and as a result this commit is
a:

[breaking-change]
  • Loading branch information
alexcrichton committed Jan 7, 2015
1 parent 9e4e524 commit 511f0b8
Show file tree
Hide file tree
Showing 50 changed files with 1,061 additions and 1,012 deletions.
2 changes: 1 addition & 1 deletion src/compiletest/compiletest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
// except according to those terms.

#![crate_type = "bin"]
#![feature(slicing_syntax, unboxed_closures)]
#![feature(slicing_syntax)]

#![deny(warnings)]

Expand Down
21 changes: 13 additions & 8 deletions src/liballoc/arc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -67,21 +67,20 @@
//! }
//! ```
use core::prelude::*;

use core::atomic;
use core::atomic::Ordering::{Relaxed, Release, Acquire, SeqCst};
use core::borrow::BorrowFrom;
use core::clone::Clone;
use core::fmt::{self, Show};
use core::cmp::{Eq, Ord, PartialEq, PartialOrd, Ordering};
use core::cmp::{Ordering};
use core::default::Default;
use core::marker::{Sync, Send};
use core::mem::{min_align_of, size_of, drop};
use core::mem::{min_align_of, size_of};
use core::mem;
use core::nonzero::NonZero;
use core::ops::{Drop, Deref};
use core::option::Option;
use core::option::Option::{Some, None};
use core::ptr::{self, PtrExt};
use core::ops::Deref;
use core::ptr;
use core::hash::{Hash, Hasher};
use heap::deallocate;

/// An atomically reference counted wrapper for shared state.
Expand Down Expand Up @@ -591,6 +590,12 @@ impl<T: Default + Sync + Send> Default for Arc<T> {
fn default() -> Arc<T> { Arc::new(Default::default()) }
}

impl<H: Hasher, T: Hash<H>> Hash<H> for Arc<T> {
fn hash(&self, state: &mut H) {
(**self).hash(state)
}
}

#[cfg(test)]
#[allow(experimental)]
mod tests {
Expand Down
8 changes: 8 additions & 0 deletions src/liballoc/boxed.rs
Original file line number Diff line number Diff line change
Expand Up @@ -106,12 +106,20 @@ impl<T: ?Sized + Ord> Ord for Box<T> {
#[stable]}
impl<T: ?Sized + Eq> Eq for Box<T> {}

#[cfg(stage0)]
impl<S: hash::Writer, T: ?Sized + Hash<S>> Hash<S> for Box<T> {
#[inline]
fn hash(&self, state: &mut S) {
(**self).hash(state);
}
}
#[cfg(not(stage0))]
impl<S: hash::Hasher, T: ?Sized + Hash<S>> Hash<S> for Box<T> {
#[inline]
fn hash(&self, state: &mut S) {
(**self).hash(state);
}
}

/// Extension methods for an owning `Any` trait object.
#[unstable = "post-DST and coherence changes, this will not be a trait but \
Expand Down
35 changes: 23 additions & 12 deletions src/liballoc/rc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,23 +10,26 @@

//! Thread-local reference-counted boxes (the `Rc<T>` type).
//!
//! The `Rc<T>` type provides shared ownership of an immutable value. Destruction is deterministic,
//! and will occur as soon as the last owner is gone. It is marked as non-sendable because it
//! avoids the overhead of atomic reference counting.
//! The `Rc<T>` type provides shared ownership of an immutable value.
//! Destruction is deterministic, and will occur as soon as the last owner is
//! gone. It is marked as non-sendable because it avoids the overhead of atomic
//! reference counting.
//!
//! The `downgrade` method can be used to create a non-owning `Weak<T>` pointer to the box. A
//! `Weak<T>` pointer can be upgraded to an `Rc<T>` pointer, but will return `None` if the value
//! has already been dropped.
//! The `downgrade` method can be used to create a non-owning `Weak<T>` pointer
//! to the box. A `Weak<T>` pointer can be upgraded to an `Rc<T>` pointer, but
//! will return `None` if the value has already been dropped.
//!
//! For example, a tree with parent pointers can be represented by putting the nodes behind strong
//! `Rc<T>` pointers, and then storing the parent pointers as `Weak<T>` pointers.
//! For example, a tree with parent pointers can be represented by putting the
//! nodes behind strong `Rc<T>` pointers, and then storing the parent pointers
//! as `Weak<T>` pointers.
//!
//! # Examples
//!
//! Consider a scenario where a set of `Gadget`s are owned by a given `Owner`. We want to have our
//! `Gadget`s point to their `Owner`. We can't do this with unique ownership, because more than one
//! gadget may belong to the same `Owner`. `Rc<T>` allows us to share an `Owner` between multiple
//! `Gadget`s, and have the `Owner` remain allocated as long as any `Gadget` points at it.
//! Consider a scenario where a set of `Gadget`s are owned by a given `Owner`.
//! We want to have our `Gadget`s point to their `Owner`. We can't do this with
//! unique ownership, because more than one gadget may belong to the same
//! `Owner`. `Rc<T>` allows us to share an `Owner` between multiple `Gadget`s,
//! and have the `Owner` remain allocated as long as any `Gadget` points at it.
//!
//! ```rust
//! use std::rc::Rc;
Expand Down Expand Up @@ -597,12 +600,20 @@ impl<T: Ord> Ord for Rc<T> {
}

// FIXME (#18248) Make `T` `Sized?`
#[cfg(stage0)]
impl<S: hash::Writer, T: Hash<S>> Hash<S> for Rc<T> {
#[inline]
fn hash(&self, state: &mut S) {
(**self).hash(state);
}
}
#[cfg(not(stage0))]
impl<S: hash::Hasher, T: Hash<S>> Hash<S> for Rc<T> {
#[inline]
fn hash(&self, state: &mut S) {
(**self).hash(state);
}
}

#[experimental = "Show is experimental."]
impl<T: fmt::Show> fmt::Show for Rc<T> {
Expand Down
4 changes: 2 additions & 2 deletions src/libcollections/bit.rs
Original file line number Diff line number Diff line change
Expand Up @@ -982,7 +982,7 @@ impl fmt::Show for Bitv {
}

#[stable]
impl<S: hash::Writer> hash::Hash<S> for Bitv {
impl<S: hash::Writer + hash::Hasher> hash::Hash<S> for Bitv {
fn hash(&self, state: &mut S) {
self.nbits.hash(state);
for elem in self.blocks() {
Expand Down Expand Up @@ -1742,7 +1742,7 @@ impl fmt::Show for BitvSet {
}
}

impl<S: hash::Writer> hash::Hash<S> for BitvSet {
impl<S: hash::Writer + hash::Hasher> hash::Hash<S> for BitvSet {
fn hash(&self, state: &mut S) {
for pos in self.iter() {
pos.hash(state);
Expand Down
14 changes: 13 additions & 1 deletion src/libcollections/btree/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@ use core::borrow::BorrowFrom;
use core::cmp::Ordering;
use core::default::Default;
use core::fmt::Show;
use core::hash::{Writer, Hash};
use core::hash::{Hash, Hasher};
#[cfg(stage0)]
use core::hash::Writer;
use core::iter::{Map, FromIterator};
use core::ops::{Index, IndexMut};
use core::{iter, fmt, mem};
Expand Down Expand Up @@ -820,13 +822,23 @@ impl<K: Ord, V> Extend<(K, V)> for BTreeMap<K, V> {
}

#[stable]
#[cfg(stage0)]
impl<S: Writer, K: Hash<S>, V: Hash<S>> Hash<S> for BTreeMap<K, V> {
fn hash(&self, state: &mut S) {
for elt in self.iter() {
elt.hash(state);
}
}
}
#[stable]
#[cfg(not(stage0))]
impl<S: Hasher, K: Hash<S>, V: Hash<S>> Hash<S> for BTreeMap<K, V> {
fn hash(&self, state: &mut S) {
for elt in self.iter() {
elt.hash(state);
}
}
}

#[stable]
impl<K: Ord, V> Default for BTreeMap<K, V> {
Expand Down
4 changes: 2 additions & 2 deletions src/libcollections/btree/set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -678,7 +678,7 @@ mod test {
use prelude::*;

use super::BTreeSet;
use std::hash;
use std::hash::{self, SipHasher};

#[test]
fn test_clone_eq() {
Expand All @@ -703,7 +703,7 @@ mod test {
y.insert(2);
y.insert(1);

assert!(hash::hash(&x) == hash::hash(&y));
assert!(hash::hash::<_, SipHasher>(&x) == hash::hash::<_, SipHasher>(&y));
}

struct Counter<'a, 'b> {
Expand Down
10 changes: 5 additions & 5 deletions src/libcollections/dlist.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ use alloc::boxed::Box;
use core::cmp::Ordering;
use core::default::Default;
use core::fmt;
use core::hash::{Writer, Hash};
use core::hash::{Writer, Hasher, Hash};
use core::iter::{self, FromIterator};
use core::mem;
use core::ptr;
Expand Down Expand Up @@ -675,7 +675,7 @@ impl<A: fmt::Show> fmt::Show for DList<A> {
}

#[stable]
impl<S: Writer, A: Hash<S>> Hash<S> for DList<A> {
impl<S: Writer + Hasher, A: Hash<S>> Hash<S> for DList<A> {
fn hash(&self, state: &mut S) {
self.len().hash(state);
for elt in self.iter() {
Expand All @@ -688,7 +688,7 @@ impl<S: Writer, A: Hash<S>> Hash<S> for DList<A> {
mod tests {
use prelude::*;
use std::rand;
use std::hash;
use std::hash::{self, SipHasher};
use std::thread::Thread;
use test::Bencher;
use test;
Expand Down Expand Up @@ -951,7 +951,7 @@ mod tests {
let mut x = DList::new();
let mut y = DList::new();

assert!(hash::hash(&x) == hash::hash(&y));
assert!(hash::hash::<_, SipHasher>(&x) == hash::hash::<_, SipHasher>(&y));

x.push_back(1i);
x.push_back(2);
Expand All @@ -961,7 +961,7 @@ mod tests {
y.push_front(2);
y.push_front(1);

assert!(hash::hash(&x) == hash::hash(&y));
assert!(hash::hash::<_, SipHasher>(&x) == hash::hash::<_, SipHasher>(&y));
}

#[test]
Expand Down
2 changes: 1 addition & 1 deletion src/libcollections/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@

#![allow(unknown_features)]
#![feature(unsafe_destructor, slicing_syntax)]
#![feature(old_impl_check)]
#![feature(unboxed_closures)]
#![feature(old_impl_check)]
#![no_std]

#[macro_use]
Expand Down
8 changes: 4 additions & 4 deletions src/libcollections/ring_buf.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ use core::ops::{Index, IndexMut};
use core::ptr;
use core::raw::Slice as RawSlice;

use std::hash::{Writer, Hash};
use std::hash::{Writer, Hash, Hasher};
use std::cmp;

use alloc::heap;
Expand Down Expand Up @@ -1562,7 +1562,7 @@ impl<A: Ord> Ord for RingBuf<A> {
}

#[stable]
impl<S: Writer, A: Hash<S>> Hash<S> for RingBuf<A> {
impl<S: Writer + Hasher, A: Hash<S>> Hash<S> for RingBuf<A> {
fn hash(&self, state: &mut S) {
self.len().hash(state);
for elt in self.iter() {
Expand Down Expand Up @@ -1631,7 +1631,7 @@ mod tests {
use prelude::*;
use core::iter;
use std::fmt::Show;
use std::hash;
use std::hash::{self, SipHasher};
use test::Bencher;
use test;

Expand Down Expand Up @@ -2283,7 +2283,7 @@ mod tests {
y.push_back(2);
y.push_back(3);

assert!(hash::hash(&x) == hash::hash(&y));
assert!(hash::hash::<_, SipHasher>(&x) == hash::hash::<_, SipHasher>(&y));
}

#[test]
Expand Down
9 changes: 9 additions & 0 deletions src/libcollections/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -820,12 +820,21 @@ impl fmt::Show for String {
}

#[experimental = "waiting on Hash stabilization"]
#[cfg(stage0)]
impl<H: hash::Writer> hash::Hash<H> for String {
#[inline]
fn hash(&self, hasher: &mut H) {
(**self).hash(hasher)
}
}
#[experimental = "waiting on Hash stabilization"]
#[cfg(not(stage0))]
impl<H: hash::Writer + hash::Hasher> hash::Hash<H> for String {
#[inline]
fn hash(&self, hasher: &mut H) {
(**self).hash(hasher)
}
}

#[unstable = "recent addition, needs more experience"]
impl<'a> Add<&'a str> for String {
Expand Down
8 changes: 8 additions & 0 deletions src/libcollections/vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1183,12 +1183,20 @@ impl<T:Clone> Clone for Vec<T> {
}
}

#[cfg(stage0)]
impl<S: hash::Writer, T: Hash<S>> Hash<S> for Vec<T> {
#[inline]
fn hash(&self, state: &mut S) {
self.as_slice().hash(state);
}
}
#[cfg(not(stage0))]
impl<S: hash::Writer + hash::Hasher, T: Hash<S>> Hash<S> for Vec<T> {
#[inline]
fn hash(&self, state: &mut S) {
self.as_slice().hash(state);
}
}

#[experimental = "waiting on Index stability"]
impl<T> Index<uint> for Vec<T> {
Expand Down
12 changes: 6 additions & 6 deletions src/libcollections/vec_map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ use core::prelude::*;
use core::cmp::Ordering;
use core::default::Default;
use core::fmt;
use core::hash::{Hash, Writer};
use core::hash::{Hash, Writer, Hasher};
use core::iter::{Enumerate, FilterMap, Map, FromIterator};
use core::iter;
use core::mem::replace;
Expand Down Expand Up @@ -85,7 +85,7 @@ impl<V:Clone> Clone for VecMap<V> {
}
}

impl<S: Writer, V: Hash<S>> Hash<S> for VecMap<V> {
impl<S: Writer + Hasher, V: Hash<S>> Hash<S> for VecMap<V> {
fn hash(&self, state: &mut S) {
// In order to not traverse the `VecMap` twice, count the elements
// during iteration.
Expand Down Expand Up @@ -712,7 +712,7 @@ impl<V> DoubleEndedIterator for IntoIter<V> {
#[cfg(test)]
mod test_map {
use prelude::*;
use core::hash::hash;
use core::hash::{hash, SipHasher};

use super::VecMap;

Expand Down Expand Up @@ -1004,7 +1004,7 @@ mod test_map {
let mut x = VecMap::new();
let mut y = VecMap::new();

assert!(hash(&x) == hash(&y));
assert!(hash::<_, SipHasher>(&x) == hash::<_, SipHasher>(&y));
x.insert(1, 'a');
x.insert(2, 'b');
x.insert(3, 'c');
Expand All @@ -1013,12 +1013,12 @@ mod test_map {
y.insert(2, 'b');
y.insert(1, 'a');

assert!(hash(&x) == hash(&y));
assert!(hash::<_, SipHasher>(&x) == hash::<_, SipHasher>(&y));

x.insert(1000, 'd');
x.remove(&1000);

assert!(hash(&x) == hash(&y));
assert!(hash::<_, SipHasher>(&x) == hash::<_, SipHasher>(&y));
}

#[test]
Expand Down
Loading

3 comments on commit 511f0b8

@alexcrichton
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=aturon

@aturon
Copy link

@aturon aturon commented on 511f0b8 Jan 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+ p=1 (for alpha)

@alexcrichton
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bors: retry

Please sign in to comment.