Skip to content

Commit

Permalink
Auto merge of #1963 - cbeuw:weak-memory, r=RalfJung
Browse files Browse the repository at this point in the history
Weak memory emulation using store buffers

This implements the second half of the [Lidbury & Donaldson paper](https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2017/POPL.pdf): weak memory emulation using store buffers. A store buffer is created over a memory range on atomic access. Stores will push store elements into the buffer and loads will search through the buffer in reverse modification order, determine which store elements are valid for the current load, and pick one randomly.

This implementation will never generate weak memory behaviours forbidden by the C++11 model, but it is incapable of producing all possible weak behaviours allowed by the model. There are certain weak behaviours observable on real hardware but not while using this.

Note that this implementation does not take into account of C++20's memory model revision to SC accesses and fences introduced by [P0668](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0668r5.html). This implementation is not fully correct under the revised C++20 model and may generate behaviours C++20 disallows.

Rust follows the C++20 memory model (except for the Consume ordering and some operations not performable through C++'s std::atomic<T> API). It is therefore possible for this implementation to generate behaviours never observable when the same program is compiled and run natively. Unfortunately, no literature exists at the time of writing which proposes an implementable and C++20-compatible relaxed memory model that supports all atomic operation existing in Rust. The closest one is [A Promising Semantics for Relaxed-Memory Concurrency](https://www.cs.tau.ac.il/~orilahav/papers/popl17.pdf) by Jeehoon Kang et al. However, this model lacks SC accesses and is therefore unusable by Miri (SC accesses are everywhere in library code).

Safe/sound Rust allows for more operations on atomic locations than the C++20 atomic API was intended to allow, such as non-atomically accessing a previously atomically accessed location, or accessing previously atomically accessed locations with a differently sized operation (such as accessing the top 16 bits of an `AtomicU32`). These scenarios are generally left undefined in formalisations of C++ memory model, even though they [became possible](https://lists.isocpp.org/std-discussion/2022/05/1662.php) in C++20 with `std::atomic_ref<T>`. In Rust, these operations can only be done through a `&mut AtomicFoo` reference or one derived from it, therefore these operations can only happen after all previous accesses on the same locations. This implementation is adapted to accommodate these.

----------
TODOs:

- [x] Add tests cases that actually demonstrate weak memory behaviour (even if they are scheduler dependent)
- [x] Change `{mutex, rwlock, cond, srwlock}_get_or_create_id` functions under `src/shims` to use atomic RMWs instead of separate read -> check if need to create a new one -> write steps
- [x] Make sure Crossbeam tests still pass (crossbeam-rs/crossbeam#831)
- [x] Move as much weak-memory related code as possible into `weak_memory.rs`
- [x] Remove "weak memory effects are not emulated" warnings
- [x] Accommodate certain mixed size and mixed atomicity accesses Rust allows on top of the C++ model
  • Loading branch information
bors committed Jun 6, 2022
2 parents 3361eab + 1b32d14 commit e6d3d98
Show file tree
Hide file tree
Showing 93 changed files with 1,848 additions and 126 deletions.
15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ for example:
or an invalid enum discriminant)
* **Experimental**: Violations of the [Stacked Borrows] rules governing aliasing
for reference types
* **Experimental**: Data races (but no weak memory effects)
* **Experimental**: Data races
* **Experimental**: Emulation of weak memory effects (i.e., reads can return outdated values)

On top of that, Miri will also tell you about memory leaks: when there is memory
still allocated at the end of the execution, and that memory is not reachable
Expand Down Expand Up @@ -61,9 +62,11 @@ in your program, and cannot run all programs:
not support networking. System API support varies between targets; if you run
on Windows it is a good idea to use `--target x86_64-unknown-linux-gnu` to get
better support.
* Threading support is not finished yet. E.g., weak memory effects are not
emulated and spin loops (without syscalls) just loop forever. There is no
threading support on Windows.
* Threading support is not finished yet. E.g. spin loops (without syscalls) just
loop forever. There is no threading support on Windows.
* Weak memory emulation may produce weak behaivours unobservable by compiled
programs running on real hardware when `SeqCst` fences are used, and it cannot
produce all behaviors possibly observable on real hardware.

[rust]: https://www.rust-lang.org/
[mir]: https://github.com/rust-lang/rfcs/blob/master/text/1211-mir.md
Expand Down Expand Up @@ -317,7 +320,7 @@ to Miri failing to detect cases of undefined behavior in a program.
can focus on other failures, but it means Miri can miss bugs in your program.
Using this flag is **unsound**.
* `-Zmiri-disable-data-race-detector` disables checking for data races. Using
this flag is **unsound**.
this flag is **unsound**. This implies `-Zmiri-disable-weak-memory-emulation`.
* `-Zmiri-disable-stacked-borrows` disables checking the experimental
[Stacked Borrows] aliasing rules. This can make Miri run faster, but it also
means no aliasing violations will be detected. Using this flag is **unsound**
Expand All @@ -327,6 +330,8 @@ to Miri failing to detect cases of undefined behavior in a program.
as out-of-bounds accesses) first. Setting this flag means Miri can miss bugs
in your program. However, this can also help to make Miri run faster. Using
this flag is **unsound**.
* `-Zmiri-disable-weak-memory-emulation` disables the emulation of some C++11 weak
memory effects.
* `-Zmiri-measureme=<name>` enables `measureme` profiling for the interpreted program.
This can be used to find which parts of your program are executing slowly under Miri.
The profile is written out to a file with the prefix `<name>`, and can be processed
Expand Down
3 changes: 3 additions & 0 deletions src/bin/miri.rs
Original file line number Diff line number Diff line change
Expand Up @@ -318,6 +318,7 @@ fn main() {
miri_config.stacked_borrows = false;
} else if arg == "-Zmiri-disable-data-race-detector" {
miri_config.data_race_detector = false;
miri_config.weak_memory_emulation = false;
} else if arg == "-Zmiri-disable-alignment-check" {
miri_config.check_alignment = miri::AlignmentCheck::None;
} else if arg == "-Zmiri-symbolic-alignment-check" {
Expand All @@ -340,6 +341,8 @@ fn main() {
isolation_enabled = Some(false);
}
miri_config.isolated_op = miri::IsolatedOp::Allow;
} else if arg == "-Zmiri-disable-weak-memory-emulation" {
miri_config.weak_memory_emulation = false;
} else if let Some(param) = arg.strip_prefix("-Zmiri-isolation-error=") {
if matches!(isolation_enabled, Some(false)) {
panic!("-Zmiri-isolation-error cannot be used along with -Zmiri-disable-isolation");
Expand Down
278 changes: 278 additions & 0 deletions src/concurrency/allocation_map.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,278 @@
//! Implements a map from allocation ranges to data.
//! This is somewhat similar to RangeMap, but the ranges
//! and data are discrete and non-splittable. An allocation in the
//! map will always have the same range until explicitly removed

use rustc_target::abi::Size;
use std::ops::{Index, IndexMut, Range};

use rustc_const_eval::interpret::AllocRange;

#[derive(Clone, Debug)]
struct Elem<T> {
/// The range covered by this element; never empty.
range: AllocRange,
/// The data stored for this element.
data: T,
}

/// Index of an allocation within the map
type Position = usize;

#[derive(Clone, Debug)]
pub struct AllocationMap<T> {
v: Vec<Elem<T>>,
}

#[derive(Clone, Debug, PartialEq)]
pub enum AccessType {
/// The access perfectly overlaps (same offset and range) with the exsiting allocation
PerfectlyOverlapping(Position),
/// The access does not touch any exising allocation
Empty(Position),
/// The access overlaps with one or more existing allocations
ImperfectlyOverlapping(Range<Position>),
}

impl<T> AllocationMap<T> {
pub fn new() -> Self {
Self { v: Vec::new() }
}

/// Finds the position of the allocation containing the given offset. If the offset is not
/// in an existing allocation, then returns Err containing the position
/// where such allocation should be inserted
fn find_offset(&self, offset: Size) -> Result<Position, Position> {
// We do a binary search.
let mut left = 0usize; // inclusive
let mut right = self.v.len(); // exclusive
loop {
if left == right {
// No element contains the given offset. But the
// position is where such element should be placed at.
return Err(left);
}
let candidate = left.checked_add(right).unwrap() / 2;
let elem = &self.v[candidate];
if offset < elem.range.start {
// We are too far right (offset is further left).
debug_assert!(candidate < right); // we are making progress
right = candidate;
} else if offset >= elem.range.end() {
// We are too far left (offset is further right).
debug_assert!(candidate >= left); // we are making progress
left = candidate + 1;
} else {
// This is it!
return Ok(candidate);
}
}
}

/// Determines whether a given access on `range` overlaps with
/// an existing allocation
pub fn access_type(&self, range: AllocRange) -> AccessType {
match self.find_offset(range.start) {
Ok(pos) => {
// Start of the range belongs to an existing object, now let's check the overlapping situation
let elem = &self.v[pos];
// FIXME: derive Eq for AllocRange in rustc
if elem.range.start == range.start && elem.range.size == range.size {
// Happy case: perfectly overlapping access
AccessType::PerfectlyOverlapping(pos)
} else {
// FIXME: add a last() method to AllocRange that returns the last inclusive offset (end() is exclusive)
let end_pos = match self.find_offset(range.end() - Size::from_bytes(1)) {
// If the end lands in an existing object, add one to get the exclusive position
Ok(inclusive_pos) => inclusive_pos + 1,
Err(exclusive_pos) => exclusive_pos,
};

AccessType::ImperfectlyOverlapping(pos..end_pos)
}
}
Err(pos) => {
// Start of the range doesn't belong to an existing object
match self.find_offset(range.end() - Size::from_bytes(1)) {
// Neither does the end
Err(end_pos) =>
if pos == end_pos {
// There's nothing between the start and the end, so the range thing is empty
AccessType::Empty(pos)
} else {
// Otherwise we have entirely covered an existing object
AccessType::ImperfectlyOverlapping(pos..end_pos)
},
// Otherwise at least part of it overlaps with something else
Ok(end_pos) => AccessType::ImperfectlyOverlapping(pos..end_pos + 1),
}
}
}
}

/// Inserts an object and its occupied range at given position
// The Position can be calculated from AllocRange, but the only user of AllocationMap
// always calls access_type before calling insert/index/index_mut, and we don't
// want to repeat the binary search on each time, so we ask the caller to supply Position
pub fn insert_at_pos(&mut self, pos: Position, range: AllocRange, data: T) {
self.v.insert(pos, Elem { range, data });
// If we aren't the first element, then our start must be greater than the preivous element's end
if pos > 0 {
debug_assert!(self.v[pos - 1].range.end() <= range.start);
}
// If we aren't the last element, then our end must be smaller than next element's start
if pos < self.v.len() - 1 {
debug_assert!(range.end() <= self.v[pos + 1].range.start);
}
}

pub fn remove_pos_range(&mut self, pos_range: Range<Position>) {
self.v.drain(pos_range);
}

pub fn remove_from_pos(&mut self, pos: Position) {
self.v.remove(pos);
}
}

impl<T> Index<Position> for AllocationMap<T> {
type Output = T;

fn index(&self, pos: Position) -> &Self::Output {
&self.v[pos].data
}
}

impl<T> IndexMut<Position> for AllocationMap<T> {
fn index_mut(&mut self, pos: Position) -> &mut Self::Output {
&mut self.v[pos].data
}
}

#[cfg(test)]
mod tests {
use rustc_const_eval::interpret::alloc_range;

use super::*;

#[test]
fn empty_map() {
// FIXME: make Size::from_bytes const
let four = Size::from_bytes(4);
let map = AllocationMap::<()>::new();

// Correctly tells where we should insert the first element (at position 0)
assert_eq!(map.find_offset(Size::from_bytes(3)), Err(0));

// Correctly tells the access type along with the supposed position
assert_eq!(map.access_type(alloc_range(Size::ZERO, four)), AccessType::Empty(0));
}

#[test]
#[should_panic]
fn no_overlapping_inserts() {
let four = Size::from_bytes(4);

let mut map = AllocationMap::<&str>::new();

// |_|_|_|_|#|#|#|#|_|_|_|_|...
// 0 1 2 3 4 5 6 7 8 9 a b c d
map.insert_at_pos(0, alloc_range(four, four), "#");
// |_|_|_|_|#|#|#|#|_|_|_|_|...
// 0 ^ ^ ^ ^ 5 6 7 8 9 a b c d
map.insert_at_pos(0, alloc_range(Size::from_bytes(1), four), "@");
}

#[test]
fn boundaries() {
let four = Size::from_bytes(4);

let mut map = AllocationMap::<&str>::new();

// |#|#|#|#|_|_|...
// 0 1 2 3 4 5
map.insert_at_pos(0, alloc_range(Size::ZERO, four), "#");
// |#|#|#|#|_|_|...
// 0 1 2 3 ^ 5
assert_eq!(map.find_offset(four), Err(1));
// |#|#|#|#|_|_|_|_|_|...
// 0 1 2 3 ^ ^ ^ ^ 8
assert_eq!(map.access_type(alloc_range(four, four)), AccessType::Empty(1));

let eight = Size::from_bytes(8);
// |#|#|#|#|_|_|_|_|@|@|@|@|_|_|...
// 0 1 2 3 4 5 6 7 8 9 a b c d
map.insert_at_pos(1, alloc_range(eight, four), "@");
// |#|#|#|#|_|_|_|_|@|@|@|@|_|_|...
// 0 1 2 3 4 5 6 ^ 8 9 a b c d
assert_eq!(map.find_offset(Size::from_bytes(7)), Err(1));
// |#|#|#|#|_|_|_|_|@|@|@|@|_|_|...
// 0 1 2 3 ^ ^ ^ ^ 8 9 a b c d
assert_eq!(map.access_type(alloc_range(four, four)), AccessType::Empty(1));
}

#[test]
fn perfectly_overlapping() {
let four = Size::from_bytes(4);

let mut map = AllocationMap::<&str>::new();

// |#|#|#|#|_|_|...
// 0 1 2 3 4 5
map.insert_at_pos(0, alloc_range(Size::ZERO, four), "#");
// |#|#|#|#|_|_|...
// ^ ^ ^ ^ 4 5
assert_eq!(map.find_offset(Size::ZERO), Ok(0));
assert_eq!(
map.access_type(alloc_range(Size::ZERO, four)),
AccessType::PerfectlyOverlapping(0)
);

// |#|#|#|#|@|@|@|@|_|...
// 0 1 2 3 4 5 6 7 8
map.insert_at_pos(1, alloc_range(four, four), "@");
// |#|#|#|#|@|@|@|@|_|...
// 0 1 2 3 ^ ^ ^ ^ 8
assert_eq!(map.find_offset(four), Ok(1));
assert_eq!(map.access_type(alloc_range(four, four)), AccessType::PerfectlyOverlapping(1));
}

#[test]
fn straddling() {
let four = Size::from_bytes(4);

let mut map = AllocationMap::<&str>::new();

// |_|_|_|_|#|#|#|#|_|_|_|_|...
// 0 1 2 3 4 5 6 7 8 9 a b c d
map.insert_at_pos(0, alloc_range(four, four), "#");
// |_|_|_|_|#|#|#|#|_|_|_|_|...
// 0 1 ^ ^ ^ ^ 6 7 8 9 a b c d
assert_eq!(
map.access_type(alloc_range(Size::from_bytes(2), four)),
AccessType::ImperfectlyOverlapping(0..1)
);
// |_|_|_|_|#|#|#|#|_|_|_|_|...
// 0 1 2 3 4 5 ^ ^ ^ ^ a b c d
assert_eq!(
map.access_type(alloc_range(Size::from_bytes(6), four)),
AccessType::ImperfectlyOverlapping(0..1)
);
// |_|_|_|_|#|#|#|#|_|_|_|_|...
// 0 1 ^ ^ ^ ^ ^ ^ ^ ^ a b c d
assert_eq!(
map.access_type(alloc_range(Size::from_bytes(2), Size::from_bytes(8))),
AccessType::ImperfectlyOverlapping(0..1)
);

// |_|_|_|_|#|#|#|#|_|_|@|@|_|_|...
// 0 1 2 3 4 5 6 7 8 9 a b c d
map.insert_at_pos(1, alloc_range(Size::from_bytes(10), Size::from_bytes(2)), "@");
// |_|_|_|_|#|#|#|#|_|_|@|@|_|_|...
// 0 1 2 3 4 5 ^ ^ ^ ^ ^ ^ ^ ^
assert_eq!(
map.access_type(alloc_range(Size::from_bytes(6), Size::from_bytes(8))),
AccessType::ImperfectlyOverlapping(0..2)
);
}
}
Loading

0 comments on commit e6d3d98

Please sign in to comment.