Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make AllocId decoding thread-safe #50957

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 69 additions & 14 deletions src/librustc/mir/interpret/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,17 @@ use mir;
use hir::def_id::DefId;
use ty::{self, TyCtxt, Instance};
use ty::layout::{self, Align, HasDataLayout, Size};
use ty::codec::{TyEncoder, TyDecoder};
use middle::region;
use std::iter;
use std::io;
use std::ops::{Deref, DerefMut};
use std::hash::Hash;
use syntax::ast::Mutability;
use rustc_serialize::{Encoder, Decoder, Decodable, Encodable};
use rustc_serialize::{Decodable, Encodable};
use rustc_data_structures::sorted_map::SortedMap;
use rustc_data_structures::fx::FxHashMap;
use rustc_data_structures::fx::{FxHashMap, FxHashSet};
use rustc_data_structures::sync::{HashMapExt, LockGuard};
use byteorder::{WriteBytesExt, ReadBytesExt, LittleEndian, BigEndian};

#[derive(Clone, Debug, PartialEq, RustcEncodable, RustcDecodable)]
Expand Down Expand Up @@ -162,25 +164,51 @@ impl ::rustc_serialize::UseSpecializedDecodable for AllocId {}
#[derive(RustcDecodable, RustcEncodable)]
enum AllocKind {
Alloc,
AllocAtPos,
Fn,
Static,
}

pub fn specialized_encode_alloc_id<
'a, 'tcx,
E: Encoder,
E: TyEncoder,
M,
>(
encoder: &mut E,
tcx: TyCtxt<'a, 'tcx, 'tcx>,
cache: M,
alloc_id: AllocId,
) -> Result<(), E::Error> {
) -> Result<(), E::Error>
where
M: Fn(&mut E) -> &mut FxHashMap<AllocId, usize>
{
let alloc_type: AllocType<'tcx, &'tcx Allocation> =
tcx.alloc_map.lock().get(alloc_id).expect("no value for AllocId");
match alloc_type {
AllocType::Memory(alloc) => {
if let Some(alloc_pos) = cache(encoder).get(&alloc_id).cloned() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need to be an atomic "get or insert" operation in order to prevent two threads that get here at the same time from both trying to encode alloc (I think this is the same as what @michaelwoerister mentioned below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been addressed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operation is effectively atomic since we have unique ownership of the encoder and the cache. This doesn't matter though as encoding isn't intended to be multithreaded.

AllocKind::AllocAtPos.encode(encoder)?;
return encoder.emit_usize(alloc_pos);
}
let pos = encoder.position();

This comment was marked as resolved.

This comment was marked as resolved.

assert!(cache(encoder).insert(alloc_id, pos).is_none());
trace!("encoding {:?} with {:#?}", alloc_id, alloc);
AllocKind::Alloc.encode(encoder)?;

// Write placeholder for size
let size_pos = encoder.position();
0usize.encode(encoder)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work because of variable-length integer encoding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustc has some similar code elsewhere and works around this by using a 4 byte array that the size is encoded into. Somewhat space-wasteful though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be simpler to just remember the in the global_cache during decoding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that doesn't work, because we need this value also when another thread hasn't finished decoding the allocation yet.


let start = encoder.position();
alloc.encode(encoder)?;
let end = encoder.position();

// Overwrite the placeholder with the real size
let size: usize = end - start;
encoder.set_position(size_pos);
size.encode(encoder)?;
encoder.set_position(end);

}
AllocType::Function(fn_instance) => {
trace!("encoding {:?} with {:#?}", alloc_id, fn_instance);
Expand All @@ -199,23 +227,48 @@ pub fn specialized_encode_alloc_id<

pub fn specialized_decode_alloc_id<
'a, 'tcx,
D: Decoder,
CACHE: FnOnce(&mut D, AllocId),
D: TyDecoder<'a, 'tcx>,
GlobalCache: FnMut(&mut D) -> LockGuard<'_, FxHashMap<usize, (AllocId, bool)>>,
LocalCache: FnOnce(&mut D) -> &mut FxHashSet<AllocId>,
>(
decoder: &mut D,
tcx: TyCtxt<'a, 'tcx, 'tcx>,
cache: CACHE,
mut global_cache: GlobalCache,
local_cache: LocalCache,
) -> Result<AllocId, D::Error> {
let pos = decoder.position();
match AllocKind::decode(decoder)? {
AllocKind::AllocAtPos => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use the trick here that I had originally where you read a usize, and that tag is either 0 for Alloc, 1 for Static, 2 for Fn or anything else is the real_pos. This is also used in Ty encoding I think.

let real_pos = decoder.read_usize()?;
decoder.with_position(real_pos, AllocId::decode)
},
AllocKind::Alloc => {
let alloc_id = tcx.alloc_map.lock().reserve();
trace!("creating alloc id {:?}", alloc_id);
// insert early to allow recursive allocs
cache(decoder, alloc_id);
// Read the size of the allocation.
// Used to skip ahead if we don't need to decode this.
let alloc_size = usize::decode(decoder)?;

let (alloc_id, fully_loaded) = *global_cache(decoder).entry(pos).or_insert_with(|| {
// Create an id which is not fully loaded
(tcx.alloc_map.lock().reserve(), false)
});
if fully_loaded || !local_cache(decoder).insert(alloc_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if one thread tstarts decoding, the next thread takes over the CPU, gets here for the same AllocId, skips over and tries to access the allocation? It'll ICE about uncached alloc or error with dangling pointer, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the fully_loaded case this isn't a problem since the AllocId has an Allocation assigned. For the !local_cache(decoder).insert(alloc_id) case, we know that some stack frame above us will assign an AllocId before the result will be used. Since local_cache is thread local another thread won't see the value inserted here. It may instead decode the same allocation in parallel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, neat. Please make this explanation a comment on that if statement

// We have started decoding this alloc id already, so just return it.
// Its content is already filled in or will be filled in by functions
// further up the stack.

// Skip the allocation
let pos = decoder.position();
decoder.set_position(pos + alloc_size);
return Ok(alloc_id)
}

let allocation = <&'tcx Allocation as Decodable>::decode(decoder)?;
trace!("decoded alloc {:?} {:#?}", alloc_id, allocation);
tcx.alloc_map.lock().set_id_memory(alloc_id, allocation);
// This may overwrite with the same allocation
tcx.alloc_map.lock().set_id_same_memory(alloc_id, allocation);

// Mark the alloc id as fully loaded
global_cache(decoder).insert(pos, (alloc_id, true));

Ok(alloc_id)
},
Expand All @@ -225,14 +278,12 @@ pub fn specialized_decode_alloc_id<
trace!("decoded fn alloc instance: {:?}", instance);
let id = tcx.alloc_map.lock().create_fn_alloc(instance);
trace!("created fn alloc id: {:?}", id);
cache(decoder, id);
Ok(id)
},
AllocKind::Static => {
trace!("creating extern static alloc id at");
let did = DefId::decode(decoder)?;
let alloc_id = tcx.alloc_map.lock().intern_static(did);
cache(decoder, alloc_id);
Ok(alloc_id)
},
}
Expand Down Expand Up @@ -333,6 +384,10 @@ impl<'tcx, M: fmt::Debug + Eq + Hash + Clone> AllocMap<'tcx, M> {
bug!("tried to set allocation id {}, but it was already existing as {:#?}", id, old);
}
}

fn set_id_same_memory(&mut self, id: AllocId, mem: M) {
self.id_to_type.insert_same(id, AllocType::Memory(mem));
}
}

#[derive(Clone, Debug, Eq, PartialEq, PartialOrd, Ord, Hash, RustcEncodable, RustcDecodable)]
Expand Down
7 changes: 7 additions & 0 deletions src/librustc/ty/codec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,13 +52,18 @@ impl<'tcx> EncodableWithShorthand for ty::Predicate<'tcx> {

pub trait TyEncoder: Encoder {
fn position(&self) -> usize;
fn set_position(&mut self, usize);
}

impl<'buf> TyEncoder for opaque::Encoder<'buf> {
#[inline]
fn position(&self) -> usize {
self.position()
}
#[inline]
fn set_position(&mut self, p: usize) {
self.cursor.set_position(p as u64)
}
}

/// Encode the given value or a previously cached shorthand.
Expand Down Expand Up @@ -123,6 +128,8 @@ pub trait TyDecoder<'a, 'tcx: 'a>: Decoder {

fn position(&self) -> usize;

fn set_position(&mut self, usize);

fn cached_ty_for_shorthand<F>(&mut self,
shorthand: usize,
or_insert_with: F)
Expand Down
Loading