Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vm) Ability to use exports after the instance has been freed #1837

Merged
merged 30 commits into from
Dec 1, 2020

Conversation

Hywan
Copy link
Contributor

@Hywan Hywan commented Nov 23, 2020

Description

Sequel of #1822. Before #1822, we were able to use an export from an instance even if the instance were freed. That was a nice feature, but since #1822 is merged, it's now an undefined behavior. Indeed, this feature was working because of a memory leak. This PR is an attempt to safely restore this behavior!

At the same time, this patch fixes the way the Layout of Instance is calculated (it was wrong).

There is now an InstanceAllocator that can be viewed as a Arc<Instance> basically. The InstanceAllocator type is responsible to calculate the layout of an Instance, to allocate it, and to deallocate it. The InstanceHandle allocates it along with constructing the Instance. A clone of the InstanceAllocator is passed into exports when they are created. Consequently, every export owns a clone of the InstanceAllocator.

The InstanceAllocator frees the Instance only when all clones of itself are dropped, just like an Arc. InstanceAllocator is the only type that “owns” the pointer to the Instance for more safety. That way, we stop having copies of the same Instance pointer everywhere.

So by (re-)implementing the “use-export-after-freed-instance” feature, we also add more soundness, fix silent bugs (like incorrect Layout calculation), and add more safety in the code by having less raw pointers.

To do

  • Zero memory leak with valgrind
  • Zero errors for memory writes with valgrind
  • Our example works
  • Extensive documentation
  • Update InstanceHandle::from_vmctx (only used by the trap API) [note: the method has been removed]
  • All tests pass

Example we want to pass

This example is now a test in the lib/api/ crate.

The example we want to work
use wasmer::*;

fn main() {
    let store = Store::default();
    let module = Module::new(
        &store,
        "
    (module
      (type $sum_t (func (param i32 i32) (result i32)))
      (func $sum_f (type $sum_t) (param $x i32) (param $y i32) (result i32)
        local.get $x
        local.get $y
        i32.add)
      (export \"sum\" (func $sum_f)))
",
    )
    .unwrap();

    let import_object = ImportObject::new();
    let instance = Instance::new(&module, &import_object).unwrap();
    let instance2 = instance.clone();
    let instance3 = instance.clone();

    let sum = instance.exports.get_function("sum").unwrap().clone();

    drop(instance);
    drop(instance2);
    drop(instance3);

    dbg!(sum.call(&[Value::I32(1), Value::I32(2)]).unwrap());
}

Review

  • Add a short description of the the change to the CHANGELOG.md file

I hope this little change will clarify a little bit that the `Export`
passed to `Extern::from_vm_export` is not a `wasmer::Export` but a
`wasmer_vm::Export`.
Those methods have a `pub` or `pub(crate)` visibility but it's not
required. Let's reduce their visibility.
…ods.

This patch moves the `lookup`, `lookup_by_declaration`, and `exports`
methods from `Instance` to `InstanceHandle`. Basically, all methods
that creates `Export` instances.
Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! This is very complicated and I have low confidence in the entire implementation being correct but I think it's a good start. The design idea seems like it should work as far as I can tell right now!

lib/vm/src/export.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
Originally, `Instance` was private and we were casting to
`NonNull<u8>` to avoid exposing the type publicly. Now we can just use
`Arc<InstanceAllocator>`.
@Hywan
Copy link
Contributor Author

Hywan commented Nov 24, 2020

Finally:

==92344== Memcheck, a memory error detector
==92344== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==92344== Using Valgrind-3.17.0.GIT and LibVEX; rerun with -h for copyright info
==92344== Command: target/debug/bug
==92344==
Dropping `wasmer::Instance`
Dropping `wasmer::Instance`
Dropping `wasmer::Instance`
Trying to drop `wasmer_vm::InstanceAllocator`...
... not now
Dropping `wasmer::Function`
Trying to drop `wasmer_vm::InstanceAllocator`...
... not now
[src/main.rs:30] sum.call(&[Value::I32(1), Value::I32(2)]).unwrap() = [
    I32(3),
]
Dropping `wasmer::Function`
Trying to drop `wasmer_vm::InstanceAllocator`...
Yep, dropping it.
Dropping `wasmer_vm::Instance`
--92344:0:schedule VG_(sema_down): read returned -4
--92344:0:schedule VG_(sema_down): read returned -4
--92344:0:schedule VG_(sema_down): read returned -4
--92344:0:schedule VG_(sema_down): read returned -4
--92344:0:schedule VG_(sema_down): read returned -4
--92344:0:schedule VG_(sema_down): read returned -4
--92344:0:schedule VG_(sema_down): read returned -4
--92344:0:schedule VG_(sema_down): read returned -4
==92344==
==92344== HEAP SUMMARY:
==92344==     in use at exit: 84,594 bytes in 341 blocks
==92344==   total heap usage: 762 allocs, 421 frees, 145,029 bytes allocated
==92344==
==92344== LEAK SUMMARY:
==92344==    definitely lost: 0 bytes in 0 blocks
==92344==    indirectly lost: 0 bytes in 0 blocks
==92344==      possibly lost: 0 bytes in 0 blocks
==92344==    still reachable: 68,961 bytes in 178 blocks
==92344==         suppressed: 15,633 bytes in 163 blocks
==92344== Reachable blocks (those to which a pointer was found) are not shown.
==92344== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==92344==
==92344== For lists of detected and suppressed errors, rerun with: -s
==92344== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5 from 5)

Comment on lines -38 to -65
cfg_if::cfg_if! {
if #[cfg(unix)] {
pub type SignalHandler = dyn Fn(libc::c_int, *const libc::siginfo_t, *const libc::c_void) -> bool;

impl InstanceHandle {
/// Set a custom signal handler
pub fn set_signal_handler<H>(&self, handler: H)
where
H: 'static + Fn(libc::c_int, *const libc::siginfo_t, *const libc::c_void) -> bool,
{
self.instance().signal_handler.set(Some(Box::new(handler)));
}
}
} else if #[cfg(target_os = "windows")] {
pub type SignalHandler = dyn Fn(winapi::um::winnt::PEXCEPTION_POINTERS) -> bool;

impl InstanceHandle {
/// Set a custom signal handler
pub fn set_signal_handler<H>(&self, handler: H)
where
H: 'static + Fn(winapi::um::winnt::PEXCEPTION_POINTERS) -> bool,
{
self.instance().signal_handler.set(Some(Box::new(handler)));
}
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved below.

Comment on lines 285 to 396
/// Lookup an export with the given name.
pub fn lookup(&self, field: &str) -> Option<Export> {
let export = self.module.exports.get(field)?;

Some(self.lookup_by_declaration(&export))
}

/// Lookup an export with the given export declaration.
pub fn lookup_by_declaration(&self, export: &ExportIndex) -> Export {
match export {
ExportIndex::Function(index) => {
let sig_index = &self.module.functions[*index];
let (address, vmctx) = if let Some(def_index) = self.module.local_func_index(*index)
{
(
self.functions[def_index].0 as *const _,
VMFunctionEnvironment {
vmctx: self.vmctx_ptr(),
},
)
} else {
let import = self.imported_function(*index);
(import.body, import.environment)
};
let call_trampoline = Some(self.function_call_trampolines[*sig_index]);
let signature = self.module.signatures[*sig_index].clone();
ExportFunction {
address,
// Any function received is already static at this point as:
// 1. All locally defined functions in the Wasm have a static signature.
// 2. All the imported functions are already static (because
// they point to the trampolines rather than the dynamic addresses).
kind: VMFunctionKind::Static,
signature,
vmctx,
call_trampoline,
}
.into()
}
ExportIndex::Table(index) => {
let from = if let Some(def_index) = self.module.local_table_index(*index) {
self.tables[def_index].clone()
} else {
let import = self.imported_table(*index);
import.from.clone()
};
ExportTable { from }.into()
}
ExportIndex::Memory(index) => {
let from = if let Some(def_index) = self.module.local_memory_index(*index) {
self.memories[def_index].clone()
} else {
let import = self.imported_memory(*index);
import.from.clone()
};
ExportMemory { from }.into()
}
ExportIndex::Global(index) => {
let from = {
if let Some(def_index) = self.module.local_global_index(*index) {
self.globals[def_index].clone()
} else {
let import = self.imported_global(*index);
import.from.clone()
}
};
ExportGlobal { from }.into()
}
}
}

/// Return an iterator over the exports of this instance.
///
/// Specifically, it provides access to the key-value pairs, where the keys
/// are export names, and the values are export declarations which can be
/// resolved `lookup_by_declaration`.
pub fn exports(&self) -> indexmap::map::Iter<String, ExportIndex> {
self.module.exports.iter()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved inside InstanceHandle, so that we can inject InstanceAllocator clones inside the exports.

Comment on lines -538 to -577
fn alloc_layout(offsets: &VMOffsets) -> Layout {
let size = mem::size_of::<Self>()
.checked_add(usize::try_from(offsets.size_of_vmctx()).unwrap())
.unwrap();
let align = mem::align_of::<Self>();
Layout::from_size_align(size, align).unwrap()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved inside InstanceAllocator, it's the new place to be.

}
}

/// TODO: Review this super carefully.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

lib/vm/src/instance.rs Outdated Show resolved Hide resolved
Comment on lines 1079 to 1144
/// Get the locations of where the local `VMMemoryDefinition`s should be stored.
///
/// This function lets us create `Memory` objects on the host with backing
/// memory in the VM.
///
/// # Safety
/// - `instance_ptr` must point to enough memory that all of the offsets in
/// `offsets` point to valid locations in memory.
pub unsafe fn memory_definition_locations(
instance_ptr: NonNull<u8>,
offsets: &VMOffsets,
) -> Vec<NonNull<VMMemoryDefinition>> {
// `NonNull<u8>` here actually means `NonNull<Instance>`. See
// `Self::allocate_instance` to understand why.
let instance_ptr: NonNull<Instance> = instance_ptr.cast();

let num_memories = offsets.num_local_memories;
let num_memories = usize::try_from(num_memories).unwrap();
let mut out = Vec::with_capacity(num_memories);

let base_ptr = instance_ptr.as_ptr().add(mem::size_of::<Instance>());

for i in 0..num_memories {
let mem_offset = offsets.vmctx_vmmemory_definition(LocalMemoryIndex::new(i));
let mem_offset = usize::try_from(mem_offset).unwrap();

let new_ptr = NonNull::new_unchecked(base_ptr.add(mem_offset));

out.push(new_ptr.cast());
}

out
}

/// Get the locations of where the `VMTableDefinition`s should be stored.
///
/// This function lets us create `Table` objects on the host with backing
/// memory in the VM.
///
/// # Safety
/// - `instance_ptr` must point to enough memory that all of the offsets in
/// `offsets` point to valid locations in memory.
pub unsafe fn table_definition_locations(
instance_ptr: NonNull<u8>,
offsets: &VMOffsets,
) -> Vec<NonNull<VMTableDefinition>> {
// `NonNull<u8>` here actually means `NonNull<Instance>`. See
// `Self::allocate_instance` to understand why.
let instance_ptr: NonNull<Instance> = instance_ptr.cast();

let num_tables = offsets.num_local_tables;
let num_tables = usize::try_from(num_tables).unwrap();
let mut out = Vec::with_capacity(num_tables);

let base_ptr = instance_ptr.as_ptr().add(std::mem::size_of::<Instance>());

for i in 0..num_tables {
let table_offset = offsets.vmctx_vmtable_definition(LocalTableIndex::new(i));
let table_offset = usize::try_from(table_offset).unwrap();

let new_ptr = NonNull::new_unchecked(base_ptr.add(table_offset));

out.push(new_ptr.cast());
}
out
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from above, and changed a little bit (just code simplification, nothing fancy).

signature,
vmctx,
call_trampoline,
instance_allocator: Some(instance),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's 🆕.


ExportTable {
from,
instance_allocator: Some(instance),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's 🆕.


ExportMemory {
from,
instance_allocator: Some(instance),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's 🆕.


ExportGlobal {
from,
instance_allocator: Some(instance),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's 🆕.

@jubianchi
Copy link
Contributor

bors try

bors bot added a commit that referenced this pull request Nov 24, 2020
@bors
Copy link
Contributor

bors bot commented Nov 24, 2020

try

Build failed:

Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements, lots of really tricky code though.

I'm thinking it might make the code a bit easier to read to put InstanceAllocator in a new file or something. Purely for the sake of encapsulating and hiding complex code that we shouldn't mess with most of the time.

lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Show resolved Hide resolved
lib/vm/src/instance.rs Show resolved Hide resolved
lib/vm/src/instance.rs Show resolved Hide resolved
lib/vm/src/instance.rs Show resolved Hide resolved
lib/vm/src/instance.rs Show resolved Hide resolved
@syrusakbary
Copy link
Member

Random question.
Let's say we have two instances:

  1. Instance 1. Which exports a memory created in the wasm (memory1)
  2. Instance 2, that takes a memory from it's imports, and reexports it at the same time (memory 2)

Let's say, we take the memory1 and instantiate the instance2 with this memory1, to then take the exported memory from instance 2 (memory2, which should be the same as memory1).

Which one will be the instance deallocator for memory2?

@MarkMcCaskey
Copy link
Contributor

MarkMcCaskey commented Nov 24, 2020

Random question.
Let's say we have two instances:

1. Instance 1. Which exports a memory created in the wasm (memory1)

2. Instance 2, that takes a memory from it's imports, and reexports it at the same time (memory 2)

Let's say, we take the memory1 and instantiate the instance2 with this memory1, to then take the exported memory from instance 2 (memory2, which should be the same as memory1).

Which one will be the instance deallocator for memory2?

Due to the memory layout of local vs non-local memories, that memory's data lives in instance 1 (where it was local), reexporting the memory from instance 2 should be fine. The only issue is that we'd need to be sure instance 1 can't be freed while instance 2 is using a memory from it!

So to actually answer your question, it should be instance 1, I believe.

@jubianchi jubianchi added 1.0 Wasmer at 1.0 🎉 enhancement New feature! bug Something isn't working 📦 lib-vm About wasmer-vm labels Nov 25, 2020
The `InstanceHandle::from_vmctx` method is only used by
`CallThreadState::any_instance`, where a given closure expect an
`&InstanceHandle`. However, the only closure that exists in the code
only reads an `&Instance` from the `InstanceHandle`.

So… this patch updates this only closure and `any_instance` to work on
an `&Instance` directly, which allows us to remove
`InstanceHandle::from_vmctx`. It results to less code and less pointer
arithmetic.
@Hywan
Copy link
Contributor Author

Hywan commented Nov 26, 2020

bors try

bors bot added a commit that referenced this pull request Nov 26, 2020
@bors
Copy link
Contributor

bors bot commented Nov 26, 2020

try

Build failed:

The old code is working because the alignment of `Instance` is 16, and
its size is 400. Even with a really large `VMContext`, it will still
fit in an alignment of 8. But the way it is calculated is wrong. The
size is correctly calculated, but the alignment misses `VMContext`.

Quoting our own documentation:

> It is important to remind that Instance is dynamically-sized based
> on `VMOffsets`: The `Instance.vmctx` field represents a
> dynamically-sized array that extends beyond the nominal end of the
> type.

So first, let's compute the layout of the `Instance.vmctx` “array”
with `Layout::array`. Second, let's compute the layout of `Instance`
itself with `Layout::new`. And third, let's compute the entire layout
by using `Layout::extend` + `Layout::pad_to_align`.
@Hywan
Copy link
Contributor Author

Hywan commented Nov 26, 2020

bors try

bors bot added a commit that referenced this pull request Nov 26, 2020
@bors
Copy link
Contributor

bors bot commented Nov 26, 2020

@Hywan
Copy link
Contributor Author

Hywan commented Nov 26, 2020

bors try

bors bot added a commit that referenced this pull request Nov 26, 2020
@bors
Copy link
Contributor

bors bot commented Nov 26, 2020

@Hywan
Copy link
Contributor Author

Hywan commented Nov 26, 2020

Maybe we should add tests specifically for that :-p?

@Hywan
Copy link
Contributor Author

Hywan commented Nov 27, 2020

bors try

bors bot added a commit that referenced this pull request Nov 27, 2020
@bors
Copy link
Contributor

bors bot commented Nov 27, 2020

try

Build succeeded:

Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, biggest thing I'd want to see changed before shipping is the unsoundness in on our internal InstanceAllocator API.

@Hywan
Copy link
Contributor Author

Hywan commented Dec 1, 2020

Everything has been addressed. Thanks @MarkMcCaskey for the deep review. I'm fixing the conflicts with master and I'll merge the PR.

@Hywan
Copy link
Contributor Author

Hywan commented Dec 1, 2020

bors try

bors bot added a commit that referenced this pull request Dec 1, 2020
@bors
Copy link
Contributor

bors bot commented Dec 1, 2020

@Hywan
Copy link
Contributor Author

Hywan commented Dec 1, 2020

bors r+

@bors
Copy link
Contributor

bors bot commented Dec 1, 2020

@bors bors bot merged commit 7a45d7d into wasmerio:master Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0 Wasmer at 1.0 bug Something isn't working 🎉 enhancement New feature! 📦 lib-vm About wasmer-vm
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants