Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(vm) Fix memory leak of InstanceHandle #1822

Merged
merged 6 commits into from
Nov 19, 2020
Merged

Conversation

Hywan
Copy link
Contributor

@Hywan Hywan commented Nov 17, 2020

Description

In wasmer_engine::Artifact::instantiate, the engine artifact
allocates a new instance by calling
wasmer_vm::InstanceHandle::allocate_instance. To make it short, this
function calculates the Layout of an wasmer_vm::Instance, and
allocates space for it by calling alloc. This last part is
important because it means we are responsible to deallocate it with
dealloc.

The pointer to this wasmer_vm::Instance is stored in the
wasmer_vm::InstanceHandle structure. That's the handle that is
returned by the engine artifact Artifact::instantiate method.

This instance handle is then stored in wasmer::Instance (through the
intervention of wasmer::Module).

How is it freed? It wasn't. That's the leak. dealloc was never
called.

How to free it? We must call dealloc. There is even a
wasmer_vm::InstanceHandle::deallocate helper to do that
properly. Neat!

When to free it? That's the tricky part. A wasmer::Instance can be
clonable. To do so, wasmer_vm::InstanceHandle must be clonable
too. There was a Clone implementation, that was constructing a new
wasmer_vm:InstanceHandle by using the same pointer to
wasmer_vm::Instance. That's annoying because it's the best way to
get many pointers that point to the same wasmer_vm::Instance in the
nature, and it's difficult to track them.

This patch changes the paradigm. There is only one and unique
wasmer_vm::InstanceHandle per wasmer::Instance, including its
clones. The handle is now stored inside a
Arc<Mutex<wasmer_vm::InstanceHandle>>. Consequently, when a
wasmer::Instance is cloned, it uses the same
wasmer_vm::InstanceHandle, not a clone of it.

Bonus: wasmer::Instance continues to be Send + Sync.

So. Let's back to our question. When to free
wasmer_vm::InstanceHandle? Response: When wasmer::Instance is
dropped. Right? There is a unique path from wasmer::Instance, to
wasmer_vm::InstanceHandle, to wasmer_vm::Instance now. So we just
need to call wasmer_vm::InstanceHandle::dealloc in a specific Drop
implementation for wasmer_vm::InstanceHandle, and the Rust borrow
checker does the rest.

Yes. … No. There is another use case: It is possible to create a
wasmer_vm::InstanceHandle with InstanceHandle::from_vmctx. Indeed,
a wasmer_vm::VMContext also stores a pointer to
wasmer_vm::Instance. In this, we consider wasmer_vm::VMContext
owns the instance pointer, somehow, and is responsible to free it
properly.

Consequently, we need another flag inside wasmer_vm::InstanceHandle
to know whether this structure owns the pointer to
wasmer_vm::Instance or not.

So. Let's back to our question. When to free
wasmer_vm::InstanceHandle? Response: Inside the Drop
implementation of wasmer_vm::InstanceHandle with its Self::dealloc
method if and only if the handle owns the pointer to
wasmer_vm::Instance.

Testing with Valgrind shows that the leak has been removed.

Review

  • Add a short description of the the change to the CHANGELOG.md file

…`.module_translation_state`.

The idea is to avoid confusion with th e`ModuleInfoTranslation` type itself.
In `wasmer_engine::Artifact::instantiate`, the engine artifact
allocates a new instance by calling
`wasmer_vm::InstanceHandle::allocate_instance`. To make it short, this
function calculates the [`Layout`] of an `wasmer_vm::Instance`, and
allocates space for it by calling [`alloc`]. This last part is
important because it means we are responsible to deallocate it with
[`dealloc`].

The pointer to this `wasmer_vm::Instance` is stored in the
`wasmer_vm::InstanceHandle` structure. That's the handle that is
returned by the engine artifact `Artifact::instantiate` method.

This instance handle is then stored in `wasmer::Instance` (through the
intervention of `wasmer::Module`).

How is it freed? It wasn't. That's the leak. [`dealloc`] was never
called.

How to free it? We must call [`dealloc`]. There is even a
`wasmer_vm::InstanceHandle::deallocate` helper to do that
properly. Neat!

When to free it? That's the tricky part. A `wasmer::Instance` can be
clonable. To do so, `wasmer_vm::InstanceHandle` must be clonable
too. There was a `Clone` implementation, that was constructing a new
`wasmer_vm:InstanceHandle` by using the same pointer to
`wasmer_vm::Instance`. That's annoying because it's the best way to
get many pointers that point to the same `wasmer_vm::Instance` in the
nature, and it's difficult to track them.

This patch changes the paradigm. There is only one and unique
`wasmer_vm::InstanceHandle` per `wasmer::Instance`, including its
clones. The handle is now stored inside a
`Arc<Mutex<wasmer_vm::InstanceHandle>>`. Consequently, when a
`wasmer::Instance` is cloned, it uses the same
`wasmer_vm::InstanceHandle`, not a clone of it.

Bonus: `wasmer::Instance` continues to be `Send` + `Sync`.

So. Let's back to our question. When to free
`wasmer_vm::InstanceHandle`? Response: When `wasmer::Instance` is
dropped. Right? There is a unique path from `wasmer::Instance`, to
`wasmer_vm::InstanceHandle`, to `wasmer_vm::Instance` now. So we just
need to call `wasmer_vm::InstanceHandle::dealloc` in a specific `Drop`
implementation for `wasmer_vm::InstanceHandle`, and the Rust borrow
checker does the rest.

Yes. … No. There is another use case: It is possible to create a
`wasmer_vm::InstanceHandle` with `InstanceHandle::from_vmctx`. Indeed,
a `wasmer_vm::VMContext` also stores a pointer to
`wasmer_vm::Instance`. In this, we consider `wasmer_vm::VMContext`
owns the instance pointer, somehow, and is responsible to free it
properly.

Consequently, we need another flag inside `wasmer_vm::InstanceHandle`
to know whether this structure owns the pointer to
`wasmer_vm::Instance` or not.

So. Let's back to our question. When to free
`wasmer_vm::InstanceHandle`? Response: Inside the `Drop`
implementation of `wasmer_vm::InstanceHandle` with its `Self::dealloc`
method if and only if the handle owns the pointer to
`wasmer_vm::Instance`.

Testing with Valgrind shows that the leak has been removed.

[`Layout`]: https://doc.rust-lang.org/std/alloc/struct.Layout.html
[`alloc`]: https://doc.rust-lang.org/std/alloc/fn.alloc.html
[`dealloc`]: https://doc.rust-lang.org/std/alloc/fn.dealloc.html
@Hywan Hywan added bug Something isn't working 📦 lib-api About wasmer 📦 lib-engine About wasmer-engine 📦 lib-vm About wasmer-vm labels Nov 17, 2020
@Hywan Hywan requested a review from losfair as a code owner November 17, 2020 15:00
@Hywan Hywan self-assigned this Nov 17, 2020
@Hywan
Copy link
Contributor Author

Hywan commented Nov 17, 2020

Note: The first 2 commits were about explorations of this leak. They are part of the PR but they may disturb the reviewer. Feel free to review 8016a34 alone.

lib/vm/src/instance.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks generally good to me, I have some concerns about use-after-free with this change: my understanding is that this shouldn't be that hard. I'll try to trigger it and post the code here.

lib/engine-jit/src/artifact.rs Outdated Show resolved Hide resolved
@MarkMcCaskey
Copy link
Contributor

error[E0609]: no field `module_translation` on type `wasmer_compiler::ModuleInfoTranslation<'_>`
   --> lib/engine-object-file/src/artifact.rs:131:25
    |
131 |             translation.module_translation,
    |                         ^^^^^^^^^^^^^^^^^^ help: a field with a similar name exists: `module_translation_state`

error: aborting due to previous error

Heads up I got an error while building locally, it's easy to fix though

@MarkMcCaskey
Copy link
Contributor

By the way here's some code that compiles and runs and I believe it's a memory leak:

use wasmer::{imports, wat2wasm, Instance, Module, Store, Value};
use wasmer_compiler_cranelift::Cranelift;
use wasmer_engine_jit::JIT;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let wasm_bytes = wat2wasm(
        r#"
(module
  (type $sum_t (func (param i32 i32) (result i32)))
  (func $sum_f (type $sum_t) (param $x i32) (param $y i32) (result i32)
    local.get $x
    local.get $y
    i32.add)
  (export "sum" (func $sum_f)))
"#
        .as_bytes(),
    )?;

    let sum = {
        let compiler = Cranelift::default();
        let store = Store::new(&JIT::new(&compiler).engine());
        let module = Module::new(&store, wasm_bytes)?;
        let import_object = imports! {};
        let instance = Instance::new(&module, &import_object)?;
        let sum = instance.exports.get_function("sum")?;
        sum.clone()
    };

    println!("Calling `sum` function...");
    let results = sum.call(&[Value::I32(1), Value::I32(2)])?;

    println!("Results: {:?}", results);
    assert_eq!(results.to_vec(), vec![Value::I32(3)]);

    Ok(())
}

The idea here is to just call a local function after the Instance has been freed. I've verified that the instance is actually being freed by adding a print statement to InstanceHandle::dealloc giving us:

[lib/vm/src/instance.rs:1149] "FREEING INSTANCE DATA" = "FREEING INSTANCE DATA"
Calling `sum` function...
Results: [I32(3)]

@Hywan
Copy link
Contributor Author

Hywan commented Nov 17, 2020

The behavior you're describing is also a “bug”: it was working of this bug.

Let's fix that. As discussed internally, we will keep this behavior.

@Hywan
Copy link
Contributor Author

Hywan commented Nov 19, 2020

error[E0609]: no field `module_translation` on type `wasmer_compiler::ModuleInfoTranslation<'_>`
   --> lib/engine-object-file/src/artifact.rs:131:25
    |
131 |             translation.module_translation,
    |                         ^^^^^^^^^^^^^^^^^^ help: a field with a similar name exists: `module_translation_state`

error: aborting due to previous error

Heads up I got an error while building locally, it's easy to fix though

Fixed. Thanks.

@syrusakbary
Copy link
Member

bors r+

@bors
Copy link
Contributor

bors bot commented Nov 19, 2020

@bors bors bot merged commit 43b3d4b into wasmerio:master Nov 19, 2020
@Hywan
Copy link
Contributor Author

Hywan commented Nov 19, 2020

Thanks for merging that. I've been working on another PR to fix the “use exports after freed instance” feature.

bors bot added a commit that referenced this pull request Dec 1, 2020
1837: feat(vm) Ability to use exports after the instance has been freed r=Hywan a=Hywan

# Description

Sequel of #1822. Before #1822, we were able to use an export from an instance even if the instance were freed. That was a nice feature, but since #1822 is merged, it's now an undefined behavior. Indeed, this feature was working because of a memory leak. This PR is an attempt to safely restore this behavior!

At the same time, this patch fixes the way the `Layout` of `Instance` is calculated (it was wrong).

There is now an `InstanceAllocator` that can be viewed as a `Arc<Instance>` basically. The `InstanceAllocator` type is responsible to calculate the layout of an `Instance`, to allocate it, and to deallocate it. The `InstanceHandle` allocates it along with constructing the `Instance`. A clone of the `InstanceAllocator` is passed into exports when they are created. Consequently, every export owns a clone of the `InstanceAllocator`.

The `InstanceAllocator` frees the `Instance` only when all clones of itself are dropped, just like an `Arc`. `InstanceAllocator` is the only type that “owns” the pointer to the `Instance` for more safety. That way, we stop having copies of the same `Instance` pointer everywhere.

So by (re-)implementing the “use-export-after-freed-instance” feature, we also add more soundness, fix silent bugs (like incorrect `Layout` calculation), and add more safety in the code by having less raw pointers.

# To do

- [x] Zero memory leak with `valgrind`
- [x] Zero errors for memory writes with `valgrind`
- [x] Our example works
- [x] Extensive documentation
- [x] Update `InstanceHandle::from_vmctx` (only used by the trap API) [note: the method has been removed]
- [x] All tests pass

# Example we want to pass

This example is now a test in the `lib/api/` crate.

<details>

<summary>The example we want to work</summary>

```rust
use wasmer::*;

fn main() {
    let store = Store::default();
    let module = Module::new(
        &store,
        "
    (module
      (type $sum_t (func (param i32 i32) (result i32)))
      (func $sum_f (type $sum_t) (param $x i32) (param $y i32) (result i32)
        local.get $x
        local.get $y
        i32.add)
      (export \"sum\" (func $sum_f)))
",
    )
    .unwrap();

    let import_object = ImportObject::new();
    let instance = Instance::new(&module, &import_object).unwrap();
    let instance2 = instance.clone();
    let instance3 = instance.clone();

    let sum = instance.exports.get_function("sum").unwrap().clone();

    drop(instance);
    drop(instance2);
    drop(instance3);

    dbg!(sum.call(&[Value::I32(1), Value::I32(2)]).unwrap());
}
```

</details>

# Review

- [x] Add a short description of the the change to the CHANGELOG.md file


Co-authored-by: Ivan Enderlin <ivan@mnt.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working 📦 lib-api About wasmer 📦 lib-engine About wasmer-engine 📦 lib-vm About wasmer-vm
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants