Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in host function envs #1865

Merged
merged 15 commits into from
Dec 15, 2020
Merged

Conversation

MarkMcCaskey
Copy link
Contributor

@MarkMcCaskey MarkMcCaskey commented Dec 3, 2020

TODO: link to issue

This PR contains a number of changes:

  1. Make WasmerEnv: Clone
  2. Store a pointer to the clone function when creating a host function (Notably this is a feature that wouldn't work even if we could use a proper trait object because you can't have a Sized trait object and Clone: Sized).
  3. Store a pointer to the drop function when creating a host function.
  4. Clone the env via pointer every time an Instance is made. Therefore each Instance gets its own, unique Env per host function with Env.
  5. Add reference counting and drop logic to a sub-field of wasmer_export::ExportFunction which frees the original version of the Env (the thing that gets cloned each time an Instance is made) with the drop function pointer.
  6. Change some logic in vm::Instance from SoA (struct of arrays) to AoS (array of structs): this uses more memory but is a bit less error prone and can be easily changed later.
  7. Add logic on this new struct (vm::ImportEnv) that contains the function pointers for each import in Instance to drop (with the drop fn pointer) when the vm::Instance is being dropped. This fixes the original memory leak.
  8. Add wrapper functions inside the host function creation functions which makes the layout of the user supplied env-pointer the responsibility of each function. Thus, rather than drop being Env::drop, it's a function which frees all wrapper types, traverses indirections and frees the internal Env with Env::drop. This simplifies code at the cost of making the host_env pointer (vmctx) not consistent in terms of what it actually points to. This change fixes another memory leak related to the creation of host functions.

tl;dr: we're leaning into manually doing virtual method dispatch on WasmerEnvs and it actually works great! The biggest issue I have with the PR as-is is that the code isn't as clean/readable/robust as I'd ideally like it to be.

Edit (by @Hywan): This PR fixes #1584, #1714, #1865, #1667.

Review

  • Add a short description of the the change to the CHANGELOG.md file

Copy link
Member

@syrusakbary syrusakbary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A function can be exported twice. How do you verify that host_env_drop_fn is only called once?

@jubianchi jubianchi added 1.0 Wasmer at 1.0 🎉 enhancement New feature! labels Dec 4, 2020
@MarkMcCaskey
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Dec 5, 2020
@bors
Copy link
Contributor

bors bot commented Dec 5, 2020

try

Build failed:

Cargo.toml Outdated Show resolved Hide resolved
lib/api/src/env.rs Outdated Show resolved Hide resolved
lib/api/src/env.rs Outdated Show resolved Hide resolved
lib/api/src/externals/function.rs Outdated Show resolved Hide resolved
lib/api/src/externals/function.rs Show resolved Hide resolved
lib/engine/src/export.rs Outdated Show resolved Hide resolved
lib/engine/src/export.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@Hywan Hywan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the PR. In some places it simplifies code, in other, it's a little bit harder to understand. I think another refactoring round would be required in a close future, but let's keep the priority as is: fixing the memory leak.

There is a lot of TODO in this PR. Are you going to fix them all? I believe a majority should be fixed before merging this PR.

func: Box::new(func),
function_type: ty.clone(),
});
let dynamic_ctx: VMDynamicFunctionContext<DynamicFunctionWithoutEnv> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet this type annotation isn't required (regardless, I prefer having type annotation in this kind of code, so it's perfectly fine to keep it, just wondered whether it changes something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly no, I thought it was helpful for understanding what's happening with the unsafe block here though.

Actually all these wrapper functions can be created by a helper function, so I should probably just do that

Comment on lines -735 to -780
// This field _must_ come first in this struct.
env: Box<Env>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why env is no longer required to be at first position?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason before was due to a bad hack where we'd double deref a pointer ignoring the fact that it was in a struct.

We now have each function that is called on the env traverse things correctly, so for example we create a wrapper function where we walk through the types and find what we need. So if we need to operate on the Box<env> itself, we can find it in each function. I think this is a simpler solution because then all the functions always just work on the env, we had an issue where the clone and drop function needed to be called on the top level the wrapper dynamic func context thing but the initialize function needed to be called on the inner env. This fixes that by making the thing each function gets consistent.

@@ -76,7 +76,7 @@ impl ValFuncRef for Val {
let export = wasmer_engine::ExportFunction {
// TODO:
// figure out if we ever need a value here: need testing with complicated import patterns
import_init_function_ptr: None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the TODO above still valid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not 100% confident in this code which is why that comment is still there, I can investigate it later... I'm not aware of a good way to test this yet, I can investigate that though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should investigate. I bet is fine as None, but it will be good to verify

lib/engine/src/export.rs Show resolved Hide resolved
///
/// See `wasmer_vm::export::VMExportFunction::vmctx` for the version of
/// this pointer that is used by the VM when creating an `Instance`.
pub host_env: *mut std::ffi::c_void,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: Create a new constructor method rather than having all fields pub. I think it could be “dangerous”, it's better to hide everything here. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for all the fields in this type.

lib/engine/src/export.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
bors bot added a commit that referenced this pull request Dec 11, 2020
1902: Fix memory leak in `InstanceAllocator` on failure r=MarkMcCaskey a=MarkMcCaskey

This code needs to be cleaned up, just hacked it out really quickly.

The idea is to use a builder pattern to create a type-safe state machine for setting up things for the `InstanceAllocator`; this should allow us to remove `unsafe` in our "public" API because we can guarantee certain invariants because of the restricted nature of these types.

The names for the new types are long and don't really make sense but I didn't want to get blocked on that: suggestions welcome!

In combination with #1865 , this PR should fix the all the memory leaks seen by asan while running `make test-cranelift-jit` 🎉 🎉 🎉 

Co-authored-by: Mark McCaskey <mark@wasmer.io>
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicU32 {}
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicI64 {}
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicUsize {}
impl<'a> WasmerEnv for &'a ::std::sync::atomic::AtomicIsize {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need the <'a> ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<'a> is needed because of the &: trait impls on things with lifetimes can't omit them. And the reason we need & is to make these types shared somehow, otherwise the impl WasmerEnv for X can't be used here unless we have recursive definitions.

These impls on std types definitely aren't critical; these actually are only work with 'static lifetimes because of other constraints, so this only really works when putting an atomic in a static variable or otherwise doing unsafe things.

I'd be fine to remove them

lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/engine/src/artifact.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/instance.rs Outdated Show resolved Hide resolved
lib/vm/src/imports.rs Outdated Show resolved Hide resolved
@@ -76,7 +76,7 @@ impl ValFuncRef for Val {
let export = wasmer_engine::ExportFunction {
// TODO:
// figure out if we ever need a value here: need testing with complicated import patterns
import_init_function_ptr: None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should investigate. I bet is fine as None, but it will be good to verify

pub enum ImportFunctionEnv {
/// The `vmctx` pointer does not refer to a host env, there is no
/// metadata about it.
NoEnv,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is better to name this: VMEnv.
And have the HostEnv to be an enum with all the different options:

  • NoEnv
  • Env

Also, the VM should not be aware of dynamic functions or not. So I believe DynamicHostEnvWithNoInnerEnv should be done via HostEnv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thought on the VM prefix, I'm not sure it makes sense to use it on types that aren't related to the VM.

Yeah, the extra case here is just to make the multiple cases more explicit. Otherwise there's data that's sometimes there and sometimes not and it's not exactly clear why. This enum is more descriptive than prescriptive: it's describing the state of the world as it is, not what it should be. Perhaps we should move this type out of vm, there's no reason it has to be defined in VM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the VM, there is not such thing as DynamicHostEnv. This is something that we do in the engine level, not VM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning: the vm only handles the functions with their end ABI. Is the engine the one that creates the trampolines and so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: there is some "dangling" code in the VM that is aware of dynamic functions. But thanks to the PR where we move most of the function stuff to the engine, it should be easy to move the rest as well.

@MarkMcCaskey
Copy link
Contributor Author

bors r+

bors bot added a commit that referenced this pull request Dec 15, 2020
1865: Fix memory leak in host function envs r=MarkMcCaskey a=MarkMcCaskey

TODO: link to issue

This PR contains a number of changes:

1. Make `WasmerEnv: Clone`
2. Store a pointer to the `clone` function when creating a host function (Notably this is a feature that wouldn't work even if we _could_ use a proper trait object because you can't have a `Sized` trait object and `Clone: Sized`).
3. Store a pointer to the `drop` function when creating a host function.
4. Clone the env via pointer every time an `Instance` is made. Therefore each `Instance` gets its own, unique `Env` per host function with `Env`.
5. Add reference counting and drop logic to a sub-field of `wasmer_export::ExportFunction` which frees the original version of the `Env` (the thing that gets cloned each time an `Instance` is made) with the `drop` function pointer.
6. Change some logic in `vm::Instance` from SoA (struct of arrays) to AoS (array of structs): this uses more memory but is a bit less error prone and can be easily changed later.
7. Add logic on this new struct (`vm::ImportEnv`) that contains the function pointers for each import in `Instance` to drop (with the `drop` fn pointer) when the `vm::Instance` is being dropped. This fixes the original memory leak.
8. Add wrapper functions inside the host function creation functions which makes the layout of the user supplied env-pointer the responsibility of each function.  Thus, rather than `drop` being `Env::drop`, it's a function which frees all wrapper types, traverses indirections and frees the internal `Env` with `Env::drop`.  This simplifies code at the cost of making the `host_env` pointer (`vmctx`) not consistent in terms of what it actually points to.  This change fixes another memory leak related to the creation of host functions.

tl;dr: we're leaning into manually doing virtual method dispatch on `WasmerEnv`s and it actually works great! The biggest issue I have with the PR as-is is that the code isn't as clean/readable/robust as I'd ideally like it to be.

Edit (by @Hywan): This PR fixes #1584, #1714, #1865, #1667.

# Review

- [ ] Add a short description of the the change to the CHANGELOG.md file


Co-authored-by: Mark McCaskey <mark@wasmer.io>
Co-authored-by: Mark McCaskey <5770194+MarkMcCaskey@users.noreply.github.com>
@bors
Copy link
Contributor

bors bot commented Dec 15, 2020

Build failed:

Comment on lines -75 to +74
counter: Arc<RefCell<i32>>,
counter: Arc<Mutex<i32>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to change the examples? The previous examples should work as is. Or am I missing something?

Copy link
Contributor Author

@MarkMcCaskey MarkMcCaskey Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the previous examples were incorrect, we just didn't enforce it. WasmerEnv needs to be Send and Sync because Instance is Send and Sync and we don't synchronize access to functions. So it's possible in safe Rust to get a NativeFunc on multiple threads that points to the same underlying function. In order for the WasmerEnv to be able to handle being used this way, it must be both Send and Sync. This shouldn't affect much, if any, real code. For example, WASI was not affected at all by this change. Emscripten was only affected because it was doing some weird stuff with pointers.

@MarkMcCaskey
Copy link
Contributor Author

bors r+

@bors
Copy link
Contributor

bors bot commented Dec 15, 2020

@bors bors bot merged commit 22c7bc8 into master Dec 15, 2020
@bors bors bot deleted the fix/host-func-env-memory-leak branch December 15, 2020 23:12
@webmaster128
Copy link
Contributor

Edit (by @Hywan): This PR fixes #1584, #1714, #1865, #1667.

Note that GitHub only closes one ticket per keyword. In order to connect all of them, you need something like: fixes #1584, fixes #1714, fixes #1865, fixes #1667.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0 Wasmer at 1.0 🎉 enhancement New feature!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vmctx leaks?
6 participants