Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented multi-threading for both JS and SYS, plus other WASIX implementations #3034

Closed
wants to merge 11 commits into from

Conversation

john-sharratt
Copy link
Contributor

Description

Review

  • Add a short description of the change to the CHANGELOG.md file

@john-sharratt john-sharratt force-pushed the wasmer3-wasix branch 2 times, most recently from 232a1c4 to af5663b Compare July 22, 2022 07:52
Comment on lines 75 to 92
/// Returns the pointer to the raw bytes of the `Memory`.
//
// This used by wasmer-emscripten and wasmer-c-api, but should be treated
// as deprecated and not used in future code.
#[doc(hidden)]
pub fn data_ptr(&self, store: &impl AsStoreRef) -> *mut u8 {
self.buffer(store).base
}

/// Returns the size (in bytes) of the `Memory`.
pub fn data_size(&self, store: &impl AsStoreRef) -> u64 {
self.buffer(store).len.try_into().unwrap()
}

/// Retrieve a slice of the memory contents.
///
/// # Safety
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't modify the API of non-shared memory. We should either:

  • create a new SharedMemory that is able to lock
  • preserve the current memory API and depending if the memory is shared o no, to work on the MutexGuard or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SharedMemory would still require guards that abstract the access to the memory where none shared memory could be made to be a zero cost abstraction it will still mandate some changes in the way Wasmer API accesses memory. So the main difference becomes if they implement the lock inside or not. Also we would need to figure out a way to decide when to use SharedMemory and when not to use it. With WASI we don't know if its multithreaded or not upfront until its actually used (having import memory is not conclusive as one may wish to pass in memory but be singlethreaded). Hence in the WASI mode we would have to assume all programs are multithreaded and hence need SharedMemory. Beyond this there is no technical limitation to implementing what you say and can be made optional on the "enable_threads" argument

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drastically simplified the changes around memory much closer to the original code. I had made an incorrect assumption which means the multithreading risks are much lower than I thought thus requiring much less protection. Wasmer uses mmap which means it does not change the base pointer address during grow operations, further it never shrinks in size... with those two constraints together it means there is no need to protect against a grow operation in parallel with normal memory operation, we only need to protect against concurrent grow operations which is only in one small part of the code. A simple RwLock around mmap is thus all that's needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now implemented with VMSharedMemory instead and preserves the original API where possible

@@ -19,8 +23,10 @@ use std::fmt;
#[derive(Clone)]
pub struct Instance {
_handle: StoreHandle<WebAssembly::Instance>,
thread_seed: Arc<AtomicU32>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thread ID is abstracted so that real thread ID's are not leaked into the sandbox (security considerations) - this seed is used to generate incrementing threads - it could be moved into WasiEnv but then threading starts to become a WASI only feature while I think its likely other things will use threading (unless we just use WASI code for none WASI ABI's)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now moved away from Instance and put into WasiStateThreading instead

Comment on lines +134 to +138
pub fn imports_for_module(
&self,
_store: &mut impl AsStoreMut,
module: &Module
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to nave a new _store param, if we are not using it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The store is needed here (for sys) - JS doesn't use it but I kept the interface consistent.

ret.push(Extern::Memory(Memory::new(store, mem_ty.clone())

It's used for the situation that imported memory is needed (multithreading) - it will create the memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer needed and is removed from the new patch I made

Comment on lines 136 to 239
pub fn lock<'a>(&'a self, store: &'a impl AsStoreRef) -> MemoryGuard<'a> {
let view = self.handle.get(store.as_store_ref().objects()).vmmemory();
let def = unsafe { view.as_ref() };
MemoryGuard {
_view: view,
base: def.base,
len: def.current_length,
marker: PhantomData,
Copy link
Member

@syrusakbary syrusakbary Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to name this buffer and return a ArrayBuffer (which is similar as a memory view) in case memory is not shared, and return a SharedArrayBuffer in case it's shared

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure no worries - makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer needed as the changes were simplified

Comment on lines 84 to 99
impl VMTableProtected
{
/// Get the `VMTableDefinition`.
fn get_vm_table_definition(&self) -> NonNull<VMTableDefinition> {
self.vm_table_definition.as_ptr()
}

/// Returns the number of allocated elements.
pub fn size(&self) -> u32 {
unsafe {
let td_ptr = self.get_vm_table_definition();
let td = td_ptr.as_ref();
td.current_elements
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to protect the table at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are race conditions that need to be protected against because the table was converted to a COW (Copy-on-write) pattern. In normal single-threaded code this will be close to a zero cost abstraction but for multi-threaded code it will copy the VMTable before modifying it. For performance reasons I implemented the copying as COW to make it a shallow copy rather than a deep copy as I assume that a large number of the tables do not change between threads

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the VMTable does not need to be multithread safe anymore so this was removed - instead we follow closer the threads proposal and make a new instance instead

.downcast_mut::<T>()
.unwrap()
.map(|a| a.downcast_mut::<T>())
.flatten()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .unwrap().unwrap() will be faster to execute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably compiles to the same assembly code - I can change it if you like though no worries

@@ -29,7 +29,7 @@ impl<T> FunctionEnv<T> {
}

/// Get the data as reference
pub fn as_ref<'a>(&self, store: &'a impl AsStoreMut) -> &'a T
pub fn as_ref<'a>(&self, store: &'a impl AsStoreRef) -> &'a T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops, thanks for the fix!

Comment on lines 28 to 29
/// Reactors are used to wake up the program
pub reactors: Reactors,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the programs use SharedArrayBuffer to wake up? (I might be confusing things, so it would be good to have a bit of better details on for which cases Reactors are used)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make the scope of a SharedArrayBuffer functionality wider then...well shared buffers. We should try and follow the "separation of concerns" principle when we can. Do you think we need to create a superset object/class? and if so what would you name it?

The reactors are needed because WASIX provides reactors (code execution stops until work arrives to be done) - this is essential for building asynchronous programs (for instance with Tokio). This will also be important for low overhead density of web assembly applications running on the same physical server, if there are web assembly programs running that are just in a spin lock / futex waiting on syscall to finish (e.g. poll) then it will not be possible to push them into a warm state (loaded but no CPU cycles)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this away from the instance and put it in Wasi instead - its much cleaner now

Comment on lines 149 to 151
#[derive(Clone, Copy)]
#[derive(Clone)]
pub struct WasmSlice<'a, T: ValueType> {
buffer: MemoryBuffer<'a>,
view: MemoryGuard<'a>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super sure about this changes. Once we have two kinds of Buffer (non shared and shared), I think this might need to vary more

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the threading extension proposes a new Shared Memory object. We will need two kinds of buffer, shared and not shared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now implemented it with Shared and Owned memory objects closer to the threading proposal

@john-sharratt john-sharratt force-pushed the wasmer3-wasix branch 3 times, most recently from 2ad2e02 to 98e3ead Compare July 28, 2022 23:25
@john-sharratt john-sharratt changed the title Added the ability to clone a Store and resolved the merge conflicts Implemented multi-threading for both JS and SYS, plus other WASIX implementations Jul 28, 2022
Comment on lines 1 to 8
use std::collections::VecDeque;
use std::sync::Mutex;
use std::sync::atomic::AtomicBool;
use std::sync::atomic::Ordering;
use std::sync::mpsc;
use std::sync::Arc;
use std::task::Waker;
use std::time::Duration;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to live inside of Wasmer. It can be a side utility/module inside of wasmer-wasi atm (since it's only used there)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have indeed now moved this to wasmer-wasi instead

Comment on lines 1 to 2
use std::{sync::{Arc, mpsc, Mutex}, time::Duration};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to live inside of Wasmer. It can be a side utility/module inside of wasmer-wasi atm (since it's only used there)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can move it to WASI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now moved to WASI

Added a Reactor that drives WebAssembly react pattern
Added thread local storage
Implemented the remaining functionality for the BUS
Fixed an issue where the imports were missing important information about the memory size
Added a Reactor that drives WebAssembly react pattern
Added thread local storage
Implemented the remaining functionality for the BUS
Fixed an issue where the imports were missing important information about the memory size
Added unit tests for multithreading and WASIX
@john-sharratt john-sharratt force-pushed the wasmer3-wasix branch 3 times, most recently from 7b4b6af to 8612415 Compare August 16, 2022 13:52
Fixed an issue with the module caching that no longer supports clones - instead we do a serialize and deserialize

Added support for thread local storage

Added better support for realtime reactors

More updates to the reactors and added a unit test that crashes because of some regression issue (I believe)

Modified the reactors so that they use futex implementations are wake much quicker to events
@john-sharratt
Copy link
Contributor Author

This is now replaced by...
#3116

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants