-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing atomic instructions with unshared memory #144
Comments
Making the load/store/rmw/cmpxchg ops more lenient seems fairly unproblematic (though see below) and would make simple cases "just work".
Another concern with the simple operations is that code written for shared memory may use techniques that are not directly related to It is possible that we could allow the simple ops to verify and execute with non-shared memory but that To deal with the spinlocks problem, perhaps a constant flag in memory (placed there by the linker) would be adequate, until we know how much an issue this really is; or we add an instruction that evaluates to a constant to test for shared memory. After all, we already have |
I think this is actually practical. Even if an engine doesn't support threading, it's not so hard to support this proposal in a superficial way. There's been some discussion (see WebAssembly/wasm-c-api#107) about sharing functions between threads would add a shared flag to functions (and all other extern kinds), and would prohibit shared functions from calling non-shared functions or accessing non-shared memories/tables/globals. This implies that modules compiled for sharing between threads would not be interoperable with non-shared modules. Allowing the atomic instructions to operate on non-shared memories will only defer the interoperability problems. |
I think mutexes should mostly just work. The single thread cannot fail to acquire a mutex because there are no other threads that could be holding it. Mutexes can easily be implemented to not execute a wait instruction if they are acquired on the first try. Programs that wait on condition variables while some other thread does work would be more problematic, but presumably such applications have alternative code paths for single-threaded execution if they are meant to be run in single-threaded environments.
I don't believe this introspection is necessary if mutexes just work. The reason I suggested that
Like normal waiting locks, I would expect spinlocks to always be acquired on the first try and therefore not actually spin.
Good point! If I'm missing something and this introspection is necessary, then it can easily be solved by the linker.
True, but it seems hacky to have an engine that appears to support the threads proposal but doesn't really support actually having multiple threads. I would like to avoid such a muddled situation.
Perhaps, but the user problem will be solved for the time before the extensions you mention are implemented and standardized. If those extensions were standardized, it would ideally still be possible to have MVP-compatible multithreaded modules even if other capabilities like forking could not be MVP-compatible. |
Something "just works" for how the Rust WASM support in
We use spinlocks for parts of the memory allocator in Lumen and haven't had issue being single-threaded. |
@KronicDeth Good point! I am essentially proposing that we move the stripping of atomics from the tools to the engine, and the fact that everything works when we strip atomics in the tools suggests that everything would continue to work if we had the engines do the stripping instead. |
@tlively, there's a fair amount of "can be", "presumably", "seems", "may" in your earlier message. I'm inclined to be skeptical myself. We probably need some data / experience reports here from real applications. I see @KronicDeth has some; more would be welcome. (Unless we come to agreement quickly it would also be good not to hold up the threads MVP over this, but perhaps that's a separate discussion.) |
@lars-t-hansen We will experiment with the proposed semantics in V8 and real-world programs and follow up with our results. |
I have confirmed that few Emscripten-compiled program would be broken by this change. Emscripten's pthread implementation calls The only programs that could possibly break under the proposed semantics are those that call However, we could prevent even these applications from breaking on unshared memory with further changes to how |
They are not broken if they are using a wait with a timeout as a simple implementation of sleep(). That's what I'd do... |
Right, I should have been more precise. Only programs that would reach the Suspend step in the Atomics.wait semantics that are also running in a context that forbids suspending or using an unshared memory would throw an error, but those programs throw errors today. I am proposing that we loosen the restrictions on what programs can execute |
They don't, though. Today those programs require shared memory to validate, in which case they would execute just fine. You're proposing to allow them to validate & execute in a context where they would start throwing errors if they get to the Suspend state, and my argument is only that we should expect there to be some programs like that. (If a program is running in unshared memory today then the author would have used a different mechanism for sleeping.) |
I presented this proposal at the CG meeting today. There did not appear to be broad consensus on the best way forward, but allowing normal (i.e. non-wait) atomic operations to validate and execute on unshared memories was uncontroversial. The conditional sections proposal was also raised as a way of allowing individual objects to simultaneously contain and not contain wait instructions. I presented the two solutions raised earlier in this thread, namely 1) trapping when wait instructions are executed in unshared-memory environments by analogy to There was mixed support for option 2 due to the riskiness of changing wait semantics. It was also mentioned that pursuing option 1 would not preclude moving to option 2 at a later date if the need arose. Also, an optimal solution would probably use conditional sections to avoid executing waits in unshared environments, in which case option 1 would be fine. Given all that, are there any objections to moving forward with option 1, i.e. trapping when waits are executed in unshared environments? @lars-t-hansen did you have additional concerns about engines needing to do extra work to lower normal atomics? |
@tlively, no real concern about the work involved for the plain atomics, these can be lowered to alignment check + nonatomic access I think. |
As discussed in the CG on November 12, 2019 and in WebAssembly#144. Specifically, all atomic accesses are now allowed to validate and execute normally on unshared memories and wait operations trap when used with unshared memories. The PR updates Overview.md and makes a best-effort attempt at updating the spec, making no changes when there is already a TODO for updating spec text. It also adds new TODO comments to the reference interpreter where changes will have to be made.
As discussed in the CG on November 12, 2019 and in #144. Specifically, all atomic accesses are now allowed to validate and execute normally on unshared memories and wait operations trap when used with unshared memories. The PR updates Overview.md and makes a best-effort attempt at updating the spec, making no changes when there is already a TODO for updating spec text. It also adds new TODO comments to the reference interpreter where changes will have to be made.
We have decided on a path forward for this and updated the spec proposal accordingly, so I will close this issue. |
Summary: The WebAssembly backend automatically lowers atomic operations and TLS to nonatomic operations and non-TLS data when either are present and the atomics or bulk-memory features are not present, respectively. The resulting object is no longer thread-safe, so the linker has to be told not to allow it to be linked into a module with shared memory. This was previously done by disallowing the 'atomics' feature, which prevented any objct with its atomic operations or TLS removed from being linked with any object containing atomics or TLS, and therefore preventing it from being linked into a module with shared memory since shared memory requires atomics. However, as of WebAssembly/threads#144, the validation rules are relaxed to allow atomic operations to validate with unshared memories, which makes it perfectly safe to link an object with stripped atomics and TLS with another object that still contains TLS and atomics as long as the resulting module has an unshared memory. To allow this kind of link, this patch disallows a pseudo-feature 'shared-mem' rather than 'atomics' to communicate to the linker that the object is not thread-safe. This means that the 'atomics' feature is available to accurately reflect whether or not an object has atomics enabled. As a drive-by tweak, this change also requires that bulk-memory be enabled in addition to atomics in order to use shared memory. This is because initializing shared memories requires bulk-memory operations. Reviewers: aheejin, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79542
Summary: The WebAssembly backend automatically lowers atomic operations and TLS to nonatomic operations and non-TLS data when either are present and the atomics or bulk-memory features are not present, respectively. The resulting object is no longer thread-safe, so the linker has to be told not to allow it to be linked into a module with shared memory. This was previously done by disallowing the 'atomics' feature, which prevented any objct with its atomic operations or TLS removed from being linked with any object containing atomics or TLS, and therefore preventing it from being linked into a module with shared memory since shared memory requires atomics. However, as of WebAssembly/threads#144, the validation rules are relaxed to allow atomic operations to validate with unshared memories, which makes it perfectly safe to link an object with stripped atomics and TLS with another object that still contains TLS and atomics as long as the resulting module has an unshared memory. To allow this kind of link, this patch disallows a pseudo-feature 'shared-mem' rather than 'atomics' to communicate to the linker that the object is not thread-safe. This means that the 'atomics' feature is available to accurately reflect whether or not an object has atomics enabled. As a drive-by tweak, this change also requires that bulk-memory be enabled in addition to atomics in order to use shared memory. This is because initializing shared memories requires bulk-memory operations. Reviewers: aheejin, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79542
WebAssembly/threads#144 updated the WebAssembly threads proposal to make atomic operations on unshared memories valid. This change updates the feature checking in the linker accordingly. Production WebAssembly engines have recently been updated to allow this behvaior, but after this change users who accidentally use atomics with unshared memories on older versions of the engines will get validation errors at runtime rather than link errors. Differential Revision: https://reviews.llvm.org/D79530
WebAssembly/threads#144 updated the WebAssembly threads proposal to make atomic operations on unshared memories valid. This change updates the feature checking in the linker accordingly. Production WebAssembly engines have recently been updated to allow this behvaior, but after this change users who accidentally use atomics with unshared memories on older versions of the engines will get validation errors at runtime rather than link errors. Differential Revision: https://reviews.llvm.org/D79530
Expects the runtime will support atomic access, even in single-threaded mode. See WebAssembly/threads#144 for more.
Since #144, we've specified that `memory.atomic.notify` is allowed to be executed on unshared memories, but always returns 0 since it is impossible for there to be anything waiting on an unshared memory since all variations of the wait instructions in Wasm and JS trap or throw exceptions on unshared memories. However, I realized recently that there is no technical reason why [`Atomics.waitAsync`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics/waitAsync) shouldn't be able to wait on an unshared memory. Because `Atomics.waitAsync` returns a promise rather than blocking, it can in principle be used to synchronize between multiple asynchronous tasks in flight concurrently in the same JS context, and now that [JSPI](https://github.com/WebAssembly/js-promise-integration/blob/main/proposals/js-promise-integration/Overview.md) is becoming more real, this actually might be a useful thing to do. In particular `Atomics.waitAsync` could be used to build suspending versions of mutexes and condition variables to synchronize between threads running as different asynchronous tasks within a single JS context. Actually realizing this vision of asynchronous threading with JSPI on top of `Atomics.waitAsync` requires changes on the JS side to allow `Atomics.waitAsync` and `Atomics.notify` to be used on unshared ArrayBuffers, but in the meantime we can prepare for that change on the Wasm side by softening the language stating that `memory.atomic.notify` unconditionally returns 0 for unshared memories.
Since #144, we've specified that `memory.atomic.notify` is allowed to be executed on unshared memories, but always returns 0 since it is impossible for there to be anything waiting on an unshared memory since all variations of the wait instructions in Wasm and JS trap or throw exceptions on unshared memories. However, I realized recently that there is no technical reason why [`Atomics.waitAsync`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics/waitAsync) shouldn't be able to wait on an unshared memory. Because `Atomics.waitAsync` returns a promise rather than blocking, it can in principle be used to synchronize between multiple asynchronous tasks in flight concurrently in the same JS context, and now that [JSPI](https://github.com/WebAssembly/js-promise-integration/blob/main/proposals/js-promise-integration/Overview.md) is becoming more real, this actually might be a useful thing to do. In particular `Atomics.waitAsync` could be used to build suspending versions of mutexes and condition variables to synchronize between threads running as different asynchronous tasks within a single JS context. Actually realizing this vision of asynchronous threading with JSPI on top of `Atomics.waitAsync` requires changes on the JS side to allow `Atomics.waitAsync` and `Atomics.notify` to be used on unshared ArrayBuffers, but in the meantime we can prepare for that change on the Wasm side by softening the language stating that `memory.atomic.notify` unconditionally returns 0 for unshared memories.
Summary: The WebAssembly backend automatically lowers atomic operations and TLS to nonatomic operations and non-TLS data when either are present and the atomics or bulk-memory features are not present, respectively. The resulting object is no longer thread-safe, so the linker has to be told not to allow it to be linked into a module with shared memory. This was previously done by disallowing the 'atomics' feature, which prevented any objct with its atomic operations or TLS removed from being linked with any object containing atomics or TLS, and therefore preventing it from being linked into a module with shared memory since shared memory requires atomics. However, as of WebAssembly/threads#144, the validation rules are relaxed to allow atomic operations to validate with unshared memories, which makes it perfectly safe to link an object with stripped atomics and TLS with another object that still contains TLS and atomics as long as the resulting module has an unshared memory. To allow this kind of link, this patch disallows a pseudo-feature 'shared-mem' rather than 'atomics' to communicate to the linker that the object is not thread-safe. This means that the 'atomics' feature is available to accurately reflect whether or not an object has atomics enabled. As a drive-by tweak, this change also requires that bulk-memory be enabled in addition to atomics in order to use shared memory. This is because initializing shared memories requires bulk-memory operations. Reviewers: aheejin, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79542
WebAssembly/threads#144 updated the WebAssembly threads proposal to make atomic operations on unshared memories valid. This change updates the feature checking in the linker accordingly. Production WebAssembly engines have recently been updated to allow this behvaior, but after this change users who accidentally use atomics with unshared memories on older versions of the engines will get validation errors at runtime rather than link errors. Differential Revision: https://reviews.llvm.org/D79530
It is currently specified that attempting to use atomic access operators on non-shared linear memory is a validation error. This validation requirement causes a lot of friction for toolchain developers and users.
For example, consider a user wishing to distribute a WebAssembly binary for a C++ library. Say that library uses C++11 atomics internally to achieve thread-safety, but is otherwise agnostic to whether it is used in a single threaded environment or a multithreaded environment. On native platforms a single build of the library can be used in both environments, but due to the validation rules for WebAssembly atomics, the library must be built twice for WebAssembly; once with atomics enabled and once without. Beyond causing extra work for developers, having to produce these separate builds is also a large hole in WebAssembly's portability story.
One solution to this would be for all applications to use shared memories whether they are multithreaded or not. But this is not practical because not all engines support threads and not all engines are expected to support threads in the future.
A better solution would be to relax the validation rules to allow atomic operations to be used with non-shared memories. The semantics for atomic loads, stores, rmw operations, and cmpxch operations are straightforward to extend to unshared memories. There are more possibilities that might make sense for wait and notify operations, but I suggest that waiting on an unshared memory always trap and notify on an unshared memory always return 0.
The text was updated successfully, but these errors were encountered: