Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to instantiate WASM module without calling the start function #3062

Closed
wants to merge 6 commits into from

Conversation

daxpedda
Copy link
Collaborator

@daxpedda daxpedda commented Sep 3, 2022

Currently it's impossible to properly implement WASM multi-threading as a library without the user having to use some workaround to prevent main to be called.

I was thinking of just adding a third parameter to the init function, not sure if that's a breaking change or not (I don't believe so). Happy to try other approaches to solve this issue.

Fixes #2295.

@daxpedda daxpedda force-pushed the init-without-start branch 4 times, most recently from 9ff706f to 4556d91 Compare September 3, 2022 16:28
Copy link
Collaborator

@Liamolucko Liamolucko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__wbindgen_start doesn't just call main; it also does stuff like initializing the stack and thread-local storage, that need to be called on every thread. So, you'll have to split those runtime-initialization things that need to be called by every thread out into a different function.

This should be the relevant part to change:

fn add_start_function(&mut self, id: FunctionId) -> Result<(), Error> {
if self.start_found {
bail!("cannot specify two `start` functions");
}
self.start_found = true;
// Skip this when we're generating tests
if !self.support_start {
return Ok(());
}
let prev_start = match self.module.start {
Some(f) => f,
None => {
self.module.start = Some(id);
return Ok(());
}
};
// Note that we call the previous start function, if any, first. This is
// because the start function currently only shows up when it's injected
// through thread/externref transforms. These injected start functions
// need to happen before user code, so we always schedule them first.
let mut builder = walrus::FunctionBuilder::new(&mut self.module.types, &[], &[]);
builder.func_body().call(prev_start).call(id);
let new_start = builder.finish(Vec::new(), &mut self.module.funcs);
self.module.start = Some(new_start);
Ok(())
}

It's called for both main and #[wasm_bindgen(start)] functions.

@daxpedda
Copy link
Collaborator Author

daxpedda commented Sep 6, 2022

I didn't realize this, which caused me to encounter a lot of bugs that I couldn't really explain. Thanks for pointing that out!

I hope I got this right:
I changed add_start_function() to just add a new exported function called __wbindgen_main instead of modifying the start section. This leaves the initialization code untouched to be moved later to __wbindgen_start by unstart_start_function().

Tested and works great!

@daxpedda daxpedda force-pushed the init-without-start branch 2 times, most recently from 58d9777 to ed22c4c Compare September 6, 2022 12:53
@daxpedda daxpedda requested a review from Liamolucko September 6, 2022 13:42
@alexcrichton
Copy link
Contributor

Personally I would prefer to avoid tacking on additions to try to get a comprehensive multi-threading story when there's no actual comprehensive solution to multi-threading right now and I'm not sure that anyone's really "driving" the design. AFAIK the current support implemented by wasm-bindgen is a sort of "local maximum" which while it's pretty unusable that's somewhat reflective of multithreading on the web at large.

I didn't realize this, which caused me to encounter a lot of bugs that I couldn't really explain

I don't personally have time to really dig in and review this, but at least to me this worries me. Setting up stacks and TLS and such are pretty basic things with multi-threading and if this went wrong in the first iteration I suspect that this needs a pretty deep and thorough review to ensure correctness.

@daxpedda
Copy link
Collaborator Author

daxpedda commented Sep 6, 2022

It took me quiet some time to figure out what's going on, but I don't think I'm introducing a deep change by any stretch of the imagination.
Of course there could still be a bug that affects the major use-case of wasm-bindgen. So obviously a thorough review is necessary.

I would also like to mention that I'm working on a library like wasm_thread that should make things very usable, I'm using it right now and it's working completely abstracted without me having to write a ton of cfgs. I believe this PR and #3032 are the last missing pieces.


I think everybody in the Rust Community knows and appreciates the amount of work and all the responsibilities that you have. So I totally understand if you can't find time for this ❤️.

Is there somebody else that can be asked to review this?

@Liamolucko
Copy link
Collaborator

A potential way for this to work without adding an initWithoutStart function would be for wasm-bindgen to only call main on the first thread that calls init. That would boil down to setting a byte to 1 (atomically) when the first thread starts, which all the other threads will see and skip running main.

That does have the potential to break any code that relies on the current behaviour of calling main on every call to init, though.

@daxpedda
Copy link
Collaborator Author

daxpedda commented Sep 8, 2022

If the desire is to not add another function, I think a solution that would have much less breakage is to add a third default argument to init, I assume that fewer people use weird syntax to pass arguments into init then people relying on init to always call main. Not that any of these solutions would alleviate the concerns alexcrichton was outlining.

@Liamolucko I didn't realize that you have merge rights, is this PR mergeable? Can you review this? Is there some other solution to alleviate alexcrichton's concerns?

@ranile
Copy link
Collaborator

ranile commented Sep 15, 2022

I agree with Alex here.
There's no true support for multi threading in WASM, and we should be applying bandaids to fill the hole.

As long a workaround is available, I wouldn't be in favor of merging this. Also inclusion of this function increases the JS bundle size, which is something worth considering

@daxpedda
Copy link
Collaborator Author

There's no true support for multi threading in WASM, and we should be applying bandaids to fill the hole.

I'm assuming you meant "we should not"?

I will attempt to make a case for this:

As long a workaround is available, I wouldn't be in favor of merging this.

I am currently building a library very much like wasm_thread and currently there is no workaround that I can apply in the library itself. I have to tell users not to use a start function and to manually call their main function in their index.html to start their application. This is a very bad footgun.

I myself have hit that footgun when trying to use wasm_thread and I didn't understand why I'm getting some really bad infinite loops until I discovered this issue. It also prevents users from using any kind of tool that automatically starts up their application, like trunk, wasm-pack or wasm-server-runner.

So I don't really consider there to be a viable workaround, the only available one doesn't just introduce a footgun, it also locks you out from a lot of convenient tools that you can't use anymore.

Also inclusion of this function increases the JS bundle size, which is something worth considering

Is the JS bundle the JS shim? I just measured it and you are right, it's noticeable. In a completely minimal "Hello, World!" example it's 5% both not-minimized and minimized. I must say though that in any reasonably sized project it's much less then 1%. Also initSync should have introduced an even bigger overhead.


From alexcrichton:

AFAIK the current support implemented by wasm-bindgen is a sort of "local maximum" which while it's pretty unusable that's somewhat reflective of multithreading on the web at large.

I do know that currently multithreading in the browser through WASM has a lot of limitations and issues. But I have to point out, that it is currently underused because of two issues:

I can't really change anything about the nightly issue, but I can contribute to making it usable for now. wasm_thread is a good example of what's currently possible, but it simply has too many footguns to be usable for the average user. I was able to remove all "major" footguns except the one addressed by this PR and I do consider it the last thing really necessary to make things as seamless to the user as possible.

Copy link
Collaborator

@Liamolucko Liamolucko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of this looks good, I think it's more a question of the API.

I agree that initWithoutStart feels like a bandaid. The way I'd prefer is the way I suggested where it's handled automatically, provided that that doesn't create much breakage. Failing that, I think that adding a third boolean parameter is slightly less of a bandaid, but it's still not great.

@daxpedda
Copy link
Collaborator Author

The way I'd prefer is the way I suggested where it's handled automatically, provided that that doesn't create much breakage.

I will look into it, I really don't mind this solution, it fits my needs very well.

@daxpedda daxpedda mentioned this pull request Sep 22, 2022
@daxpedda
Copy link
Collaborator Author

Replaced by #3092.

@daxpedda daxpedda closed this Sep 28, 2022
This was referenced Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to avoid extra calls to main when creating web worker from binary?
4 participants