-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Fix dyncalls api #13289
RFC: Fix dyncalls api #13289
Conversation
I've not fully grokked the entire thing here but I think am generally supportive of this direction. Regarding (5) I don't think it was my intention when adding I would however like to see this work split up into its constituent parts if possible. For example, can we add (4) as a separate, and straight forward PR that optimizes the dynCall mechanism? I'd rather avoid adding a the WASM_BIGINT=2 option if possible. For users who don't want access to the dynCall_xx functions can we create new setting instead? something that is called something more meaningful like NO_GENERATE_DYNCALL_HELPERS (but maybe not that long :). Whether or not we generate these helper functions seems orthogonal to whether we have bigint support in the JS engine. Also, I find ternary options that take 0,1 or 2 much hard to reason about for some reason about I'd rather avoid adding new ones if we can. |
@@ -175,7 +175,7 @@ var EXPECT_MAIN = 1; | |||
// MODULARIZE, and returned from the factory function. | |||
var EXPORT_READY_PROMISE = 1; | |||
|
|||
var USE_LEGACY_DYNCALLS = 0; | |||
var USE_LEGACY_DYNCALLS = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended this setting to mean "use the dyncalls that are generated by binaryen". Perhaps we don't need those add anymore after thise change? In nay case can you add a comment here so its clear what the new meaning is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that may be needed in some form or the other. Let me see what form the code takes after test suite passes.
@@ -401,7 +432,7 @@ def finalize_wasm(infile, outfile, memfile, DEBUG): | |||
args.append('-g') | |||
if shared.Settings.WASM_BIGINT: | |||
args.append('--bigint') | |||
if shared.Settings.USE_LEGACY_DYNCALLS: | |||
if True:##shared.Settings.USE_LEGACY_DYNCALLS: | |||
# we need to add all dyncalls to the wasm | |||
modify_wasm = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you want to remove this entire if/else, and we can also remove the all the dynCall generation from binaryen? Or at least that would be really cool if we could.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a user-facing option WASM_DYNCALLS, then this might be needed when that is enabled.
@@ -0,0 +1,381 @@ | |||
#include <emscripten.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In test_core.py this seems to be called test_dyncalls.c
? Not sure how this is working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a separate benchmark file. It is now out of date after some revision, I'll update this to latest when I get a chance, to make sure it tests the latest form of the API.
@@ -123,6 +123,9 @@ function loadWasmModuleToWorkers() { | |||
/*** ASM_MODULE_EXPORTS_DECLARES ***/ | |||
#endif | |||
|
|||
/*** STATIC_DYNCALL_SIG_FUNCTIONS ***/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR but other places where do late substitution I've been using <<< ... >>>
to avoid confusion with jsifier's {{{ ... }}}
. Should we switch these to use the same pattern?
Seems like it would be good to distinguish comments from substitution patterns, and also will correctly error out of we forget/fail to make the substitution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah maybe <<< >>> is good, I'll have that in a followup. Although now I wonder if I need this at all. I'll see after I get through the test suite.
src/library_dyncall.js
Outdated
$dynCall: function(sig, funcPtr, args) { | ||
// For int64 signatures, use the dynCall_sig dispatch mechanism. | ||
if (sig.includes('j')) { | ||
return {{{getDynCaller('sig')}}}.apply(null, [funcPtr].concat(args)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention in other library files seem to be to have white space inside the {{{ ... }}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can use just implement this in terms if bindDynCall
above?
$dynCall: function(sig, funcPtr, args) { return bindDynCall(sig, funcPtr).apply(null, args); }
I imagine it would be a slight perf loss but a code size win, and I don't think the generic runtime dynCall like this should be on anyones critical path, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is reason to make this code suboptimal, it is just a few lines. But added a variant in -Os/-Oz to reuse the code to make it smaller.
src/library_dyncall.js
Outdated
@@ -0,0 +1,67 @@ | |||
mergeInto(LibraryManager.library, { | |||
{{{ (function() { global.wbind = function() { return SHRINK_LEVEL == 0 ? 'wbind' : 'wasmTable.get'; }; return null; })(); }}} | |||
{{{ (function() { global.getDynCaller = function(sig) { return MINIMAL_RUNTIME ? `dynCalls[${sig}]` : `Module["dynCall_${sig}]`; }; return null; })(); }}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty hard for me to grok.. we have a IIFE embedded in a {{{ .. }}}
. Maybe there is some simpler way to express this use parseTools.js
helpers and normal #if/#endif
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a way to get around the lack of #define
support in our preprocessing, since there was never consensus to migrate to using a real preprocessor. Basically this lets have one place where #if is performed, to avoid having to sprinkle the code with several #if #endifs.
Moved to src/parseTools.js.
@@ -37,14 +37,45 @@ | |||
WASM_INIT_FUNC = '__wasm_call_ctors' | |||
|
|||
|
|||
def make_wasm_table_static_dyncaller(func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is only used in minimal runtime? Don't we want to always generate these helpers on the JS side (when the user asks for them or needs them)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the intent is not to be minimal runtime only, but I just started looking at this in minimal runtime first. This may probably not be needed after all, I'll delete it soon if not.
I think that for non-int64 signatures, the new table-based calling mechanism can already take over. That is assuming my understanding that the table-based calling mechanism has been supported by Wasm VMs since the MVP even? Or is there some Wasm VM version where the table-based calling mechanism would not work on older Wasm implementations? If there is a version support difference between dynCalls and wasmTable.get() on non-int64 signatures, then it can get trickier, I guess it would depend on when wasmTable -based machinery became available. But I believe that is not the case, and support was already in MVP(?)
I can certainly look at that - I wanted to first make sure I solve the whole problem before proposing partial PRs, to not run in a deadend half-way.
Something like I suppose at this point before I proceed further, the question is whether the "Emscripten wasm function pointer calling API" that I am proposing we'd put forward is good? You can see the different code usage snippets in https://github.com/emscripten-core/emscripten/pull/13289/files#diff-7edea7d5dbf970f71db122fcaaf14d6458c48594c1a6366a9ee37216b9b73304R5 In particular I am proposing the I would also like to put forward that we should probably deprecate |
else: | ||
for x in exports_that_are_not_initializers: | ||
if x.startswith('dynCall_'): | ||
static_dyncall_sig_functions += make_wasm_table_static_dyncaller(x) + '\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we are generating these dynCall_ function on demand (based on them being added explicitly to the export list by the user). If so that why would we ever want to add an option to disable them? Isn't there already no cost assuming you don't export any of them?
Can we land this part of the change now or get support for JS-side dynCall_xx
functions? Hopefully this can be independent of any new APIs or options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be ok with always generating them all, as long as optimizer DCEs the ones that are not invoked. In Unity we probably still want to export dynCall
to Module
, but the other dynCall_sig
items would not be exported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general the direction seems reasonable to me. I think it may be possible to split this into smaller PRs, though. I can take a closer look once tests pass, as you said in a comment.
exports_that_are_not_initializers += ['dynCalls = {}'] | ||
else: | ||
for x in exports_that_are_not_initializers: | ||
if x.startswith('dynCall_'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this assumes that dynCall_
functions are exported from the wasm? That isn't the case for WASM_BIGINT
or otherwise when we do not legalize, and is definitely not the case when wasm-emscripten-finalize
does not write out any changes at all to the wasm (then the wasm is identical to what LLVM emitted).
I think we need to make wasm-emscripten-finalize
print out the list of function pointer types in the metadata.
Closing this to bring the functionality in partial steps, like discussed. |
This PR works towards fixing several problems introduced with addition of WASM_BIGINT and #12059 (see also #12733):
dynCall_sig
functions for invoking a wasm function pointer, breaking JS library, EM_ASM, EM_JS and external JS code that used dynCalls.This PR restores the
dynCall_sig
(and in classic runtime,Module['dynCall_sig']
) functions. When using these functions, one always passes in parameters using a pair of i32s.but when one changes to build with WASM_BIGINT=1, one must rewrite all int64 dyncalling JS code to instead use
The problem with this change is that it is global/universal to the whole build, and it is bidirectionally breaking: there is no way to write backwards or forwards compatible code (in JS libraries one could use #if WASM_BIGINT, but that would not work in EM_ASM, EM_JS or external JS).
To explain the scenario in more detail: In Unity builds there exists JS code written by three different parties: us (Unity engine developers), game asset/plugin/library/middleware developers and game developers. Each of these may be using JS dynamic calls to "call back" to the Unity engine. Because this code is written and maintained by different authors at different times, there is no good way for us to migrate to WASM_BIGINT=1 as it is currently laid out, since flipping the switch would break all existing dynCalling JS code, and we cannot coordinate everyone to fix up their code all at once. How can we migrate?
Instead, this PR resolves the problem by offering a backwards compatible mechanism of invoking new wasmTable BigInt-based signatures that does not break existing users of the old dynCalls. That is, we shall not break the existing
dynCall()
ABI, but offer a new (shinier) one for the new table-based ABI.This PR fixes the issue by avoiding extra BigInt-related code being generated when WASM_BIGINT==0 is used.
wasmTable.get(funcPtr)
is slow, and @kripken came up with the idea of mirroring the wasm table on the JS side. This PR implements that mirror and adds a benchmark, where it is observed that the mirror does improve performance quite well.In order to fix 2&3, instead of breaking the existing "32-bit ABI" in
dynCall
that uses two i32 pairs calling convention for i64 params, new functionswbind
("wasm bind") andwbindArray
are introduced to use the "64-bit ABI" where i64 params are specified as BigInt. This way existing "legacy" code can keep usingdynCall()
s and keep working, and new code can usewbind()
to migrate to 64-bit ABI.A new build option WASM_BIGINT=2 is added, which is like WASM_BIGINT=1, but omits the support for the old
dynCall()
ABI. That is, the build modes are:dynCall()
anddynCall_sig()
to invoke wasm functions, any int64 params being passed using two i32 pairs. Can usewbind()
to invoke wasm functions, but only for functions that do not have a int64 in their signature.dynCall()
anddynCall_sig()
to invoke wasm functions, any int64 params being passed using two i32 pairs. Can usewbind()
to invoke wasm functions, any int64 params being passed using BigInts.dynCall()
anddynCall_sig()
are not emitted in the build. Must usewbind()
to invoke wasm functions, using BigInt or any int64s in the signature.With this scheme, users who want to keep building code that may be using
dynCall()
anddynCall_sig()
to invoke int64 wasm functions can build with WASM_BIGINT=0 and pay no code size for addition of WASM_BIGINT support; or they may build with WASM_BIGINT=1 to enable building with a transitional mechanism that enables using BigInt()s for int64.Users who want to only support BigInt, and do not want to pay code size for any of the legacy "32-bit ABI" or
dynCall()
s, can use WASM_BIGINT=2, and always usewbind()
to invoke wasm functions.