-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reintroducing exnref #280
Comments
Also note: there is an item on the August 1st meeting agenda to discuss EH. https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-08-01.md |
Slides from today's meeting: https://docs.google.com/presentation/d/15LAJ7VfE_68mgolMgeg9AEJbrrEGoNkEOXhG1J6_5W0/edit#slide=id.p |
Thank you for the presentation today! I hadn't been following this conversation. I implemented the exception-handling support in wasm2c. One of the things that made that easier is the (current) lexical scope for storing caught exceptions. Right now wasm2c doesn't even copy a caught exception onto the stack unless that particular caught exception is the target of a later rethrow, which is easy to identify once you get to the rethrow instructions. Option A and Option B seem to make life a bit harder for consumers like wasm2c, in that I think in practice they'd require that all caught exceptions will have to be stored somewhere with indefinite lifetime and dynamically known size. So I think there would now need to be a malloc at the start of every sequence of catches, and the runtime would need some sort of GC or a system of refcounting + an API for host functions to honor the refcounts when copying or storing these types. Which all feels less-than-ideal. How would you and others feel about an "Option C," which roughly speaking would look something like this?
I'm definitely not an expert in the formal semantics of exception handling, but if something like this could be made to work, it seems like the benefits would be things like:
Curious if this something like this is feasible. |
I think this specific concern is something that could be addressed by the variant of Option A that @titzer mentioned, where the existing "exnref-less" I think an instruction-set with an explicit deallocation instruction for |
Thinking about how we'd implement It would be very easy to implement some added instructions like:
It seems like these could be layered nicely on the current design without needing to deprecate anything. Re: last comment, I don't think I understand the value of |
Sometimes we'll want to capture a long-lived reference to an exception that was caught in a regular |
I was thinking in a similar vein. Without GC objects, exceptions only can contain primitive types, and I don't think this could be very different from handling multivalue types, in case wasm2c already supports them. As you said, this is not gonna support the
As @conrad-watt said, By the way I'm wondering, is the number of |
The current C++ implementation needs try
... running destructors ...
catch_all
rethrow
end The other case is in try {
...
} catch (int) {
} becomes something like try
...
catch $cpp_tag
if the thrown value is of type `int`
do stuff
else
rethrow
end So even if we have two different |
I got to talk today with @aheejin, @conrad-watt, and @titzer, about the implementability of exnref in wasm2c, and probably in other "offline" VMs that also lack GC. The conclusion seemed to be that we agree there's a workable outcome where (for purposes of the current proposal):
If the spec ends up this way, then I think implementations would be able to treat exnref as an "exception sum type," i.e., as a POD type big enough to contain any exception contents that gets copied when needed, and can avoid a dependency on GC. All of these things might then be lifted in a future proposal. I hope that summarizes accurately! (My personal preference would probably still be to maintain the status quo, both because it's already implemented and because I think it will perform marginally faster, but I think as far as wasm2c and perhaps other offline VMs are concerned, the above wouldn't be a major hardship.) |
Thanks for doing this! I would recommend taking Option B a bit further: have two instructions |
@eqrion previously had a very similar idea, where he spelled it out like this:
I do like keeping the |
Works for me. Besides complicating parsing, my main concern with that was that the order of Does the Neither of these are big items for me; bringing the labels up front is the item I care more about. |
I was thinking more about the restrictions meant to avoid the need for reference counting in wasm2c that you were mentioning in #280 (comment). Looking forward a bit, whatever solution we end up with for stack switching will need reference counting anyway, and I don't think there will be any workarounds to avoid it. One option would be to plan to lift the restrictions in the stack switching proposal, but since we know we will need reference counting in wasm2c at some point anyway, would it make sense to bite the bullet and implement it along with exception handling? That would avoid the need for the restrictions in the first place. |
@tlively IIUC the floated restrictions to More general point, I think there's a feature threshold after which it just makes sense for a runtime to go ahead and implement a somewhat general GC, at which point there's no reason for that runtime to specially restrict EDIT: I should add that on the face of things, it seems like |
@tlively Unfortunately I haven't thought of a great way to do reference-counting in wasm2c (and similar offline VMs) without a lot of overhead. Currently our trap handling is just "longjmp to the trap handler," and our throw handling is "goto or longjmp to the nearest enclosing try block landing pad," both of which are pretty easy. The lexical rethrow in the current proposal is great for us. With refcounting, we'd need a way to unwind the stack and decrement the refcount of every stack variable, param, and local. I can't think of a zero-cost way to do that in C. It would be expensive to put a setjmp at the beginning of every scope. (And I think people would prefer to avoid switching to "wasm2c++" if possible even though that would give us destructors and zero-cost exceptions...) I'm not super-familiar with what's been happening with the stack-switching proposal, but my guess is that wasm2c will probably implement the GC proposal first and take a dependency on Boehm GC for that. But I don't want users to have to run Boehm GC just to support exception-handling. (For our own use cases, we really want deterministic memory consumption and also exceptions...) |
I think our best bet to get @keithw Just gaming out the "mini GC" idea (which I am not suggesting you do right now, just brainstorming): you could keep a separate stack of |
@keithw A longer-term solution that would also help with supporting exceptions more efficiently would be to embed a description of the stack in the stack itself. That is, when translating an "interesting" wasm function to C, have a |
@RossTate Unfortunately we are in a constraint system that does not admit significant redesign of the exception handling mechanism, so we'll have to stay focused on a very minimal change to make |
I'm unsure what you're responding to. My previous comment was offering @keithw a suggestion on how to support reference counting and exception handling, just as you did. Though I forgot to tag him, so maybe that caused confusion. |
It might be helpful to have a broader discussion about the deprecation plan for "legacy" (v3) exceptions. My concern would be that the major browsers will never deprecate them (similar to what's happened with the legacy text instruction names, e.g. "get_local"), and implementations that conform to the spec alone will receive a lot of support requests/user complaints. Or that many consumers will continue supporting v3 exceptions from an outside-the-spec document/testsuite that's only semi-maintained. One option would be to really treat this as a first-class exercise in explicit deprecation, and fork the v3 proposal into something called (e.g.) "legacy-exceptions" that remains a live Phase 3 proposal, even as "exception-handling" is merged into the spec. And then there can be a subsequent explicit process to deprecate "legacy-exceptions" once the major browsers are willing to break compatibility, and all the consumers can do so at roughly the same time. At the minimum, it would be nice if tests for "v3" exceptions remained in the WebAssembly/testsuite repository and were subject to continuous improvements/contributions (e.g. adding tests for lexical rethrow and interactions with branches and exceptions) as long as consumers are expected to support them. |
There is a very real possibility that we will never be able to remove phase 3 exceptions from the web, at least in the near future. Even if we went to phase 4 today and all browsers were to implement it tomorrow, browsers whose release schedule is coupled with the OS take quite a bit longer to get major updates, and existing users generally want to target all browsers if they can. |
Sigh. :-/ I think if the browsers are going to keep supporting "legacy-exceptions" forever (and if some producers keep generating them so that their output works on legacy browsers) then I suspect people are going to keep wanting WABT to be able to parse, write, and validate these non-normative instructions. Which doesn't totally spark joy if we have to support this zombie proposal forever. (Maybe we can drop support in the interpreter and wasm2c at least.) I recognize that the train is heading in this direction, and I don't want to speak too loudly because we'll be fine no matter what (wasm2c can implement the "GC-less" profile of exnref as discussed above). But procedurally, it seems like it would be helpful if the participants who object to lexical rethrow could register those objections in a place they can be discussed. I don't think this has been captured publicly so far. From WABT's perspective, I was surprised that lexical rethrow is a sticking point; WABT implemented EH in the interpreter in 2021 (WebAssembly/wabt#1749), and in wasm2c in 2022 (WebAssembly/wabt#1930), and I think this proposal's reference interpreter has had an implementation for a while too, and obviously the browsers + Deno + nodejs did too. I hadn't thought any of these were considered to be a particularly heavy lift at the time. I'm very sympathetic to the desire to be able to asyncify and do other transformations, but if several implementations are really keeping "legacy-exceptions" forever, then it does seem cleaner to layer things on top (e.g. a future proposal that adds |
Hi Keith, The original issue above had a list of problems that the lack of That said, I interpret (no pun intended :-)) that part of what you're asking above is "Other interpreters and engines implemented lexical rethrow. What is the issue?" And indeed, in-place interpreters like Wizard's can support lexical rethrow at some cost. I'll explain that here for posterity. Again, please weight the explanation by the fact that I think all the other problems that lexical rethrow gives rise to are far more important. But for completeness sake: ---> Lexical rethrow storage needs explicit modeling in in-place interpretationLexical rethrow introduces a new form of storage into Wasm's computational model of a function activation. In particular, in addition to the value stack and control stack, there is a new, implicit caught exception stack that stores the exception packages (implicit An in-place interpreter for Wasm has to model the caught exception stack one way or another and thus maintain additional state. There are basically only two ways of doing this; a completely separate data structure, i.e. an explicit stack of caught exceptions, with a separate stack pointer, or in pre-allocated space computed by verification, similar to the sidetable used for control flow. The first option, a completely separate stack and stack pointer, actually impacts all regular control flow, because a branch (of any kind) that leaves a scope now can implicitly pop exceptions from the caught exception stack, so needs to adjust this additional stack pointer (another entry and adjustment in the side table if you will). But, also, quite unfortunately, the ---> Wizard does implement Phase 3 with a leaky trickI implemented Phase 3 EH in Wizard, but I don't use the above strategy. Instead, I cheated a little and pre-allocate enough space between the locals and the operand stack to store what are effectively Again, please don't take the above as the tail that wags the dog. In-place interpretation is the least of our problems here, but is very indicative of something weird going on. When I had discussed this with @rossberg some many months ago, he pointed out that basically the same kind of weirdness pops up in the formal spec; there's additional storage that gets introduced into the semantics (i.e. an additionally indexable environment). I've phrased it differently elsewhere, but that above amounts to: |
Why bother with wasm2c? Why bother with so-called "zero-cost" eh despite they are never truly zero-cost? i can guarantee this EH proposal will finally die out if you keep following this path. |
@trcrsired if you have questions or concerns about the EH proposal in its current form, can you rephrase them so that people can more easily respond in good faith? (see also this comment) If you're venting about the repeated changes to the proposal, we're sorry that this has caused you disruption. We don't expect to make further semantic changes, and I hope the discussions above give some context for you to engage with. Again, if you're not happy with the current state of the proposal, you can discuss this here, so long as you follow the W3C's code of ethics and professional conduct. |
Firefox has now implemented the current version of the spec behind the pref |
FYI Firefox 131 ships support for Is there an explainer/examples of how this feature is supposed to be used? Based on what is in the Tables section it sounds like references (of any type) are stored in tables. Tables are visible in both WASM and the shared environment (i.e. JavaScript), and provide a mechanism for sharing access to functions, exceptions, other things efficiently? - i.e. in this case allows an exception to easily/safely propagate to JavaScriipt (say). Docs don't really make it clear why this is needed - Exception seems to propagate happily if tables aren't defined. It is clear that the docs are out of date in that they way that there are only funcref in tables, and exnref (and perhaps externref) now appear to be supported. Is there some expert who might be able to help me update/create developer readable documentation for the API on MDN? |
In a typical use case for
See #158 (comment) for a discussion on the |
In (https://github.com/WebAssembly/meetings/blob/main/main/2020/CG-09-15.md), the EH proposal was changed to remove exception packages as first class values. Since that time, this proposal was advanced to phase 3, implemented in the reference interpreter, in toolchains (LLVM and Binaryen), and implemented in all 3 web engines. Web browsers have subsequently enabled the feature by default. Today, there are web properties and applications that use the feature and thus there are binaries that use the feature for C++ exception handling in the wild.
Given those constraints, making any change to the proposal requires careful thought and consideration, as any change we do make could have a potential disruptive effect on producers, toolchains, and engines, and even deployed applications on the web. This proposal has also undergone a number of changes and long design discussions over the years that have taken a lot of time and energy to work through and resolve.
That said, offline discussions have led me to propose that we find a deft way to reintroduce exnref to this proposal, within a set of hard constraints. I am led to this because exnref solves a number of problems that weren’t fully anticipated when it was removed and were only encountered after-the-fact.
Problems
Several issues were identified that are either directly related to lack of exnref or related to the lexical rethrow construct introduced to avoid exnref, which adds a new form of storage to Wasm.
Opportunities
Reintroducing exnref addresses the above problems and allows us to further improve the EH proposal.
Constraints
We’re operating under a number of tight constraints and requirements.
Related issues/links
“Ambiguity/identity loss when (re)throwing external exceptions” #207
“Should wasm code be able to extract the externref from exceptions thrown from js?” #202
“Clarify how exception identity is tracked” #242
“Update JS API to better specify opaqueData exception identity” #250
“Issues discussed in J2CL” #158
The text was updated successfully, but these errors were encountered: