Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EH] Add instructions for new proposal #6181

Merged
merged 15 commits into from
Dec 19, 2023
Merged

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Dec 14, 2023

This adds basic support for the new instructions in the new EH proposal passed at the Oct CG hybrid CG meeting:
https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-10.md
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md

This mainly adds two instructions: try_table and throw_ref. This is the bare minimum required to read and write text and binary format, and does not include analyses or optimizations. (It includes some analysis required for validation of existing instructions.) Validation for the new instructions is not yet included.

try_table faces the same problem with the resume instruction in #6083 that without the module-level tag info, we are unable to know the 'sent types' of try_table. This solves it with a similar approach taken in #6083: this adds Module* parameter to finalize methods, which defaults to nullptr when not given. The Module* parameter is given when called from the binary and text parser, and we cache those tag types in sentTypes array within TryTable class. In later optimization passes, as long as they don't touch tags, it is fine to call finalize without the Module*. Refer to
#6083 (comment) and #6096 for related discussions when resume was added.

This adds basic support for the new instructions in the new EH proposal
passed at the Oct CG hybrid CG meeting:
https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-10.md
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md

This mainly adds two instructions: `try_table` and `throw_ref`. This is
the bare minimum required to read and write text and binary format, and
does not include analyses or optimizations. (It includes some analysis
required for validation of existing instructions.) Validation for
the new instructions is not yet included.

`try_table` faces the same problem with the `resume` instruction in
 WebAssembly#6083 that without the module-level tag info, we are unable to know the
'sent types' of `try_table`. This solves it with a similar approach
taken in WebAssembly#6083: this adds `Module*` parameter to `finalize` methods,
which defaults to `nullptr` when not given. The `Module*` parameter is
given when called from the binary and text parser, and we cache those
tag types in `sentTypes` array within `TryTable` class. In later
optimization passes, as long as they don't touch tags, it is fine to
call `finalize` without the `Module*`. Refer to
WebAssembly#6083 (comment)
and WebAssembly#6096 for related discussions when `resume` was added.
@aheejin aheejin requested review from kripken and tlively December 14, 2023 01:02

bool hasCatchAll() const;

void finalize(Module* wasm = nullptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here about the optional Module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

DELEGATE_FIELD_SCOPE_NAME_USE_VECTOR(TryTable, catchDests);
DELEGATE_FIELD_NAME_KIND_VECTOR(TryTable, catchTags, ModuleItemKind::Tag);
DELEGATE_FIELD_CHILD(TryTable, body);
DELEGATE_END(TryTable);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need catchRefs here, which probably means we need to add a new DELEGATE_FIELD_INT_VECTOR. Without that, hashing etc. will not work well.

I think we need sentTypes as well, even though they are just cached metadata. Types can be updated in various passes, and we need to know where to find them (when we update a type, we replace it as they are immutable).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do after #6182 lands. Adding DELEGATE_FIELD_INT_VECTOR and DELEGATE_FIELD_TYPE_VECTOR for those required me to add empty definitions for them in branch-utils.h and some other places, which I think is because we didn't define DELEGATE_GET_FIELD in some places. I did some cleanup related to that in #6182.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -284,9 +284,10 @@ class SExpressionWasmBuilder {
Expression* makeTableFill(Element& s);
Expression* makeTableCopy(Element& s);
Expression* makeTry(Element& s);
Expression* makeTryOrCatchBody(Element& s, Type type, bool isTry);
Copy link
Member Author

@aheejin aheejin Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This declaration was not used

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is missing a change to module-utils.cpp to ensure the signature type is included in the type section.

Comment on lines +1489 to +1493
ArenaVector<Name> catchTags;
// catches' destination blocks
ArenaVector<Name> catchDests;
// true for catch_ref and catch_all_ref
ArenaVector<bool> catchRefs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be simpler to define a new type (maybe a std::variant, but not necessarily) that describes the different kinds of catches and have just a single ArenaVector<Catch>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but can I describe that in wasm-delegations-fields.def?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I guess that would be the downside. We would need a special entry in wasm-delegations-fields that would only be used for this instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then can we keep the current implementation? That way might be simpler overall..

} else if (auto* tt = expr->dynCast<TryTable>()) {
for (Index i = 0; i < tt->catchTags.size(); i++) {
auto dest = tt->catchDests[i];
if (dest == name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this comparing to name? The name should only be provided to func (which can modify it) IIUC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I think I was confused. Fixed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I think the original code was correct... name given here is a block label, and we should run func only when the name is the label we are targeting. For example:

(block $l-catch-ref (result i32 exnref)
  ...
  (block $l-catch (result i32)
    ...
    (try_table (catch $e-i32 $l-catch) (catch_ref $e-i32 $l-catch-ref)
      ...
    )
  )
)

In this case, when analyzing TryTable, we shouldn't send both i32 and (i32 exnref) to a given name, which will mean the given name has two different types. We send the type only when the given name matches its sentType's label.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks. So this is called multiple times for a single TryTable, once per name, and we need to handle the right name each time.

Maybe worth handling TryTable outside of the operateOnScopeNameUses to avoid the potentially quadratic work? But if we think that will be incredibly rare maybe just a comment is enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though theoretically there can be any number of catches, in practice there is no reason to use more than two (one for one of catch/catch_ref and the other for one of catch_all/catch_all_ref), unless you are trying to catch multiple language tags. And in LLVM we don't even generate two catches: We only generate one catch or catch_all for a single try due to the structure of Windows EH IR, and I think this will be also the case for the new spec. Even though we assume a non-LLVM multi-language handling case I doubt this will go more than 3.

} else if (auto* tt = expr->dynCast<TryTable>()) {
// The values are supplied by throwing instructions, so we are unable to
// know what they will be here.
func(name, nullptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, nullptr meant "no value" but now it can also be "there is a value, but it is not known statically", which worried me initially but after reading the uses in Heap2Local and possible-contents.cpp that looks ok. IIUC these branches are from throws which we can't identify statically (and might even be in another function). Please update the comment on line 92, though, to document this API change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -715,6 +720,11 @@ class EffectAnalyzer {
if (parent.tryDepth == 0) {
parent.throws_ = true;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to have truncated Rethrow's handling?

Copy link
Member Author

@aheejin aheejin Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that was intentional. So current rethrow doesn't have an argument that can be null, so we shouldn't have set implicitTrap. I think this code was added when we had the exnref and rethrow had an exnref as an argument (pre-Sep 2020), and then not deleted after we switched to the version of rethrow that takes an immediate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks! Makes sense.

@@ -1165,6 +1169,10 @@ struct InfoCollector
}
}
void visitRethrow(Rethrow* curr) {}
void visitThrowRef(ThrowRef* curr) {
// TODO: optimize when possible
addRoot(curr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unreachable, like rethrow? So there is no value to add a root here for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it, thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see addRoot here - maybe a change wasn't pushed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, sorry, I had to refresh...

const std::vector<Name>& catchTags,
const std::vector<Name>& catchDests,
const std::vector<bool>& catchRefs,
Type type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could unify pairs of functions like this using std::optional<Type> type = std::nullopt? Anyhow, that's separate from this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try that in a follow-up.

@@ -1953,19 +1954,40 @@ void BinaryInstWriter::visitTry(Try* curr) {
emitResultType(curr->type);
}

void BinaryInstWriter::visitTryTable(TryTable* curr) {
// the binary format requires this; we have a block if we need one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment relates to line 1975, I think? Perhaps move it down if so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I copy-pasted it from visitIf and left it in a wrong place....


bool TryTable::hasCatchAll() const {
return std::any_of(
catchRefs.begin(), catchRefs.end(), [](bool v) { return v; });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, isn't an empty (not a full) catchTag (not catchRef) what indicates a catch_all? That's what I understood from wasm.h at least - did I get it wrong?

Copy link
Member Author

@aheejin aheejin Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, good catch... Don't know what I was thinking. (By the way it is not currently being used anywhere. I added the function mainly because Try has it too and it will be necessary in analyses in future.)

@@ -40,7 +100,8 @@
;; CHECK-BIN-NEXT: (local.get $exn)
;; CHECK-BIN-NEXT: )
;; CHECK-BIN-NEXT: )
(func $exnref-nullexnref-test (result exnref) (local $exn exnref) (local $null-exn nullexnref)
(func $exnref-nullexnref-test (result exnref)
(local $exn exnref) (local $null-exn nullexnref)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation looks odd?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to put that in the separate line because the previous line got too longer... but maybe this looks weirder? Put that back to the previous line. (which now exceeds 80 col, but we don't have policies on column length on tests anyway..)

@aheejin
Copy link
Member Author

aheejin commented Dec 16, 2023

@tlively

Looks like this is missing a change to module-utils.cpp to ensure the signature type is included in the type section.

Which type do you mean? try_table contains an inner block, like loop or if, so the try_table's type will be taken care of by the block. And tag types are taken care of by tags themselves. Do I need to add tuple types like (i32 exnref)? Currently exception-handling.wast prints types fine without changes to module-utils.cpp. Am I missing something?

@tlively
Copy link
Member

tlively commented Dec 17, 2023

It should be possible for the body to be a single non-block instruction (like unreachable), in which case there might not any other expression responsible for getting the signature into the module.

I forgot that we have a generic code path that works for all control flow structures, though, so I don’t think you need to change anything there after all.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comment on name in branch-utils

aheejin added a commit to aheejin/binaryen that referenced this pull request Dec 19, 2023
Using `std::optional<Type>`, this unifies two different versions of
`make***`, for  block-like structures (`block`, `if`, `loop`, `try`, and
`try_table`) with and without a type parameter.

This also allows unifying of `finalize` methods, with and without a
type. This also sets `breakability` argument of `Block::finalize` to
`Unknown` so we can only have one `Block::finalize` that handles all
cases.

Suggested in
WebAssembly#6181 (comment).
aheejin added a commit to aheejin/binaryen that referenced this pull request Dec 19, 2023
As suggested in
WebAssembly#6181 (comment),
using `std::optional<Type>`, this unifies two different versions of
`make***`, for  block-like structures (`block`, `if`, `loop`, `try`, and
`try_table`) with and without a type parameter.

This also allows unifying of `finalize` methods, with and without a
type. This also sets `breakability` argument of `Block::finalize` to
`Unknown` so we can only have one `Block::finalize` that handles all
cases.

This also adds an optional `std::optional<Type> type` parameter to
`blockify`, `blockifyWithName`, and `makeSequence` functions in
`wasm-builder.h`.
aheejin added a commit to aheejin/binaryen that referenced this pull request Dec 19, 2023
As suggested in
WebAssembly#6181 (comment),
using `std::optional<Type>`, this unifies two different versions of
`make***`, for  block-like structures (`block`, `if`, `loop`, `try`, and
`try_table`) with and without a type parameter.

This also allows unifying of `finalize` methods, with and without a
type. This also sets `breakability` argument of `Block::finalize` to
`Unknown` so we can only have one `Block::finalize` that handles all
cases.

This also adds an optional `std::optional<Type> type` parameter to
`blockify`, `blockifyWithName`, and `makeSequence` functions in
`wasm-builder.h`.
@aheejin
Copy link
Member Author

aheejin commented Dec 19, 2023

@tlively Do you have any more concerns?

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, LGTM

@aheejin aheejin merged commit 4c53361 into WebAssembly:main Dec 19, 2023
14 checks passed
@aheejin aheejin deleted the new_eh_inst branch December 19, 2023 19:10
aheejin added a commit to aheejin/binaryen that referenced this pull request Dec 19, 2023
As suggested in
WebAssembly#6181 (comment),
using `std::optional<Type>`, this unifies two different versions of
`make***`, for  block-like structures (`block`, `if`, `loop`, `try`, and
`try_table`) with and without a type parameter.

This also allows unifying of `finalize` methods, with and without a
type. This also sets `breakability` argument of `Block::finalize` to
`Unknown` so we can only have one `Block::finalize` that handles all
cases.

This also adds an optional `std::optional<Type> type` parameter to
`blockify`, `blockifyWithName`, and `makeSequence` functions in
`wasm-builder.h`.
aheejin added a commit to aheejin/binaryen that referenced this pull request Dec 19, 2023
As suggested in
WebAssembly#6181 (comment),
using `std::optional<Type>`, this unifies two different versions of
`make***`, for  block-like structures (`block`, `if`, `loop`, `try`, and
`try_table`) with and without a type parameter.

This also allows unifying of `finalize` methods, with and without a
type. This also sets `breakability` argument of `Block::finalize` to
`Unknown` so we can only have one `Block::finalize` that handles all
cases.

This also adds an optional `std::optional<Type> type` parameter to
`blockify`, `blockifyWithName`, and `makeSequence` functions in
`wasm-builder.h`.
aheejin added a commit that referenced this pull request Dec 21, 2023
As suggested in
#6181 (comment),
using `std::optional<Type>`, this unifies two different versions of
`make***`, for  block-like structures (`block`, `if`, `loop`, `try`, and
`try_table`) with and without a type parameter.

This also allows unifying of `finalize` methods, with and without a
type. This also sets `breakability` argument of `Block::finalize` to
`Unknown` so we can only have one `Block::finalize` that handles all
cases.

This also adds an optional `std::optional<Type> type` parameter to
`blockifyWithName`, and `makeSequence` functions in `wasm-builder.h`.
blockify was not included because it has a variadic parameter.
radekdoulik pushed a commit to dotnet/binaryen that referenced this pull request Jul 12, 2024
This adds basic support for the new instructions in the new EH proposal
passed at the Oct CG hybrid CG meeting:
https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-10.md
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md

This mainly adds two instructions: `try_table` and `throw_ref`. This is
the bare minimum required to read and write text and binary format, and
does not include analyses or optimizations. (It includes some analysis
required for validation of existing instructions.) Validation for
the new instructions is not yet included.

`try_table` faces the same problem with the `resume` instruction in
 WebAssembly#6083 that without the module-level tag info, we are unable to know the
'sent types' of `try_table`. This solves it with a similar approach
taken in WebAssembly#6083: this adds `Module*` parameter to `finalize` methods,
which defaults to `nullptr` when not given. The `Module*` parameter is
given when called from the binary and text parser, and we cache those
tag types in `sentTypes` array within `TryTable` class. In later
optimization passes, as long as they don't touch tags, it is fine to
call `finalize` without the `Module*`. Refer to
WebAssembly#6083 (comment)
and WebAssembly#6096 for related discussions when `resume` was added.
radekdoulik pushed a commit to dotnet/binaryen that referenced this pull request Jul 12, 2024
As suggested in
WebAssembly#6181 (comment),
using `std::optional<Type>`, this unifies two different versions of
`make***`, for  block-like structures (`block`, `if`, `loop`, `try`, and
`try_table`) with and without a type parameter.

This also allows unifying of `finalize` methods, with and without a
type. This also sets `breakability` argument of `Block::finalize` to
`Unknown` so we can only have one `Block::finalize` that handles all
cases.

This also adds an optional `std::optional<Type> type` parameter to
`blockifyWithName`, and `makeSequence` functions in `wasm-builder.h`.
blockify was not included because it has a variadic parameter.
@gkdn gkdn mentioned this pull request Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants