-
Notifications
You must be signed in to change notification settings - Fork 57
Guaranteeing additional soundness properties #128
Comments
This appears to signal a misunderstanding of the role of interface types. It is the function of the import & export adapters to match whatever the underlying function is/expects to the type of the IT function. There is inherently a very close connection between a core Wasm function export (say) and its IT export adapter: they will likely be generated by the same tool. |
I meant for that sentence to just say that a function's import signature in one module is the same as its export signature in another module. Can you elaborate on the problem you see or suggest a clearer wording? To illustrate the example concretely, consider this setup: (adapter_module $A
;; This is going to be imported from module B
(import "foo" (adapter_func $foo (result (expected i32))))
;; Lower (expected i32) to a pair of i32s, where the first signals the case
(adapter_module $ADAPTER
(import "foo" (adapter_func $originalFoo (result (expected i32))))
(adapter_func $lower_ok (param $x i32) (result i32 i32)
i32.const 0 ;; ok
rotate 1 ;; set up to return (0, $x)
)
(adapter_func $lower_err (result i32 i32)
i32.const 1 ;; error!
i32.const 0 ;; arbitrary value for the error case
)
(adapter_func (export "foo") (result i32 i32)
adapter_call $originalFoo
variant.lower (expected i32) $lower_ok $lower_err
)
)
(adapter_instance $adapter (instantiate $ADAPTER (adapter_func $foo)))
(module $CORE
(import "foo" (func $foo (result i32 i32)))
...
)
(instance $core (instantiate $CORE (adapter_func $adapter.$foo)))
...
)
(adapter_module $B
(module $CORE
(event $error (attr 0))
(func $foo (export "foo") (result i32) ;; might throw $error
...
)
)
(instance $core (instantiate $CORE))
;; Lifting functions are trivial
(adapter_func $lift_ok (param i32) (result i32))
(adapter_func $lift_err)
;; Converts $error exceptions or successful returns to (expected i32)
(adapter_func (export "foo") (result (expected i32))
try (result (expected i32))
call $core.$foo ;; returns an i32 or throws
variant.lift (expected i32) $ok $lift_ok
catch $error
variant.lift (expected i32) $error $lift_error
end
) Here the function in question is Today, the author of module A needs to trust the author of module B to have correctly caught and converted all possible exceptions into the error case of |
This is an interesting discussion, and I think I agree with the general thrust. But if I may, I have two nits about the way the issue is phrased, which make it somewhat ambiguous to me what exactly it means (esp wrt points 4 and 5). :) One is a terminology nit: soundness is not something a program or module or programmer can request or enforce. Soundness is a meta-level correctness property of a language semantics (more specifically, its type system). It's something that either holds or doesn't hold for a language as a whole. It is not some property of an individual program or module. That's a different universe. On the module/interface level, we rather have, as a generalisation of types, the notion of contracts that a program may want to be able to express. That is only related to soundness in so far that a language may have means to enforce certain contractual properties, and if it does, the program is indirectly relying on the soundness of that language (if stated in a suitable manner) for doing so correctly. For example, imagine interface types were to support exception annotations. Then a program can declare the expectation to not receive any exceptions -- that declares a form of contract (via its type), not a soundness property. And the adaptor semantics of interface types could/should enforce that, as you describe. Then soundness would be the meta-level property that it enforces it correctly, not just for this but for all possible declarations. And my second point is that this example would, in fact, be a type soundness property, because exception annotations are part of the type system in that case -- types are more than just descriptions of values! (Sorry for the pedantic tangent.) So if I understand correctly, the issue is suggesting both a more rigid semantics for interface types and for linking adapter modules (establishing IT's own soundness property that may be stronger than that of the expanded Core Wasm's), and possible extensions with more features (like exception annotations). And I interpret the opt-out capability as the ability pick appropriate interface types at a developer's discretion. Is that the idea? |
Sorry about the imprecise language :)
I could certainly have been more precise about this (and I would welcome help making it more precise), but my general idea was that IT should be able to help languages stay sound in more ways when linked with untrusted modules. If I were to state this totally formally in terms of source languages, I would probably borrow ideas from the literature on secure modular compilation. (this paper is the one I'm most familiar with). So by "optionally guaranteeing" a soundness property, I really mean that different source languages will have different soundness properties and IT should help them enforce those different properties in the face of untrusted modules without being overly restrictive.
Good point :)
Yes, it sounds like your understanding is spot on 👍 |
Thanks for the nice clarification on soundness vs. contracts, @rossberg! #99 starts with a useful question:
My takeaway from that discussion is that Interface Types is an interfacing mechanism. However, one of its main challenges was to be an interfacing mechanism that can maintain the existing isolation properties that WebAssembly already provides. That is, WebAssembly already guarantees the isolation property that if a module instance does not export its memory then no other module instance (even of the same module) can read from or write to that memory except indirectly through its exported functions, so Interface Types worked hard to figure out a way to enable copying from one module's memory to another's without either module having to directly expose its memory to the other. This was achieved by developing a language of adaptor functions that translate/from to a middle interface type in such a manner that the middle can be cut out and replaced with a direct copy. So the higher-level issue here is what to do about control isolation. At present (putting aside traps), WebAssembly ensures control isolation. But there are four proposals (two of which I have helped author), that would violate this property. I don't think that it's necessary that they violate this property—it was just something that everyone (include me) ran with. But we know a number of applications will need control isolation, and the ecosystem that Interface Types was trying to facilitate in general seems to expect control isolation. The suggestion in this issue seems to be to use dynamic mechanisms to make up for the lack of static isolation. So let's consider what that would look like. One thing to consider is that relying on dynamic mechanisms means that WebAssembly programs wanting isolation would need to be recompiled every time a new unisolated construct gets added. This is true even if Interface Types would automatically insert dynamic mechanisms, because people will want to precompute the result of linking two modules along an adapter bounder, and that precomputation will bake in the dynamic mechanisms available at the time. (And yes, even if those modules don't use the new control constructs, their other imports/exports might, which will make the difference visible.) In other words, if the default contract for an So let's consider what sharing a list via Interface Types looks like without control isolation. Remember that the adapter functions in the lift instruction are executed lazily; they wait until a matching lower instruction is executed. At that point, the destructor is dropped (since the list was used) and the engine synthesizes a loop that passes values back and forth between the lifter and lowerer. Note that, while this loop runs on the lowerer's stack, the lifter's adapters are not above the lowerer's adapters. So the lowerer can react to (or guard against) control interference by the lifter, but the lifter has no way to react to (or guard against) control interference by the lowerer. This means that, if the Interface Types ecosystem wants to have control isolation, then the Interface Types engine, rather than the application-provided adapters, must be the one to dynamically enforce control isolation. So what does that entail? To preserve module compositionality, it seems to mean that all adapter functions must dynamically enforce control isolation. The best way to do that seems to be to automatically insert the various dynamic guards around calls to On another end, the need to have dynamic guards is itself problematic for control features we might want to add. For example, many languages target C's setjmp/longjmp rather than C++'s exceptions because longjmp is a constant-time operation whereas C++ exceptions are linear time as they have to search the stack (and in a notoriously slow manner). That benefit of longjmp is lost if one has to search the stack just to see if there is a dynamic mechanism guarding against the longjmp somewhere on the stack. So this need to dynamically guard against control interference turns a constant-time operation into a slow linear-time operation. To summarize, WebAssembly so far has been designed with an isolation-by-default and pay-as-you-go design philosophy. The reasons are that it is hard to recover isolation once you give it away and making features pay-as-you-go facilitates extensibility. It seems to me that this same philosophy should apply to control as well for the same reasons. The above are examples of how these reasons have pragmatic significance for specifically Interface Types. I think it's worth considering how we can enable control isolation even in a heterogeneous environment, while also facilitating letting modules share control when they want, and based on some preliminary exploration I suspect it's easier to do than one might think. |
As you point out, some dynamic mechanisms like
This is true in general, but I expect that in practice static adapter fusion will be performed mostly by bundler-type tools that have access to the entire module graph of an application. In that case, the caveat that unknown modules in the application may use new control constructs does not apply because there are no unknown modules. If folks do want to apply static adapter fusion to collections of modules that still have additional imports and exports, they can apply static adapter fusion to all the internalized interfaces but keep an IT wrapper around the externalized interfaces so that the whole bundled package remains forward-compatible.
I agree with this, but I don't think it's a problem as long as |
First of all, thanks for the very thoughtful writeup @tlively! I pretty much agree with what you're saying. In particular, points 2 and 3 make sense to add to the Explainer.md right now, expanding on what we mean by saying that IT is "layered" on top of core wasm. (I can take an action item to add that.) For points 1, 4 and 5, the current plan is to write a new
Explicitly calling out design points 1, 4 and 5 in So a natural question is: if Interface Types says that you must convert your exceptions into variant returns, what does it say you do for effects? My current thinking is that the "analogous" conversion is to say that, if a component export wants to suspend its own stack in a way visible to callers, the component should return, as a plain return value, the continuation as an Interface-Typed reference to an abstract type (encapsulating the continuation/stack). To resume the effect, the component would also export adapter functions taking the abstract continuation references and the effect's params. Thus, to the component's client, this looks like a sort of iterator return value. I think this is appropriate for the same reason that variant returns are appropriate: it's a first-order concept that maps into languages with no notion of effects (i.e., most of them). Yes, this may induce extra stack-cutting at component boundaries, but:
There is one exception, though: I think it's important for a parent component (which adapts the imports and exports of a child component) to be able to "tunnel" through the child component (suspending in the import adapter and handling in the export adapter) as long as it doesn't reenter (via export) the unknowingly-suspended child. Why? This is critical for allowing a parent to, e.g., implement sync WASI APIs on top of underlying async (Web) APIs. So how do we distinguish tunneling from uncaught effects? With components, we can make the parent/child relationship semantically defined by who (Btw, here, I'm using "effect" terminology, but I think all these same concepts can map into stack switching as long as the stack switching either stays entirely inside a component or invisibly tunnels through a component. Stack switching that more-arbitrarily cuts through component boundaries seems problematic for the same reason as as arbitrary uncaught effects.) Lastly, I'll +1 @tlively's preceding comment: the result of adapter fusion should never go "stale" in a way that causes the problems @RossTate raised above: fusion should be an invisible implementation detail, as if there was no fusion, only lazy evaluation semantics. Thus, when core wasm semantics grows a new feature, the IT layer should be able to simultaneously grow a new contract regarding that new core feature at component boundaries. |
Thanks for the cool thoughts on components, @lukewagner! Interestingly, the exception you say you need isn't actually necessary for your use case. It's possible to design a static control-isolation system that can let the outer module run the inner module entirely on a separate stack while still guaranteeing to the inner module that its control invariants are respected. To ensure these control invariants, it's important that the inner module be run on one single stack, rather than have each call into it potentially be on a separate stack, and from preliminary investigations its not hard to make a rather simple static system that can ensure this. As you say, there's much more to say on the topic, but it seemed useful to mention the idea here while it's on topic. (That all said, the concept of ownership may have other applications, including pertaining to control, so it's useful to explore regardless of this observation.)
@tlively This seems to me to put the design in an odd space. The infrastructure necessary to isolate control for constructs with expensive dynamic mechanisms is the exact same infrastructure you would use to isolate control for constructs with inexpensive dynamic mechanisms. The constructs with expensive dynamic mechanisms seem to include |
@lukewagner I'll be interested to hear about why it is useful to draw a distinction between
Does the infrastructure you're thinking of here involve augmenting the core WebAssembly type system to statically enforce control isolation? If so, it's different from the analogous infrastructure in the EH proposal. I'm trying to find a consistent design position that uses the EH proposal as-is and also generalizes well to upcoming mechanisms. |
The only way to avoid inserting dynamic guards around calls into
4 of the 6 instructions in the EH proposal exist solely due to lack of control isolation. Of the two remaining instructions, one could be made much simpler if control isolation were guaranteed (and easier to generate and implement), and the other could be implemented much more efficiently if control isolation were guaranteed. Reading through the issues even long before I got involved, many of them are about dealing with lack of control isolation, including the switch from the first proposal to the second proposal. Plus, EH is extremely easy to statically enforce, and the infrastructure will need to be there anyways if we don't want performance hits due to adding |
@RossTate Great point; definitely interested to dig more into how that could work. |
After a few recent conversations with @RossTate and @fgmccabe where this came up, I wanted to share my thoughts about how I see the relationship between Interface Types and core WebAssembly and see if it is in line with how other folks see that relationship. If so, I hope that making these design principles explicit in the IT explainer will help guide the design of stack switching and other proposals that create new risks at coordination boundaries.
Optionally guaranteeing type soundness
An important feature of core WebAssembly is that its type system is sound. It guarantees that if a function expects to receive an i32 as an argument, it will never receive some other type. Most of this soundness is guaranteed by the static type system, but sometimes there need to be dynamic checks to ensure soundness. For example for indirect calls, soundness is ensured by runtime checks that trap if type soundness would be violated. However, there are many useful soundness properties that core WebAssembly does not and cannot enforce.
Consider a module that uses the pair of i32s (address, length) to represent strings in memory and another module that uses the pair of i32s (start address, end address) to represent strings in memory. When these modules call into each other, core WebAssembly cannot guarantee the soundness property that each module receives strings in the format it expects. Interface Types, however, is able to provide this guarantee. Note that IT does not necessarily provide this guarantee, though — the module authors could have chosen to continue exposing their string functions as taking pairs of i32s. Interface Types is therefore able to optionally guarantee this type soundness property for strings (and other IT types) at the module author's discretion.
Other soundness properties
Consider a language that exclusively uses return values to propagate error conditions. The IT interface of that language's modules will include in the return types of its imported functions the error conditions it handles. The exported functions of modules it links with must include those same error conditions in the return types of their exported functions. If those linked modules internally use the EH proposal to signal error conditions, their export adapters would be responsible for catching those exceptions and turning them into the error codes specified in the IT signature. Today there is no way for IT to guarantee that the export adapters correctly do so.
The specific soundness property in question that the first module wants to enforce is that an exception will never propagate into it. This property is desirable because all error conditions should be made explicit in the interface types signature of the module and none should be able to sneak over the boundary via other means. In this specific case, IT could enforce the soundness property dynamically by inserting a
catch_all
andtrap
as part of the adapter fusion algorithm.This soundness property is not a type soundness property per se, but IT still seems like the perfect place to enforce it. I propose that we expand IT's mandate to cover additional soundness properties beyond type soundness (starting with the example above) according to the following principles.
Design principles
To the extent possible, soundness properties involving toolchain coordination boundaries are enforced at the IT layer, not the core WebAssembly layer.
IT should always lower to core WebAssembly, so all the primitives IT needs to dynamically enforce its supported soundness properties should be specified in core WebAssembly.
IT lowering should not require rewriting arbitrary WebAssembly code in the participating modules.
Module authors should be able to control the soundness properties they are enforcing. (Otherwise IT would be too inflexible, as noted in Interface Types and Isolation? #99)
To the extent possible, all cross-module interactions should be opt-in (i.e. all supported soundness properties should be opt-out) at the IT layer so that module authors don't have to worry about unwanted interactions with future proposals.
The only exceptions to (1) should be when it is incompatible with WebAssembly's other goals, such as if the property could not be enforced solely at the IT layer without an unacceptable performance overhead.
The only exceptions to (5) should be to introduce support for a new soundness property without a corresponding new core WebAssembly feature, since making such properties opt-out would be a breaking change. This should be avoided by making the identification and standardization of relevant IT soundness properties part of the design and standardization of core WebAssembly proposals.
The text was updated successfully, but these errors were encountered: