-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rtt.fresh will not provide data abstraction #178
Comments
Yes, you are right! Thanks for the example. Though fixing this through nominal typing would only hack around the symptoms. The deeper reason for this problem actually is one that has always bothered me about the current design, namely that The only way to avoid the parametricity breach is to make rtt.canon compositional and require explicit RTT operands for each subcomponent type -- or at least for the ones that are not statically transparent. That would prevent your example, because you'd not be able to construct the RTT for the struct without explicitly providing the RTT for $A, and presumably there would be only one (or even none) available. |
I can employ the same technique using As for your fix to Note that (assuming the above fixes are made), |
@rossberg Can you indicate if you are planning to make the necessary changes to the relevant proposals to avoid the need for nominal types here? |
Somebody pointed out to me that there isn't necessarily an issue with this. The purpose of generative RTTs would be to enable compilers to piggy-back Wasm casts to implement certain language-level casts. It's not to achieve cross-language encapsulation. That would be asking for a full abstraction property, which is not something Wasm has ever supported in general -- certainly not when you're using linear memory, and it's not been a goal of the GC proposal to magically provide that either. Safe encapsulation is a separate feature, and is currently proposed to be provided by private types as in the type import/export proposal. Of course, there is an argument to be made that both these use cases could potentially be served by a single feature, which is why I have held back on adding generative RTTs so far. However, that would require significant complications to private types. That seems unwise before we have some practical experience with them. (I'm still concerned about the more general problem of lack of parametricity for type imports, though. We could restrict call_indirect to fix that, like I think you suggested at some point. But then we'd need RTTs to work around that restriction, which are currently nicely delimited to the GC part of the language.) |
Thanks for responding. But the response seems to mischaracterize the problem, the common practices, and the alternative you have provided. Runtime-cost-free data encapsulation is standard in (the memory-safe subsets of) typed languages and VMs. It is regularly used for security (keeping secrets and preventing forgery) and for preventing unwanted dependencies on implementation specifics (i.e. abstraction). It is also something that JavaScript does not make so easy/cheap, and so is a runtime-cost-free way to make WebAssembly a preferred target over JavaScript. It is a severe mischaracterization to suggest it requires any "magic". More accurately, it is the highly non-standard "magic" features that you have insisted WebAssembly support that create the problem in the first place, features such as As for "private types", that feature is also non-standard, as other languages/VMs have safe encapsulation by default. It is a patch over a problem you have created. (And, no, it is not at all like So how would you like to proceed? The options raised so far seem to be:
|
If one is allowed to carve out a safe language subset (I assume this is something like the JVM without reflection), would it be equally legitimate to say that Wasm (edit: with |
The parenthetical was carving out things that generally exit the language/VM through means that are unsafe and/or only permitted in trusted settings, i.e. backdoors in trusted settings. Examples are OCaml's |
In a hypothetical scheme where a nominally typed source language is implemented using As @rossberg said, there is still the question of whether |
Full abstraction might be overkill, but it has always been possible to ensure confidentiality and integrity of data in a WebAssembly module by choosing to only export a carefully designed interface. It would be a shame if the expressiveness of module interfaces designed to preserve confidentiality and integrity lagged behind the expressiveness of the full language, especially given how important these security properties are to the most ambitious visions of WebAssembly's future. |
As an example, with the features that @rossberg has laid out, an OO language using separate compilation seems to be unable to prevent modules from monkey-patching the method implementations in v-tables/i-tables. Even with reflection and all security restrictions turned off, the JVM and CLI ensure that modules cannot change method implementations of classes they do not define. The JS API prevents even untyped JS from accessing non-imported/exported memory and globals of module instances. It seems inconsistent to not prevent even typed wasm code from accessing non-imported/exported fields of GC references (where even the |
@RossTate I would have thought that the relevant v-table fields would be declared immutable. Is there a reason this doesn't work? EDIT: I suppose, depending on the implementation of interfaces, there may be an issue in that we currently lack immutable arrays? |
With the initialization plans @rossberg laid out in #189 (comment), you cannot have immutable v-table fields (or i-table arrays) in the presence of separate compilation. In separate compilation, when you create a v-table for a new class, you first pass that v-table to the initializer of your superclass. Since that initializer can be in another module (or since your own class might be extended by another module), v-table initialization will need to be done by Addendum: The fact that these details matter is illustrative of how weak the encapsulation properties of the (Post-)MVP are. There are other reasons why v-tables will want to be mutable (but only by the module that created the v-table), such as on-demand loading (in which case the v-table will filled with stubs that get replaced after the dependent code is loaded) or JITing (wherein the v-table needs to be updated with the more optimized/specialized implementation). |
But that is still the case. Exporting a non-private reference to an untrusted party is not safe, just like exporting your memory isn't. If you want to maintain confidentiality, don't do either. To pass a reference to an untrusted party while maintaining confidentiality, you would use a private type.
Care to give an example of a cost-free encapsulation mechanism in a mainstream VM? I'm not aware of any. You have to wrap the data into either an object or a closure, and both are a long shot from being free. And in the JVM, this isn't even safe, since it can be circumvented by reflection. The CIL has trust levels to control that, but it's a global-ish mechanism that doesn't provide per-object encapsulation properly. Private types are equivalent to wrapping the data into an object, but without the loopholes. |
The JS API for WebAssembly ensures that JS cannot access non-exported fields (e.g. globals, memories, tables) of wasm modules even though JS is untyped, has a direct reference to the instance object, and is the higher-privileged (e.g. can catch traps) embedding language for wasm. OCaml has cost-free encapsulation except for the one feature that also lets you convert an arbitrary integer into an address. SML and Haskell have cost-free encapsulation. The CLI and the JVM have cost-free encapsulation with the appropriate security settings and/or declarations of attributes.
I don't find pointing to weaknesses in other systems to be a particularly compelling argument, especially when those weaknesses exist solely to support other features that we do not provide (such as reflection-based meta-programming libraries). Furthermore, as mentioned above, even with reflection you cannot change the method implementations in v-tables in the JVM/CLI, but you can in the MVP. So you're essentially justifying weaknesses of the MVP by pointing out bad properties of other systems, then throwing away the advantages that were why those bad properties were present in those systems to begin with, discarding the complementary threat-mitigating features for untrusted settings, and making the weaknesses of those bad properties even worse.
As discussed in the OP, this is not true. Private types do not respect subtyping. To be clear, it is completely possible to design the MVP such that Java/C#/Kotlin compiled-to-wasm modules can share direct references to their objects with other wasm modules and even with JS without those other modules being able to access private fields or mutate v-tables without any run-time cost, in a way that respects subtyping, and in a way that supports separate compilation and class inheritance. The compiled-to-wasm modules can even control what is accessible/mutable through reflection mechanisms in the compiled-to-wasm language runtime. All it takes to have this is to restrict the overpowered encapsulation-breaking features of the MVP so that that they are no longer encapsulation-breaking. We know from other systems that these restrictions are still sufficiently expressive. Can you explain why you are opposed to this? |
Let's separate user-facing languages from VMs here. For the latter, e.g. CLI and JVM, I have no idea what mechanism you are referring to. Objects certainly aren't free. For the former, I'm not sure what your point is. Cost-free encapsulation in languages like SML, OCaml and Haskell relies on their use of a uniform representation. There are also no casts in these languages (ignoring unchecked Obj.magic and friends), so there is no tangible relation to RTTs. If your concern is compilation of these languages to Wasm, then that can be mapped to GC types just fine. That does not require full abstraction, no more than compiling C to linear memory does. If your concern is cross-language interop, then we are effectively talking about a form of FFI (from the perspective of a single language) at that point. That typically uses different mechanisms and representations, and private types serve that.
You made a claim about systems like it, and this refutes it, so shrug?
Yes, but a completely different use case. What cross-language(!), FFI-level uses of subtyping do you envision? I can imagine some, but nothing desperately needed in the MVP. It's not clear whether relying on subtyping would be a good idea in a language-agnostic interface anyway. |
Sorry for being late to the discussion, but we seem to be going backwards. I gave a presentation some months back on Jawa, my prototype of running Java with separate compilation and late binding on top of WebAssembly. One of the clear findings of that line of research is that late binding of lowered code just doesn't work. I think this issue has veered off topic with mutable v-tables and monkey-patching across module boundaries. Even with data abstraction across modules with proper RTTs, this is still going to be a problem because you've fundamentally exposed implementation details that should not be exposed. It's inescapable that you need to defer lowering until link time. |
I'm closing this issue for now because it does not seem actionable at this time. If we do discuss bringing back generative RTTs, we should be mindful of relevant past discussions such as this one, though. |
Many programs want to be able share references without exposing all the fields contained in those references. Typically this is done by exporting the type of those references abstractly or "opaquely". But with
rtt.canon
, another module can guess the type of those references (say by looking at the module implementation) and downcast them, thereby gaining access to all their fields. Things likertt.fresh
and thedata
restriction were added to prevent this, but I've figured out how to dodge those measures.Suppose you have a reference of some abstract imported type, i.e.
ref $A
. You want to get its contents and have reason to believe$A
ist
. Here is what you do:(Note that you can do the reverse to forge references as well.)
What will it take to fix this so that WebAssembly can provide the sort of data abstraction that is standard in other VMs? Unfortunately the notion of
private
types that was added to the Type Imports proposal does not interact well with casting. That is, when you export a class hierarchy "C extends B extends A" from an OO module, you want others to be able to cast between these nominal types—you just don't want others to be able to cast them to their hidden structural types. As such, you can't simply wrap a struct with C'sprivate
type as that won't be castable to B or A's private types. So we'd have to go further and allow modules to define hierarchies ofprivate
types, as well as extend the hierarchies of other modules with additionalprivate
types.This seems to suggest that WebAssembly needs a nominal static type system regardless of whether it has a structural static type system.
The text was updated successfully, but these errors were encountered: