Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

call_indirect versus abstraction #1343

Open
RossTate opened this issue May 7, 2020 · 36 comments
Open

call_indirect versus abstraction #1343

RossTate opened this issue May 7, 2020 · 36 comments

Comments

@RossTate
Copy link

RossTate commented May 7, 2020

call_indirect has been a very useful feature for WebAssembly. However, the efficiency and good behavior of the instruction has implicitly relied on the simplicity of wasm's type system. In particular, every wasm value has exactly one (static) type it belongs to. This property conveniently avoids a number of known problems with untyped function calls in typed languages. But now that wasm is extending beyond numeric types, it has gotten to the point where we need to understand these problems and keep them in mind.

call_indirect fundamentally works by comparing the caller's expected signature against the callee's defined signature. With just numeric types, WebAssembly had the property that these signatures were equal if and only if a direct call to the function referenced by the corresponding funcref would have type-checked. But there are two reasons that will soon not be true:

  1. With subtyping, a direct call would work so long as the actual function's defined signature is a "sub-signature" of the expected signature, meaning all the input types are subtypes of the function's parameter types and all the output types are supertypes of the function's result types. This means that an equality check between an indirect call's expected signature and the function's defined signature would trap in a number of perfectly safe situations, which might be problematic for supporting languages with heavy use of subtyping and indirect calls (as was raised during the discussion on deferring subtyping). It also means that, if a module intentionally exports a function with a weaker signature than the function was defined with, then call_indirect can be used to access the function with its private defined signature rather than just its weaker public signature (an issue that was just discovered and so has not yet been discussed).
  2. With type imports, a module can export a type without exporting the definition of that type, providing abstraction that systems like WASI plan to heavily rely upon. That abstraction prevents other modules from depending on its particular definition at compile time. But at run time the abstract exported type is simply replaced with its definition. This is important, for example, for enabling call_indirect to work properly on exported functions whose exported signatures reference that exported type. However, if a malicious module knows what the definition of that exported type is, they can use call_indirect to convert back and forth between the exported type and its intended-to-be-secret definition because call_indirect only compares signatures at run time, when the two types are indeed the same. Thus a malicious module can use call_indirect to access secrets meant to be abstracted by the exported type, and can use call_indirect to forge values of the exported type that may violate security-critical invariants not captured in the definition of the type itself.

In both of the above situations, call_indirect can be used to bypass the abstraction of a module's exported signature. As I mentioned, so far this hasn't been a concern because wasm only had numeric types. And originally I thought that, by deferring subtyping, all concerns regarding call_indirect had also effectively been deferred. But what I recently realized is that, by removing subtyping, the "new" type (named externref in WebAssembly/reference-types#87) is effectively a stand-in for an abstract type import. If that's what people would like it to actually be, then unfortunately we need to take into consideration the above interaction between call_indirect and type imports.

Now there are many potential ways to address the above issues with call_indirect, but each has its tradeoffs, and it is simply much too large a design space to be able to come to a decision on quickly. So I am not suggesting that we solve this problem here and now. Rather, the decision to be made at the moment is whether to buy time to solve the problem properly with respect to externref. In particular, if we for now restrict call_indirect and func.ref to only type-check when the associated signature is entirely numeric, then we serve all the core-wasm use cases of indirect calls and at the same time leave room for all the potential solutions to the above issues. However, I do not know if this restriction is practical, both in terms of implementation effort and in terms of whether it obstructs the applications of externref that people are waiting for. The alternative is to leave call_indirect and func.ref as is. It is just possible that this means that, depending on the solution we arrive at, externref might not be instantiable like a true type import would be, and/or that externref might (ironically) not be able to have any supertypes (e.g. might not be able to be a subtype of anyref if we do eventually decide to add anyref).

I, speaking for just myself, consider both options manageable. While I do have a preference, I am not strongly pushing the decision to go one way or the other, and I believe y'all have better access to the information necessary to come to a well-informed decision. I just wanted y'all to know that there is a decision to be made, and at the same time I wanted to establish awareness of the overarching issue with call_indirect. If you would like a more thorough explanation of that issue than what the summary above provides, please read the following.

call_indirect versus Abstraction, in Detail

I'll use the notation call_indirect[ti*->to*](func, args), where [ti*] -> [to*] is the expected signature of the function, func is simply a funcref (rather that a funcref table and an index), and args are the to* values to pass to the function. Similarly, I'll use call($foo, args) for a direct call of the function with index $foo passing arguments args.

Now suppose $foo is the index of a function with declared input types ti* and output types to*. You might expect that call_indirect[ti*->to*](ref.func($foo), args) is equivalent to call($foo, args). Indeed, that is the case right now. But it's not clear that we can maintain that behavior.

call_indirect and Subtyping

One example potential problem came up in the discussion of subtyping. Suppose the following:

  • tsub is a subtype of tsuper
  • module instance IA exports a function $fsub that was defined with type [] -> [tsub]
  • module MB imports a function $fsuper with type [] -> [tsuper]
  • module instance IB is module MB instantiated with IA's $fsub as $fsuper (which is sound to do—even if it's not possible now, this issue is about potential upcoming problems)

Now consider what should happen if IB executes call_indirect[ -> tsuper](ref.func($fsuper)). Here are the two outcomes that seem most plausible:

  1. The call succeeds because the expected signature and defined signature are compatible.
  2. The call traps because the two signatures are distinct.

If we were to choose outcome 1, realize that we would likely need to employ one of two techniques to make this possible:

  1. For imported functions, have call_indirect compare with the import signature rather than the definition signature.
  2. Do an at-least-linear-time run-time check for subtype-compatibility of the expected signature and the definition signature.

If you prefer technique 1, realize that it won't work once we add Typed Function References (with variant subtyping). That is, func.ref($fsub) will be a ref ([] -> [tsub]) and also a ref ([] -> [tsuper]), and yet technique 1 will not be sufficient to keep call_indirect[ -> super](ref.func($fsub)) from trapping. This means outcome 1 likely requires technique 2, which has concerning performance implications.

So let's consider outcome 2 a bit more. The implementation technique here is to check if the expected signature of the call_indirect in IB is equal to the signature of the definition of $fsub in IA. At first the major downside of this technique might seem to be that it traps on a number of calls that are safe to execute. However, another downside is that it potentially introduces a security leak for IA.

To see how, let's switch up our example a bit and suppose that, although instance IA internally defines $fsub to have type [] -> [tsub], instance IA only exports it with type [] -> [tsuper]. Using the technique for outcome 2, instance IB can (maliciously) execute call_indirect[ -> tsub]($fsuper) and the call will succeed. That is, IB can use call_indirect to circumvent the narrowing IA did to its function's signature. At best, that means IB is dependent on an aspect of IA that is not guaranteed by IA's signature. At worst, IB can use this to access internal state that IA might have intentionally been concealing.

call_indirect and Type Imports

Now let's put subtyping aside and consider type imports. For convenience, I am going to talk about type imports, rather than just reference-type imports, but that detail is inconsequential. For the running example here, suppose the following:

  • module instance IC defines a type capability and exports the type but not its definition as $handle

  • module instance IC exports a function $do_stuff that was defined with type [capability] -> [] but exported with type [$handle] -> []

  • module MD imports a type $extern and a function $run with type [$extern] -> []

  • module instance ID is module MD instantiated with IA's exported $handle as $extern and with IA's exported $do_stuff as $run

What this example sets up is two modules where one module does stuff with the other module's values without knowing or being allowed to know what those values are. For example, this pattern is the planned basis for interacting with WASI.

Now let's suppose instance ID has managed to get a value e of type $extern and executes call_indirect[$extern -> ](ref.func($run), e). Here are the two outcomes that seem most plausible:

  1. The call succeeds because the expected signature and defined signature are compatible.
  2. The call traps because the two signatures are distinct.

Outcome 2 makes call_indirect pretty much useless with imported types. So for outcome 1, realize that the input type $extern is not the defined input type of $do_stuff (which instead is capability), so we would likely need to use one of two techniques to bridge this gap:

  1. For imported functions, have call_indirect compare with the import signature rather than the definition signature.
  2. Recognize that at run time the type $extern in instance ID represents capability.

If you prefer technique 1, realize that it once again won't work once we add Typed Function References. (The fundamental reason is the same as with subtyping, but it'd take even more text to illustrate the analog here.)

That leaves us with technique 2. Unfortunately, once again this presents a potential security issue. To see why, suppose ID is malicious and wants to get at the contents of $handle that IC had kept secret. Suppose further that ID has a good guess as to what $handle really represents, namely capability. ID can define the identity function $id_capability of type [capability] -> [capability]. Given a value e of type $extern, ID can then execute call_indirect[$extern -> capability](ref.func($id_capability), e). Using technique 2, this indirect call will succeed because $extern represents capability at run time, and ID will get the raw capability that e represents back. Similarly, given a value c of type capability, ID can execute call_indirect[capability -> $extern](ref.func($id_capability), c) to forge c into an $extern.

Conclusion

Hopefully I've made it clear that call_indirect has a number of significant upcoming performance, semantic, and/or security/abstraction issues—issues that WebAssembly has been fortunate to have avoided so far. Unfortunately, due to call_indirect being part of core WebAssembly, these issues crosscut a number of proposals in progress. At the moment, I think it would be best to focus on the most pressing such proposal, Reference Types, where we need to decide whether or not to restrict call_indirect and func.ref to only numeric types for now—a restriction we might be able to relax depending on how we eventually end up solving the overarching issues with call_indirect.

(Sorry for the long post. I tried my best to explain complex interactions of cross-module compile-time-typing-meets-run-time-typing features and demonstrate the importance of those interactions as concisely as possible.)

@tlively
Copy link
Member

tlively commented May 8, 2020

Thanks for this detailed writeup, Ross! I have small question: in the "call_indirect and Type Imports" section you write,

If you prefer technique 1, realize that it once again won't work once we add Typed Function References.

Is this also subject to the caveat from the the previous section that the problem is only present once we add variant subtyping to the typed function references?

@RossTate
Copy link
Author

RossTate commented May 8, 2020

It is not. All the issues in the subtyping section are independent of type imports, and all the issues in the type imports section are independent of subtyping. With respect to the particular issue you're asking about, consider that a value of type ref ([] -> [capability]) can be returned by an exported function as a value of type ref ([] -> [$handle]), which then can be turned into a funcref and indirectly called to. Unlike with the exported function, this change-in-perspective of the value happens at run time rather than link time, so we cannot resolve it by comparing against the import signature since the function reference was never itself imported.

@ngzhian
Copy link
Member

ngzhian commented May 8, 2020

module instance IC defines a type capability and exports the type but not its definition as $handle
How will this work? There needs to be something that connects capability and $handle so that IC will know how to deal with it?
Also based on https://github.com/WebAssembly/proposal-type-imports/blob/master/proposals/type-imports/Overview.md#exports, imported types are completely abstract. So even if $capability is exported, it is abstract. Perhaps I am misunderstanding something.

Similar question for the exporting for module instance IC exports a function $do_stuff that was defined with type [capability] -> [] but exported with type [$handle] -> [].

I can imagine some sort of subtyping relation used for this, e.g. if $capability <: $handle, then we can export $capability as $handle. But the start of this section it was mentioned to put subtyping aside, so I'm putting that aside... But I also thought a bit more about it:
If: $capability <: $handle, we can export $capability as $handle, but export ([$capability] -> []) as ([$handle] -> []) should "fail" because functions are contravariant in the argument.

@RossTate
Copy link
Author

RossTate commented May 8, 2020

With type exports, a module specifies a signature, like type $handle; func $do_stuff_export : [$handle] -> [], and then instantiates the signature, like type $handle := capability; func $do_stuff_export := $do_stuff. (Ignore the specific syntax entirely.) The type-checker then checks "given that $handle represents capability in this module, is the export func $do_stuff_export := $do_stuff valid in this module?". Since the type of $do_stuff is [capability] -> [], its signature lines up exactly with that of $do_stuff_export after instantiating $handle with capability, so the check succeeds. (There's no subtyping involved here, just variable substitution.)

Note, though, that the signature itself says nothing about $handle. This means that everyone else is supposed to treat $handle as an abstract type. That is, the signature intentionally abstracts details of the module's implementation, and everyone else is supposed to respect that abstraction. The purpose of this issue is to illustrate that call_indirect can be used to circumvent that abstraction.

Hopefully that clarifies the issue a bit!

@ngzhian
Copy link
Member

ngzhian commented May 8, 2020

Thanks, that clarifies things. I'll have a question about the subtyping section (sorry to jump around):

I'm following the scenario where we want IB executing call_indirect[ -> tsuper](ref.func($fsuper)) to succeed, by having call_indirect "compare with the import signature rather than the definition signature."

And you added that (due to the typed function references) we also need

  1. Do an at-least-linear-time run-time check for subtype-compatibility of the expected signature and the definition signature.

Should this be compatibility between "expected signature and the import signature"? Since we are assuming we have made call_indirect compare import signature with expected signature.

If the compatibility is checked between expected and import, then later, call_indirect[ -> tsub]($fsuper) should fail.

@RossTate
Copy link
Author

RossTate commented May 9, 2020

Techniques 1 and 2 are given as two orthogonal ways to get that indirect call to work. Unfortunately, technique 1 is incompatible with typed function references, and technique 2 is likely too expensive. So neither of these seem likely to work out. Thus the rest of the section considers what happens if we use neither of these and just stick with simple equality-comparison between the expected and defined signature. Sorry for the confusion; not having a planned semantics means I have to discuss three potential semantics.

@rossberg
Copy link
Member

Be careful not to jump to too many conclusions. ;)

My assumption is that call_indirect should remain as fast as today and therefore only ever require a type equivalence test, no matter how much subtyping we add to the language. At the same time, the runtime check needs to be coherent with the static type system, i.e., it has to respect the subtyping relation.

Now, these seemingly contradictive requirements can actually be reconciled fairly easily, as long as we make sure that the types usable with call_indirect are always at the leafs of the subtype hierarchy.

One established way of enforcing that is to introduce the notion of exact types into the type system. An exact type has no subtypes, only supertypes, and we'd have (exact T) <: T.

With that, we can require the target type at call_indirect to be an exact type. Furthermore, the type of functions themselves naturally is that function's exact type already.

A module could also require exact types on function imports, if it wanted to make sure that it can only be instantiated with functions that succeed an intended runtime check.

That's all that's needed to ensure that the current implementation technique of a simple pointer comparison on canonicalised function types remains valid. It's independent of what other subtyping there is, or how fancy we make function subtyping. (FWIW, I discussed this with Luke a while ago, and planned to create a PR, but it was blocked on pending changes to the subtyping story, and which proposal that now moves to.)

(One downside is that refining a function definition to a subtype is no longer a backwards-compatible change in general, at least not if its exact type has been used anywhere. But that drawback is unavoidable under our constraints, regardless of how exactly we enforce them.)

A couple of asides:

The alternative is to leave call_indirect and func.ref as is.

AFAICS, it's not feasible to disallow ref.func on functions that involve reference types. That would severely cripple many use cases, i.e., everything involving first-class functions operating on externref (callbacks, hooks, etc.).

It is just possible that this means that, depending on the solution we arrive at, externref might not be instantiable like a true type import would be, and/or that externref might (ironically) not be able to have any supertypes (e.g. might not be able to be a subtype of anyref if we do eventually decide to add anyref).

Can you elaborate? I don't see the connection.

@RossTate
Copy link
Author

Be careful not to jump to too many conclusions. ;)

I'm not sure what conclusion you're referring to. My stated conclusion is that there are a number of issues with call_indirect that we need to be aware of and should start planning for. Your seem to be suggesting that these issues are inconsequential because you have a solution in mind. But that solution has not been reviewed or accepted by the CG, and we should not plan for it until it has. I specifically requested to not discuss solutions because it will take a while to evaluate and compare them and there are decisions we need to make before we have time to do those evaluations and comparisons properly. But, in order to prevent people forming the perception that this problem is solved and consequently avoid the pressing decision, I'll take a sec to quickly discuss your solution.

One established way of enforcing that is to introduce the notion of exact types into the type system.

Exact types are hardly an established solution. If anything, exact types have established problems that its proponents are still working to address. Interestingly, here is a thread where the TypeScript team originally saw how exact types of the form you're proposing could solve some problems, but then they eventually [realized)(https://github.com/microsoft/TypeScript/issues/12936#issuecomment-284590083) that exact types introduced more problems than they solved. (Note for context: that discussion was prompted by Flow's exact object types, which are not actually a form of exact type (in the theoretical sense) but instead simply disallow the object-analog of prefix subtyping.) I could imagine us replaying that thread here.

As an example of how these sorts of problems could play out for WebAssembly, suppose we had not deferred subtyping. The type of ref.null would be exact nullref using exact types. But exact nullref would not be a subtype of exact anyref. In fact, according to the usual semantics of exact types, likely no values would belong to exact anyref because likely no value's run-time type is exactly anyref. This would make call_indirect completely unusable for anyrefs.

Now maybe you have some different version of exact types in mind, but it would take a while to check that this different version somehow addresses the many open problems with exact types. So my point here is not to throw out this solution, but to recognize that it's not obvious that this is the solution and to not make decisions with that expectation.

Can you elaborate? I don't see the connection.

You're referencing a long sentence. Which part of it would you like me to elaborate on? One guess is that you might be missing the overall issue with call_indirect and type imports. Your exact-types suggestion only addresses problems with subtyping, but we established above that call_indirect has problems even without any subtyping.

That would severely cripple many use cases, i.e., everything involving first-class functions operating on externref (callbacks, hooks, etc.).

Yeah, so this is something I was hoping to get more information on. My understanding is that the primary use case of call_indirect is to support C/C++ function pointers and C++ virtual methods. My understanding is also that this use case is currently restricted to numeric signatures. I know of more potential uses of call_indirect, but as I mentioned I was suggesting a temporary restriction, so what matters is what are the current uses of call_indirect. Given that call_indirect still requires a table and index rather than simply a funcref, it doesn't seem particularly well designed for supporting callbacks. I didn't know if that was because currently it's not being used for this purpose.

Y'all know the code bases targeting this feature far better than I, so if y'all know of some real programs needing this functionality now, it'd be very helpful to provide a few examples of the usage patterns in need here. Besides being useful for figuring out if we need to support this functionality right now, if the functionality is needed now then these examples would be helpful in informing how best to quickly provide it while addressing the issues above.

@rossberg
Copy link
Member

@RossTate:

If anything, exact types have established problems that its proponents are still working to address. Interestingly, here is a thread where the TypeScript team originally saw how exact types of the form you're proposing could solve some problems, but then they eventually realized that exact types introduced more problems than they solved. (Note for context: that discussion was prompted by Flow's exact object types, which are not actually a form of exact type (in the theoretical sense) but instead simply disallow the object-analog of prefix subtyping.) I could imagine us replaying that thread here.

The parentheses are key here. I'm not sure what exactly they have in mind in that thread, but it doesn't seem to be the same thing. Otherwise, statements like "it's assumed that a type T & U is always assignable to T, but this fails if T is an exact type" would make no sense (this doesn't fail, because T & U would be invalid or bottom). The other questions are mainly about pragmatics, i.e., where would a programmer want to use them (for objects), which don't apply in our case.

For low-level type systems, weren't exact types a crucial ingredient even in some of your own papers?

As an example of how these sorts of problems could play out for WebAssembly, suppose we had not deferred subtyping. The type of ref.null would be exact nullref using exact types. But exact nullref would not be a subtype of exact anyref.

No disagreement here. Not having subtypes is the purpose of exact types.

In fact, according to the usual semantics of exact types, likely no values would belong to exact anyref because likely no value's run-time type is exactly anyref.

Right, the combination (exact anyref) is not a useful type, given that the only purpose of anyref is to be a supertype. But why is that a problem?

This would make call_indirect completely unusable for anyrefs.

Are you sure that you aren't confusing levels now? A function type (exact (func ... -> anyref)) is perfectly useful. It's just not compatible with a type, say, (func ... -> (ref $T)). That is, exact prevents non-trivial subtyping on function types. But that's the whole point!

Maybe you are mixing up (exact (func ... -> anyref)) with (func ... -> exact anyref)? These are unrelated types.

Your exact-types suggestion only addresses problems with subtyping, but we established above that call_indirect has problems even without any subtyping.

You are somehow assuming that you'll be able to export a type without its definition as a means to define an abstract data type. Clearly, that approach doesn't work in the presence of dynamic type casts (call_indirect or otherwise). That's why I keep saying that we'll need newtype-style type abstraction, not ML-style type abstraction.

My understanding is that the primary use case of call_indirect is to support C/C++ function pointers

Yes, but that's not the sole use case of ref.func, which I was referring to, because you included it in your suggested restriction (maybe unnecessarily so?). In particular, there will be call_ref, which does not involve type checks.

@RossTate
Copy link
Author

RossTate commented May 12, 2020

You are somehow assuming that you'll be able to export a type without its definition as a means to define an abstract data type. Clearly, that approach doesn't work in the presence of dynamic type casts (call_indirect or otherwise). That's why I keep saying that we'll need newtype-style type abstraction, not ML-style type abstraction.

Okay, so you seem to be agreeing that exact types do nothing to address the issue with call_indirect and type import. But you are also saying that there's no point in addressing that problem because it will be a problem anyways due to run-time casts. There is an easy way to prevent that problem: do not allow people to perform run-time casts on abstract types (unless the abstract type explicitly says it's castable). After all, it's an opaque type, so we should not be able to assume it exhibits the structure necessary to perform a cast. So even if there is a possibility that exact types address the subtyping problem, it is premature to disregard the other half of the problem.

As I said, every solution has tradeoffs. You seem to be presuming that your solution has only the tradeoffs you yourself have identified, and you seem to be presuming that the CG would prefer your solution over others. I, too, have a potential solution to this problem. It guarantees constant-time checks, is based on a technology already used in virtual machines, addresses all the problems here (I believe), doesn't require adding any new types, and actually adds additional functionality to WebAssembly with known applications. However, I am not presuming that it works as I expect and that I have not overlooked some shortcoming because you and others have not had a chance to look it over. I am also not presuming that the CG would prefer its tradeoffs over those of alternative options. Instead, I am trying to figure out what we can do to give us time to analyze the options so that the CG, rather than just me, can be the one who makes an informed decision on this cross-cutting topic.

In particular, there will be call_ref, which does not involve type checks.

The key word in your sentence is will. I am fully aware that there are applications of call_indirect with non-numeric types that people will want to have supported. And I expect that we will come up with a design that supports that functionality and addresses the issues above. But, as I said, ideally we can have some time to develop that design so that we're not quickly shipping a feature with cross-cutting implications before we have had a chance to investigate those implications. So my question is are there major programs that need that functionality now. If there is, there's no need to hypothesize; just point to some and illustrate how they currently rely on this functionality.

@tlively
Copy link
Member

tlively commented May 12, 2020

You are somehow assuming that you'll be able to export a type without its definition as a means to define an abstract data type. Clearly, that approach doesn't work in the presence of dynamic type casts (call_indirect or otherwise). That's why I keep saying that we'll need newtype-style type abstraction, not ML-style type abstraction.

This seems like a fundamental issue to me. Is enabling the confidentiality of the definitions of exported types a goal of the type imports proposal? I gather from this thread that @RossTate thinks it should be a goal and @rossberg thinks it is not currently a goal. Lets discuss and agree on this question before discussing solutions so that we can all work from the same set of assumptions.

@rossberg
Copy link
Member

@RossTate:

Okay, so you seem to be agreeing that exact types do nothing to address the issue with call_indirect and type import.

Yes, if by that you mean the question of how to add a feature for defining abstract data types. There are a number of ways how type abstraction can work consistently, but such a feature is further down the road.

The key word in your sentence is will. I am fully aware that there are applications of call_indirect with non-numeric types that people will want to have supported.

The call_ref instruction is in the function ref proposal, so fairly close, in any case before any potential abstract data type mechanism. Are you suggesting that we put it on hold until then?

@tlively:

Is enabling the confidentiality of the definitions of exported types a goal of the type imports proposal? I gather from this thread that @RossTate thinks it should be a goal and @rossberg thinks it is not currently a goal.

It is a goal, but an abstract data type mechanism is a separate feature. And such a mechanism must be designed such that it does not affect the design of imports. If it did, then we would be doing it very wrong -- abstraction has to be ensured at the definition site, not the use site. Fortunately, though, this is not rocket science, and the design space is fairly well-expored.

@tlively
Copy link
Member

tlively commented May 12, 2020

Thanks, @rossberg, that makes sense. Adding abstraction primitives in a follow-up proposal after type imports and exports sounds fine to me, but it would be great if we could write down the details of how we plan to do that somewhere soon. The design of type imports and exports constrains and informs the design of abstract type imports and exports, so it is important that we have a good idea of how the abstraction will work down the road before we finalize the initial design.

@RossTate
Copy link
Author

In addition to detailing that plan, since this issue with call_indirect is demonstrating that it affects pressing decisions, can you explain why you seem to be rejecting my suggestion that abstract types should not be castable (unless explicitly constrained to be castable)? They are opaque, so that suggestion seems to be in line with common practices of abstract types.

@rossberg
Copy link
Member

@tlively, yes, agreed. Plus various other things I meant to write up for a while. Will do once I've worked through all the fallout from #69. ;)

@RossTate, because that would make abstract data types incompatible with casts. Just because I want to prevent others from seeing through an abstract type, I don't necessarily want to prevent them (or myself) from casting to an abstract type. Creating such a false dichotomy would break central use cases of casts. For example, of course I want to be able to pass a value of abstract type to a polymorphic function.

@RossTate
Copy link
Author

@rossberg Can you clarify what this central use case you have in mind is? My best guess at interpreting your example is trivially solvable, but maybe you mean something else.

@rossberg
Copy link
Member

@RossTate, consider polymorphic functions. Short of Wasm-generics, when compiling them using up/down casts from anyref, then it should be possible to use them with values of abstract type like any other, without extra wrapping into another objects. You generally want to be able to treat values of abstract type like any other.

@RossTate
Copy link
Author

RossTate commented May 13, 2020

Okay, let's consider polymorphic functions, and let's suppose the imported type is Handle:

  1. Java has polymorphic functions. Its polymorphic functions expect all (Java reference) values to be objects. In particular, they must have a v-table. A Java module using Handle will likely specify a Java class CHandle possibly implementing interfaces. The instances of this class will have a (wasm-level) member of type Handle and a v-table that provides function pointers to the implementations of various class and interface methods. When given to a surface-level polymorphic function, which at the wasm level is just a function on objects, the module can use the same mechanism it uses for casting to other classes to cast to CHandle.
  2. OCaml has polymorphic functions. Its polymorphic functions expect all OCaml values to support physical equality. Because wasm can't reason about OCaml's type safety, its polymorphic functions will also likely need to make heavy use of casts. A specialized casting structure would likely make this more efficient. For either of these reasons, an OCaml module would likely specify an algebraic data type or record type THandle that fits in with these norms and has a (wasm-level) member of type Handle. Its polymorphic functions would then cast OCaml values to THandle just like they would to any other algebraic data type or record type.

In other words, because modules rely on norms about how surface-level values are represented in order to implement things like polymorphic functions, and abstract imported types like Handle do not satisfy these norms, wrapping of values is inevitable. This is the same reason one of the original applications for anyref was replaced by Interface Types. And we developed case studies demonstrating that anyref is not necessary, nor even well-suited, for supporting polymorphic functions.

On the other hand, you've demonstrated that castable anyref can be used to circumvent static abstraction mechanisms. The abstraction-mechanism plan you alluded to is an attempt to patch this problem through dynamic abstraction mechanisms. But there are a number of problems with dynamic abstraction mechanisms. For example, one cannot export your i31ref type as one's abstract Handle type without the risk of other modules using anyref and casting to forge handles (e.g. to capabilities). Instead one has to jump through additional hoops and overheads that would be unnecessary if we instead just ensured standard static abstraction.

Also, now that (I think) I better understand how you intend to use exact types, I realize that your intent addresses neither of the two major problems I called attention to with call_indirect and subtyping:

  1. It does not help make call_indirect respect subtyping (which I think you already said explicitly)
  2. It does not prevent call_indirect from being used to use an exported function with its defined signature rather than its exported signature.

So this is not a trivial problem to solve. That is why, given the time constraints, I would prefer to focus on evaluating how to give us time solve it properly. I don't think it should be necessary to first have a discussion about whether anyref is worth throwing out static abstraction for. That's the sort of big discussion I was hoping to avoid in order to not delay things further.

@rossberg
Copy link
Member

On the other hand, you've demonstrated that castable anyref can be used to circumvent static abstraction mechanisms.

Static type abstraction is insufficient in a language with dynamic type casts. Because static abstraction relies on parametricity, and casts break that. There is nothing new about that, papers have been written about it. Other abstraction mechanisms are needed in such a context.

Trying to work around that by restricting the use of abstract types defeats their purpose. Consider the WASI use case. It should not matter whether a WASI module and any type it exports is implemented by the host or in Wasm. If you arbitrarily restrict user-defined abstract types, then a Wasm implementation would no longer be interchangeable with a host implementation in general.

  1. It does not help make call_indirect respect subtyping (which I think you already said explicitly)

Huh? It is part of the subtyping rules, so does by definition.

  1. It does not prevent call_indirect from being used to use an exported function with its defined signature rather than its exported signature.

I didn't say it did. I said that this one is not a problem with call_indirect itself, but a question of picking a suitable type abstraction mechanism for a language with casts.

As an aside, there is no compelling reason why compiling OCaml (or any similar language) should require the introduction of variant types. Even if that could be slightly faster in theory (which I doubt would be the case in current-gen engines, more likely the contrary), variant types are a significant complication that should not be necessary for the MVP. I don't quite share your appetite for premature complexity. ;)

Re equality on functions: there are languages, such as Haskell or SML, that do not support that, so might benefit directly from func refs. OCaml throws for structural equality and explicitly has implementation-defined behaviour for physical one. It is left open whether that allows always returning false or throwing for functions, but either might well be sufficient in practice, and worth exploring before committing to costly extra wrapping.

[As a meta comment, I would really appreciate if you toned down your lecturing and perhaps considered the idea that this is a world where, maybe, the set of competent people is not singleton and that traces of brains have occasionally been applied before.]

@RossTate
Copy link
Author

As a meta comment, I would really appreciate if you toned down your lecturing

Heard.

and perhaps considered the idea that this is a world where, maybe, the set of competent people is not singleton and that traces of brains have occasionally been applied before.

My advice here is based on consultation with multiple experts.

Static type abstraction is insufficient in a language with dynamic type casts. Because static abstraction relies on parametricity, and casts break that. There is nothing new about that, papers have been written about it. Other abstraction mechanisms are needed in such a context.

These experts I have consulted with include authors of some of said papers.

Now, as an attempt to check that I have correctly synthesized their advice, I just e-mailed another author of some of said papers, one whom I have not discussed this topic before. Here is what I asked:

Suppose I have a polymorphic function f(...). My typed language has (subsumptive) subtyping and explicit casting. However, a cast from t1 to t2 only type checks if t2 is a subtype of t1. Suppose type variables like X by default have no subtypes or supertypes (besides themselves of course). Would you expect f to be relationally parametric with respect to X?

Here was their response:

Yes, I think this would be parametric since the only ability this gives you is to write casts on X that are equivalent to an identity function, which is already relationally parametric.

This is in line with my advice. Now, of course, this is a simplification of the problem at hand, but we have made an effort to investigate the problem more specifically to WebAssembly, and so far our exploration has suggested that this expectation continues even to hold even at the scale of WebAssembly except for call_indirect, hence this issue.

Note that the theorems you are referring to apply to languages in which all values are castable. This observation is where we got the idea to restrict castability.

Consider the WASI use case.

I do not understand the claims you are making. We have considered the WASI use case. By we, I am including multiple experts in security and even specifically capability-based security.

As a meta comment, I would really appreciate not needing to appeal to authority or to the CG to have my suggestions heard. I suggested that restricting casts would enable ensuring static parametricity even in the presence of casts. You immediately disregarded that suggestion, appealing to prior papers to justify that dismissal. Yet when I offered this same suggestion to an author of those papers, they immediately came to the same conclusion as I did and as you could have. Before that, I suggested that evaluating potential solutions would be a long process. You disregarded that suggestion, insisting that you (all on your own) had solved the problem, pulling us both into this long conversation. It is extremely difficult to make progress, and to keep from getting frustrated, when one's suggestions are repeatedly dismissed so casually. (I should clarify that I am not trying to dismiss your suggestion as a possible solution here; I am trying to demonstrate that it is not the only solution and so should be evaluated alongside various others.)

@lukewagner
Copy link
Member

lukewagner commented May 15, 2020

I think having a detailed design nailed down and scrutinized that addresses the concerns raised in this issue is important and timely: I don't actually think abstract types should be considered a farther-away feature; WASI needs them now-ish.

I also have hopes that exact+newtype can address the concerns, but I agree that we can't simply bet the farm on this hunch at this point in time by prematurely committing to a design when we (shortly) ship reference types. We need time to have a proper discussion about it.

That being said, I don't see the hazard with allowing externref in call_indirect signatures in the reference-types proposal. Yes, if a module exports an externref value (as a const global or returning it from a function ...), we haven't nailed down whether we can downcast that externref. But call_indirect isn't downcasting an externref; it's downcasting a funcref, and externref is in no different of a role than i32 w.r.t the funcref-type-equality check. Thus, in the absence of type imports, type exports, and subtyping at play in call_indirect, I don't see how we're committing to a new design choice that we haven't already committed to in the MVP.

If there isn't a hazard, perhaps we could dial down this intense discussion to a less-intense discussion in the Type Imports proposal (where I still think we should include proper abstract type support)?

@RossTate
Copy link
Author

Sure. I think it's a good idea to examine if there's a hazard or not.

Regarding WASI, the design is still very much in flux, but one option that still seems viable is to use something like i31ref for its "handles", say because it does not require dynamic memory allocation. WASI may decide upon other options, but the point is no one knows at present, and it would be nice for decisions made now not to affect such decisions down the line.

Currently, externref is the only abstract type available, and so a WASI-based host would instantiate externref with i31ref (or whatever WASI "handles" are). But my understanding is that WASI wants to move its implementation into WebAssembly as much as possible in order to reduce host-dependent code. To facilitate this, at some point WASI systems might want to treat externref just like any other type import and instantiate it with WASI's abstract exported Handle type. But if Handle is i31ref, then the above implementation of call_indirect needed to enable it to work across module boundaries can also be used to let people forge handles via externref.

So one of my questions, that now I'm noticing was not stated clearly in my original post, is do people want externref to be instantiable just like other abstract type imports will be?

@tlively
Copy link
Member

tlively commented May 15, 2020

So one of my questions, that now I'm noticing was not stated clearly in my original post, is do people want externref to be instantiable just like other abstract type imports will be?

Thanks for explicitly raising this question. FWIW, I have never understood externref to be instantiatable from inside a WebAssembly module. That implies host participation in virtualization if WASI wants to use externref as handles, but that seems ok to me, or at least seems a separable discussion.

@RossTate
Copy link
Author

Hmm, let me see if I can clarify. I suspect you're already on board with a bunch of what follows, but it's easier for me to just start from scratch.

From the perspective of a wasm module, externref does not mean host reference. It is just an opaque type that the module knows nothing about. Rather, it is the conventions around externref that interpret it as a host reference. For example, the conventions of a module using externref to interact with the DOM would be apparent in the functions involving externref that the module imports, like parentNode : [externref] -> [externref] and childNode : [externref, i32] -> [externref]. The environment of the module, such as the host itself, is what actually gives the interpretation of externref as host references, and it provides implementations of the imported methods that corroborate that interpretation.

However, the environment of the module does not have to be the host and externref does not have to be host references. The environment could be another module that provides functionality for some type that looks like host references exhibiting the expected conventions. Let's suppose module E is the environment of module M, and that module M imports parentNode and childNode as above. Let's say E wants to use module M but wants to restrict M's access to the DOM, say because E has limited trust of M or because E wants to bound any bugs M might have and knows M's needs should not exceed these restrictions. What E could do is instantiate M with "MonitoredRef" as M's externref. Let's say that, in particular, E wants to give M DOM nodes but ensure that M doesn't walk higher up the DOM tree. Then E's MonitoredRef could be specifically ref (struct externref externref), where the second externref (from E's perspective) is the DOM node that M is operating on, but the first externref is an ancestor of that node that M is not allowed to walk up past. E could then instantiate M's parentNode such that it errs if these two references are the same. E itself would import its own parentNode and childNode functions, making E effectively a run-time monitor of DOM interactions.

Hopefully that was concrete enough to paint the right picture, while not too concrete to get lost in details. There are obviously a number of patterns like this. So I guess another way to phrase the question is, do we want externref to only ever represent exactly host references?

@tlively
Copy link
Member

tlively commented May 15, 2020

The only part that sounds questionable to me is "what E could do is instantiate M with "MonitoredRef" as M's externref." I am not under the impression that there are plans to allow abstracting things to appear as externref in other modules. My understanding is that externref is not an abstraction tool at all.

@RossTate
Copy link
Author

I don't know of any such plans either; I just also don't know if anyone had considered the option. That is, should externref be a "primitive" type, e.g. like i32, or an "instantiable" type, e.g. like imported types?

In my original post, I indicated that either way is manageable. The tradeoff of going for the "primitive" interpretation is that externref is substantially less useful/composable than imported types, since the latter will support the use cases of externref as well as the patterns above. As such, "primitive" externref seems likely to become vestigial—only existing for backwards compatibility. But that seems unlikely to be particularly problematic, just a nuisance. The biggest problem I can see is that, just like the well-behavedness of call_indirect on numeric types works out because they have no supertypes, the well-behavedness of call_indirect may end up depending on externref having no supertypes as well.

@lukewagner
Copy link
Member

Ah hah, yes, this explains the difference in understanding: I agree with @tlively that externref is not abstract at all and there is no notion of "instantiating externref with a type", and I think we can feel pretty confident about this going forward. (Since externref is a primitive type, as opposed to an explicitly-declared type parameter, it's not clear how one could even attempt to instantiate it on a per-module basis.)

In the absence of downcasts, this fact makes wasm near-useless for implementing/virtualizing WASI APIs which is why the plan for WASI has been to transition from i32 handles directly to Type Imports (and why I filed type-imports/#6, b/c we need even a lil bit more).

@RossTate
Copy link
Author

Since externref is a primitive type, as opposed to an explicitly-declared type parameter, it's not clear how one could even attempt to instantiate it on a per-module basis.

When we add type imports, we can treat modules without type imports but with externref as having import type externref at the top. Everything would type-check just the same because, unlike other primitive types, externref has no associated primitive operations (beyond having a default value). But with that implicit import, now we can do things like virtualization, sandboxing, and run-time monitoring.

But before going back and forth on that, I think it would help to gauge if we're all on the same page about something. Let me know if you agree or disagree with the following statement and why: "Once type imports are available, modules have no reason to use externref and are more repurposable/composable if they use a type import instead."

@tlively
Copy link
Member

tlively commented May 17, 2020

Let me know if you agree or disagree with the following statement and why: "Once type imports are available, modules have no reason to use externref and are more repurposable/composable if they use a type import instead."

I agree with this statement in the abstract. In practice I think externref will remain common in web contexts for referring to external JS objects because it requires no additional configuration at instantiation time. But that’s just a prediction and I wouldn’t mind if I turn out to be wrong and everyone switches to using type imports after all. The value of externref is that we can have it sooner than we can have richer mechanisms like type imports. I would rather keep externref simple and see it fall out of use than awkwardly shoehorn it into being something more powerful later on when there are more elegant alternatives.

@rossberg
Copy link
Member

@tlively,

FWIW, I have never understood externref to be instantiatable from inside a WebAssembly module.

Right, the idea is that externref is the "primitive" type of foreign pointers. To abstract over the implementation details of a reference type, you'll need something else: something like anyref or a type import.

@lukewagner, I'd be on board with widening the scope of the type imports proposal, if that's preferable. But the trade-off is that the proposal would take longer. I was under the impression that type imports could be useful without in-language abstract data types, and hence desirable sooner rather than later. It would mean that you could already express and use WASI interfaces, but not yet implement them in Wasm. But either way is fine with me.

@rossberg
Copy link
Member

@RossTate:

These experts I have consulted with include authors of some of said papers.

Excellent. Then I assume you have noticed that yours truly is the author of a couple of these papers himself, in case you're looking for more authority. :)

Here is what I asked:

Suppose I have a polymorphic function f(...). My typed language has (subsumptive) subtyping and explicit casting. However, a cast from t1 to t2 only type checks if t2 is a subtype of t1. Suppose type variables like X by default have no subtypes or supertypes (besides themselves of course). Would you expect f to be relationally parametric with respect to X?

Sigh. I'd give the same reply to that specific question. But this question embodies several specific assumptions, e.g. about the nature of casts, and about a rather unusual distinction between bounded and unbounded quantification that rarely exists in any programming language. And I'd suppose that's for a reason.

When I said "static type abstraction is insufficient" then I didn't mean that it isn't technically possible (of course it is), but that it isn't practically adequate. In practice, you don't want a bifurcation between type abstraction and subtyping/castability (or between parametric and non-parametric types), because that would artificially break composition based on casts.

I do not understand the claims you are making.

If you receive a value of abstract type then you might still want to forget its exact type, e.g. to put it into some kind of union, and later recover it through a downcast. You might want to do that for the same reason you may want that for any other reference type. Type abstraction shouldn't get in the way of certain usage patterns that are valid with regular types of the same sort.

Your answer seems to be: so what, wrap everything into auxiliary types at respective use sites, e.g. into variants. But that could imply substantial wrapping/unwrapping overhead, it requires more complex type system features, and it is more complicated to use.

I think this is what several of our disagreements come down to: whether the MVP should support unions of reference types, or whether it should require the introduction of and encoding with explicit variant types. For better or worse, unions are a natural match for the heap interface of typical engines, and they are easy and cheap to support today. Variants not so much, they are a much more researchy approach that likely would induce extra overhead and less predictable performance, at least in existing engines. And I'm saying that as a type systems person that much prefers variants over unions in other circumstances, such as user-facing languages. ;)

As a meta comment, I would really appreciate not needing to appeal to authority or to the CG to have my suggestions heard.

May I kindly suggest that conversations about various proposals might work better if started by asking respective champions about things that aren't clear, e.g. specific rationales or future plans (which aren't always obvious or written up yet), before assuming the absence of an answer and making broad assertions and suggestions based on these assumptions?

@RossTate
Copy link
Author

Excellent. Then I assume you have noticed that yours truly is the author of a couple of these papers himself, in case you're looking for more authority. :)

Yes, which makes it extremely problematic when you suggest that there are papers claiming my suggestion does not work, even though you know that my suggestion specifically addresses the conditions those claims were made under.

In practice, you don't want a bifurcation between type abstraction and subtyping/castability (or between parametric and non-parametric types), because that would artificially break composition based on casts.

This is an opinion, not a fact (making it something perfectly reasonable for us to disagree on). I would say that there are no language-agnostic industry typed assembly languages for multi-language systems, and so it is impossible to make claims about practice. This is something that deserves a thorough (separate) discussion. For that discussion, it would help for you to first provide some detailed case studies so that the CG can compare the tradeoffs.

May I kindly suggest that conversations about various proposals might work better if started by asking respective champions about things that aren't clear, e.g. specific rationales or future plans (which aren't always obvious or written up yet), before assuming the absence of an answer and making broad assertions and suggestions based on these assumptions?

WebAssembly/proposal-type-imports#4, WebAssembly/proposal-type-imports#6, and WebAssembly/proposal-type-imports#7 each essentially asked for more specifics on this plan. The last of these punts the issue to GC, but WebAssembly/gc#86 points out that the current GC proposal does not in fact support dynamic abstraction mechanisms.

On the meta level, we were asked to put this discussion aside and focus on the topic at hand. I found @tlively's response to my question very helpful. I am actually quite interested in getting specifically your thoughts on that question.

@rossberg
Copy link
Member

@RossTate:

I found @tlively's response to my question very helpful. I am actually quite interested in getting specifically your thoughts on that question.

Hm, I thought I already commented on it above. Or do you mean something else?

@RossTate
Copy link
Author

Nope. I thought that comment might have implied agreement with his response, but I wanted to confirm first. Thanks!

@lukewagner, what are your thoughts?

@lukewagner
Copy link
Member

I agree with above that externref will forever be a primitive type and not retroactively reinterpreted as a type parameter. I think, given that, reference-types is good-to-go as-is.

I'd like to take @rossberg up on the offer to expand the scope of the Type Imports proposal so that it covers the ability for wasm to implement abstract types. Once we can nail that down, I think it'll unblock further discussion around function references and subtyping.

@RossTate
Copy link
Author

Awesome. Then we're all on the same page (and I, too, think @tlively provides a nice summary of the tradeoffs involved and the rationale for a decision).

So externref will not be instantiable, and modules looking for that extra flexibility of type imports will be expected to convert to type imports when the feature is released. It occurs to me that, to make that transition smooth, we'll likely need to make it so that (some?) type imports are instantiated by externref by default if no instantiation is provided.

And I also would like to take up the offer to expand the scope of Type Imports. Many of the major applications of type imports need abstraction, so it seems natural to me for abstraction to be part of that proposal.

In the meanwhile, although we've addressed the pressing question about externref, what to do more generally about call_indirect is still unresolved, albeit with some useful discussion on how it might be resolved, so I'll still leave the issue open.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants