Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract Types extension to WebAssembly's reference types proposal #4

Open
wants to merge 4 commits into
base: proposal-reference-types-master
Choose a base branch
from

Conversation

awendland
Copy link
Owner

@awendland awendland commented Jan 13, 2021

I don't expect this to be particularly useful to anyone, especially with the exciting work going on with the GC proposal and Type Imports proposal, but I wanted to share this abstract types language extension I implemented in case it is relevant to someone.

You can launch a Jupyter notebook with several examples and a modified Wasm interpreter implementing the following features by clicking launch Binder.

Overview

I've implemented a small extension to the WebAssembly language that introduces abstract types (also known as "existential types" or "abstract data types" or "opaque references" or "nominal types"). I built this in the context of my undergraduate thesis which explored using WebAssembly as a multi-language platform (by enabling the various features needed for secure compilation).

This implementation of abstract types is based on OCaml's abstract types and loosely conforms with the ideas expressed by rossberg and RossTate in WebAssembly/proposal-type-imports#7:

  • Abstract types are owned by modules.
  • The module that creates the abstract types can manipulate them directly as their underlying types.
  • Abstract types become "sealed" once exported from the module, meaning that importing modules can't manipulate their values directly.
  • There is no sub-typing.

These features allow WebAssembly to enforce higher-level abstractions such as:

  • Unforgeable file handles (e.g. in WASI)
  • Object types (i.e. allowing functions to only operate on a given Object)
  • Object references (i.e. unforgeable addresses, e.g. referring to a in let a = new Date(); a.getYear())

Usage

To work with these new abstract types, I've introduced 4 operators to the language. The syntax is verbose in order to ensure clarity in the operations being performed. It's likely that a more ergonomic syntax would be adopted, such as merging the abstype_new and abstype_sealed namespaces and referring to them with a single operator, as well as overloading the existing type instruction to support abstract types. For now, the syntax is as follows:

abstype_new abstype_new [IDENTIFIER] value_type Create a new abstract type around a given value_type (which can be another abstract type via abstype_sealed_ref)
abstype_sealed abstype_sealed [IDENTIFIER] Import a foreign abstract type. Always used within an import instruction, i.e. (import "mod" "id" (abstype_sealed [IDENTIFIER]))
abstype_new_ref abstype_new_ref IDENTIFIER Reference a local abstract type (i.e. one locally declared using abstype_new)
abstype_sealed_ref abstype_sealed_ref IDENTIFIER Reference an imported foreign abstract type (i.e. one imported via abstype_sealed)

Abstract types manifest in two ways:

  • Local - Local abstract types are at play when abstype_new* instructions are used within a given module. These abstract types are "unwrapped" within the module, and are treated as their underlying value_types. In this way, local abstract types are more like type aliases. This allows abstract types to be constructed, and only take on their abstract nature when used in a separate module.
  • Foreign / Sealed - Foreign, or sealed, abstract types are present when abstype_sealed* instructions are being used. These abstract types are treated as opaque identifiers referencing the source module instance and the export statement. These abstract types are only treated as their underlying values upon program execution (i.e. after validation). Additionally, they do not have default values, so trying to immediately use a local with a sealed abstract type will fail, instead, the local must be populated with a value provided by the sealed abstract type's source module.

Examples

You can see simple syntax/usage examples in the test file for this feature at test/core/abstract-types.wast.

For something more interesting, I've configured my awendland/2020-thesis repository to be runnable via Binder so that you can jump right into a web-based Jupyter notebook with this webassembly-spec-abstypes interpreter already available and the code in samples/ all runnable. Try it out with: launch Binder

Implementation Details

I've created a PR (#4) showcasing the implementation on top of the reference interpreter's Reference Types Proposal branch.

The core changes for stack type checking were in interpreter/syntax/types.ml:

  • adding SealedAbsType of int32 to value_type
  • introducing a new layer on top of value_type called wrapped_value_types that could be a NewAbsType of value_type * int32 in addition to a straight value_type.

In both of these instances, the i32 value represents a unique identifier for the abstract type, since the types are "nominal", as in, even if they have the same underlying type they're unique if they are from different definitions (i.e. different abstype_new operations). See the test suite where this distinction is demonstrated.

The core changes for type checking module linking were implemented in a new module called interpreter/runtime/extern_types.ml that:

  • replaces extern_type_of in runtime/instance.ml
  • handles comparing abstract types, which may either be "resolved" or "unresolved" since, fundamentally, abstract types are based on Module (or Host) instances, not module definitions.
    • There are additional constructors (NameModuleRef, LocalModuleRef) which exist to assist with representing abstract types in contexts were modules haven't been instantiated yet, but these representations of abstract types have to be resolved before they can be properly compared.

All uses of value_type (should) have been expanded to support abstract types.

Work in Progress

Several aspects of this implementation are unfinished (and likely to remain so given the exciting development going on with the GC Proposal and my interests moving elsewhere). I documented the issues I was aware of in the test suite; they are:

  • Enforcement of type sealing when a sealed/foreign abstract type is used as the value for another new abstract type. 3rd party modules are able to use these double-sealed abstract types as if they were the original sealed abstract type and visa-versa.
  • Enforcement of abstract types for globals. You can import a global as its underlying type and the module linking process will succeed just fine (it should have thrown a linking error saying "incompatible import type").

Additionally, I only implemented abstract types for the textual representation of WebAssembly, not the binary variant or the JS API. The binary variant should be a straightforward addition. The JS API will require various decisions to be made around how to enforce the abstraction across the JS boundary; hard problems that are getting compelling answers in the GC Proposal.

Related Discussions

This small language extension is related to much more exciting discussions going on elsewhere, such as:

It's also loosely related to much grander proposals that are in the works: GC, Type Imports, Interface Types.

Thanks for such great work on WebAssembly! I'm excited to see where the language goes.

Add several new instructions:
  * abstype_new: Construct a new abstract type
  * abstype_sealed: Import a foreign abstract type
  * abstype_new_ref: Reference a local abstract type (created by
      abstype_new)
  * abstype_sealed_ref: Reference a foreign abstract type (imported and
      named by abstype_sealed)

FIX module local usage of abstract types by "unwrapping" the abstract
    type to its underlying value type when used within its defining
    module.

FIX failure on func imports w/ abstypes b/c module_inst refs aren't
    stable identifiers by adding unique identifiers to
    module_inst (for equality checking)

TODO when dealing with either binary or JS variants, abstype_*
     operations will throw an error and crash execution.
…efaults

When a local/other value is created with a foreign abstype (i.e. using
abstype_sealed_ref), ensure that the real value is kept sealed, instead
of unwrapping it and using the underylying type's default. This ensures
that the abstype can't be forged by constructing a default value for the
underlying type, instead, the user must set the local/other value using
an appopriate function provided by the abstype's module.

For example:

```
(module $mod0
  (abstype_new $Token i32)
  (export "Token" $Token)
  (func (export "useToken") (param (abstype_new_ref $Token))
    (; do something ;))
)
(register "mod0" $mod0)

(module $mod1
  (import "mod0" "Token" (abstype_sealed $Token))
  (import "mod0" "useToken"
    (func $useToken (param (abstype_sealed_ref $Token))))
  (func $f
    (local (abstype_sealed_ref $Token))
    (call $useToken (local.get 0)))
)
```

These modules will pass validation, however, the `call` instruction in
`$f` in `$mod1` will throw an error at runtime because `useToken` in
`$mod0` is expecting an `i32` value (since the `$Token` abstype will be
unwrapped) and `local 0` in `$f` will have the value `sealed{0}` since
it hasn't been initialized.
@awendland awendland added the Showcase PR exists just to showcase some code label Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Showcase PR exists just to showcase some code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant