Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for WebAssembly externref in non-web environment #103516

Open
kwerner8 opened this issue Oct 25, 2022 · 11 comments
Open

Support for WebAssembly externref in non-web environment #103516

kwerner8 opened this issue Oct 25, 2022 · 11 comments
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@kwerner8
Copy link

My goal is to compile a function written in Rust to WebAssembly which has as an input/output type externref.
Then I want to call this function as an export in Wasmtime. So, this function will receive an externref as an input and also return one.

The WebAssembly Code in the wat format should look similar to this:

(func (export "func") (param externref) (result externref)
... WebAssembly Code ... )

I have seen that there exists wasm-bindgen for browser hosts.

Is there a way to produce externref as an input and output for exports of functions written in Rust that can be called in non-web environments like Wasmtime?

@CryZe
Copy link
Contributor

CryZe commented Oct 25, 2022

I believe this is essentially blocked by LLVM support: https://reviews.llvm.org/D122215

@kwerner8
Copy link
Author

Okay. Thank you very much!
Are there plans to enable this feature in the future?

@misalcedo
Copy link

@CryZe now that the changes you linked are merged, is there any plans to add this to Rust?

@jyn514 jyn514 added the O-wasm Target: WASM (WebAssembly), http://webassembly.org/ label Apr 9, 2023
@jyn514
Copy link
Member

jyn514 commented Apr 9, 2023

I am not familiar with WASM or externref. What does this feature do, and what support is needed from Rust to enable it? Will it need language changes or a new extern "foo" ABI?

@CryZe
Copy link
Contributor

CryZe commented Apr 9, 2023

An externref is an opaque object that the WebAssembly engine can pass into wasm functions from the outside. So in the case of a browser those are actual JavaScript objects. WebAssembly can then pass them around and store them in a dedicated table for them and later retrieve them from the table to then call for example an external function that takes those as arguments (so that could be a DOM function that operates on JavaScript objects). Importantly externrefs must stay fully opaque, so it's impossible to write them into linear memory as otherwise you could observe their bytes and possibly modify those. Also you wouldn't know how big they are anyway. One would have to check how LLVM treats them and design a language concept around that.

@TethysSvensson
Copy link
Contributor

Yes, this will probably require changing the ABI, but I don't think it will require adding a new one.

A bit of context

The reference types proposal was accepted into the wasm spec in Februrary 2021.

LLVM support was added later in 2021. Clang also seems to mostly have support at this point.

The reference types proposal adds a few new things. The most relevant for this discussion is:

  • Two new value types which can be stored on the stack, in globals and in tables: funcref and externref.
  • Adds support for having multiple tables.

An externref is a completely opaque pointer, which is managed by the host. It cannot be used for anything except being passed around or given to a host function.

A funcref is a pointer to a function created either by the host or by wasm itself. It can be called from wasm.

See this Clang RFC for a more in-depth description of how LLVM implements this proposal.

Rust implementation

Before this proposal, there was only a single global table, which contained all of the function pointers. It was not possible to extract a function pointer from the table into a stack or global value.

This had a direct effect on the ABI. Because you could not represent function pointers directly, and because there was only a single table, function pointers were represented as indices into this table.

As an example this code

fn add(left: usize, right: usize) -> usize {
    left + right
}

#[no_mangle]
pub fn add_ptr() -> fn(usize, usize) -> usize {
    add
}

currently compiles to this wasm module:

(module
  (type (;0;) (func (param i32 i32) (result i32)))
  (type (;1;) (func (result i32)))
  (func $_ZN3foo3add17hc001cc2609bca236E (type 0) (param i32 i32) (result i32)
    local.get 1
    local.get 0
    i32.add)
  (func $add_ptr (type 1) (result i32)
    i32.const 1)
  (table (;0;) 2 2 funcref)
  (memory (;0;) 16)
  (global $__stack_pointer (mut i32) (i32.const 1048576))
  (global (;1;) i32 (i32.const 1048576))
  (global (;2;) i32 (i32.const 1048576))
  (export "memory" (memory 0))
  (export "add_ptr" (func $add_ptr))
  (export "__data_end" (global 1))
  (export "__heap_base" (global 2))
  (elem (;0;) (i32.const 1) func $_ZN3foo3add17hc001cc2609bca236E))

Specifically, the pointer to the add function is put at the location 1 in the table, and add_ptr returns this index.

In my opinion, this should ideally be changed. With the reference types proposal, it would be more appropriate to instead return a funcref value.

@jyn514
Copy link
Member

jyn514 commented Apr 9, 2023

Ok. It sounds like this is purely an implementation concern - we can change the IR we send to LLVM and it will result in a speedup, without changes necessary to user code?

@jyn514 jyn514 added I-slow Issue: Problems and improvements with respect to performance of generated code. C-enhancement Category: An issue proposing an enhancement or a PR with one. A-codegen Area: Code generation labels Apr 9, 2023
@CryZe
Copy link
Contributor

CryZe commented Apr 9, 2023

without changes necessary to user code?

No, it definitely is a language concern as an externref is a totally new kind of type with all sorts of corner cases that need to be resolved:

void foo(__externref_t x) {
    &x;
    struct { __externref_t y; } z = { .y = x };
}
<source>:2:5: error: cannot take address of WebAssembly reference
    &x;
    ^~
<source>:3:28: error: field has sizeless type '__externref_t'
    struct { __externref_t y; } z = { .y = x };
                           ^

Compiler Explorer

So it's a !Sized + !Ref type (the latter concept doesn't even exist, although not even sure if !Sized is the correct concept, as you can store it in variables directly and assign it just fine)

@jyn514 jyn514 added T-lang Relevant to the language team, which will review and decide on the PR/issue. and removed I-slow Issue: Problems and improvements with respect to performance of generated code. labels Apr 9, 2023
@jyn514
Copy link
Member

jyn514 commented Apr 9, 2023

This probably needs some sort of RFC in that case.

@TimNN
Copy link
Contributor

TimNN commented Oct 28, 2023

I have written a pre-RFC for WebAssembly Heap Type Support in Rust: https://internals.rust-lang.org/t/pre-rfc-webassembly-heap-types/19774

@adetaylor
Copy link
Contributor

Hi everyone, we just proposed an RFC for arbitrary self types. The primary motivation is because for C++, Python and other interop we need to have smart pointer types which do not obey normal Rust semantics.

This doesn't help with the fundamental LLVM-based opaque thingy that you need for __externref_t but it might well help you wrap such things up into first-class smart pointer objects. Or, it might provide a nicer interim solution for wasm-bindgen:

pub struct WasmExternRef<T> {
   index_into_wasm_bindgens_big_table: usize,
   _phantom: std::marker::PhantomData<T>
}

impl std::ops::Receiver for WasmExternRef<T> {.   // the new bit as enabled by the RFC linked above
   type Target = T;
}

// Generated by wasm-bindgen
struct JSApi;
impl JSApi {
  fn js_method(self: WasmExternRef<T>) {   // note the weird self type here
    // wasm-bindgen calls to javascript by accessing the index from the big table
  }
}

fn main() {
  let my_js_ref: WasmExternRef<JSApi> = // obtain somehow
  my_js_ref.js_method();  // new bit enabled by RFC
}

I'm not sure how much of this is already possible with wasm-bindgen as I've never used it. And it's probably not as good as first-class __externref_t LLVM support. But maybe it enables wasm-bindgen-generated bindings to be more ergonomic.

Thanks to pachi for spotting the possible link.

bors added a commit to rust-lang-ci/rust that referenced this issue Apr 9, 2024
…=wesleywiser

Stabilize Wasm target features that are in phase 4 and 5

This stabilizes the Wasm target features that are known to be working and in [phase 4 and 5](https://github.com/WebAssembly/proposals/tree/04fa8c810e1dc99ab399e41052a6e427ee988180).

Feature stabilized:
- [Non-trapping float-to-int conversions](https://github.com/WebAssembly/nontrapping-float-to-int-conversions)
- [Import/Export of Mutable Globals](https://github.com/WebAssembly/mutable-global)
- [Sign-extension operators](https://github.com/WebAssembly/sign-extension-ops)
- [Bulk memory operations](https://github.com/WebAssembly/bulk-memory-operations)
- [Extended Constant Expressions](https://github.com/WebAssembly/extended-const)

Features not stabilized:
- [Multi-value](https://github.com/WebAssembly/multi-value): requires rebuilding `std` rust-lang#73755.
- [Reference Types](https://github.com/WebAssembly/reference-types): no point stabilizing without rust-lang#103516.
- [Threads](https://github.com/webassembly/threads): requires rebuilding `std` rust-lang#77839.
- [Relaxed SIMD](https://github.com/WebAssembly/relaxed-simd): separate PR rust-lang#117468.
- [Multi Memory](https://github.com/WebAssembly/multi-memory): not implemented.

See rust-lang#117457 (comment) for more context.

Documentation: rust-lang/reference#1420
Tracking issue: rust-lang#44839
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 21, 2024
…=wesleywiser

Stabilize Wasm target features that are in phase 4 and 5

This stabilizes the Wasm target features that are known to be working and in [phase 4 and 5](https://github.com/WebAssembly/proposals/tree/04fa8c810e1dc99ab399e41052a6e427ee988180).

Feature stabilized:
- [Non-trapping float-to-int conversions](https://github.com/WebAssembly/nontrapping-float-to-int-conversions)
- [Import/Export of Mutable Globals](https://github.com/WebAssembly/mutable-global)
- [Sign-extension operators](https://github.com/WebAssembly/sign-extension-ops)
- [Bulk memory operations](https://github.com/WebAssembly/bulk-memory-operations)
- [Extended Constant Expressions](https://github.com/WebAssembly/extended-const)

Features not stabilized:
- [Multi-value](https://github.com/WebAssembly/multi-value): requires rebuilding `std` rust-lang#73755.
- [Reference Types](https://github.com/WebAssembly/reference-types): no point stabilizing without rust-lang#103516.
- [Threads](https://github.com/webassembly/threads): requires rebuilding `std` rust-lang#77839.
- [Relaxed SIMD](https://github.com/WebAssembly/relaxed-simd): separate PR rust-lang#117468.
- [Multi Memory](https://github.com/WebAssembly/multi-memory): not implemented.

See rust-lang#117457 (comment) for more context.

Documentation: rust-lang/reference#1420
Tracking issue: rust-lang#44839
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants