Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for WebAssembly/WASM #418

Open
newtack opened this issue Dec 22, 2017 · 19 comments
Open

Support for WebAssembly/WASM #418

newtack opened this issue Dec 22, 2017 · 19 comments
Labels

Comments

@newtack
Copy link

newtack commented Dec 22, 2017

I want to use gluon in WebAssembly. Is this possible now (doubtful since I saw that threads are used and WASM doesn't support threads yet) and if not what would need to change to make it support WASM?

@Marwes
Copy link
Member

Marwes commented Dec 22, 2017

That's going to be a hard thing to do :) .

Yep, gluon threads (not to be confused with OS threads, gluon's are more like coroutines so perhaps I should rename them). If you omit threads there are still a lot of other problems.

  • Large parts of gluon's standard library is just Rust code called through gluon's FFI https://github.com/gluon-lang/gluon/blob/master/vm/src/primitives.rs . In theory you ought to be able to translate these Rust functions into WASM as well and simply call them from the gluon WASM though which would be a really cool solution for that problem.

  • Gluon has a GC but WASM does not. I guess this means that a GC needs to be generated in web assembly itself.

  • Gluon's type system may be a bit too advanced for WASM. Since WASM has no concept of generics, all type generic code needs to be monomorphized. The problem here is that it is possible to define functions such as this

let test f b c: (forall a . a -> a) -> b -> c  -> () =
    f b
    f c
    f ()

Here f is applied to values of type b, c and () (and we could pass any other type to f as well) thus it is not possible to monomorphize f to make it work on only a single type. It is not impossible to generate WASM for the test function though, but you have to muck around with tagged values which is going to be a bit awkward to implement in WASM.

There is probably more hard things to solve than the points I outlined above. If I think of any I will update the list.

Given all the hurdles and the fact that most problems to be solved do not benefit gluon as an embeddable language I am not looking to implement a WASM backed (or any other assembly-like) backend. It would certainly be an interesting project though so I am open to helping out if anyone is interested in attempting this (or making a LLVM backend for that matter, that might be even more useful). While I have only been listing problems so far there is one thing that should help a lot when writing a backend. The compiler provides an extremely small Intermediate Representation (IR) as output after all typechecking is done which should be very easy to translate into WASM, LLVM-IR, assembly etc

gluon/vm/src/core/mod.rs

Lines 57 to 101 in 483dfef

#[derive(Clone, Debug, PartialEq)]
pub struct Closure<'a> {
pub pos: BytePos,
pub name: TypedIdent<Symbol>,
pub args: Vec<TypedIdent<Symbol>>,
pub expr: &'a Expr<'a>,
}
#[derive(Clone, Debug, PartialEq)]
pub enum Named<'a> {
Recursive(Vec<Closure<'a>>),
Expr(&'a Expr<'a>),
}
#[derive(Clone, Debug, PartialEq)]
pub struct LetBinding<'a> {
pub name: TypedIdent<Symbol>,
pub expr: Named<'a>,
pub span_start: BytePos,
}
#[derive(Clone, Debug, PartialEq)]
pub enum Pattern {
Constructor(TypedIdent<Symbol>, Vec<TypedIdent<Symbol>>),
Record(Vec<(TypedIdent<Symbol>, Option<Symbol>)>),
Ident(TypedIdent<Symbol>),
}
#[derive(Clone, Debug, PartialEq)]
pub struct Alternative<'a> {
pub pattern: Pattern,
pub expr: &'a Expr<'a>,
}
pub type CExpr<'a> = &'a Expr<'a>;
#[derive(Clone, Debug, PartialEq)]
pub enum Expr<'a> {
Const(Literal, Span<BytePos>),
Ident(TypedIdent<Symbol>, Span<BytePos>),
Call(&'a Expr<'a>, &'a [Expr<'a>]),
Data(TypedIdent<Symbol>, &'a [Expr<'a>], BytePos, ExpansionId),
Let(LetBinding<'a>, &'a Expr<'a>),
Match(&'a Expr<'a>, &'a [Alternative<'a>]),
}
. So it is just the runtime that is problematic 🙄

@newtack
Copy link
Author

newtack commented Dec 22, 2017

Thanks for your fast response.

@Storyyeller
Copy link

Regarding monomorphization, I don't understand what problems WASM presents that wouldn't already be an issue with Rust. When compiling to native Rust binaries, you already have to monomorphize everything. How is WASM different?

@Marwes
Copy link
Member

Marwes commented Dec 30, 2017

@Storyyeller A function like this

let test f b c: (forall a . a -> a) -> b -> c  -> () =
    f b
    f c
    f ()

is impossible to write in Rust but you can do it in gluon. If you think about it, the argument f is a function that needs to be able to take ANY type. The only general way to handle this is to make sure that all values have the same representation which would basically make all gluon values be a tagged union (and this is indeed the same representation that the current interpreter uses).

gluon/vm/src/value.rs

Lines 296 to 335 in 0909139

pub enum Value {
Byte(u8),
Int(VmInt),
Float(f64),
String(#[cfg_attr(feature = "serde_derive", serde(deserialize_state))] GcStr),
Tag(VmTag),
Data(
#[cfg_attr(feature = "serde_derive",
serde(deserialize_state_with = "::serialization::gc::deserialize_data"))]
#[cfg_attr(feature = "serde_derive", serde(serialize_state))]
GcPtr<DataStruct>,
),
Array(
#[cfg_attr(feature = "serde_derive",
serde(deserialize_state_with = "::serialization::gc::deserialize_array"))]
#[cfg_attr(feature = "serde_derive", serde(serialize_state))]
GcPtr<ValueArray>,
),
Function(#[cfg_attr(feature = "serde_derive", serde(state))] GcPtr<ExternFunction>),
Closure(
#[cfg_attr(feature = "serde_derive", serde(state_with = "::serialization::closure"))]
GcPtr<ClosureData>,
),
PartialApplication(
#[cfg_attr(feature = "serde_derive",
serde(deserialize_state_with = "::serialization::deserialize_application"))]
#[cfg_attr(feature = "serde_derive", serde(serialize_state))]
GcPtr<PartialApplicationData>,
),
// TODO Implement serializing of userdata
#[cfg_attr(feature = "serde_derive", serde(skip_deserializing))]
Userdata(
#[cfg_attr(feature = "serde_derive",
serde(serialize_with = "::serialization::serialize_userdata"))]
GcPtr<Box<Userdata>>,
),
#[cfg_attr(feature = "serde_derive", serde(skip_deserializing))]
#[cfg_attr(feature = "serde_derive", serde(skip_serializing))]
Thread(#[cfg_attr(feature = "serde_derive", serde(deserialize_state))] GcPtr<Thread>),
}

This uniform representation means some extra work when generating WASM/LLVM-IR/etc but it shouldn't be excessively bad. Having this extra tag might also make it possible to skip out on generating stack-maps for the garbage collector since this means that all values know if they are heap allocated or not.

That said, even though tagged values makes this possible, there is a tradeoff. To tag each value we either need an extra integer for each value store the tag (which costs memory and some speed), or we need to pack the tag into the value itself, sacrificing (fast) i64/u64 (no cost in memory but maybe a slightly larger speed loss).

@Marwes
Copy link
Member

Marwes commented Dec 30, 2017

Thinking about it, these kinds of functions that can't be monomorphized should be pretty uncommon. It might be possible to just return an error if a function that can't be monomorphized is encountered (to start with) and still be able to compile most real-world gluon code.

If/when it becomes necessary to compile these functions one could then generate extra code to tag and untag any values passed to and from these function. It makes the compiler more complex but it should be doable.

@Storyyeller
Copy link

Storyyeller commented Dec 30, 2017

I understand that higher ranked types can't be efficiently compiled to native code. My question is why this is more of a problem for WASM than it is for Rust. It seems to me like the issues should be the same either way.

@Marwes
Copy link
Member

Marwes commented Dec 30, 2017

The problem here is the same for WASM/Rust/LLVM-IR/assembly . The only reason it is not a problem for Rust is that that Rust's type system do not allow these kinds of functions to be written. If Rust's type system where extended to support it then it would have the same problem.

(Rust RFC for Higher-ranked-types rust-lang/rfcs#1481 )

@Storyyeller
Copy link

I know that. I meant why is it not a problem when you're running the Gluon VM in Rust? Surely running Gluon in Rust and running it in WASM should be equivalent?

@Marwes
Copy link
Member

Marwes commented Dec 30, 2017

@Storyyeller Are you talking about compiling the gluon interpreter that is now written in Rust into WASM using rustc's WASM support or are you talking about compiling gluon code into WASM?

If it is the former then I don't think there is anything preventing that. The only platform specific code that is needed is for the REPL (or possibly in one of gluon's dependencies such as tokio).

What I wrote about above has only been about the latter, ie adding another back end which is capable emitting WASM (or LLVM-IR etc) from the gluon compiler itself instead of the custom bytecode that the current interpreter uses

gluon/vm/src/types.rs

Lines 17 to 117 in 0909139

pub enum Instruction {
/// Push an integer to the stack
PushInt(isize),
/// Push a byte to the stack
PushByte(u8),
/// Push a float to the stack
PushFloat(f64),
/// Push a string to the stack by loading the string at `index` in the currently executing
/// function
PushString(VmIndex),
/// Push a variable to the stack by loading the upvariable at `index` from the currently
/// executing function
PushUpVar(VmIndex),
/// Push the value at `index`
Push(VmIndex),
/// Call a function by passing it `args` number of arguments. The function is at the index in
/// the stack just before the arguments. After the call is all arguments are removed and the
/// function is replaced by the result of the call.
Call(VmIndex),
/// Tailcalls a function, removing the current stack frame before calling it.
/// See `Call`.
TailCall(VmIndex),
/// Constructs a data value tagged by `tag` by taking the top `args` values of the stack.
Construct {
/// The tag of the data
tag: VmIndex,
/// How many arguments that is taken from the stack to construct the data.
args: VmIndex,
},
ConstructRecord {
/// Index to the specification describing which fields this record contains
record: VmIndex,
/// How many arguments that is taken from the stack to construct the data.
args: VmIndex,
},
/// Constructs an array containing `args` values.
ConstructArray(VmIndex),
/// Retrieves the field at `offset` of an object at the top of the stack. The result of the
/// field access replaces the object on the stack.
GetOffset(VmIndex),
/// Retrieves the field of a polymorphic record by retrieving the string constant at `index`
/// and using that to retrieve lookup the field. The result of the
/// field access replaces the object on the stack.
GetField(VmIndex),
/// Splits a object, pushing all contained values to the stack.
Split,
/// Tests if the value at the top of the stack is tagged with `tag`. Pushes `True` if the tag
/// matches, otherwise `False`
TestTag(VmTag),
/// Jumps to the instruction at `index` in the currently executing function.
Jump(VmIndex),
/// Jumps to the instruction at `index` in the currently executing function if `True` is at the
/// top of the stack and pops that value.
CJump(VmIndex),
/// Pops the top `n` values from the stack.
Pop(VmIndex),
/// Pops the top value from the stack, then pops `n` more values, finally the first value is
/// pushed back to the stack.
Slide(VmIndex),
/// Creates a closure with the function at `function_index` of the currently executing function
/// and `upvars` upvariables popped from the top of the stack.
MakeClosure {
/// The index in the currently executing function which the function data is located at
function_index: VmIndex,
/// How many upvariables the closure contains
upvars: VmIndex,
},
/// Creates a closure with the function at `function_index` of the currently executing
/// function. The closure has room for `upvars` upvariables but these are not filled until the
/// matching call to `ClosureClosure` is executed.
NewClosure {
/// The index in the currently executing function which the function data is located at
function_index: VmIndex,
/// How many upvariables the closure contains
upvars: VmIndex,
},
/// Fills the previously allocated closure with `n` upvariables.
CloseClosure(VmIndex),
AddInt,
SubtractInt,
MultiplyInt,
DivideInt,
IntLT,
IntEQ,
AddByte,
SubtractByte,
MultiplyByte,
DivideByte,
ByteLT,
ByteEQ,
AddFloat,
SubtractFloat,
MultiplyFloat,
DivideFloat,
FloatLT,
FloatEQ,
}

@Storyyeller
Copy link

Oh, I thought you were talking about the first one. Sorry for the confusion.

@Marwes
Copy link
Member

Marwes commented Dec 31, 2017

@Storyyeller No worries 😆 .

Out of curiosity I tried compiling gluon with WASM. Currently it stops compilation due to https://crates.io/crates/iovec not having an implementation for WASM which is needed for tokio-core. There are probably more platform specific things in tokio-core (mio for instance) even if that is fixed however so the best way to let gluon compile to WASM (using rustc/cargo) would be to make tokio-core optional which should be possible at the cost of not being able to run all async code (only the futures crate may be used).

@Marwes
Copy link
Member

Marwes commented Feb 12, 2018

@Storyyeller I made tokio_core optional now so gluon actually compiles to WASM now (about 3.5 Mb). Still needs some boilerplate to deal with the pointers used in the exported C API but in theory everything should work.

https://travis-ci.org/gluon-lang/gluon/builds/340155165

@Zireael07
Copy link

Any news?

@Marwes
Copy link
Member

Marwes commented Feb 27, 2020

Not really, you can still compile the rust code in the gluon crate to WASM https://travis-ci.org/gluon-lang/gluon/jobs/340155173 and run the interpreter in WASM.

For compiling gluon code directly to WASM I did a small investigation on using https://github.com/bytecodealliance/cranelift to JIT compile but i didn't take it further than compiling functions that only operate directly on integers/floats (no closures, records, indirect function calls etc etc).

@Boscop
Copy link

Boscop commented Jan 31, 2022

@Marwes It's not working, I'm getting this runtime error when using gluon in wasm. Any idea why? :)

image

gluon = { version = "0.18", default-features = false, features = ["random"] }

nightly-2022-01-31
Using trunk to build.

EDIT: I get the same error when not using the random feature.

@Boscop
Copy link

Boscop commented Feb 1, 2022

Also, another aspect of Wasm support would be that if a gluon script prints to stdout/stderr, it won't work. Is there a way to get all printed output from the script through a std channel or something? So that the host can display this output in appropriate ways (e.g. as part of the wasm UI, or log it to the browser console, or send it to the backend and print it to stdout/stderr there).
How to do this? :)

@Zireael07
Copy link

Zireael07 commented Feb 1, 2022

What I do in my own (non-gluon) project is this:
// A macro to provide println!(..)-style syntax for console.log logging. #[macro_export] macro_rules! log { ( $( $t:tt )* ) => { web_sys::console::log_1(&format!( $( $t )* ).into()) } }
This means all log! prints to web console.

@Boscop
Copy link

Boscop commented Feb 1, 2022

@Zireael07 Sure, but it can't be used to redirect the output of gluon scripts.

@lilac
Copy link

lilac commented Oct 31, 2024

@Storyyeller A function like this

let test f b c: (forall a . a -> a) -> b -> c  -> () =
    f b
    f c
    f ()

is impossible to write in Rust but you can do it in gluon. If you think about it, the argument f is a function that needs to be able to take ANY type. The only general way to handle this is to make sure that all values have the same representation which would basically make all gluon values be a tagged union (and this is indeed the same representation that the current interpreter uses).

gluon/vm/src/value.rs

Lines 296 to 335 in 0909139

pub enum Value {
Byte(u8),
Int(VmInt),
Float(f64),
String(#[cfg_attr(feature = "serde_derive", serde(deserialize_state))] GcStr),
Tag(VmTag),
Data(
#[cfg_attr(feature = "serde_derive",
serde(deserialize_state_with = "::serialization::gc::deserialize_data"))]
#[cfg_attr(feature = "serde_derive", serde(serialize_state))]
GcPtr<DataStruct>,
),
Array(
#[cfg_attr(feature = "serde_derive",
serde(deserialize_state_with = "::serialization::gc::deserialize_array"))]
#[cfg_attr(feature = "serde_derive", serde(serialize_state))]
GcPtr<ValueArray>,
),
Function(#[cfg_attr(feature = "serde_derive", serde(state))] GcPtr<ExternFunction>),
Closure(
#[cfg_attr(feature = "serde_derive", serde(state_with = "::serialization::closure"))]
GcPtr<ClosureData>,
),
PartialApplication(
#[cfg_attr(feature = "serde_derive",
serde(deserialize_state_with = "::serialization::deserialize_application"))]
#[cfg_attr(feature = "serde_derive", serde(serialize_state))]
GcPtr<PartialApplicationData>,
),
// TODO Implement serializing of userdata
#[cfg_attr(feature = "serde_derive", serde(skip_deserializing))]
Userdata(
#[cfg_attr(feature = "serde_derive",
serde(serialize_with = "::serialization::serialize_userdata"))]
GcPtr<Box<Userdata>>,
),
#[cfg_attr(feature = "serde_derive", serde(skip_deserializing))]
#[cfg_attr(feature = "serde_derive", serde(skip_serializing))]
Thread(#[cfg_attr(feature = "serde_derive", serde(deserialize_state))] GcPtr<Thread>),
}

This uniform representation means some extra work when generating WASM/LLVM-IR/etc but it shouldn't be excessively bad. Having this extra tag might also make it possible to skip out on generating stack-maps for the garbage collector since this means that all values know if they are heap allocated or not.

That said, even though tagged values makes this possible, there is a tradeoff. To tag each value we either need an extra integer for each value store the tag (which costs memory and some speed), or we need to pack the tag into the value itself, sacrificing (fast) i64/u64 (no cost in memory but maybe a slightly larger speed loss).

The OCaml's runtime also has a uniform value representation, so it seems gluon program can be compiled to OCaml's IR. If so then we can easily build a gluon compiler and interpreter, by utilizing an IR like Malfunction, which is actually a thin abstraction of OCaml compiler's lambda IR. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants