-
Notifications
You must be signed in to change notification settings - Fork 148
Implement emscripten libc environment #163
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
package emlibc | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be internal? I imagine some embedded users would want to call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I was trying to decide how to handle that so i put it in there as more of an afterthought. I may be overthinking it but... should we be laying the ground work for supporting multiple 'environments' (EM, WASI, the other 15 competing specifications that are sure to come) Should they be built in like i've started, or should they themselves be external wasm files that use a common libc like api that we develop internally. So at the moment I'm implementing it as a ResolveFunc. I was thinking about making the ReadModule signature variadic ReadModule(r io.Reader, resolvePath ...ResolveFunc) (*Module, error) but i'm not sure if thats the correct way to go about it so that someone could call wasm.ReadModule(buf,EMLibc,WASI,FileImporter,WAPM) and we go through each in turn to try and resolve some of that can be figured out later...but I was questioning if I was hooking in at the right place. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmmmm, good point ... We should have a think about this. I think the 'resolver' method is probably the right way to go. I'm not a fan of inventing our own internal API, as thats another abstraction layer to maintain and might affect performance depending on implementation. |
||
|
||
import ( | ||
"errors" | ||
"fmt" | ||
"reflect" | ||
|
||
"github.com/go-interpreter/wagon/exec" | ||
|
||
"github.com/go-interpreter/wagon/wasm" | ||
) | ||
|
||
func ResolveEnv(name string) (*wasm.Module, error) { | ||
if name == "env" { | ||
return GetEnv(), nil | ||
} | ||
fmt.Println("tried resolve", name) | ||
return nil, errors.New("Not Found") | ||
} | ||
func clen(n []byte) int { | ||
for i := 0; i < len(n); i++ { | ||
if n[i] == 0 { | ||
return i | ||
} | ||
} | ||
return len(n) | ||
} | ||
func GetEnv() *wasm.Module { | ||
|
||
m := wasm.NewModule() | ||
print := func(proc *exec.Process, v int32) int32 { | ||
fmt.Printf("result = %v\n", v) | ||
return 0 | ||
} | ||
puts := func(proc *exec.Process, v int32) int32 { | ||
|
||
buf := []byte{} | ||
temp := make([]byte, 1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe try There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah i like that...I couldn't decide if it would be better if we had more access to the underlaying []byte, I guess at some point during this process maybe we need to implement a couple other methods on exec.Process for memory management, that may be where we can implement things like alloc/free etc. and could also have a method that returns a reader so we could use bufio.Readers for some of this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Hmmmm. How do things like rust/Go handle this? Does everyone who compiles wasm ship their own malloc/free implementation? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thats been one of the most confusing things about this as I have been learning WASM. From what I can gather it currently ships inside the glue code from compiling. So its actually provided by the compiler 'runtime'...like emscripten or LLVM. var _free = Module["_free"] = function() {
return Module["asm"]["_free"].apply(null, arguments)
};
var _main = Module["_main"] = function() {
return Module["asm"]["_main"].apply(null, arguments)
};
var _malloc = Module["_malloc"] = function() {
return Module["asm"]["_malloc"].apply(null, arguments)
};
var _memcpy = Module["_memcpy"] = function() {
return Module["asm"]["_memcpy"].apply(null, arguments)
};
var _memset = Module["_memset"] = function() {
return Module["asm"]["_memset"].apply(null, arguments)
}; and then its called from the wasm call $_printf
i32.const 4
call $_malloc
local.set 4
local.get 1
local.get 4
i32.store
local.get 1 Really seems like it should have been part of the MVP spec to me...that seems pretty basic, but it appears to me that is how its done... But its an area of WASM i'm still trying to learn. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've learned the most about wasm from playing around with https://github.com/intel/wasm-micro-runtime But to support emscripten (and llvm) they have a libc wrapper and it is the most concise place i've found for figuring out what calls are needed to support the compiler runtimes. In the following code you can see their "env" implementation. It covers the basics for libc calls from emscripten or llvm...and from most of the code i've tried against it works without issue. You can clearly see malloc and free operating inside the memory buffer. So thats what i'm basing my info off of. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWICT, wasm is intended just to implement a very minimal 'CPU', and design decisions like memory allocation are to be handled by the calling code. The only memory management features in the wasm specification are these two opcodes:
In all the wasm I've seen, malloc/free are all implemented in wasm shipped by the application. Do any other wasm interpreters provide a 'libc' layer like this, or is the libc layer always shipped with the application code? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with what you say from the spec yes wasmer appears to provide a 'emscripten compatibility layer' and wasm-micro-runtime does as well. I think it is kind of a grey area at the moment but the problem I see it that since the linear memory buffer is used from both the host and the wasm module then someone has to be authoritative i.e. if I call into a wasm module from the host and want to pass in a string, I allocate it in the buffer and send a pointer. If they reply with another string they basically do the same. So do we assume the linear memory is stateless? So in each iteration the current 'owner' has full use of the buffer...if not someone has to manage the memory. I think that since GC is planned in the post MVP then I think the responsibility for memory management is best handled in the runtime host. Again these are my very very unqualified opinions There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One thing to point out, if you build for a browser then the libc code is in the .js glue file generated by emscripten, so the browser does not directly provide it. but in the case of non browser runtimes it appears to me they have provided their own glue code natively to support the emscripten compiler. that is why i mentioned doing the implementations as a library of .wasm files that could be imported rather than writing them natively in go...but we would need to provide at least a minimal api to do that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Finally found the wasmer code...I remembered seeing it but it took me a minute There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Lets do that - (as in, lets do what wasmer and the other native runtimes are doing, and provide an identical API). |
||
for i := int(v); i < proc.MemSize(); i++ { | ||
_, err := proc.ReadAt(temp, int64(i)) | ||
if err != nil { | ||
fmt.Println(err) | ||
} | ||
if temp[0] == 0 { | ||
break | ||
} | ||
buf = append(buf, temp[0]) | ||
} | ||
fmt.Println(string(buf)) | ||
return 0 | ||
} | ||
m.Types = &wasm.SectionTypes{ | ||
Entries: []wasm.FunctionSig{ | ||
{ | ||
Form: 0, // value for the 'func' type constructor | ||
ParamTypes: []wasm.ValueType{wasm.ValueTypeI32}, | ||
ReturnTypes: []wasm.ValueType{wasm.ValueTypeI32}, | ||
}, | ||
}, | ||
} | ||
m.GlobalIndexSpace = []wasm.GlobalEntry{ | ||
{ | ||
Type: wasm.GlobalVar{ | ||
Type: wasm.ValueTypeI32, | ||
}, | ||
Init: []byte{65, 0, 11}, | ||
}, | ||
} | ||
// m.LinearMemoryIndexSpace = [][]byte{make([]byte, 256)} | ||
m.Memory = &wasm.SectionMemories{ | ||
Entries: []wasm.Memory{ | ||
{ | ||
Limits: wasm.ResizableLimits{Initial: 1}, | ||
}, | ||
}, | ||
} | ||
m.FunctionIndexSpace = []wasm.Function{ | ||
{ | ||
Sig: &m.Types.Entries[0], | ||
Host: reflect.ValueOf(print), | ||
Body: &wasm.FunctionBody{}, // create a dummy wasm body (the actual value will be taken from Host.) | ||
}, | ||
{ | ||
Sig: &m.Types.Entries[0], | ||
Host: reflect.ValueOf(puts), | ||
Body: &wasm.FunctionBody{}, // create a dummy wasm body (the actual value will be taken from Host.) | ||
}, | ||
} | ||
m.Export = &wasm.SectionExports{ | ||
Entries: map[string]wasm.ExportEntry{ | ||
"print": { | ||
FieldStr: "print", | ||
Kind: wasm.ExternalFunction, | ||
Index: 0, | ||
}, | ||
"_puts": { | ||
FieldStr: "_puts", | ||
Kind: wasm.ExternalFunction, | ||
Index: 1, | ||
}, | ||
"__memory_base": { | ||
FieldStr: "__memory_base", | ||
Kind: wasm.ExternalGlobal, | ||
Index: 0, | ||
}, | ||
"memory": { | ||
FieldStr: "memory", | ||
Kind: wasm.ExternalMemory, | ||
Index: 0, | ||
}, | ||
}, | ||
} | ||
return m | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
//go:generate emcc -Os src/puts.c -s SIDE_MODULE=1 -o puts.wasm -s TOTAL_MEMORY=65536 -s TOTAL_STACK=4096 | ||
|
||
package test |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
(module | ||
(type $FUNCSIG$ii (func (param i32) (result i32))) | ||
(data (global.get $__memory_base) "Hello") | ||
(import "env" "__memory_base" (global $__memory_base i32)) | ||
(import "env" "_puts" (func $_puts (param i32) (result i32))) | ||
(memory $memory 1) | ||
(global $STACKTOP (mut i32) (i32.const 0)) | ||
(global $STACK_MAX (mut i32) (i32.const 0)) | ||
(export "__post_instantiate" (func $__post_instantiate)) | ||
(export "_main" (func $_main)) | ||
(func $_main (; 1 ;) (; has Stack IR ;) (result i32) | ||
;;@ src/puts.c:5:0 | ||
(drop | ||
(call $_puts | ||
(global.get $__memory_base) | ||
) | ||
) | ||
;;@ src/puts.c:6:0 | ||
(i32.const 0) | ||
) | ||
(func $__post_instantiate (; 2 ;) (; has Stack IR ;) | ||
(global.set $STACKTOP | ||
(i32.add | ||
(global.get $__memory_base) | ||
(i32.const 16) | ||
) | ||
) | ||
(global.set $STACK_MAX | ||
(i32.add | ||
(global.get $STACKTOP) | ||
(i32.const 5242880) | ||
) | ||
) | ||
) | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#include <stdio.h> | ||
|
||
|
||
int main(){ | ||
puts("Hello"); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -144,7 +144,8 @@ func (m *Module) ExecInitExpr(expr []byte) (interface{}, error) { | |
if globalVar == nil { | ||
return nil, InvalidGlobalIndexError(index) | ||
} | ||
lastVal = globalVar.Type.Type | ||
return m.ExecInitExpr(globalVar.Init) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this give us an infinite loop? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it could...it happened to me while playing with it...could try and detect it. Do we need this here...I couldn't get it to work before, and if you just return nil,nil without an error it panics in Module.populateLinearMemory() if we really do want to return nil (as it did before) we can either specify an error to return if the stack is empty or deal with a possible null in the error building The problem comes from reflect.TypeOf(val).Kind() when val is nil There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could modify But this is a NP problem, we could probably not address this. |
||
// lastVal = globalVar.Type.Type | ||
case end: | ||
break | ||
default: | ||
|
@@ -155,7 +156,6 @@ func (m *Module) ExecInitExpr(expr []byte) (interface{}, error) { | |
if len(stack) == 0 { | ||
return nil, nil | ||
} | ||
|
||
v := stack[len(stack)-1] | ||
switch lastVal { | ||
case ValueTypeI32: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the 'env' import reserved?
Could someone legitimately create a wasm file named 'env' and we break them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not found it in the WASM spec...it looks to me like its just a 'convention' used internally by emscripten, and possible adopted by LLVM in their code (I'll research more).
That is why it has to be optional
The above was a temporary for the POC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha :) plz move it to a flag or something before we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, I'm sorry to be clear this pull request was meant to start the conversation not be syntactically correct...I'll work on a better implementation now that we have nailed down a few things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhh, my bad, I didnt realise :O
In that case LGTM.