Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to initialize STACKTOP? #5771

Closed
pepyakin opened this issue Nov 12, 2017 · 28 comments
Closed

How to initialize STACKTOP? #5771

pepyakin opened this issue Nov 12, 2017 · 28 comments

Comments

@pepyakin
Copy link

I'm working in an non-js environment.
So before an execution of the wasm module I need to provide initial STACKTOP.

But the problem is STACKTOP depends on STATICTOP, i.e. should be just after static data (+ alignment). AFAIK, this value is only available in JS runtime.

I'm not sure, how should I initialize STACKTOP?

@pepyakin pepyakin changed the title How to calculate STATICTOP? How to initialize STACKTOP? Nov 12, 2017
@kripken
Copy link
Member

kripken commented Nov 12, 2017

Is this a wasm side module? If so, then it's a dynamic library, and has a dylink section. You can see code for reading it and creating room for the heap and stack here. Basically, the section tells you how much room you need for memory (and table). So the JS loading code should reserve that memory and the stack, and pass those in.

@kripken
Copy link
Member

kripken commented Nov 12, 2017

Actually I don't see stack allocation there, so the loader assumes the stack is handled internally by the module. But you can do it either way, as long as you coordinate.

@pepyakin
Copy link
Author

That's helpful, thank you! I will try assess this approach and return back

@pepyakin
Copy link
Author

pepyakin commented Nov 14, 2017

I realized that SIDE_MODULE=1 isn't the best choice for me. I need small binaries that executes in a minimal amount of steps, but in SIDE_MODULE=1 mode, every access to a global is offsetted by memoryBase. On the other hand, I don't need my binaries to be relocatable.

Is there a way to find out the staticbump in SIDE_MODULE=0 mode?

@kripken
Copy link
Member

kripken commented Nov 15, 2017

Oh, yes, if you don't need it to be relocatable then maybe not SIDE_MODULE.

In normal mode, you can see it in the generated JS, in the form

STATICTOP = STATIC_BASE + X;

where X is a number. It's emitted from emscripten.py, which receives it as metadata from the backend.

@pepyakin
Copy link
Author

My experiments and research led me to conclude that I need something like --allocate-stack of s2wasm and new backend. It looks like there is nothing like this in Emscripten with fastcomp...

@tjpalmer
Copy link

Related, I'm trying to build a common js environment (with future interest in non-JS as well) for any emcc-generated executable. One of the issues I've hit is that "+ X". Is there some way to get it from the wasm module itself?

@kripken
Copy link
Member

kripken commented Nov 27, 2017

In a wasm module with a dynamic linking section, there is info for that, https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md I believe also in the object file linking docs, https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md . But a normal wasm will not contain it, a toolchain needs to create a convention for it.

@tjpalmer
Copy link

tjpalmer commented Nov 30, 2017

Thanks for the info. I want it in normal wasm. Could emscripten make a convention, such as a standard exported symbol?

@kripken
Copy link
Member

kripken commented Nov 30, 2017

The wasm tool conventions docs mentioned above decided to add new wasm custom sections, for object and library files respectively. Emscripten supports the library one already, and will support object files soon I think (with the llvm wasm backend).

@tjpalmer
Copy link

tjpalmer commented Dec 1, 2017

Would it be wiser for me then to wait on the upcoming features before continuing my project?

@kripken
Copy link
Member

kripken commented Dec 1, 2017

I don't fully understand your project. Can you explain more in detail what you're doing, what the goals are, etc.?

@tjpalmer
Copy link

tjpalmer commented Dec 2, 2017

Here's a project with braindump and some exported code from emsdk:

https://github.com/tjpalmer/dae

I don't know for sure what I can and can't accomplish with my time constraints, but even separate from any lofty goals, it seems like the ability to run wasm executables from a preset environment has value. (And a different use case from building a specially tailored page/site/app, which is where emsdk is focused now, but still a valuable use case, in my opinion.)

@tjpalmer
Copy link

tjpalmer commented Dec 2, 2017

Actually, it occurs to me that I have a workaround. I probably should have an app descriptor for my purposes, anyway, and I can throw metadata in there, including any atinit function names that I know have been exported from the module.

But again, separate from anything I'm doing, I think it would be good for both wasm and emscripten to support self-contained executables, given some predefined environment.

@tjpalmer
Copy link

tjpalmer commented Dec 2, 2017

Sorry for the spam, but if you follow my above link, I now have proof of concept demos that use a small descriptor (parsed from the emsdk-compiled js) that extracts the static bump and other key values into a small descriptor.

I'll go with this workaround for my own needs for now, but it would be great not to have to parse the js file. Some exported metadata file from emsdk might be one option until standards exist.

@kripken
Copy link
Member

kripken commented Dec 4, 2017

Interesting project. I don't think I fully understand it all, but I do get that you want a constrained/secure environment - and I guess knowing the app details are necessary for that.

Yeah, parsing the JS makes sense for now. We could also consider adding an option to export that metadata - static bump, list of global ctors, maybe other stuff I'm forgetting. Maybe like emcc --export-metadata=X which would write it to the specified file.

@tjpalmer
Copy link

tjpalmer commented Dec 5, 2017

That would be great if you get a chance. If I make progress on my project, my current short list might get longer based on the info I find I need to extract. Thanks for all the feedback.

@pepyakin
Copy link
Author

pepyakin commented Dec 5, 2017

Just my 2 cents on why i'm interested in this.

I'm working on a project integrating WebAssembly into a blockchain.
In a nutshell:
Blockchain can be viewed as a distributed database. One could deploy a contract in it. Contract contains code and storage. The code contains only one entrypoint function, that takes byte array as input and returns another byte array as output. The storage is accessed by special imported functions that allows to read or write a word. One could send a message to a contract, message could contain a data in it and the contract might return some data. Linear memory is transient and doesn't persist across different invocations of the contract's code. Instruction count that contract could execute is limited by a value that is passed in a message.
If you interested, you can take a look at this example. This contract just tracks how many "tokens" each user (represented by some "address") holds.

Before execution of contract starts, we need to initialize execution enviroment, including to setup stack for the contract. At the moment, we just hardcode the STACK_TOP to an arbitrary high-enough address. But I believe that contracts should be self-contained and initialize enviroment for themselves (So because of it I think --allocate-stack might be a good solution).

In a real blockchain, users that want to execute some code actually pays for every instruction that was executed by a contract and for every resource it touched (i.e memory or storage).
So it is very desired that wasm binaries is very compact and function executes in minimal amount of steps. (So this is one reason why we don't want relocatable binaries).

@rianhunter
Copy link
Contributor

rianhunter commented Sep 14, 2018

@tjpalmer @kripken I've also just recently ran into this issue when analyzing emscripten-generated WASM modules. Is it possible to make STATIC_BUMP an exported property of the WASM file itself? Parsing the JS is not a good option for me. I'm happy to contribute the code if you are willing to accommodate this.

@kripken
Copy link
Member

kripken commented Sep 15, 2018

@rianhunter Sounds good, I think it makes sense to add an option to export STATIC_BUMP (and maybe also a few others, like GLOBAL_BASE, which is where globals are allocated - in practice, the start of static memory). Let me know if you run into any issues implementing it.

One question is where to do it - implementing it in asm2wasm versus the llvm wasm backend would be a little different (in asm2wasm it would be in emscripten.py, in the llvm wasm backend probably in wasm-emscripten-finalize in binaryen). Probably not much work in either, though.

@rianhunter
Copy link
Contributor

@kripken Probably best to implement it in both, right? The impression I get is that the llvm wasm backend is the future yet still not the default.

When it comes to modeling the protocol between the runtime and the module, I see STATIC_BUMP similar to ___errno_location. The module depends on the runtime for ___setErrNo, yet the runtime depends on ___errno_location being defined by the module . Similarly, the module depends on the runtime providing STACKTOP, yet the runtime defines STACKTOP based on the STATIC_BUMP defined by the module. Does that make sense to you? Am I missing something?

@kripken
Copy link
Member

kripken commented Sep 17, 2018

Probably best to implement it in both, right? The impression I get is that the llvm wasm backend is the future yet still not the default.

Yeah, exactly. Doing in both is best, but if you can only do one, I'd do the llvm wasm backend.

The module depends on the runtime for ___setErrNo, yet the runtime depends on ___errno_location being defined by the module .

I think that's right - we have setErrNo in JS so that the JS filesystem code can call it. And that updates errno in the compiled C code using errno_location so that C code just looking at errno sees the proper value.

Similarly, the module depends on the runtime providing STACKTOP, yet the runtime defines STACKTOP based on the STATIC_BUMP defined by the module. Does that make sense to you?

I think that's right too. At runtime in JS we calculate the location for the stack, and then provide that to the module.

It might be nice eventually to have an option for fixing the location of the stack at compile time. There was some discussion about it on the mailing list, regarding the ctor evaller.

@rianhunter
Copy link
Contributor

Okay, I hacked together a workaround on my project for now, and furiously working there, but I will return to this!

@eira-fransham
Copy link

eira-fransham commented Nov 22, 2018

This is a major issue in runwasm. I would much prefer to get the STACK_BUMP value from the wasm file's metadata, it prevents the program from unexpectedly segfaulting, etc. It would be an acceptable solution to also store the STACK_BUMP value in the wasm metadata but not migrate emscripten to actually read it (or write it to an asm.js module) until later.

@kripken
Copy link
Member

kripken commented Nov 23, 2018

See #6457 (comment) for the bigger picture here. But that should not stop someone from adding metadata for STACK_BUMP etc. in the meantime.

@rianhunter
Copy link
Contributor

See #7815 for pending fix

@rianhunter
Copy link
Contributor

Fixed in 0d83546 pass -s EMIT_EMSCRIPTEN_METADATA to emcc to use. I can't close but @pepyakin or @kripken can :)

@kripken
Copy link
Member

kripken commented Jan 9, 2019

Thanks @rianhunter, closing.

@kripken kripken closed this as completed Jan 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants