Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppport executing wasm workloads with wasmer #548

Merged
merged 8 commits into from
Jan 24, 2022

Conversation

Furisto
Copy link
Collaborator

@Furisto Furisto commented Dec 19, 2021

See docs

@codecov-commenter
Copy link

codecov-commenter commented Dec 19, 2021

Codecov Report

Merging #548 (6fdc296) into main (de6385f) will increase coverage by 8.58%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main     #548      +/-   ##
==========================================
+ Coverage   61.18%   69.77%   +8.58%     
==========================================
  Files          85       84       -1     
  Lines       12553    10972    -1581     
==========================================
- Hits         7681     7656      -25     
+ Misses       4872     3316    -1556     

@tsturzl
Copy link
Collaborator

tsturzl commented Dec 21, 2021

This is really interesting. I'm wondering how you plan to distinguish between a container and a wasm workload, and how a high level runtime might integrate with this. This could be a really interesting alternative to something like Krustlet which could allow you to run wasm and container workloads on the same kubelet, whereas krustlet only supports wasm workloads on the hosts running it.

@Furisto
Copy link
Collaborator Author

Furisto commented Dec 22, 2021

@tsturzl This is based on the compat variant of the Wasm Image Specification. The config.json needs to have an annotation of either run.oci.handler or module.wasm.image/variant in order for us to detect that a wasm workload has be run. The linked spec also describes how to build an image that is compliant with the Wasm Image spec. Here is an example of how you can use it with CRI-O.

@Furisto Furisto force-pushed the wasm-wasmer branch 2 times, most recently from d4ca2cf to c99757a Compare January 1, 2022 22:25
@Furisto Furisto marked this pull request as ready for review January 4, 2022 15:30
@Furisto Furisto requested a review from utam0k January 4, 2022 15:30
@Furisto Furisto changed the title Draft: Suppport executing wasm workloads with wasmer Suppport executing wasm workloads with wasmer Jan 4, 2022
@utam0k
Copy link
Member

utam0k commented Jan 5, 2022

Wow! This is a very wonderful feature. I'm sorry but as It is hard for me to understand it because I am not familiar with wasm, I need some time to review this PR. Please give me some time to review 🙇

@utam0k utam0k added the enhancement New feature or request label Jan 5, 2022
@tsturzl
Copy link
Collaborator

tsturzl commented Jan 5, 2022

I will also review in addition to @utam0k. This is a pretty big change so it's probably good to have a few eyes on it. This is awesome work though.

tsturzl
tsturzl previously requested changes Jan 5, 2022
Copy link
Collaborator

@tsturzl tsturzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all this is really cool stuff, and it's awesome that you took the time and initiative to do this work. So thank you for that. I'm not entirely on board with using dynamic dispatch here. I think perhaps I'd consider refactoring to get rid of the CompositeExecutor and opt for a ExecutorManager that doesn't need to satisfy the Executor trait. This ExecutorManager would instead select hard coded Executors with a match block, and therefore only instantiate and allocate the Executor that is determined is actually needed. You can probably still use can_handle just don't have can_handle require a reference to self, in fact you can probably completely avoid the need to instantiate Executors and have all of the trait methods be static (no ref to self) just like we currently do with cgroups controllers, self is never used anyway (other than CompositeExecutor) and there's no point in instantiating these empty structs.

Let me know what you think of this. It's possible there's an argument for dynamic dispatch that I'm missing, or maybe others prefer that approach also. Even if we were to decide dynamic dispatch is best I'd say the CompositeExecutor should probably just be an ExecutorManager that doesn't satisfy the Executor trait and all Executors should just be static structs that never need to be instantiated.

It also needs to specifiy a valid .wasm (webassembly binary) or .wat (webassembly test) module as entrypoint for the container. If a wat module is specified it will be compiled to a wasm module by youki before it is executed. The module also needs to be available in the root filesystem of the container obviously.


```json
Copy link
Collaborator

@tsturzl tsturzl Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this tutorial be more in line with the crun example, or the example from wasm/spec-compat? They both seem to use buildah as a way to create a WASM image to run through a high level runtime. Maybe we could have both the youki only example and the example of how to build an image (or at least link to one)? I think the interest here will probably be running this through a high level runtime, whereas podman seems like a good option for the example. It would also be good to verify that this is working with some kind of high level runtime, and then give instructions on how to use this feature through that highlevel runtime.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'd say it's probably worth providing a couple of links to helpful references here. Like the links I posted in my comment, and some of the references you linked in the PR comments.

fn name(&self) -> &str;
}
pub struct CompositeExecutor {
executors: Vec<Box<dyn Executor>>,
Copy link
Collaborator

@tsturzl tsturzl Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't need a Vector since it's statically sized, and you know the size. You could use an Array and set the size based on whether the wasm feature flag is present.

Also I'm not totally sure I agree with dynamic dispatch here. How many Executors are we actually going to add here? It might be worth saving the Box allocation. It might even be better to not have to load Executors you don't actually need.


impl Executor for CompositeExecutor {
fn exec(&self, spec: &Spec) -> Result<()> {
for executor in &self.executors {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think iterators would be more idiomatic and readable here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind you this loop may go away if we agree against dynamic dispatch.

fn exec(&self, spec: &Spec) -> Result<()> {
for executor in &self.executors {
if executor
.can_handle(spec)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'm not sure I'm totally on board with dynamic dispatch here. It seems weird to loop through all the executors and settle on the first one that satisfies the spec, what if more than one Executor were to match at some point in the future? It also seems like if a number of parameters or spec properties go into selecting the Executor then you'd be better served by static control flow like a match block, and moving the can_run logic into the top level, and instead of having a CompositeExecutor, you'd have a ExecutorManager that probably doesn't even need to satisfy the Executor trait. You're also always allocating a copy of every Executor Youki was compiled with into heap. If you figured out what Executor you need before instantiating an Executor things would be more efficient.

Consider this for example: You have 3 regular container executors, and 3 WASM executors. In the dynamic dispatch approach your worst case is you check all 6. If you had a match block that could narrow down between knowing if the user wanted a "regular" or "wasm" executor, then you could narrow down to 4 checks as your worst case. Checking if it's wasm or regular, then checking the 3 for either. I suppose with dynamic dispatch you could handle that by having a regular and wasm composite Executor, but then you're still instantiating and allocating Executors you don't actually need.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And of course, I'm not demanding the change, but perhaps it's worth the conversation. Maybe there's things I'm not considering or overlooking here, or maybe others find the dynamic dispatch approach to be nicer also.

executors: Vec<Box<dyn Executor>>,
}

impl Executor for CompositeExecutor {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit for having a CompositeExecutor? It seems the major benefit for an Executor is that is creates a common interface for Executors, could this maybe just be a ExecutorManager that doesn't need to satisfy Executor? I don't imagine we'll ever need the name function here, and that's just dead code.

Ok(())
}

fn can_handle(&self, spec: &Spec) -> Result<bool> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the instances of can_handle Result isn't ever really needed, and always returns Ok. This seems like unnecessary indirection.


pub trait Executor {
/// Executes the workload
fn exec(&self, spec: &Spec) -> Result<()>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any of the functions for Executor actually need a reference to self? The CompositeExecutor is the only one that uses it, and it probably doesn't even need to satisfy this trait. If none of the Executors needed a &self, you'd also avoid the need to ever instantiate Executors, which aside from the CompositeExecutor are all empty structs.

@tsturzl
Copy link
Collaborator

tsturzl commented Jan 5, 2022

@utam0k Could you also look through my review and provide your input?

@utam0k
Copy link
Member

utam0k commented Jan 8, 2022

@Furisto
As this feature is very interesting, I think that this should be written in README.md. Can I ask you to write?

@utam0k
Copy link
Member

utam0k commented Jan 8, 2022

First of all this is really cool stuff, and it's awesome that you took the time and initiative to do this work. So thank you for that. I'm not entirely on board with using dynamic dispatch here. I think perhaps I'd consider refactoring to get rid of the CompositeExecutor and opt for a ExecutorManager that doesn't need to satisfy the Executor trait. This ExecutorManager would instead select hard coded Executors with a match block, and therefore only instantiate and allocate the Executor that is determined is actually needed. You can probably still use can_handle just don't have can_handle require a reference to self, in fact you can probably completely avoid the need to instantiate Executors and have all of the trait methods be static (no ref to self) just like we currently do with cgroups controllers, self is never used anyway (other than CompositeExecutor) and there's no point in instantiating these empty structs.

Let me know what you think of this. It's possible there's an argument for dynamic dispatch that I'm missing, or maybe others prefer that approach also. Even if we were to decide dynamic dispatch is best I'd say the CompositeExecutor should probably just be an ExecutorManager that doesn't satisfy the Executor trait and all Executors should just be static structs that never need to be instantiated.

@Furisto
First of all, now this design is not bad at all. But I agree with @tsturzl's option because perhaps executor will only increase the number of types that can be adequately covered by hard coding. It would be better to have an ExecutorManger instead of CompositeExecutor. If you have a reason for choosing the current design, I'd like to hear it.

@utam0k
Copy link
Member

utam0k commented Jan 8, 2022

If so, may it be difficult to separate wasm-wasmer feature?

@Furisto
Copy link
Collaborator Author

Furisto commented Jan 8, 2022

@tsturzl @utam0k Instead of responding to individual comments I think it is better for me to explain what my current thinking around this feature is and where I want to go with this.

Quite a few webassembly runtimes exist nowadays with the most prominent ones being wasmer, wasmtime and wasmedge. These runtimes offer different features and make different tradeoffs (wasmedge for example currently does not support compiling .wat files while wasmer does). Therefore I plan to support different webassembly runtimes to accommodate different user requirements.

It is likely that the current method of activating a handler using a feature flag will go away as I would like to avoid that everyone has to build youki with the appropriate configuration for their specific use case. Statically linking support for all webassembly runtimes would be problematic as this would bloat the size of the youki binary considerably (even with just wasmer the size of the binary roughly doubles) and would not be of use to anyone who is not wanting to run webassembly workloads (which is still the majority for the foreseeable future). Therefore dependencies will probably linked dynamically in the future.

I am also thinking about allowing users to to specify the preferred order in which webassembly runtimes are asked to handle the workload in the config.json. It might even be possible that a configuration file located at a well known location can be provided that allows enables/disables certain handlers globally. This might be overkill though and we would want to check if other runtimes also plan to support such a feature.

Lastly one intended use case for this will be running hooks in a webassembly sandbox, so hooks will have to be adapted as well in the future.

What I want to say with this is: I would not focus to much on how exactly an executor is called because this is likely to change anyway. I would like to get this feature in first and then iterate on it with the knowledge gained from implementing support for further webassembly runtimes.

I welcome comments on the general direction of this feature.

@utam0k
Copy link
Member

utam0k commented Jan 10, 2022

@tsturzl @utam0k Instead of responding to individual comments I think it is better for me to explain what my current thinking around this feature is and where I want to go with this.

Quite a few webassembly runtimes exist nowadays with the most prominent ones being wasmer, wasmtime and wasmedge. These runtimes offer different features and make different tradeoffs (wasmedge for example currently does not support compiling .wat files while wasmer does). Therefore I plan to support different webassembly runtimes to accommodate different user requirements.

It is likely that the current method of activating a handler using a feature flag will go away as I would like to avoid that everyone has to build youki with the appropriate configuration for their specific use case. Statically linking support for all webassembly runtimes would be problematic as this would bloat the size of the youki binary considerably (even with just wasmer the size of the binary roughly doubles) and would not be of use to anyone who is not wanting to run webassembly workloads (which is still the majority for the foreseeable future). Therefore dependencies will probably linked dynamically in the future.

I am also thinking about allowing users to to specify the preferred order in which webassembly runtimes are asked to handle the workload in the config.json. It might even be possible that a configuration file located at a well known location can be provided that allows enables/disables certain handlers globally. This might be overkill though and we would want to check if other runtimes also plan to support such a feature.

Lastly one intended use case for this will be running hooks in a webassembly sandbox, so hooks will have to be adapted as well in the future.

What I want to say with this is: I would not focus to much on how exactly an executor is called because this is likely to change anyway. I would like to get this feature in first and then iterate on it with the knowledge gained from implementing support for further webassembly runtimes.

I welcome comments on the general direction of this feature.

Great thoughts 💯
I'm sorry if I'm understanding this wrong, but I think @tsturzl is imagining this kind of code. I guess this should keep the byte size the same.

#[cfg(not(feature = "wasm-wasmer"))]
fn choose_executor() -> Result<Box<dyn Executor>> {
    Ok(Box::new(DefaultExecutor {}))
}

#[cfg(feature = "wasm-wasmer")]
fn choose_executor() -> Result<Box<dyn Executor>> {
    if true {
        Ok(Box::new(WasmerExecutor {}))
    } else {
        Ok(Box::new(DefaultExecutor {}))
    }
}

@Furisto
Copy link
Collaborator Author

Furisto commented Jan 19, 2022

@utam0k @tsturzl PTAL

Copy link
Member

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you for accepting our suggestion!

@Furisto
Copy link
Collaborator Author

Furisto commented Jan 24, 2022

@tsturzl seems to be busy, so I will go ahead and merge this now. But don't worry, there will be a new opportunity in the next PR for this.

@Furisto Furisto merged commit 1b810d4 into youki-dev:main Jan 24, 2022
@tsturzl
Copy link
Collaborator

tsturzl commented Jan 28, 2022

Apologies for my absence. I've been on-boarding new hires at work. I did take a brief look through, and thought this looked fine. I'm excited to see the feature get merged in. I may hack on it a bit on my free time to familiarize myself with the code base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants