Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle args for modules with out extra first argument in prep for oci articfacts #146

Closed
wants to merge 4 commits into from

Conversation

jsturtevant
Copy link
Contributor

@jsturtevant jsturtevant commented Jun 15, 2023

When working on adding OCI artifact support, I found two bugs:

  1. the args to the program wasm program included the module.wasm artifact. This means the demo app was parsing the second arguement even though it was really the first.
  2. the shim would hang (via an exception) if the module wasn't passed as the entrypoint or cmd. This lead to the sim hanging. Adding an addtional check lets the shim fail properly in scenarios where someone built the container without the correct entry point.

I've included in the tests the checks for the OCI artifacts. I can drop those and add back when add oci artifact support in the next PR but I thought i'd include it initially.

work towards #108

Signed-off-by: James Sturtevant <jstur@microsoft.com>
@jsturtevant
Copy link
Contributor Author

I think the failure is because I need to update wasmedge as well

Signed-off-by: James Sturtevant <jstur@microsoft.com>
@jsturtevant
Copy link
Contributor Author

jsturtevant commented Jun 15, 2023

hmmm this is passing locally: it was passing locally but only because it was a timing issue. It did eventually fail locally.

 make test/k3s
rustup target add wasm32-wasi
....
sudo bin/k3s kubectl wait deployment wasi-demo --for condition=Available=True --timeout=90s && \
sudo bin/k3s kubectl get pods -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP          NODE       NOMINATED NODE   READINESS GATES
wasi-demo-5d847998-cz9vm   1/1     Running   0          8s    10.42.0.3   jjs15338   <none>           <none>
wasi-demo-5d847998-v4bt2   1/1     Running   0          8s    10.42.0.5   jjs15338   <none>           <none>
wasi-demo-5d847998-2nn99   1/1     Running   0          8s    10.42.0.4   jjs15338   <none>           <none>

@jsturtevant
Copy link
Contributor Author

Ok I sorted it out, It seems to be a bug in wasmedge. If you pass None for arguements then it pass a blank string arg onto the call. I added some logging to the demo app and got:

runwasi/wasi-demo-app:latest oci                               
args: [""]                                                                 
updating cmd                                                               
unknown command:     

This is because it is passing an array: https://github.com/WasmEdge/wasmedge-rust-sdk/blob/441dfe167c33119c4a12b0ba4034743bfde9a30d/crates/wasmedge-sys/src/instance/module.rs#L558-L563

Signed-off-by: James Sturtevant <jstur@microsoft.com>
@jsturtevant jsturtevant changed the title Handle args to wasmtime in prep for oci articfacts Handle args for modules with out extra first argument in prep for oci articfacts Jun 15, 2023
@jsturtevant
Copy link
Contributor Author

So wasmedge is passing properly in 20.04:

sudo bin/k3s kubectl wait deployment wasi-demo --for condition=Available=True --timeout=90s && \
sudo bin/k3s kubectl get pods -o wide
deployment.apps/wasi-demo condition met
NAME                       READY   STATUS    RESTARTS   AGE   IP           NODE          NOMINATED NODE   READINESS GATES
wasi-demo-5d847998-v47gp   1/1     Running   0          4s    10.42.0.11   fv-az41-276   <none>           <none>
wasi-demo-5d847998-ctwqf   1/1     Running   0          4s    10.42.0.13   fv-az41-276   <none>           <none>
wasi-demo-5d847998-rch5n   1/1     Running   0          4s    10.42.0.12   fv-az41-276   <none>           <none>
sudo bin/k3s kubectl logs deployments/wasi-demo   
Found 3 pods, using pod/wasi-demo-5d847998-v47gp
This is a song that never ends.
Yes, it goes on and on my friends.
Some people started singing it not knowing what it was,
So they'll continue singing it forever just because...

Looks like it is failing in 22.04 for a different reason which I am having trouble getting logs for (I run 20.04 locally)

Signed-off-by: James Sturtevant <jstur@microsoft.com>
@jsturtevant
Copy link
Contributor Author

well 22.04 passed on that run:

Found 3 pods, using pod/wasi-demo-5d847998-dp4nm
This is a song that never ends.
Yes, it goes on and on my friends.
Some people started singing it not knowing what it was,
So they'll continue singing it forever just because...

This is a song that never ends.
Yes, it goes on and on my friends.
Some people started singing it not knowing what it was,
So they'll continue singing it forever just because...

This is a song that never ends.
Yes, it goes on and on my friends.
Some people started singing it not knowing what it was,
So they'll continue singing it forever just because...

This is a song that never ends.
Yes, it goes on and on my friends.
Some people started singing it not knowing what it was,
So they'll continue singing it forever just because...

NAME                       READY   STATUS    RESTARTS   AGE   IP           NODE           NOMINATED NODE   READINESS GATES
wasi-demo-5d847998-dp4nm   1/1     Running   0          5s    10.42.0.11   fv-az627-401   <none>           <none>
wasi-demo-5d847998-pwvtb   1/1     Running   0          5s    10.42.0.[12](https://github.com/containerd/runwasi/actions/runs/5283775358/jobs/9560590072?pr=146#step:7:13)   fv-az6[27](https://github.com/containerd/runwasi/actions/runs/5283775358/jobs/9560590072?pr=146#step:7:28)-401   <none>           <none>
wasi-demo-5d847998-5g4ww   1/1     Running   0          5s    10.42.0.1

@jsturtevant
Copy link
Contributor Author

/assign @Mossaka

Copy link
Contributor

@ipuustin ipuustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR needs to be linked to documentation somehow, because it's all about reading an annotation and having special args handling based on that. I think the documentation can just be a good comment in the code explaining the difference between the different ways of running the module, and why the arguments have to be processed differently.

Also, the wasmtime instance.rs will see big changes due to #142. There should be some coordination about which PR goes in first.

args
}
_ => {
if args.len() > 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is this a workaround for the Wasmedge issue? Should we wait for them to resolve it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this logic didn't change. This is the standard processing when the wasm module is shipped in the container layers. The first argument will always be the module. The rest of the args are the args to the module.

@@ -5,3 +5,4 @@ test/out/img.tar
!crates/wasmtime/src/bin/
test/k8s/_out
release/
.vscode/settings.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just ignore the full .vscode directory while at it? Or is it possible to have some project-level settings there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open to either... settings.json changes often for me as I switch the rust analyzer settings around as I go.

I do have a couple debug configurations that might be worth sharing if folks are interested in them

@@ -109,4 +109,7 @@ jobs:
make test/k3s
- name: cleanup
if: always()
run: make test/k3s/clean
run: |
sudo bin/k3s kubectl logs deployments/wasi-demo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that this will cause a long timeout if the cluster is in a broken state? I'm not opposing this but would like the cleanup to be quick and safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, I can wrap it in a timeout? I was having trouble getting the logs on failure and this was the only way I could get it to work.

module_args = Some(args.iter().map(|s| s as &str).collect())
}

debug!("module args: {:?}", module_args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we need to go through the codebase and look at the logging, because I'm worried that someone might pass secrets as environment variables or module arguments and we'll accidentally log them. It's not a problem yet, just something to keep in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we might want to take a more structure approach to logging as well.

I can drop some of these debug statements. It was helpful when figuring this out but your concern is valid.

@jsturtevant
Copy link
Contributor Author

I think this PR needs to be linked to documentation somehow, because it's all about reading an annotation and having special args handling based on that. I think the documentation can just be a good comment in the code explaining the difference between the different ways of running the module, and why the arguments have to be processed differently.

100% agree need docs. I was going to add them in #147. I was on the fence for including the annotations at all on this one. I wanted to give enough context on how this would change in an upcoming PR but I think it might not be appropriate here. This is really about fixing the argument handling as it is today.

Also, the wasmtime instance.rs will see big changes due to #142. There should be some coordination about which PR goes in first.

👍 These changes will end up pretty similar as they are now, just be in the new wasmtimeExecutor @Mossaka is working on

if !args.is_empty() {
// temporary work around for wasmedge bug
// https://github.com/WasmEdge/wasmedge-rust-sdk/issues/10
if !(args.len() == 1 && args[0].is_empty()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ipuustin this is the work around for the wasmedge bug.

I should also call out that this change set, is potentially a breaking change. Any module that took arguments, was throwing away the first argument as it was the module name. We no pass arguments as a program might expect.

@jsturtevant
Copy link
Contributor Author

So this PR isn't valid, I found it strange that the arguments didn't work in wasmtime the way I expected and did some more research and asked in the wasmtime channel (https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Arguments.20to.20wasm.20modules.20in.20wasmtime). Ends up I had a miss understanding here.

I will fix this up, though the second bug where it hangs is still valid. But it does bring up the question of what do we pass as the name for the module since all we got was the hash of the module pacakge.

@jsturtevant jsturtevant marked this pull request as draft June 16, 2023 21:50
@jsturtevant
Copy link
Contributor Author

I've decided most of the changes that are actually needed here really relate to the OCI work in #147 so I will close this and fix that up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants