Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kata container pod cannot start on v1.7.6 #449

Closed
0dragosh opened this issue Aug 9, 2024 · 4 comments
Closed

Kata container pod cannot start on v1.7.6 #449

0dragosh opened this issue Aug 9, 2024 · 4 comments

Comments

@0dragosh
Copy link

0dragosh commented Aug 9, 2024

On the latest Talos v1.7.6 with the latest version of the kata-containers extension installed, when applying the example Pod and RuntimeClass, the container fails with the status: Error: failed to create containerd task: failed to create shim task: File exists (os error 17): unknown.

Relevant excerpts seem to be:

Could not add /dev/mshv to the devices cgroup
Could not add /dev/vfio/vfio to the devices cgroup
"clh.VmmPingGet API call failed\" error=\"Get \\\"http://localhost/api/v1/vmm.ping\\\": dial unix /run/vc/vm/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/clh-api.sock: connect: no such file or directory

Full talosctl logs syslogd on the host:

10.250.2.204: {"content":"time="2024-08-09T15:25:02.32590216Z" level=warning msg="Could not add /dev/mshv to the devices cgroup" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=cgroups\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:02Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:02.325979899Z" level=warning msg="Could not add /dev/vfio/vfio to the devices cgroup" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=cgroups\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:02Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:02.373257074Z" level=warning msg="clh.VmmPingGet API call failed" error="Get \"http://localhost/api/v1/vmm.ping\\\": dial unix /run/vc/vm/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/clh-api.sock: connect: no such file or directory" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers/hypervisor subsystem=cloudHypervisor\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:02Z"}
10.250.2.204: {"content":"Waiting for vhost-user socket connection...","facility":1,"hostname":"nuc4","priority":14,"severity":6,"tag":"virtiofsd","timestamp":"2024-08-09T15:25:02Z"}
10.250.2.204: {"content":"Client connected, servicing requests","facility":1,"hostname":"nuc4","priority":14,"severity":6,"tag":"virtiofsd","timestamp":"2024-08-09T15:25:02Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:05.205557823Z" level=error msg="createContainer failed" error="rpc error: code = Internal desc = File exists (os error 17)" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=kata_agent\n","facility":0,"hostname":"localhost","priority":3,"severity":3,"tag":"kata","timestamp":"2024-08-09T15:25:05Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:05.205784452Z" level=error msg="container create failed" container=7bcd1bfd8b04de0c003fd9a87ad3f1f0ebba5b49bb19b7176e408fb8b1a49d90 error="rpc error: code = Internal desc = File exists (os error 17)" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=container\n","facility":0,"hostname":"localhost","priority":3,"severity":3,"tag":"kata","timestamp":"2024-08-09T15:25:05Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:05.205811322Z" level=warning msg="Could not umount" container=7bcd1bfd8b04de0c003fd9a87ad3f1f0ebba5b49bb19b7176e408fb8b1a49d90 error="no such file or directory" host-path=/run/kata-containers/shared/sandboxes/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/mounts/7bcd1bfd8b04de0c003fd9a87ad3f1f0ebba5b49bb19b7176e408fb8b1a49d90-e482759432e0442f-localtime name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=container\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:05Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:05.205855233Z" level=error msg="rollback failed unmountHostMounts()" container=7bcd1bfd8b04de0c003fd9a87ad3f1f0ebba5b49bb19b7176e408fb8b1a49d90 error="no such file or directory" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=container\n","facility":0,"hostname":"localhost","priority":3,"severity":3,"tag":"kata","timestamp":"2024-08-09T15:25:05Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:05.205875631Z" level=warning error="no such file or directory" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 share-dir=/run/kata-containers/shared/sandboxes/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/mounts/7bcd1bfd8b04de0c003fd9a87ad3f1f0ebba5b49bb19b7176e408fb8b1a49d90/rootfs source=virtcontainers subsystem=mount\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:05Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:05.205892727Z" level=warning msg="Could not remove container share dir" error="no such file or directory" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 share-dir=/run/kata-containers/shared/sandboxes/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/mounts/7bcd1bfd8b04de0c003fd9a87ad3f1f0ebba5b49bb19b7176e408fb8b1a49d90 source=virtcontainers subsystem=fs_share\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:05Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:06.225329102Z" level=error msg="createContainer failed" error="rpc error: code = Internal desc = File exists (os error 17)" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=kata_agent\n","facility":0,"hostname":"localhost","priority":3,"severity":3,"tag":"kata","timestamp":"2024-08-09T15:25:06Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:06.225562294Z" level=error msg="container create failed" container=762520e309dab3057f41b873a8e7931a7a6684572e441ce87685db3d58ce4353 error="rpc error: code = Internal desc = File exists (os error 17)" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=container\n","facility":0,"hostname":"localhost","priority":3,"severity":3,"tag":"kata","timestamp":"2024-08-09T15:25:06Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:06.225585724Z" level=warning msg="Could not umount" container=762520e309dab3057f41b873a8e7931a7a6684572e441ce87685db3d58ce4353 error="no such file or directory" host-path=/run/kata-containers/shared/sandboxes/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/mounts/762520e309dab3057f41b873a8e7931a7a6684572e441ce87685db3d58ce4353-5c3797f8da42b1c5-localtime name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=container\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:06Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:06.225605375Z" level=error msg="rollback failed unmountHostMounts()" container=762520e309dab3057f41b873a8e7931a7a6684572e441ce87685db3d58ce4353 error="no such file or directory" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 source=virtcontainers subsystem=container\n","facility":0,"hostname":"localhost","priority":3,"severity":3,"tag":"kata","timestamp":"2024-08-09T15:25:06Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:06.2256247Z" level=warning error="no such file or directory" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 share-dir=/run/kata-containers/shared/sandboxes/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/mounts/762520e309dab3057f41b873a8e7931a7a6684572e441ce87685db3d58ce4353/rootfs source=virtcontainers subsystem=mount\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:06Z"}
10.250.2.204: {"content":"time="2024-08-09T15:25:06.225639538Z" level=warning msg="Could not remove container share dir" error="no such file or directory" name=containerd-shim-v2 pid=71414 sandbox=5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81 share-dir=/run/kata-containers/shared/sandboxes/5b9a4aa29a332f96b374aad4793290e0fa10a6e412b52d190345817763205a81/mounts/762520e309dab3057f41b873a8e7931a7a6684572e441ce87685db3d58ce4353 source=virtcontainers subsystem=fs_share\n","facility":0,"hostname":"localhost","priority":4,"severity":4,"tag":"kata","timestamp":"2024-08-09T15:25:06Z"}

I'm using the default configuration for Kata. Any tips for fixing this?

Thanks!

@smira
Copy link
Member

smira commented Aug 9, 2024

Probably you need to ask in the Kata containers first (linking this issue).

@0dragosh
Copy link
Author

0dragosh commented Aug 9, 2024

Created kata-containers issue #10143

@zzachattack2
Copy link

I was having this issue as well. Do you have any mutating webhooks running? In my case, I was running k8tz, which injects an init container before the pod starts. The issue stopped when I configured the pod to bypass the webhook. Best I can tell, the mutated state of the pod itself was not the issue (I tested manually defining the pod to exactly how k8tz would want to mutate it to, and it worked fine), but perhaps something here doesn't work with the dynamic nature of mutating webhooks?

@0dragosh
Copy link
Author

0dragosh commented Oct 9, 2024

I was having this issue as well. Do you have any mutating webhooks running? In my case, I was running k8tz, which injects an init container before the pod starts. The issue stopped when I configured the pod to bypass the webhook. Best I can tell, the mutated state of the pod itself was not the issue (I tested manually defining the pod to exactly how k8tz would want to mutate it to, and it worked fine), but perhaps something here doesn't work with the dynamic nature of mutating webhooks?

Dang I have k8tz as well and didn't check to kill any validating/mutating webhooks.

Thank you so much @zzachattack2 !

@0dragosh 0dragosh closed this as completed Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants