Replies: 1 comment
-
Hey @sjm-ho, Shutdown signals usually happen when a node does not have enough resources, so it kills the container. To avoid this scenario, please use requests and limits. I cannot speak to the actual reason for termination, but from my experience, that is what is usually the reason. It is unlikely that the shutdown signal is happening without any legitimate reason. I would suggest digging into the kubelet log and figuring out what is happening. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey ARC community, I am actually running a github actions build which runs ephemera runner sets on my kubernetes clusters. Having said that, the clusters are actually integrated with CAST AI, so the node provisioning is handled by CAST AI itself.
Now even though we were using spot instances for the runner pods, we have switched to on demand nodes, but still we are faing issues like
[publish_image (test, false)](https://github.com/headout/magellan/actions/runs/9241155685/job/25422290263) The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
AND
The operation was canceled.
Without any legitimate reason. Would like some workaround / fix to stabilise the ARC runners since it's resulting in very frequent unreliable builds which are as I mentioned failing
Beta Was this translation helpful? Give feedback.
All reactions