Workers healthcheck/uptime #10
andresbarcenas
started this conversation in
General
Replies: 2 comments
-
Hard to judge just by the info provided. Per my experience mostly OOM killer is the reason for unexpected kills. But it heavily depends on environment. |
Beta Was this translation helpful? Give feedback.
0 replies
-
OOM killer events are logged in one of the kernel/OS log files. We won't guess as to what may be happening, the very first step of this investigation should be capturing Kicks worker process logs. How to do that and what may be affecting, redirecting or collecting the logs, only you know and is not up to the community to try to figure out. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
We have over 100+ workers and we are seeing random workers crashing but we have no logs nor we can tell why its happening. Is there a way to log a heartbeat or something similar to tell if a worker is not responding? Our normal reaction is to restart the entire sneakers process
Beta Was this translation helpful? Give feedback.
All reactions