-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LogWatch sometimes silentlly fails #4741
Comments
#4637 should be applicable here - if your issue is due to the relevant pods not being ready, that will help wait until they are before making the log request. Also ideally the built-in logic for retry / exponential backoff should apply here - but it currently does not due to the structuring of where that logic is located. |
I did some digging. I don't think it's related to the pods health state. In my case, the pod is spawned by a Job and is in a failed state and is also not coming up again (totally normal). Retrying the operation after 20 seconds does not result in a error. |
500 is returned by the api server when the pod is not yet ready to return logs - a 400 would be more like a bad request. The fabric8 logic is checking for either Ready or Succeeded to know when a log request should work. If you have a situation where the Pod is ultimately in the Failed state, but still able to provide logs, then I don't think this check will work. A quick check of the kubectl client suggests that they don't wait on a particular state, but instead just rely on the builtin retry that would come with a 500 response. |
With #4825 the fabric8 client can retry all 500's, even those on websocket requests. |
Another thing of note - requests that are made after a pod is created, but before containers are initialized will return a 400. |
This issue has been automatically marked as stale because it has not had any activity since 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions! |
Describe the bug
When I create a new LogWatch and start consuming its output, right after the Kubernetes cluster is starting I sometimes don't get back the actual lines but rather the following output as a line:
Afterwards the LogWatch directly terminates.
I feel like the Kubernetes api Server seems not to be ready to serve requests. It only happens occasionally. It would be really nice to get an Exception in that case and not just swallow it.
Fabric8 Kubernetes Client version
6.3.1
Steps to reproduce
Expected behavior
I think the LogWatch should wait until the Cluster can serve logs or it should throw an Exception. I think both would be fine.
Runtime
other (please specify in additional context)
Kubernetes API Server version
1.24
Environment
Linux
Fabric8 Kubernetes Client Logs
No response
Additional context
I use k3s as cluster distro
The text was updated successfully, but these errors were encountered: