LogWatch sometimes silentlly fails #4741

cmdjulian · 2023-01-06T09:57:49Z

Describe the bug

When I create a new LogWatch and start consuming its output, right after the Kubernetes cluster is starting I sometimes don't get back the actual lines but rather the following output as a line:

{ "kind": "Status", "apiVersion":"v1", "metadata": {}, "status": "failur", "message": "Get \"https://10.1.32.200:10250/containerLogs/default/uuid/nginx?follow=true\": EOF", "code": 500 }

Afterwards the LogWatch directly terminates.
I feel like the Kubernetes api Server seems not to be ready to serve requests. It only happens occasionally. It would be really nice to get an Exception in that case and not just swallow it.

Fabric8 Kubernetes Client version

6.3.1

Steps to reproduce

Create a LogWatch
Use Kubernetes Cluster right after it got started

Expected behavior

I think the LogWatch should wait until the Cluster can serve logs or it should throw an Exception. I think both would be fine.

Runtime

other (please specify in additional context)

Kubernetes API Server version

1.24

Environment

Linux

Fabric8 Kubernetes Client Logs

No response

Additional context

I use k3s as cluster distro

shawkins · 2023-01-06T11:58:47Z

#4637 should be applicable here - if your issue is due to the relevant pods not being ready, that will help wait until they are before making the log request.

Also ideally the built-in logic for retry / exponential backoff should apply here - but it currently does not due to the structuring of where that logic is located.

cmdjulian · 2023-01-18T23:25:19Z

I did some digging. I don't think it's related to the pods health state. In my case, the pod is spawned by a Job and is in a failed state and is also not coming up again (totally normal). Retrying the operation after 20 seconds does not result in a error.
From the printed log message, the api server returned 500. Doesn't this more or less indicate a server error? If the pod is not ready / still in creation, I would expect more something in 4xx range.

shawkins · 2023-01-18T23:45:50Z

Doesn't this more or less indicate a server error? If the pod is not ready / still in creation, I would expect more something in 4xx range.

500 is returned by the api server when the pod is not yet ready to return logs - a 400 would be more like a bad request. The fabric8 logic is checking for either Ready or Succeeded to know when a log request should work. If you have a situation where the Pod is ultimately in the Failed state, but still able to provide logs, then I don't think this check will work. A quick check of the kubectl client suggests that they don't wait on a particular state, but instead just rely on the builtin retry that would come with a 500 response.

shawkins · 2023-02-09T14:13:29Z

With #4825 the fabric8 client can retry all 500's, even those on websocket requests.

shawkins · 2023-02-15T20:40:16Z

Another thing of note - requests that are made after a pod is created, but before containers are initialized will return a 400.

stale · 2023-05-16T23:37:34Z

This issue has been automatically marked as stale because it has not had any activity since 90 days. It will be closed if no further activity occurs within 7 days. Thank you for your contributions!

stale bot added the status/stale label May 16, 2023

stale bot closed this as completed May 24, 2023

shawkins mentioned this issue Jun 10, 2023

Retrieving logs on pod ready #5229

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LogWatch sometimes silentlly fails #4741

LogWatch sometimes silentlly fails #4741

cmdjulian commented Jan 6, 2023 •

edited

Loading

shawkins commented Jan 6, 2023

cmdjulian commented Jan 18, 2023

shawkins commented Jan 18, 2023

shawkins commented Feb 9, 2023

shawkins commented Feb 15, 2023

stale bot commented May 16, 2023

LogWatch sometimes silentlly fails #4741

LogWatch sometimes silentlly fails #4741

Comments

cmdjulian commented Jan 6, 2023 • edited Loading

Describe the bug

Fabric8 Kubernetes Client version

Steps to reproduce

Expected behavior

Runtime

Kubernetes API Server version

Environment

Fabric8 Kubernetes Client Logs

Additional context

shawkins commented Jan 6, 2023

cmdjulian commented Jan 18, 2023

shawkins commented Jan 18, 2023

shawkins commented Feb 9, 2023

shawkins commented Feb 15, 2023

stale bot commented May 16, 2023

cmdjulian commented Jan 6, 2023 •

edited

Loading