java.net.SocketTimeoutException for logWatch() #5202

cmdjulian · 2023-06-04T12:12:48Z

Describe the bug

In v6.6.2 the following code used to work fine:

private val client = KubernetesClientBuilder()
    .withConfig(ConfigBuilder().withNamespace("default").build())
    .build()

override fun logReader(jobId: UUID): JobLogReader {
    val jobPod = jobPod(jobId)

    // when pod is ready, construct log reader, pod-name=job-name-{some-hash}
    val inputStream = client.pods().withName(jobPod.metadata.name).watchLog().output
    return object : JobLogReader {
        val reader = inputStream.bufferedReader()

        override fun readLine(): String? = reader.readLine()

        override fun close() {
            reader.close()
        }
    }
}

logReader(job.jobId).use { reader ->
        generateSequence(reader::readLine).forEach(::println)
}

Since version 6.7.0 I get an Exception. The Exception always occurs pretty fast, even if the underlying pod is not terminated yet (I used a sleep 3600 to simulate it). The pod is streaming continuous logs.
I also see them being consumed .

Fabric8 Kubernetes Client version

6.7.0

Steps to reproduce

create logWatch()
feed it into a BufferedReader
See error after a few seconds regardless if the pod is done

Expected behavior

Having the old behavior. Up on Pod termination stream should be gracefully closed.

Runtime

other (please specify in additional context)

Kubernetes API Server version

1.25.3@latest

Environment

Linux

Fabric8 Kubernetes Client Logs

java.io.IOException: channel already closed with exception
	at io.fabric8.kubernetes.client.http.HttpClientReadableByteChannel.read(HttpClientReadableByteChannel.java:112) ~[kubernetes-client-api-6.7.0.jar:na]
	at java.base/sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65) ~[na:na]
	at java.base/sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:107) ~[na:na]
	at java.base/sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:101) ~[na:na]
	at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:270) ~[na:na]
	at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:313) ~[na:na]
	at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:188) ~[na:na]
	at java.base/java.io.InputStreamReader.read(InputStreamReader.java:177) ~[na:na]
	at java.base/java.io.BufferedReader.fill(BufferedReader.java:162) ~[na:na]
	at java.base/java.io.BufferedReader.readLine(BufferedReader.java:329) ~[na:na]
	at java.base/java.io.BufferedReader.readLine(BufferedReader.java:396) ~[na:na]
	at de.etalytics.scripts.kubernetes.job.JobServiceOnKubernetesImpl$logReader$1.readLine(JobServiceOnKubernetesImpl.kt:63) ~[main/:na]
	at de.etalytics.scripts.engine.logs.JobLogStreamingService$streamLogs$1$1.invoke(JobLogStreamingService.kt:28) ~[main/:na]
	at de.etalytics.scripts.engine.logs.JobLogStreamingService$streamLogs$1$1.invoke(JobLogStreamingService.kt:28) ~[main/:na]
	at kotlin.sequences.SequencesKt__SequencesKt$generateSequence$1.invoke(Sequences.kt:641) ~[kotlin-stdlib-1.8.21.jar:1.8.21-release-380(1.8.21)]
	at kotlin.sequences.GeneratorSequence$iterator$1.calcNext(Sequences.kt:591) ~[kotlin-stdlib-1.8.21.jar:1.8.21-release-380(1.8.21)]
	at kotlin.sequences.GeneratorSequence$iterator$1.hasNext(Sequences.kt:609) ~[kotlin-stdlib-1.8.21.jar:1.8.21-release-380(1.8.21)]
	at kotlin.sequences.TransformingSequence$iterator$1.hasNext(Sequences.kt:214) ~[kotlin-stdlib-1.8.21.jar:1.8.21-release-380(1.8.21)]
	at de.etalytics.scripts.engine.logs.JobLogStreamingService.streamLogs(JobLogStreamingService.kt:48) ~[main/:na]
	at de.etalytics.scripts.engine.job.JobRunner.streamLogs$lambda$0(JobRunner.kt:43) ~[main/:na]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[na:na]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[na:na]
	at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]
Caused by: java.net.SocketTimeoutException: timeout
	at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.kt:675) ~[okhttp-4.10.0.jar:na]
	at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.kt:684) ~[okhttp-4.10.0.jar:na]
	at okhttp3.internal.http2.Http2Stream$FramingSource.read(Http2Stream.kt:379) ~[okhttp-4.10.0.jar:na]
	at okhttp3.internal.connection.Exchange$ResponseBodySource.read(Exchange.kt:276) ~[okhttp-4.10.0.jar:na]
	at okio.RealBufferedSource.exhausted(RealBufferedSource.kt:197) ~[okio-jvm-3.0.0.jar:na]
	at io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$OkHttpAsyncBody.doConsume(OkHttpClientImpl.java:131) ~[kubernetes-httpclient-okhttp-6.7.0.jar:na]
	... 3 common frames omitted

Additional context

I'm using k3d

cmdjulian · 2023-06-04T12:19:33Z

I also tried setting the requestTimeout to 0 in the client builder but this didn't solve my problem either.

shawkins · 2023-06-04T22:18:59Z

I also tried setting the requestTimeout to 0 in the client builder but this didn't solve my problem either.

I think the issue is that all of the other clients default their timeouts to 0/unlimited, but okhttp does not. So it's still enforcing it's default of 10 seconds. We'll need to change the OkHttpClientFactory to account for this.

The workaround in the meantime is to do it on your own:

    KubernetesClient otherClient = new KubernetesClientBuilder().withHttpClientFactory(new OkHttpClientFactory() {
      @Override
      protected void additionalConfig(Builder builder) {
        builder.readTimeout(Duration.ZERO);
      }
    }).build();

For okhttp this does affect http watches as well. #4949 also removed where the calling code was setting the readTimeout to 0 for websocket requests - but that does not actually seem to matter to okhttp as the read/write timeout seem to only affect regular http requests.

cmdjulian · 2023-06-05T08:27:33Z

Thanks for the fast reply, as the OkHttp dependency is not included with implementation scope, I had to add it first with with implementation("io.fabric8:kubernetes-httpclient-okhttp").

But this seemed to have worked! :)

…unlimited

manusa · 2023-06-05T15:59:20Z

The new behavior (#4911) is not working out very well.

I'd really like a way to ensure that long-running operations such as builds, tail logs, and so on have an ensured timeout of 0.

The fixing PR seems to be failing on the e2e tests, probably the no timeout assumption is having collateral unwanted side effects.

shawkins · 2023-06-05T16:04:57Z

The new behavior is not working out very well.

That's a bit premature.

I'd really like a way to ensure that long-running operations such as builds, tail logs, and so on have an ensured timeout of 0.

You can also entertain setting it explicitly to 0 for http watches and log watches, but after this fix those cases are covered as well. As mentioned on the issue that was unnecessary for the websocket operations.

The fixing PR seems to be failing on the e2e tests, probably the no timeout assumption is having collateral unwanted side effects.

That was related to the thread pool changes in the tests, not the timeout.

manusa · 2023-06-05T17:13:00Z

That was related to the thread pool changes in the tests, not the timeout.

OK, good to know. I hadn't check the PR changes and was worried that just by changing the timeouts everything else went south.

You can also entertain setting it explicitly to 0 for http watches and log watches,

We probably want to do this too, just to make sure that these requests enforce a no-timeout behavior.

As mentioned on the issue that was unnecessary for the websocket operations.

Log watch is HTTP. I was actually dealing with this same issue downstream and about to report it when I saw this one 😅

manusa · 2023-06-05T17:13:36Z

We'll need to release a 6.7.1 to fix these problems though

…unlimited

shawkins added a commit to shawkins/kubernetes-client that referenced this issue Jun 5, 2023

fix fabric8io#5202 setting the default okhttp read/write timeouts to …

954f9ca

…unlimited

shawkins self-assigned this Jun 5, 2023

shawkins linked a pull request Jun 5, 2023 that will close this issue

fix #5202 setting the default okhttp read/write timeouts to unlimited #5206

Merged

11 tasks

manusa added the bug label Jun 5, 2023

manusa added this to the 6.7.1 milestone Jun 5, 2023

shawkins added a commit to shawkins/kubernetes-client that referenced this issue Jun 5, 2023

fix fabric8io#5202 setting the default okhttp read/write timeouts to …

75a78f9

…unlimited

manusa pushed a commit to shawkins/kubernetes-client that referenced this issue Jun 6, 2023

fix fabric8io#5202 setting the default okhttp read/write timeouts to …

6b8e385

…unlimited

manusa mentioned this issue Jun 6, 2023

fix #5202 setting the default okhttp read/write timeouts to unlimited #5206

Merged

11 tasks

manusa pushed a commit to shawkins/kubernetes-client that referenced this issue Jun 6, 2023

fix fabric8io#5202 setting the default okhttp read/write timeouts to …

60fb493

…unlimited

manusa closed this as completed in #5206 Jun 6, 2023

manusa pushed a commit that referenced this issue Jun 6, 2023

fix #5202 setting the default okhttp read/write timeouts to unlimited

e4f4345

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

java.net.SocketTimeoutException for logWatch() #5202

java.net.SocketTimeoutException for logWatch() #5202

cmdjulian commented Jun 4, 2023 •

edited

Loading

cmdjulian commented Jun 4, 2023

shawkins commented Jun 4, 2023

cmdjulian commented Jun 5, 2023

manusa commented Jun 5, 2023 •

edited

Loading

shawkins commented Jun 5, 2023

manusa commented Jun 5, 2023

manusa commented Jun 5, 2023

java.net.SocketTimeoutException for logWatch() #5202

java.net.SocketTimeoutException for logWatch() #5202

Comments

cmdjulian commented Jun 4, 2023 • edited Loading

Describe the bug

Fabric8 Kubernetes Client version

Steps to reproduce

Expected behavior

Runtime

Kubernetes API Server version

Environment

Fabric8 Kubernetes Client Logs

Additional context

cmdjulian commented Jun 4, 2023

shawkins commented Jun 4, 2023

cmdjulian commented Jun 5, 2023

manusa commented Jun 5, 2023 • edited Loading

shawkins commented Jun 5, 2023

manusa commented Jun 5, 2023

manusa commented Jun 5, 2023

cmdjulian commented Jun 4, 2023 •

edited

Loading

manusa commented Jun 5, 2023 •

edited

Loading