Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default-timeout is ignored #3184

Closed
JohanNordlinder opened this issue Feb 23, 2024 · 3 comments
Closed

default-timeout is ignored #3184

JohanNordlinder opened this issue Feb 23, 2024 · 3 comments

Comments

@JohanNordlinder
Copy link

JohanNordlinder commented Feb 23, 2024

Spring Boot Admin Server information

  • Version:
    3.2.2

  • Spring Boot version:
    3.2.3

  • Configured Security:
    Default/Off

  • Webflux or Servlet application:
    Servlet application

Client information

  • Spring Boot versions:
    3.2.3

  • Used discovery mechanism:
    Self registration

  • Webflux or Servlet application:
    Servlet application

Description

We're setting spring.boot.admin.monitor.default-timeout: 61000 but calls to the client times out after 9000ms. It looks like the default timeout value of 10000ms minus a 1000ms margin is used instead.

Error:

"message":"Couldn't retrieve status for Instance(id=c759772f591b........"
"logger_name":"de.codecentric.boot.admin.server.services.StatusUpdater"
"stack_trace":"<#ac00dbe2> java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 9000ms in 'log' (and no fallback has been configured)\n\tat reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.handleTimeout(FluxTimeout.java:296)\n\tat reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.doTimeout(FluxTimeout.java:281)\n\tat reactor.core.publisher.FluxTimeout$TimeoutTimeoutSubscriber.onNext(FluxTimeout.java:420)\n\tat reactor.core.publisher.FluxOnErrorReturn$ReturnSubscriber.onNext(FluxOnErrorReturn.java:162)\n\tat reactor.core.publisher.MonoDelay$MonoDelayRunnable.propagateDelay(MonoDelay.java:270)\n\tat reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:285)\n\tat reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)\n\tat reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\n\tat java.lang.Thread.run(Thread.java:840)\n"}

The margin of 10000ms is introduced here: https://github.com/codecentric/spring-boot-admin/blob/master/spring-boot-admin-server/src/main/java/de/codecentric/boot/admin/server/services/StatusUpdater.java#L85.

But I suspect the issue might be this line: https://github.com/codecentric/spring-boot-admin/blob/master/spring-boot-admin-server/src/main/java/de/codecentric/boot/admin/server/services/StatusUpdateTrigger.java#L57.

To me it seems strange that the interval is used as the timeout.

If I change spring.boot.admin.monitor.status-interval to some other value than the default of 10000ms, the error message instead reflects that value which confirms that this interval is actually being used as the timeout.

@erikpetzold
Copy link
Member

At first sight there are two things to investigate:

  • spring.boot.admin.monitor.default-timeout is used by timeoutInstanceExchangeFilter. So if there is a Filter, why do we set the timeout directly in StatusUpdater? Maybe because of the margin, as far as I understand it should make sure that responses do not overrun each other exactly at the point where the status is changing.
  • Can we simply pass the value of the property from StatusUpdateTrigger to StatusUpdater? Can there be problems if the timeout is larger than the interval?

@erikpetzold
Copy link
Member

Hi, we had a look at this again and agreed, that the implementation is correct. It is a bit unintuitive though.

The default-timeout is for all requests from admin to monitored instance.

However, for interval based tasks like statusUpdate there are some more limitations. The timeout cannot be longer than the interval, so the interval is the upper bound for the timeout. We will update the docs to clarify this and maybe also log a warn message when the timeout is larger than the interval.

SteKoe added a commit that referenced this issue Apr 26, 2024
…t and interval (#3333)

* chore(#3184): add warning log when default timeout is larger than info timeout

* chore(#3184): add info to documentation
@SteKoe
Copy link
Contributor

SteKoe commented Apr 28, 2024

We have updated the log and added a warn message. It will be shipped in the next release.

@SteKoe SteKoe closed this as completed Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants