Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'acquired' channels not decremented in some timeout scenarios #9448

Closed
dhofftgt opened this issue Jun 14, 2023 · 5 comments
Closed

'acquired' channels not decremented in some timeout scenarios #9448

dhofftgt opened this issue Jun 14, 2023 · 5 comments
Assignees

Comments

@dhofftgt
Copy link
Contributor

Expected Behavior

Acquired count should reflect channels in use in all cases.

Actual Behaviour

In scenarios where the client has a fixed channel pool and it is backed up we see the 'acquired' channel tracking can 'leak' meaning it is incremented and never comes down causing all subsequent requests to time out waiting for a channel.

Steps To Reproduce

See example application:

The linked project reproduces an issue where the 'acquired' channel count maintained by the FixedChannelPool for a client can not be decremented in the case the netty client threads are unresponsive (seems to correlate with when io.micronaut.http.client.exceptions.ReadTimeoutException is thrown)

the MicronautAcquireLeakTest reproduces the issue and the following conditions

  • max connections is set for the client
  • the @client returns a future with a POJO Deserialized from JSON
  • the netty client thread is unresponsive or backed up causing io.micronaut.http.client.exceptions.ReadTimeoutException to be thrown by DefaultHttpClient (I blocked the thread in the thenApply on a future returned by a @client. Note it is odd that this is executed on the netty thread, not the IO executor)

The test executes like this:

  • The test spits out a bunch of http client requests
  • a callback on the response CompletableFuture will block on a latch causing the netty client threads to hang and become unresponsive
  • on subsequent requests DefaultHttpClient will give up 1 second after the netty read timeout and throw io.micronaut.http.client.exceptions.ReadTimeoutException
  • we wait 5 seconds to make sure things have timed out then release the latch so the client become responsive again
  • wait for client to be idle by checking the pendingTasks() is 0
  • acquiredChannels should be 0, but it's not.

Environment Information

mac M1 os 13.4
Micronaut version 3.9.3
openjdk 11.0.18 2023-01-17 LTS
OpenJDK Runtime Environment Zulu11.62+17-CA (build 11.0.18+10-LTS)
OpenJDK 64-Bit Server VM Zulu11.62+17-CA (build 11.0.18+10-LTS, mixed mode)

Example Application

https://github.com/dhofftgt/micronaut-acquire-leak

Version

3.9.3

yawkat added a commit that referenced this issue Jun 15, 2023
When there was a read timeout (i.e. timeout handled by DefaultHttpClient) while the Mono<PoolHandle> from ConnectionManager was not yet complete, the PoolHandle would be dropped silently.

This patch handles cancellation of the Mono properly, releasing the pool handle.

This is already fixed by the ConnectionManager rework in 4.0.

Fixes #9448
yawkat added a commit that referenced this issue Jun 15, 2023
When there was a read timeout (i.e. timeout handled by DefaultHttpClient) while the Mono<PoolHandle> from ConnectionManager was not yet complete, the PoolHandle would be dropped silently.

This patch handles cancellation of the Mono properly, releasing the pool handle.

This is already fixed by the ConnectionManager rework in 4.0.

Fixes #9448
@yawkat
Copy link
Member

yawkat commented Jun 15, 2023

thanks for the report, i've made a pr to fix it.

please don't block the event loop like in your example though, netty does not like it :)

@dhofftgt
Copy link
Contributor Author

Awesome! Thanks! Haha, yes. Blocking purely to create conditions reproduce the problem.

yawkat added a commit that referenced this issue Jun 16, 2023
When there was a read timeout (i.e. timeout handled by DefaultHttpClient) while the Mono<PoolHandle> from ConnectionManager was not yet complete, the PoolHandle would be dropped silently.

This patch handles cancellation of the Mono properly, releasing the pool handle.

This is already fixed by the ConnectionManager rework in 4.0.

Fixes #9448

Co-authored-by: Sergio del Amo <sergio.delamo@softamo.com>
@yawkat yawkat closed this as completed Jun 16, 2023
@stepanv
Copy link
Contributor

stepanv commented Jun 16, 2023

That's perfect, I just wanted to report this problem.
@yawkat when can we expect this fix to be released? Thx!

@dhofftgt
Copy link
Contributor Author

dhofftgt commented Jul 5, 2023

@yawkat I am also wondering when this will be in a release. We are currently holding off upgrading until this fix is available.

@yawkat
Copy link
Member

yawkat commented Jul 10, 2023

im not sure, 4.0 will probably be released before and 4.0 never had this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants