You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We have a latency-sensitive system and we're seeing regular spikes in query latency. After some digging, I believe there are two interconnected issues:
Establishing a new database connection is slow (which I'm trying to improve here and here)
Connection pool does not reap old connections well, leading to a wave of new connections every max_lifetime
A somewhat related issue is #2854 - similar symptoms. All connections are replaced after max_lifetime AND this happens during acquire, so we observe spikes in query latency due to (1).
The issue with (2) is that the maintenance task only looks at (size - min_size) connections. This means that if size = min_size, it does nothing. We run with relatively high min_size to ensure we always have a good number of available connections, but after learning this, I suspect it may actually be counterproductive.
Describe the solution you'd like
The pool should be replacing connections in the background. It should be extremely rare that a connection needs to be opened at query time. Even if opening the connection takes 20ms - we're writing Rust so my expectations are high ;-)
The logic of acquire() should be changed to race between acquiring a connection from the idle queue and opening a new connection.
💯 . This would be amazing. But I don't think it'd fully solve my issue - the problem is that if all connections in the pool were opened around the same time, the next connection we acquire from the idle queue is likely to be above max lifetime as well.
My preferred combination of changes would be:
Don't check max_lifetime on acquire. Instead, check it on release in return_to_pool. If users want to check this on acquire, they are free to do so manually in before_acquire
Improve the replacement logic: go though the full pool one-by-one, if old or idle, close it and run min_connections_maintenance immediately
The suggestion above: acquire() should be changed to race between acquiring a connection from the idle queue and opening a new connection
I'm open to other solutions; happy to contribute changes if we agree on the path forward
The text was updated successfully, but these errors were encountered:
We don't get the benefit of background max lifetime checking, but we run enough queries for this to be checked often enough. This chart shows the p99 latency of different queries after rolling out this change:
Basically: synchronous reconnections at query time became very rare.
I still think this change should be done in sqlx - this seems to be a much better default
Is your feature request related to a problem? Please describe.
We have a latency-sensitive system and we're seeing regular spikes in query latency. After some digging, I believe there are two interconnected issues:
A somewhat related issue is #2854 - similar symptoms. All connections are replaced after max_lifetime AND this happens during
acquire
, so we observe spikes in query latency due to (1).The issue with (2) is that the maintenance task only looks at (size - min_size) connections. This means that if size = min_size, it does nothing. We run with relatively high min_size to ensure we always have a good number of available connections, but after learning this, I suspect it may actually be counterproductive.
Describe the solution you'd like
The pool should be replacing connections in the background. It should be extremely rare that a connection needs to be opened at query time. Even if opening the connection takes 20ms - we're writing Rust so my expectations are high ;-)
@abonander had a related comment in #2848:
💯 . This would be amazing. But I don't think it'd fully solve my issue - the problem is that if all connections in the pool were opened around the same time, the next connection we acquire from the idle queue is likely to be above max lifetime as well.
My preferred combination of changes would be:
max_lifetime
on acquire. Instead, check it on release inreturn_to_pool
. If users want to check this on acquire, they are free to do so manually inbefore_acquire
min_connections_maintenance
immediatelyI'm open to other solutions; happy to contribute changes if we agree on the path forward
The text was updated successfully, but these errors were encountered: