-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pools/connections not being closed in tests #1824
Comments
#1799 was designed specifically to fix this situation so I don't understand how it could have made it worse unless it introduced an Since you're creating a number of different Since the pools are short-lived, you can also turn off the idle reaper by setting Though, it might make sense to have a |
That does sound pretty nifty but I was hoping to not have to make a pool for my connection pools 😆 This may be the way though!
We actually do have Example Local Output
The tests start taking over 60s because the code that checks out the pool actually does it in a loop when it fails to connect to the DB so they just loop for a long time but if I turn on all the logging things I see the above mentioned error.
Ah that's great - I didn't know if
That also sounds really cool and would help our integration tests that connect to the various DB clones. Our integration tests use the same pool connection function that our actual application uses so it maybe can't benefit from this (or the other suggestions above - I'll see!). |
I'll spend some more time today trying to get an actual minimal reproduction but I'm having trouble understanding the original reproducer - is it semi-obvious to you that, with 100ms sleeps (also tried with 1s sleeps), the pools wouldn't close? It doesn't look like |
Alrighty, I have a consistent reproducer on my local at least. I made 3 changes: DependenciesI needed to lock in [dev-dependencies]
sqlx = { version = "=0.5.11", default-features = false, features = ["runtime-actix-rustls", "postgres"] }
sqlx-core = "=0.5.11"
tokio = { version = "=1.17.0", default-features = false, features = ["macros", "rt"] } Pool AcquisitionI added some code that's more similar to how our actual function sets up a pool. I removed some looping and error handling but one important thing is that we test the connection with a query before returning it (not sure if that's necessary - the issue is still open but if I stop my DB server it seems to correctly return an error without that query - maybe because we have a timeout?). async fn get_pool() -> sqlx::PgPool {
let db_url = "your_connection_string_here";
let pool = sqlx::PgPool::connect(db_url).await.unwrap();
// NOTE: Commenting out this query allows the tests to pass ... freaky
// sqlx will not fail if it cannot connect to the database at all.
// Until [this](https://github.com/launchbadge/sqlx/issues/744) is fixed,
// we need to make a dummy query.
sqlx::query("SELECT 1").execute(&pool).await.unwrap();
pool
}
async fn handwavy_test_setup() {
let pool = get_pool().await;
let tx = pool.begin().await.unwrap();
// Do actual test setup stuff ...
tx.commit().await.unwrap();
// This "fixes" the "problem"
// pool.close().await;
} Test ArrangementI went away from the weird loop to the "lots of actual tests" // Generate a test so we can fit this all on one screen.
macro_rules! t {
($test_name:tt) => {
#[tokio::test]
async fn $test_name() {
handwavy_test_setup().await;
// Do test stuff
tokio::time::sleep(std::time::Duration::from_millis(10)).await;
}
};
}
#[rustfmt::skip]
mod actual_tests {
use super::*;
t!(a0); t!(a1); t!(a2); t!(a3); t!(a4); t!(a5); t!(a6); t!(a7); t!(a8); t!(a9);
t!(b0); t!(b1); t!(b2); t!(b3); t!(b4); t!(b5); t!(b6); t!(b7); t!(b8); t!(b9);
t!(c0); t!(c1); t!(c2); t!(c3); t!(c4); t!(c5); t!(c6); t!(c7); t!(c8); t!(c9);
t!(d0); t!(d1); t!(d2); t!(d3); t!(d4); t!(d5); t!(d6); t!(d7); t!(d8); t!(d9);
t!(e0); t!(e1); t!(e2); t!(e3); t!(e4); t!(e5); t!(e6); t!(e7); t!(e8); t!(e9);
t!(f0); t!(f1); t!(f2); t!(f3); t!(f4); t!(f5); t!(f6); t!(f7); t!(f8); t!(f9);
t!(g0); t!(g1); t!(g2); t!(g3); t!(g4); t!(g5); t!(g6); t!(g7); t!(g8); t!(g9);
t!(h0); t!(h1); t!(h2); t!(h3); t!(h4); t!(h5); t!(h6); t!(h7); t!(h8); t!(h9);
t!(i0); t!(i1); t!(i2); t!(i3); t!(i4); t!(i5); t!(i6); t!(i7); t!(i8); t!(i9);
t!(j0); t!(j1); t!(j2); t!(j3); t!(j4); t!(j5); t!(j6); t!(j7); t!(j8); t!(j9);
t!(k0); t!(k1); t!(k2); t!(k3); t!(k4); t!(k5); t!(k6); t!(k7); t!(k8); t!(k9);
t!(l0); t!(l1); t!(l2); t!(l3); t!(l4); t!(l5); t!(l6); t!(l7); t!(l8); t!(l9);
} ResultWith this set up I get consistent failures with (both Full Code#[cfg(test)]
mod tests {
async fn get_pool() -> sqlx::PgPool {
let db_url = "your_connection_string_here";
let pool = sqlx::PgPool::connect(db_url).await.unwrap();
// NOTE: Commenting out this query allows the tests to pass ... freaky
// sqlx will not fail if it cannot connect to the database at all.
// Until [this](https://github.com/launchbadge/sqlx/issues/744) is fixed,
// we need to make a dummy query.
sqlx::query("SELECT 1").execute(&pool).await.unwrap();
pool
}
async fn handwavy_test_setup() {
let pool = get_pool().await;
let tx = pool.begin().await.unwrap();
// Do actual test setup stuff ...
tx.commit().await.unwrap();
// This "fixes" the "problem"
// pool.close().await;
}
// Generate a test so we can fit this all on one screen.
macro_rules! t {
($test_name:tt) => {
#[tokio::test]
async fn $test_name() {
handwavy_test_setup().await;
// Do test stuff
tokio::time::sleep(std::time::Duration::from_millis(10)).await;
}
};
}
#[rustfmt::skip]
mod actual_tests {
use super::*;
t!(a0); t!(a1); t!(a2); t!(a3); t!(a4); t!(a5); t!(a6); t!(a7); t!(a8); t!(a9);
t!(b0); t!(b1); t!(b2); t!(b3); t!(b4); t!(b5); t!(b6); t!(b7); t!(b8); t!(b9);
t!(c0); t!(c1); t!(c2); t!(c3); t!(c4); t!(c5); t!(c6); t!(c7); t!(c8); t!(c9);
t!(d0); t!(d1); t!(d2); t!(d3); t!(d4); t!(d5); t!(d6); t!(d7); t!(d8); t!(d9);
t!(e0); t!(e1); t!(e2); t!(e3); t!(e4); t!(e5); t!(e6); t!(e7); t!(e8); t!(e9);
t!(f0); t!(f1); t!(f2); t!(f3); t!(f4); t!(f5); t!(f6); t!(f7); t!(f8); t!(f9);
t!(g0); t!(g1); t!(g2); t!(g3); t!(g4); t!(g5); t!(g6); t!(g7); t!(g8); t!(g9);
t!(h0); t!(h1); t!(h2); t!(h3); t!(h4); t!(h5); t!(h6); t!(h7); t!(h8); t!(h9);
t!(i0); t!(i1); t!(i2); t!(i3); t!(i4); t!(i5); t!(i6); t!(i7); t!(i8); t!(i9);
t!(j0); t!(j1); t!(j2); t!(j3); t!(j4); t!(j5); t!(j6); t!(j7); t!(j8); t!(j9);
t!(k0); t!(k1); t!(k2); t!(k3); t!(k4); t!(k5); t!(k6); t!(k7); t!(k8); t!(k9);
t!(l0); t!(l1); t!(l2); t!(l3); t!(l4); t!(l5); t!(l6); t!(l7); t!(l8); t!(l9);
}
} |
After trying the repro locally, it does seem to be leaking connections for some reason but I can't fathom why. #1799 shouldn't have introduced it, although the extra cloning of |
Okay, it took me an embarrassingly long time to find the problem, but I think I've figured it out. In hindsight, it should have been painfully obvious from the diff of #1799 but Github collapsed the lines by default because they weren't changed: https://github.com/launchbadge/sqlx/pull/1799/files#diff-782a6e0cb2edbd43e45dfef48fae8534c6a8f46cddebca66256e6f88819f7ef0R397-R399 That That means that the |
That seems incredibly sneaky and I find this not at all an embarrassingly long time for debugging 😆 And, for me, that link doesn't obviously go anywhere (maybe because the lines are collapsed by default and so GH can't jump to the hidden line or something?) |
It's here on sqlx/sqlx-core/src/pool/inner.rs Lines 397 to 399 in fa5c436
|
Would it be sufficient to call |
Crazy High Level
After upgrading from
0.5.11
to0.5.12
(and trying0.5.13
), our testsuite fails because of this error:
I cloned
sqlx
andgit bisect
ed it and got to #1799.
Details And Debugging
Every test in this particular test suite runs with
#[tokio::test]
so that seemsawfully related to the discussion of runtimes in #1396 (comment). Every test in this
test suite calls a function which creates its own
PgPool
to run some queries.This is, of course, not what pools are for but, if I add a
pool.close().await
tothe end of that function, the tests all pass.
Originally I thought I'd just do that, problem solved. However we also have
a bunch of tests that, I think, can't share a pool. Every unit test
clones a template database and then connects a pool to its specific database.
This ensures no tests collide which is nice but, I think, prevents us from
sharing one pool among them.
Minimal Reproduction (sort of)
This reproducer shows the "bug" in both
0.5.11
and0.5.13
so maybethis is something we were bound to run into eventually but our test suite
happens to consistently pass on
0.5.11
and consistently fail otherwise.This also, obviously, isn't how our tests work - we don't create the pools
in a loop, each test creates its own. But this is fewer lines of code :-).
Cargo.toml
:
Versions And Things
sqlx
0.5.11
, did not on0.5.12
, does not on0.5.13
sqlx
0.5.11
and0.5.13
(at least)postgresql
The text was updated successfully, but these errors were encountered: