Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sqlx::macro db cleanup race condition by adding a margin to current timestamp #2640

Merged

Conversation

fhsgoncalves
Copy link
Contributor

@fhsgoncalves fhsgoncalves commented Jul 21, 2023

Issue

As described in #2631, the #[sqlx::test] macro is occasionally failing to connect to the created database.

Distinct from the Instant type, this time measurement is not monotonic. This means that you can save a file to the file system, then save another file to the file system, and the second file has a SystemTime measurement earlier than the first. In other words, an operation that happens after another operation in real time may have an earlier SystemTime!

Solution

  • The easiest way to fix this issue was to add a little margin (2 seconds) when comparing the timestamps for cleaning up the old databases.

WDYT of this approach? I'm testing here and it seems to have fixed the issue! The margin doesn't guarantee that the error won't occur again, but it decreases a lot the probability of it happening.

@abonander
Copy link
Collaborator

The real issue is with the multiprocess model of cargo-nextest, which this was not designed for, which has been a known issue: #2123

The existing code assumes all testing is happening in the same process, which is why it uses a static AtomicBool to decide whether to clean up or not.

You'd need some sort of multi-process lock to fix this properly. I had a brief look around for crates implementing something like that but didn't find anything I particularly liked:

The POSIX specification has named semaphores but I can't find any crate wrapping it that's even worth considering. named_semaphore hasn't been updated in 2 years and its API is completely undocumented. sema also hasn't been updated in 6 years, and specifically doesn't do named semaphores.

@abonander
Copy link
Collaborator

Alternatively, we could cooperate with the nextest authors to add support for lifecycle hooks so we don't have to call the cleanup function within the tests themselves.

@fhsgoncalves
Copy link
Contributor Author

fhsgoncalves commented Aug 2, 2023

That's interesting.

So using cargo test should always work? Because I'm experiencing the same issue when running with cargo test.
Even forcing just one job (with --jobs 1) I got the error frequently.

@gengjun
Copy link

gengjun commented Oct 9, 2023

That's interesting.

So using cargo test should always work? Because I'm experiencing the same issue when running with cargo test. Even forcing just one job (with --jobs 1) I got the error frequently.

yes, i can confirm even with just cargo test, two separate sqlx::test will fail or not fail randomly. My case is different, it's due to multiple tests sharing one global static PgPool.

@abonander
Copy link
Collaborator

I'll accept this as a marginal improvement but note that I would prefer some cooperation from nextest here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants