-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change random test shuffling technique #1908
Change random test shuffling technique #1908
Conversation
c7cd0d5
to
293ad88
Compare
After reading some of the commit history in this area (specifically 535da5c) I realise that this is not entirely platform independent yet, because |
Codecov Report
@@ Coverage Diff @@
## master #1908 +/- ##
==========================================
+ Coverage 86.49% 86.82% +0.33%
==========================================
Files 131 131
Lines 3900 3922 +22
==========================================
+ Hits 3373 3405 +32
+ Misses 527 517 -10 |
I'd be happy to add a test for this, but after a brief glance at your testing infrastructure it wasn't clear to me where a test of command-line parameter like this belongs. |
I like the general idea, but I will have to think about the approach a bit. I have two misgivings right now:
The first one should be easy enough to fix -- we can compute each hash once, sort those and then project the order back to test cases -- or we can benchmark say 100k test cases and see what it does. The second one is obviously a bit harder and I will have to think about it first. As to tests, generally everything possible should be tested via CTest. I do not see how to easily test this specific feature using just the facilities provided by CTest, so it will likely require a Python script that performs the actual test, and reports the result back. The way I see it, the test should work something like
The script will also need some sanity checks to avoid false negatives. The ones I can think of on top of my head are that the number of tests returned in step 1 should more than step 2, and step 2 should return more tests than step 3, and that there should always be at least 2 returned tests. |
293ad88
to
2e7eb0b
Compare
I've added a test and rewritten the sorting code to only compute O(n) hashes rather than O(n log(n)). Regarding your other concern: I see no reason to believe there will be any less randomness than there was with Should I move |
08515cc
to
4991e0e
Compare
My misgiving with this approach is that That is why I want to play with it a bit first -- generate some random seeds, check if there are suspicious results. On a related note, to ensure stability properly, the code needs to handle hash collisions. I suggest just using test case name as the fallback comparison if hashes happen to be the same. |
4991e0e
to
4dd47c4
Compare
Previously a random test ordering was obtained by applying std::shuffle to the tests in declaration order. This has two problems: - It depends on the declaration order, so the order in which the tests will be run will be platform-specific. - When trying to debug accidental inter-test dependencies, it is helpful to be able to find a minimal subset of tests which exhibits the issue. However, any change to the set of tests being run will completely change the test ordering, making it difficult or impossible to reduce the set of tests being run in any reasonably efficient manner. Therefore, change the randomization approach to resolve both these issues. Generate a random value based on the user-provided RNG seed. Convert every test case to an integer by hashing a combination of that value with the test name. Sort the test cases by this integer. The test names and RNG are platform-independent, so this should be consistent across platforms. Also, removing one test does not change the integer value associated with the remaining tests, so they remain in the same order. To hash, use the FNV-1a hash, except with the basis being our randomly selected value rather than the fixed basis set in the algorithm. Cannot use std::hash, because it is important that the result be platform-independent.
4dd47c4
to
f5d0139
Compare
FWIW, I have added a test which verifies that the seeds from 1 to 100 generate different orderings. Obviously that's not proof of much, but it does at least demonstrate that the distribution isn't utterly awful.
Done. |
As I said before, thanks for this -- the idea is quite useful. I am going to do a little further refactoring later, but it is nothing that stops this from being merged. |
--- Improvements --- * Running tests in random order (`--order rand`) has been reworked significantly (#1908) * Given same seed, all platforms now produce the same order * Given same seed, the relative order of tests does not change if you select only a subset of them * Vector matchers support custom allocators (#1909) * `|` and `&` (bitwise or and bitwise and) are now supported in `CHECK` and `REQUIRE` * The resulting type must be convertible to `bool` --- Fixes --- * Fixed computation of benchmarking column widths in ConsoleReporter (#1885, #1886) * Suppressed clang-tidy's `cppcoreguidelines-pro-type-vararg` in assertions (#1901) * It was a false positive trigered by the new warning support workaround * Fixed bug in test specification parser handling of OR'd patterns using escaping (#1905) --- Miscellaneous --- * Worked around IBM XL's codegen bug (#1907) * It would emit code for _destructors_ of temporaries in an unevaluated context * Improved detection of stdlib's support for `std::uncaught_exceptions` (#1911)
This workaround was added by #27165 The need for it was removed by catchorg/Catch2#1908 and the import of the new code.
Description
Previously a random test ordering was obtained by applying
std::shuffle
to the tests in declaration order. This has two problems:Therefore, change the randomization approach to resolve both these issues.
Generate a random value based on the user-provided RNG seed. Convert every test case to an integer by hashing a combination of that value with the test name. Sort the test cases by this integer.
The test names and RNG are platform-independent, so this should be consistent across platforms. Also, removing one test does not change the integer value associated with the remaining tests, so they remain in the same order.
To hash, use the FNV-1a hash, except with the basis being our randomly selected value rather than the fixed basis set in the algorithm. Cannot use
std::hash
, because it is important that the result be platform-independent.