-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent coverage failures (some test runs not counted) #91092
Comments
@rustbot label A-code-coverage |
I see you are using the |
Gave it a try; still exhibits the same behavior. |
|
Just learned that the Not closing this issue, though. Rust is supposed to be "fearless concurrency" so this should work correctly even if the test runner is using multiple threads. |
184: Limit the number of threads to work around rust-lang/rust#91092 r=taiki-e a=taiki-e Co-authored-by: Taiki Endo <te316e89@gmail.com>
184: Limit the number of test threads to work around rust-lang/rust#91092 r=taiki-e a=taiki-e Co-authored-by: Taiki Endo <te316e89@gmail.com>
According to: https://github.com/taiki-e/cargo-llvm-cov#known-limitations it's only defaulting to 1 thread, because of rustc issue: rust-lang/rust#91092 but it seems the issue is that relatively infrequently some tests will fail to be reported... which if fine with me if it makes the CI faster. And they are talking about thousands of tests, while we probably have <100.
According to: https://github.com/taiki-e/cargo-llvm-cov#known-limitations it's only defaulting to 1 thread, because of rustc issue: rust-lang/rust#91092 but it seems the issue is that relatively infrequently some tests will fail to be reported... which if fine with me if it makes the CI faster. And they are talking about thousands of tests, while we probably have <100.
Still reproducible with
I used such script to run against https://github.com/scole66/rust-e262/tree/reduction-for-bugreport ,
at step 854, coverage |
This is in cargo-binutils. |
I created bug against upstream llvm/llvm-project#62558 |
Looks like this is not llvm bug. There is bool flag void InstrProfiling::lowerIncrement(InstrProfIncrementInst *Inc) {
auto *Addr = getCounterAddress(Inc);
IRBuilder<> Builder(Inc);
if (Options.Atomic || AtomicCounterUpdateAll ||
(Inc->getIndex()->isZeroValue() && AtomicFirstCounter)) {
Builder.CreateAtomicRMW(AtomicRMWInst::Add, Addr, Inc->getStep(),
MaybeAlign(), AtomicOrdering::Monotonic);
} else {
Value *IncStep = Inc->getStep();
Value *Load = Builder.CreateLoad(IncStep->getType(), Addr, "pgocount");
auto *Count = Builder.CreateAdd(Load, Inc->getStep());
auto *Store = Builder.CreateStore(Count, Addr);
if (isCounterPromotionEnabled())
PromotionCandidates.emplace_back(cast<Instruction>(Load), Store);
}
Inc->eraseFromParent();
} So if set it to true via So is any way to set |
$ rustc -C llvm-args='--help-list-hidden' | rg 'atomic-counter'
--atomic-counter-update-promoted - Do counter update using atomic fetch add for promoted counters only
--gcov-atomic-counter - Make counter updates atomic
--instrprof-atomic-counter-update-all - Make all profile counter updates atomic (for testing only) |
I can not see how any of this options can reach
So I create PR to set |
Looks like
where
|
…r=wesleywiser Fix data race in llvm source code coverage Fixes rust-lang#91092 . Before this patch, increment of counters for code coverage looks like this: ``` movq .L__profc__RNvCsd6wgJFC5r19_3lib6bugaga+8(%rip), %rax addq $1, %rax movq %rax, .L__profc__RNvCsd6wgJFC5r19_3lib6bugaga+8(%rip) ``` after this patch: ``` lock incq .L__profc__RNvCs3JgIB2SjHh2_3lib6bugaga+8(%rip) ```
I have a large-ish project with ~6000 testcases, and for a long time now, code coverage has been hit-or-miss. The issue is that some of the test cases don't seem to get their data included in the
profraw
files, and so don't show up as having an effect on coverage. The unrecorded tests seem to be essentially random, but with thousands of tests, single-digit-percentage failures are noticed on every test run.It's been annoying. So I finally sat down to try and reduce to a simplest error, but it still takes multiple files, so is difficult to include inline in a GitHub issue.
The tree:
When I run a test via:
which shows 2 tests run:
Then report on that particular function via:
Most of the time, I see the correct result:
But sometimes (about 1 in 170 times), I see this:
It seems to be related to having multiple source files; I could not get similar behavior with only a
main.rs
. I've seen the problem appear on both MacOS (Mohave) and on Windows (in Windows Subsystem for Linux 2)The two-source-file tree mentioned up above (including the script I run to repeat the test until a failure happens), is on a bug-report branch here: https://github.com/scole66/rust-e262/tree/reduction-for-bugreport
This really feels like whichever thread is controlling writes to the
profraw
file is missing messages. Queue overrun maybe? (I haven't looked.)Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: