Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Known limitations of branch coverage instrumentation #124118

Open
4 tasks
Zalathar opened this issue Apr 18, 2024 · 9 comments
Open
4 tasks

Known limitations of branch coverage instrumentation #124118

Zalathar opened this issue Apr 18, 2024 · 9 comments
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Zalathar
Copy link
Contributor

Zalathar commented Apr 18, 2024

Branch coverage is known to support these branching language constructs:

  • if and while
    • Includes if let and let-chains
    • Treats nested || and && as separate branches
    • while desugars to if before MIR building, so we support it as a consequence of supporting if
  • Standalone || and && expressions
  • let-else statements
  • Match arm guards
    • These also desugar to if expressions before MIR building
      • This also means that we naturally support if let guards
    • But their presence makes it more complicated to instrument match arms, as described below

Branch coverage is known to not yet support these branching language constructs:

  • Individual arms and or-patterns in match expressions
    • We need to be sure to support (or gracefully not support) all four combinations of:
      • Arms with/without guards
        • The guards themselves are already supported (as indicated above), but their presence also complicates how we instrument the enclosing match arm, and other match arms in the same match
        • Note that the combination of guards and or-patterns results in particularly complex control-flow, because if a guard fails then we keep searching for other or-pattern matches in the same arm
      • Arms with/without or-patterns, including arbitrary nesting
    • coverage: Treat each match arm as a "branch" for branch coverage #124154
      • This draft does not support or-patterns, and might not be able to support them without substantial changes (requiring a different approach)
      • This draft uses complex counter expressions to work around not being able to instrument all the appropriate points in the control-flow graph directly; this ties back into concerns about or-patterns
  • The try ? operator (which conditionally returns)
    • This is desugared to a match expression, so once we support those, supporting ? should mostly be a matter of expansion-span bookkeeping
  • .await expressions (which implicitly “return” when the enclosing future is dropped)
    • This can probably be excluded from any MVP of branch coverage, unless it turns out to be unexpectedly easy after supporting ?
    • But some users will presumably want this in the long run
  • Any of the above branching constructs that are introduced by macro expansion
    • The current implementation discards any branch span that isn't “directly visible” in the function body

Relevant PR links:

(Rewritten on 2024-07-11; see edit history for the old text.)

@rustbot rustbot added needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) labels Apr 18, 2024
@Zalathar
Copy link
Contributor Author

Based on my investigations so far:

  • If-let (and friends) and let-else are relatively easy to support. I have implementations of them, though I haven't filed PRs yet because I'm wary of making any changes that will accidentally make the other branching constructs harder to support.
  • Match arms are tricky to support at the moment. Conceptually, it makes sense to treat each arm (except the last) as a branch, with the true arm leading to the guard check (if any) followed by the corresponding body, and the false arm leading to the test for the next arm. However, that control-flow structure doesn't necessarily exist in the lowering from THIR to MIR, so we don't always have a suitable place to insert block markers.
    • I have a sketch of a plan for how to deal with this, by inserting block markers in all the true arms, and then using coverage expressions to create synthetic counts for the false arms. However, this will require some non-trivial reshuffling of how the InstrumentCoverage MIR pass currently deals with branch coverage information.
  • The try operator desugars to a match expression, so if we support those, then we should be close to supporting try as well.
    • The caveat here is that we currently ignore all branches that come from expansion of any kind, so once we support match arms, we'll need to deal with that limitation as well.
  • Await seems complicated enough that I'm inclined to put it in the too-hard basket for now.

@Zalathar
Copy link
Contributor Author

Zalathar commented Apr 18, 2024

There's also some discussion at #79649 (comment) about whether the right-hand-side of a lazy boolean expression (&&/||) should be treated as a “branch” condition, even when it's not part of an if condition.

My view is that “branch coverage” should not be in the business of instrumenting things that clearly aren't branches at the language level, at least not by default. There are valid use-cases for that behaviour, but if it exists then it should require some additional level of opt-in beyond just enabling branch coverage.

(If people want to discuss this point further, I would suggest creating a separate issue for that discussion.)

@Lambdaris
Copy link

Is there a draft folk for if-let pattern?

I have tried a bit and found a possibly feasible way to insert block markers, see a naive draft.

Because if let would be taken as match guard this method may solve them together (not sure, I just check it could find the true/false blocks for code like if let (En::A(Some(a)) | En::B(a), En::C(b)) = _.

By this then BranchSpan might need to change its true_marker and false_marker to Vec<BlockMarkerId>.

@Zalathar
Copy link
Contributor Author

I managed to get my match-arm idea working, so I'll try to tie up the loose ends and post it as a PR.

@Zalathar
Copy link
Contributor Author

#124154 shows my current approach to instrumenting match arms for branch coverage.

@Zalathar
Copy link
Contributor Author

I now have draft PRs for if-let, let-else, and match arms (without or-patterns).

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Apr 22, 2024
coverage: Prepare for improved branch coverage

When trying to rebase my new branch coverage work (including rust-lang#124154) on top of the introduction of MC/DC coverage (rust-lang#123409), I found it a lot harder than anticipated. With the benefit of hindsight, the branch coverage code and MC/DC code have become more interdependent than I'm happy with.

This PR therefore disentangles them a bit, so that it will be easier for both areas of code to evolve independently without interference.

---

This PR also includes a few extra branch coverage tests that I had sitting around from my current branch coverage work. They mostly just demonstrate that certain language constructs listed in rust-lang#124118 currently don't have branch coverage support.

`@rustbot` label +A-code-coverage
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Apr 22, 2024
coverage: Prepare for improved branch coverage

When trying to rebase my new branch coverage work (including rust-lang#124154) on top of the introduction of MC/DC coverage (rust-lang#123409), I found it a lot harder than anticipated. With the benefit of hindsight, the branch coverage code and MC/DC code have become more interdependent than I'm happy with.

This PR therefore disentangles them a bit, so that it will be easier for both areas of code to evolve independently without interference.

---

This PR also includes a few extra branch coverage tests that I had sitting around from my current branch coverage work. They mostly just demonstrate that certain language constructs listed in rust-lang#124118 currently don't have branch coverage support.

``@rustbot`` label +A-code-coverage
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 23, 2024
Rollup merge of rust-lang#124217 - Zalathar:pre-branch, r=oli-obk

coverage: Prepare for improved branch coverage

When trying to rebase my new branch coverage work (including rust-lang#124154) on top of the introduction of MC/DC coverage (rust-lang#123409), I found it a lot harder than anticipated. With the benefit of hindsight, the branch coverage code and MC/DC code have become more interdependent than I'm happy with.

This PR therefore disentangles them a bit, so that it will be easier for both areas of code to evolve independently without interference.

---

This PR also includes a few extra branch coverage tests that I had sitting around from my current branch coverage work. They mostly just demonstrate that certain language constructs listed in rust-lang#124118 currently don't have branch coverage support.

``@rustbot`` label +A-code-coverage
@ZhuUx
Copy link

ZhuUx commented Apr 23, 2024

Noticed the match arms pay attention to occasion like comment at first line of make_expression in counters.rs, I think it may be better put a note here rather reviews of the current two prs.

Many redundant expressions like this may be introduced by match arms.
The root cause is rustc may generate several test blocks for certain pattern. For example, code (A | B, C | D) will generate CFG like code

if match A {
    if match C {
        // Assume here is BcbCounter 1
    }  else if match D {
        // ...
    }
} else if match B {
    if match C {
        // Assume here is BcbCounter 2
    }  else if match D {
        // ...
    }
}

The true_term of C apparently should be BcbCounter 1 + BcbCounter 2. Pattern (A | B, C | D, E | F) even make terms of E more sophisticated. Actually they are just composition of several BcbCounter::Counter thus there probably many of intermediate expressions can be optimized out.

However, given that Expressions does not have runtime overhead the main side effect of these redundant expressions might only make people who read mir directly distressed. Further if we wanted to minimize the number of expressions, probably do it in llvm is better.

@ZhuUx
Copy link

ZhuUx commented Apr 23, 2024

The true_term of C apparently should be BcbCounter 1 + BcbCounter 2. Pattern (A | B, C | D, E | F) even make terms of E more sophisticated. Actually they are just composition of several BcbCounter::Counter thus there probably many of intermediate expressions can be optimized out.

This might not be "apparent". We also can view C as several branches sharing one span. By this view most work could be simplified but users would be confused with many branches in llvm-cov reports.

@saethlin saethlin added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 28, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue May 7, 2024
…er-errors

coverage: Branch coverage support for let-else and if-let

This PR adds branch coverage instrumentation for let-else and if-let, including let-chains.

This lifts two of the limitations listed at rust-lang#124118.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue May 7, 2024
…er-errors

coverage: Branch coverage support for let-else and if-let

This PR adds branch coverage instrumentation for let-else and if-let, including let-chains.

This lifts two of the limitations listed at rust-lang#124118.
bors added a commit to rust-lang-ci/rust that referenced this issue May 7, 2024
…-errors

coverage: Branch coverage support for let-else and if-let

This PR adds branch coverage instrumentation for let-else and if-let, including let-chains.

This lifts two of the limitations listed at rust-lang#124118.
@Zalathar
Copy link
Contributor Author

I've rewritten the issue description to give a more up-to-date summary of the status quo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants