Line coverage reports #2609

adpaco-aws · 2023-07-19T15:26:38Z

Description of changes:

Adds a coverage report feature to Kani that does not require cbmc-viewer.

In particular, it introduces the unstable --coverage verification option to kani-driver (only usable with the -Z line-coverage unstable feature), which in turn enables the --coverage-checks option in kani-compiler. This results in the injection of coverage checks (similar to cover statements, but with a fixed condition) before each Rust MIR statement and terminator in a basic block. The post-processing of coverage checks allows us to report coverage information at the line level.

This contribution also includes a new compiletest suite developed by @jaisnan for end-to-end testing of the emitted coverage information for different test cases.

The command is:

kani file.rs --coverage -Z line-coverage

For instance, for the assume_false.rs example

#[kani::proof]
fn wrong_coverage_2() {
    let a: u8 = kani::any();
    kani::assume(false);
    assert!(a < 5);
}

we obtain the verification results:

RESULTS:
Check 1: wrong_coverage_2.assertion.1
         - Status: UNREACHABLE
         - Description: "assertion failed: a < 5"
         - Location: assume_false.rs:5:5 in function wrong_coverage_2

and the following coverage results:

Coverage Results:
/home/ubuntu/kani/library/kani/src/lib.rs, 133, FULL
/home/ubuntu/kani/library/kani/src/lib.rs, 134, FULL
/home/ubuntu/kani/library/kani/src/lib.rs, 176, FULL
/home/ubuntu/kani/library/kani/src/lib.rs, 177, FULL

/home/ubuntu/kani/assume_false.rs, 3, FULL
/home/ubuntu/kani/assume_false.rs, 4, FULL
/home/ubuntu/kani/assume_false.rs, 5, NONE
/home/ubuntu/kani/assume_false.rs, 6, NONE

/home/ubuntu/kani/library/kani/src/arbitrary.rs, 55, FULL

Resolved issues:

Resolves #2585
Resolves #2608
Resolves #2610

Related RFC:

#2612

Call-outs:

Testing:

How is this change tested? Existing tests + new coverage suite.
Is this a refactor change? No.

Checklist

Each commit message has a non-empty body, explaining why the change was made
Methods or procedures are documented
Regression or unit tests are included, or existing tests cover the modified code
My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

…verage-tests-2

…co-aws/rmc into coverage-tests-2

…to coverage-tests-2

celinval

I think we need to polish the coverage output to make it more intuitive, but this is a good start.

Please create issues to:

Improve the tests
Create a user friendly output
Export results to a well-supported output
Investigate if we can make some results more intuitive (maybe compare against Rust code coverage results).
All other open questions in the RFC

kani-driver/src/cbmc_output_parser.rs

kani-driver/src/cbmc_property_renderer.rs

tests/coverage/reachable/assert-false/expected

tests/coverage/reachable/assert/reachable_pass/test.rs

tests/coverage/reachable/bounds/reachable_fail/test.rs

tools/compiletest/src/common.rs

zhassan-aws

There are a few redundant tests remaining. Otherwise, looks good.

tests/coverage/unreachable/while-loop-break/main.rs

tests/coverage/unreachable/vectors/main.rs

tests/coverage/unreachable/variant/main.rs

tests/coverage/unreachable/turned-off/test.rs

tests/coverage/unreachable/rem-zero/test.rs

tests/coverage/unreachable/overflow/test.rs

tests/coverage/unreachable/overflow-neg/test.rs

tests/coverage/unreachable/div-zero/test.rs

tests/coverage/unreachable/debug-assert-ne/test.rs

adpaco-aws · 2023-07-28T20:41:32Z

Created the following issues:

which I'm hoping take care of the remaining work here and in the RFC (#2618)

There are a few redundant tests remaining

Indeed, there were many tests that were redundant or didn't have a clear purpose. I removed a bunch of them and have #2635 for another pass.

celinval · 2023-07-29T02:02:01Z

@adpaco-aws should we create a milestone for those?

adpaco-aws · 2023-07-31T17:14:10Z

Yes, just created a Coverage reports milestone to keep track of all of these. I create a new one instead of using the Source-based coverage report generation milestone because that one contains issues with the visualize feature, and it's not clear to me that we should mix both.

This PR replaces the line-based coverage instrumentation we introduced in #2609 with the standard source-based code coverage instrumentation performed by the Rust compiler. As a result, we now insert code coverage checks in the `StatementKind::Coverage(..)` statements produced by the Rust compiler during compilation. These checks include coverage-relevant information[^note-internal] such as the coverage counter/expression they represent [^note-instrument]. Both the coverage metadata (`kanimap`) and coverage results (`kaniraw`) are saved into files after the verification stage. Unfortunately, we currently have a chicken-egg problem with this PR and #3121, where we introduce a tool named `kani-cov` to postprocess coverage results. As explained in #3143, `kani-cov` is expected to be an alias for the `cov` subcommand and provide most of the postprocessing features for coverage-related purposes. But, the tool will likely be introduced after this change. Therefore, we propose to temporarily print a list of the regions in each function with their associated coverage status (i.e., `COVERED` or `UNCOVERED`). ### Source-based code coverage: An example The main advantage of source-based coverage results is their precision with respect to the source code. The [Source-based Code Coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html) documentation explains more details about the LLVM coverage workflow and its different options. For example, let's take this Rust code: ```rust 1 fn _other_function() { 2 println!("Hello, world!"); 3 } 4 5 fn test_cov(val: u32) -> bool { 6 if val < 3 || val == 42 { 7 true 8 } else { 9 false 10 } 11 } 12 13 #[cfg_attr(kani, kani::proof)] 14 fn main() { 15 let test1 = test_cov(1); 16 let test2 = test_cov(2); 17 assert!(test1); 18 assert!(test2); 19 } ``` Compiling and running the program with `rustc` and the `-C instrument-coverage` flag, and using the LLVM tools can get us the following coverage result: ![Image](https://github.com/model-checking/kani/assets/73246657/9070e390-6e0b-4add-828d-d9f9caacad07) In contrast, the `cargo kani --coverage -Zsource-coverage` command currently generates: ``` src/main.rs (main) * 14:1 - 19:2 COVERED src/main.rs (test_cov) * 5:1 - 6:15 COVERED * 6:19 - 6:28 UNCOVERED * 7:9 - 7:13 COVERED * 9:9 - 9:14 UNCOVERED * 11:1 - 11:2 COVERED ``` which is a verification-based coverage result almost equivalent to the runtime coverage results. ### Benchmarking We have evaluated the performance impact of the instrumentation using the `kani-perf.sh` suite (14 benchmarks). For each test, we compare the average time to run standard verification against the average time to run verification with the source-based code coverage feature enabled[^note-line-evaluation]. The evaluation has been performed on an EC2 `m5a.4xlarge` instance running Ubuntu 22.04. The experimental data has been obtained by running the `kani-perf.sh` script 10 times for each version (`only verification` and `verification + coverage`), computing the average and standard deviation. We've split this data into `small` (tests taking 60s or less) and `large` (tests taking more than 60s) and drawn the two graphs below. #### Performance comparison - `small` benchmarks ![performance_comparison_small](https://github.com/user-attachments/assets/679cf412-0193-4b0c-a78c-2d0fb702706f) #### Performance comparison - `large` benchmarks ![performance_comparison_large](https://github.com/user-attachments/assets/4bb5a895-7f57-49e0-86b5-5fea67fad939) #### Comments on performance Looking at the small tests, the performance impact seems negligible in such cases. The difference is more noticeable in the large tests, where the time to run verification and coverage can take 2x or even more. It wouldn't be surprising that, as programs become larger, the complexity of the coverage checking grows exponentially as well. However, since most verification jobs don't take longer than 30min (1800s), it's OK to say that coverage checking represents a 100-200% slowdown in the worst case w.r.t. standard verification. It's also worth noting a few other things: * The standard deviation remains similar in most cases, meaning that the coverage feature doesn't have an impact on their stability. * We haven't tried any SAT solvers other than the ones used by default for each benchmark. It's possible that other solvers perform better/worse with the coverage feature enabled. ### Call-outs * The soundness issue documented in #3441. * The issue with saving coverage mappings for non-reachable functions documented in #3445. * I've modified the test cases in `tests/coverage/` to test this feature. Since this technique is simpler, we don't need that many test cases. However, it's possible I've left some test cases which don't contribute much. Please let me know if you want to add/remove a test case. [^note-internal]: The coverage mappings can't be accessed through the StableMIR interface so we retrieve them through the internal API. [^note-instrument]: The instrumentation replaces certain counters with expressions based on other counters when possible to avoid a part of the runtime overhead. More details can be found [here](https://github.com/rust-lang/rustc-dev-guide/blob/master/src/llvm-coverage-instrumentation.md#mir-pass-instrumentcoverage). Unfortunately, we can't avoid instrumenting expressions at the moment. [^note-line-evaluation]: We have not compared performance against the line-based code coverage feature because it doesn't seem worth it. The line-based coverage feature is guaranteed to include more coverage checks than the source-based one for any function. In addition, source-based results are more precise than line-based ones. So this change represents both a quantitative and qualitative improvement. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

jaisnan and others added 30 commits July 13, 2023 18:22

Add coverage and coverage test script

22751fd

Add coverage to basic blocks (both stmts and terms)

b3a679e

Add descriptions to printed lines

487e4bb

Add cover statements to the 1st statement of each basic block (NO TERMS)

4f42a32

Add cover stmts to the front of each basic block (including terminators)

789cc00

Add cover before every statement/terminator

c9ef05c

Add new coverage prop. class and method to codegen it

77da3b4

Adjustments to post-processing for coverage checks

3466b7f

Merge branch 'main' of https://github.com/model-checking/kani into co…

37d9e5f

…verage-tests-2

Flatten coverage directory

9543241

Add coverage test suite

c1d5569

Remove python script

8206ff0

Add coverage-related flags to driver and compiler

0638f8b

Format

9b0b866

Codegen assert as assume if not for coverage

ba72fc5

Add python script for coverage postprocessing

d94b53e

Merge branch 'proto-cover-every-stmt-term' of https://github.com/adpa…

b2bc2ca

…co-aws/rmc into coverage-tests-2

Merge branch 'postcov-script' of https://github.com/adpaco-aws/rmc in…

8602593

…to coverage-tests-2

Remove explicit kani::cover!() calls

b0a4d25

Add fixed expected files

cf31e98

Add changes to compiletest

54cf004

Add copyright string

265158c

Remove redundant tests and expected lines

7945cc1

add copyright string to script

252c776

Format python script

d4842fb

fix expected

c344f80

change test collection

8dad837

Use existing output checking function

4f346ed

add coverage output format

3d009b2

remove os specific strings from expected

36cdaf6

jaisnan and others added 10 commits July 28, 2023 09:13

Address test related comments

d91b354

Change description of code coverage property class

235b872

Update status to COVERED, UNCOVERED

aece576

add comment

f99284e

Update comments for CheckStatus

d974cf3

Update format_result_coverage

16a5170

Update comment on update_results...coverage_checks

3d3d391

Add comments to some tests

e200b52

Run formatting

cd47c9f

More comments in two tests

a4a7f22

celinval approved these changes Jul 28, 2023

View reviewed changes

adpaco-aws added 4 commits July 28, 2023 18:30

Comments in one test

dd54c09

Make UNCOVERED red

ee3290f

Rename CoverStatus to CoverageStatus

4235ba6

Rewrite condition for PARTIAL status

eea1d25

zhassan-aws reviewed Jul 28, 2023

View reviewed changes

adpaco-aws added 5 commits July 28, 2023 18:43

Rewrite status updates for coverage into match

88a7e2e

Change test to use kani::any_where

c518bf2

Change comment

9a279b5

Remove a bunch of tests

1f552de

Add descriptions to three tests and update results

c3afce2

adpaco-aws mentioned this pull request Jul 28, 2023

Avoid confusing results in coverage reports #2639

Open

adpaco-aws merged commit db42ee9 into model-checking:main Jul 28, 2023
12 checks passed

adpaco-aws assigned adpaco-aws and jaisnan Jul 31, 2023

adpaco-aws mentioned this pull request Apr 2, 2024

Adopt Rust's source-based code coverage instrumentation #3119

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Line coverage reports #2609

Line coverage reports #2609

adpaco-aws commented Jul 19, 2023 •

edited by jaisnan

Loading

celinval left a comment

zhassan-aws left a comment

adpaco-aws commented Jul 28, 2023

celinval commented Jul 29, 2023

adpaco-aws commented Jul 31, 2023

Line coverage reports #2609

Line coverage reports #2609

Conversation

adpaco-aws commented Jul 19, 2023 • edited by jaisnan Loading

Description of changes:

Resolved issues:

Related RFC:

Call-outs:

Testing:

Checklist

celinval left a comment

Choose a reason for hiding this comment

zhassan-aws left a comment

Choose a reason for hiding this comment

adpaco-aws commented Jul 28, 2023

celinval commented Jul 29, 2023

adpaco-aws commented Jul 31, 2023

adpaco-aws commented Jul 19, 2023 •

edited by jaisnan

Loading