Generate Multiple playback harnesses when multiple crashes exist in a single harness. #2496

YoshikiTakashima · 2023-05-31T19:36:26Z

Description of changes:

Allow multiple harnesses to be generated.

Resolved issues:

Resolves #2461

Call-outs:

Note: We should also update the warning, since concrete playback now supports different property type.

Not sure what is requested here.

Testing:

How is this change tested? tests/ui/concrete-playback/single-harness-multi-trace
Is this a refactor change? No

Checklist

Each commit message has a non-empty body, explaining why the change was made
Methods or procedures are documented
Regression or unit tests are included, or existing tests cover the modified code
My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

jaisnan

Few questions regarding the change -

Do all of the tests have unique identifiers as suffixes? I'm assuming yes, but can you make that more clear in the regression tests that you've added?
Would it make sense to hide this behavior behind a flag?

I'm asking because the vscode extension runs the playback command and in cases where there's more than one unit test being generated, I'm wondering how the behavior might change and if there's something we need to change within the extension as well, to accommodate these cases.

YoshikiTakashima · 2023-05-31T20:34:55Z

@jaisnan

I think they are unique, but I don't think they are deterministic. See tests/ui/concrete-playback/mult-harnesses/expected because those are also unique, but the concrete playback expected is also identical.
I don't immediately have an answer to this one. I would prefer to show all of them, but I understand it may not be ideal if 9000 test cases come out and hammer the customer dev laptop without warning.

YoshikiTakashima · 2023-05-31T20:35:56Z

I'll mark this as draft since it is not immidiately clear what ought to be done. Thus fixing the test fails at this point does not make sense.

celinval

Thanks for doing this. I have 2 requests:

Can you please add a test for inplace where Kani will add more than one test for more than one harness at the same time? I'm a bit worried about updating the file multiple times in a row since we use original line numbers to calculate the test position.
The current implementation is generating a modified version of the file N times. Can we add all the tests in one pass instead?

kani-driver/src/concrete_playback/test_generator.rs

tests/ui/concrete-playback/single-harness-multi-trace/expected

YoshikiTakashima · 2023-06-02T01:36:57Z

@celinval I think I resolved your comments with the exception of the inplace testing, which I think is matter of placing the right files in the right directories. I will flip the PR over to "ready for review"

jaisnan · 2023-06-02T21:04:11Z

I'm a bit worried about a potential proof having a lot of crashes as @YoshikiTakashima has mentioned, and the user clicking the concrete playback button and discovering that they have a lot of test cases pasted into their source. It might be annoying to deal with. Does it make sense to be worried about this corner case? Do we have any plans to deal with that when that happens?

YoshikiTakashima · 2023-06-04T19:06:51Z

@jaisnan I think it's plausible if the harness is large. One possibility is that we inject first N harnesses where N is a user-set parameter in the extension. Not sure if there is a good way to prioritize them.

As for dealing with it when it happens, I think most IDEs have a history beyond last save, so ctrl-z should still work. Not too familiar except for Emacs and VS Code though.

celinval · 2023-06-05T20:06:04Z

@jaisnan I think it's plausible if the harness is large. One possibility is that we inject first N harnesses where N is a user-set parameter in the extension. Not sure if there is a good way to prioritize them.

As for dealing with it when it happens, I think most IDEs have a history beyond last save, so ctrl-z should still work. Not too familiar except for Emacs and VS Code though.

We could add a configurable cap, and even set its default to 1 to keep the same behavior as today.

Regarding priorities, we could have different modes, like user assertions, cover statements and UB checks (the last doesn't work yet with concrete playback). I was wondering if by default we should only generate tests for assertion failures.

Another possibility is to allow users to specify which property they want to generate the testcase for. We don't have any stable way of doing so though.

@adpaco-aws, any thoughts?

celinval

Awesome! Thanks @YoshikiTakashima

kani-driver/src/concrete_playback/test_generator.rs

tests/script-based-pre/playback_already_existing/playback_opts.sh

tests/script-based-pre/playback_multi_harness_multi_inject/playback_opts.sh

blocking changes have been addressed

adpaco-aws · 2023-06-05T22:06:02Z

We could add a configurable cap, and even set its default to 1 to keep the same behavior as today.

This is a good starting point: we can remove the technical limitation first and decide on what properties to target later.

Regarding priorities, we could have different modes, like user assertions, cover statements and UB checks (the last doesn't work yet with concrete playback). I was wondering if by default we should only generate tests for assertion failures.

Still, tests generated for cover statement could be useful for proof debugging. I don't think test cases for UB checks would be that useful though.

celinval · 2023-06-06T04:25:54Z

We could add a configurable cap, and even set its default to 1 to keep the same behavior as today.

This is a good starting point: we can remove the technical limitation first and decide on what properties to target later.

Regarding priorities, we could have different modes, like user assertions, cover statements and UB checks (the last doesn't work yet with concrete playback). I was wondering if by default we should only generate tests for assertion failures.

Still, tests generated for cover statement could be useful for proof debugging. I don't think test cases for UB checks would be that useful though.

I think UB tests would be useful. For example, memory checks could be debugged using valgrind, and other UB checks could be debugged using MIRI.

adpaco-aws · 2023-06-06T15:25:53Z

I think UB tests would be useful. For example, memory checks could be debugged using valgrind, and other UB checks could be debugged using MIRI.

Right, I hadn't thought of using them with other tools.

celinval

Just some minor comments. Thanks!

kani-driver/src/concrete_playback/test_generator.rs

celinval

Just please prune the info. Thanks!

Multi-harness.

438a911

YoshikiTakashima requested a review from a team as a code owner May 31, 2023 19:36

YoshikiTakashima changed the title ~~Generate Multiple harnesses when multiple crashes exist in a single harness.~~ Generate Multiple playback harnesses when multiple crashes exist in a single harness. May 31, 2023

Adjust test case to new type signature.

bcee075

jaisnan reviewed May 31, 2023

View reviewed changes

YoshikiTakashima marked this pull request as draft May 31, 2023 20:36

celinval previously requested changes May 31, 2023

View reviewed changes

kani-driver/src/concrete_playback/test_generator.rs Outdated Show resolved Hide resolved

kani-driver/src/concrete_playback/test_generator.rs Outdated Show resolved Hide resolved

tests/ui/concrete-playback/single-harness-multi-trace/expected Outdated Show resolved Hide resolved

YoshikiTakashima added 2 commits May 31, 2023 17:54

Comments and messages

e3e29bf

Stabilize value.

0cae000

YoshikiTakashima force-pushed the yoshi-2461-multi-test branch from fb41650 to 0cae000 Compare May 31, 2023 21:58

YoshikiTakashima added 3 commits June 1, 2023 21:24

Add second case.

8814032

Moved loop down into harness injection part, reduce fopen.

fd9fd4b

Clippy.

3eb13db

YoshikiTakashima marked this pull request as ready for review June 2, 2023 01:36

YoshikiTakashima added 6 commits June 2, 2023 11:50

This test produces 2 injected tests now.

0ff6f4e

Fixed test comment.

8a8d7d7

test: Multi-harness, each with multi-inject.

ecd093b

Merge branch 'main' into yoshi-2461-multi-test

37a299a

Fixed existing harness issue.

f447048

injecting into already existing tests.

0de324c

YoshikiTakashima added 2 commits June 2, 2023 17:13

Delete injected test that has different suffix on Mac/Ubuntu.

75218ca

Merge branch 'main' into yoshi-2461-multi-test

ff22b95

Merge branch 'main' into yoshi-2461-multi-test

4e20656

celinval reviewed Jun 5, 2023

View reviewed changes

YoshikiTakashima added 4 commits June 6, 2023 18:46

Fix comments.

930efaf

Prune test script: just check for double insert.

5979461

typo: , -> .

db007ea

Update expected test to new message.

38cd57c

celinval reviewed Jun 7, 2023

View reviewed changes

kani-driver/src/concrete_playback/test_generator.rs Outdated Show resolved Hide resolved

kani-driver/src/concrete_playback/test_generator.rs Outdated Show resolved Hide resolved

kani-driver/src/concrete_playback/test_generator.rs Outdated Show resolved Hide resolved

YoshikiTakashima added 3 commits June 7, 2023 10:05

Name and comment fix.

b4b3336

Newline for each test.

919ef57

Merge branch 'main' into yoshi-2461-multi-test

a7020b1

celinval reviewed Jun 8, 2023

View reviewed changes

kani-driver/src/concrete_playback/test_generator.rs Outdated Show resolved Hide resolved

celinval approved these changes Jun 8, 2023

View reviewed changes

Fix message listing type.

0cff780

YoshikiTakashima enabled auto-merge (squash) June 8, 2023 00:43

YoshikiTakashima merged commit 95e7161 into model-checking:main Jun 8, 2023

YoshikiTakashima mentioned this pull request Jun 8, 2023

2 Zero-size types result in a harness with the same name #2509

Closed

YoshikiTakashima deleted the yoshi-2461-multi-test branch June 8, 2023 15:06

YoshikiTakashima mentioned this pull request Jun 9, 2023

De-duplicate same input injections for the same harness. #2513

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate Multiple playback harnesses when multiple crashes exist in a single harness. #2496

Generate Multiple playback harnesses when multiple crashes exist in a single harness. #2496

YoshikiTakashima commented May 31, 2023

jaisnan left a comment

YoshikiTakashima commented May 31, 2023

YoshikiTakashima commented May 31, 2023

celinval left a comment

YoshikiTakashima commented Jun 2, 2023

jaisnan commented Jun 2, 2023

YoshikiTakashima commented Jun 4, 2023

celinval commented Jun 5, 2023

celinval left a comment

adpaco-aws commented Jun 5, 2023

celinval commented Jun 6, 2023

adpaco-aws commented Jun 6, 2023

celinval left a comment

celinval left a comment

Generate Multiple playback harnesses when multiple crashes exist in a single harness. #2496

Generate Multiple playback harnesses when multiple crashes exist in a single harness. #2496

Conversation

YoshikiTakashima commented May 31, 2023

Description of changes:

Resolved issues:

Call-outs:

Testing:

Checklist

jaisnan left a comment

Choose a reason for hiding this comment

YoshikiTakashima commented May 31, 2023

YoshikiTakashima commented May 31, 2023

celinval left a comment

Choose a reason for hiding this comment

YoshikiTakashima commented Jun 2, 2023

jaisnan commented Jun 2, 2023

YoshikiTakashima commented Jun 4, 2023

celinval commented Jun 5, 2023

celinval left a comment

Choose a reason for hiding this comment

adpaco-aws commented Jun 5, 2023

celinval commented Jun 6, 2023

adpaco-aws commented Jun 6, 2023

celinval left a comment

Choose a reason for hiding this comment

celinval left a comment

Choose a reason for hiding this comment