-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop Support for Expected Results and Scores without Task-Definition Files? #439
Comments
table-generator reads task-definition files, but this is not tested yet. Furthermore, after #439 this test will be the only one where table-generator generates the full set of statistics (per category).
Question: For replicating old experiments, I could just use an old release of BenchExec without trouble, correct? |
Several possibilities:
|
I am in favor of dropping support for expected results in file names. |
This was already decided a few months ago :-) |
But not documented! |
... and the issue was not closed. |
Because the implementation is not finished yet. We will close it when this is done by labeling the commit as usually. |
ok |
…nerator Part of #439. For tasks that are not defined with yaml files, table-generator so far parses the file name to detect the expected verdict. This commit removes this. The result is that for tables with such tasks, there are no numbers of true/false tasks, and no statistics for correct/wrong results.
Part of #439. This means that all expected verdicts that are encoded in file names (e.g., "_true-unreach-call") are now ignored.
Currently, BenchExec has two modes for checking expected results and computing (SV-COMP) scores:
The second mode is historical and no longer recommended. The code for it in BenchExec complicates the result handling and it would simplify things if we could remove it.
Replicating old experiments (e.g., old SV-COMP instances) would get a little bit more difficult, but would still be possible (one could simply generate task-definition files for tasks in the old format). Furthermore, SV-COMP switched to task-definition files in 2019, and if we remove the legacy mode in 2020, already two instances of SV-COMP with the new format will have had taken place.
To clarify: Tasks could still be defined without task-definition files, BenchExec would just no longer check the correctness of the result nor compute scores.
Please comment here if you still have a use case for the historical mode and it would create problems for you to migrate.
The text was updated successfully, but these errors were encountered: