Slow verification with: well-formedness checks within a big disjunction #791

bobismijnnaam · 2023-12-21T16:41:06Z

Dear Viper team,

I have another question about slowness that I'd love for you to look at. Please leave this one for after the holidays if you're already busy! Also, I could not get is as clean as the previous issue, but I hope maybe you might still be able to shed some light on it for me.

In my problematic VerCors input, I have the following monstrous postcondition:

  ensures \old(goOn) == true && ((board[move.x-2][move.y] == move.t && board[move.x-1][move.y] == move.t) ||
                                 (board[move.x-1][move.y] == move.t && board[move.x+1][move.y] == move.t) ||
                                 (board[move.x+1][move.y] == move.t && board[move.x+2][move.y] == move.t) ||
                                 (board[move.x][move.y-2] == move.t && board[move.x][move.y-1] == move.t) ||
                                 (board[move.x][move.y-1] == move.t && board[move.x][move.y+1] == move.t) ||
                                 (board[move.x][move.y+1] == move.t && board[move.x][move.y+2] == move.t) ||
                                 (board[move.x-2][move.y-2] == move.t && board[move.x-1][move.y-1] == move.t) ||
                                 (board[move.x-1][move.y-1] == move.t && board[move.x+1][move.y+1] == move.t) ||
                                 (board[move.x+1][move.y+1] == move.t && board[move.x+2][move.y+2] == move.t) ||
                                 (board[move.x+2][move.y-2] == move.t && board[move.x+1][move.y-1] == move.t) ||
                                 (board[move.x+1][move.y-1] == move.t && board[move.x-1][move.y+1] == move.t) ||
                                 (board[move.x-1][move.y+1] == move.t && board[move.x-2][move.y+2] == move.t)) == !goOn;
  void play();

(It is very straightforwardly translated into Viper, but in the PVL form it is more readable)

At the place where play is called, in turn1, a lot of time is spent inhaling the postcondition of play. Most of the time is spent in the big disjunction, as can be seen in the following flame graph (profile-full-proper-triggers-empty-turn1.pprof.gz):

(You can view this using pprof -http=:1234 profile....gz, and then flame graph, with sample = wall.)

As far as we understand, the structure that the flame graph kind of shows is:

Prove well-formedness for accessing some board[x][y],
Assume the negation, because of short-circuiting behaviour of or,
Prove well-formedness of the remaining part of the disjunction. This builds the pyramid that tapers off to the left.

This is also the behaviour that we have seen in the code in Evaluator.scala, in def evalSeqShortCircuit. Everything to the right of the big pyramid is the precondition being exhaled, and some small other parts of the postcondition being inhaled.

This whole process takes 40s for the entire postcondition of the second call to play, of which 38s is spent in the big disjunction. The deepest assert (which is really a decider/prover call) visible in the flame graph takes 3.64s. That on its own is not so long, but the fact that there are many is what makes the situation problematic. The first call to play (two lines above the second call) is also slow to verify, but still a lot faster than the second play: 5.35s. I understand why the well-formedness checks are there, and why there are so many. I'm primarily wondering what makes the well-formed assertions so slow. There don't seem to be complicated data structures/permissions, quantifiers, or arithmetic constraints present. So I'd expect them to be (spoilt as I am as a user far removed from the reality of z3) on the millisecond scale.

For completeness, this is the smt content of the last prover call:

Assert (Z) < (PermLookup(int, pm@555@10, aloc((_, _), optGet1(_, Lookup(arrInt, sm@552@10, aloc((_, _), optGet1(_, First:(Second:(Second:(Second:(Second:($t@421@10)))))), First:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:($t@421@10))))))))))))))) - 2))), First:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:(Second:($t@421@10)))))))))))))))) + 2)))

This is the almost minimal input file: ttt-full-proper-triggers-empty-turn1.vpr.txt. The big disjunction is on line 360, and play is called twice at the end.

IMHO, the file should be fairly safe trigger wise. There are a few quantifiers in there that say something like xs[i] != None, but I don't think those should cause problems as the triggers there are pretty clear. So I'm not sure if quantifiers are the cause of making the well-formedness checks slow. Another possible culprit could be that bounds checking is somehow expensive. However, one of the assumptions in scope is that 2 <= move.x < width - 2, and same for y. width and height are directly related to the length of the board array and the arrays contained in board.

Thanks again for any insight you can offer me!

The text was updated successfully, but these errors were encountered:

marcoeilers · 2023-12-22T13:40:28Z

There seem to be multiple issues:

Silicon seems to spend the vast majority of its time proving that it has permission to the arr_int and int fields read in that big disjunction, and it does that separately for every field read, even if you read the same field twice.
I tried rewriting the disjunction using let-expressions to pull out the arr_int reads s.t. the same field read is only evaluated once, that got the time down from ca. 55 to 30 seconds. I guess one could do the same again for the int field reads and save some more time.
There seems to be an issue where snapshot map caching isn't working properly. That's an easy fix and gets us from 30 to 20 seconds. I'll make a PR for that at some point.

Not sure if that's everything; the terms that are involved are all rather big and Silicon seems to be performing a lot of joins, but I can't tell if those are problematic and/or if there's anything we can do about it.

marcoeilers · 2023-12-22T14:05:51Z

The snapshot map cache fix by itself seems to cut verification time roughly in half.

bobismijnnaam · 2023-12-22T15:23:19Z

Thanks for looking at it so quickly! I have factored in your feedback on the duplicated fields. With the map cache fix verification indeed seems faster. My original big input is unstable, so it's hard to be exact, but from a few test runs it feels faster. With these improvements I can benchmark again, and see if I can find some bottlenecks that are specific enough for you to look at.

marcoeilers mentioned this issue Dec 22, 2023

Properly use snapshot map cache for QP field reads #792

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow verification with: well-formedness checks within a big disjunction #791

Slow verification with: well-formedness checks within a big disjunction #791

bobismijnnaam commented Dec 21, 2023

marcoeilers commented Dec 22, 2023

marcoeilers commented Dec 22, 2023

bobismijnnaam commented Dec 22, 2023

Slow verification with: well-formedness checks within a big disjunction #791

Slow verification with: well-formedness checks within a big disjunction #791

Comments

bobismijnnaam commented Dec 21, 2023

marcoeilers commented Dec 22, 2023

marcoeilers commented Dec 22, 2023

bobismijnnaam commented Dec 22, 2023