Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[analyzer] Suppress out of bounds reports after weak loop assumptions #109804

Closed
wants to merge 7 commits into from

Conversation

NagyDonat
Copy link
Contributor

The checker alpha.security.ArrayBoundV2 produced lots of false positives in situations where loop modeling of the engine fed it with unfounded assumptions.

This commit introduces a heuristic that discards ArrayBoundV2 reports when the execution path introduces an assumption that is questionable. More precisely, two kinds of assumptions are categorized as "weak":

  1. When the analyzer assumes that the first evaluation of the loop condition returns false and the loop body is completely skipped.
  2. When the analyzer assumes that the loop condition is true in a situation where it already executed (at least) two iterations.

For examples and more explanation, see the new tests.

The actual implementation uses some approximations (it uses the BlockCount instead of the iteration count) because that seems to be "good enough" for this heuristical suppression.

Note that I used minor state updates instead of bug reporter visitors because the number of finished iterations is not visible in the visitor which "walks backwards in time".

As a very minor unrelated change, this commit removes the "Bin" part from the method name "evalEagerlyAssumeBinOpBifurcation" because this method is also used for the unary logical not operator.

The checker alpha.security.ArrayBoundV2 produced lots of false positives
in situations where loop modeling of the engine fed it with unfounded
assumptions.

This commit introduces a heuristic that discards ArrayBoundV2 reports
when the execution path introduces an assumption that is questionable.
More precisely, two kinds of assumptions are categorized as "weak":
(1) When the analyzer assumes that the first evaluation of the loop
    condition returns false and the loop body is completely skipped.
(2) When the analyzer assumes that the loop condition is true in a
    situation where it already executed (at least) two iterations.
For examples and more explanation, see the new tests.

The actual implementation uses some approximations (it uses the
BlockCount instead of the iteration count) because that seems to be
"good enough" for this heuristical suppression.

Note that I used minor state updates instead of bug reporter visitors
because the number of finished iterations is not visible in the visitor
which "walks backwards in time".

As a very minor unrelated change, this commit removes the "Bin" part
from the method name "evalEagerlyAssumeBinOpBifurcation" because this
method is also used for the unary logical not operator.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Sep 24, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 24, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Donát Nagy (NagyDonat)

Changes

The checker alpha.security.ArrayBoundV2 produced lots of false positives in situations where loop modeling of the engine fed it with unfounded assumptions.

This commit introduces a heuristic that discards ArrayBoundV2 reports when the execution path introduces an assumption that is questionable. More precisely, two kinds of assumptions are categorized as "weak":

  1. When the analyzer assumes that the first evaluation of the loop condition returns false and the loop body is completely skipped.
  2. When the analyzer assumes that the loop condition is true in a situation where it already executed (at least) two iterations.

For examples and more explanation, see the new tests.

The actual implementation uses some approximations (it uses the BlockCount instead of the iteration count) because that seems to be "good enough" for this heuristical suppression.

Note that I used minor state updates instead of bug reporter visitors because the number of finished iterations is not visible in the visitor which "walks backwards in time".

As a very minor unrelated change, this commit removes the "Bin" part from the method name "evalEagerlyAssumeBinOpBifurcation" because this method is also used for the unary logical not operator.


Full diff: https://github.com/llvm/llvm-project/pull/109804.diff

6 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h (+31-11)
  • (modified) clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp (+5)
  • (modified) clang/lib/StaticAnalyzer/Core/CoreEngine.cpp (+24-1)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+73-15)
  • (modified) clang/test/Analysis/loop-unrolling.cpp (+3-3)
  • (modified) clang/test/Analysis/out-of-bounds.c (+101)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
index 04eacd1df7ffe2..0361ce5515a868 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
@@ -121,6 +121,25 @@ struct EvalCallOptions {
   EvalCallOptions() {}
 };
 
+/// Simple control flow statements like `if` only produce a single state split,
+/// so the fact that they are included in the source code implies that both
+/// branches are possible (at least under some conditions) and the analyzer can
+/// freely assume either of them. (This is not entirely true, because there may
+/// be unmarked logical correlations between `if` statements, but is a good
+/// enough heuristic and the analyzer strongly relies on it.)
+/// On the other hand, in a loop the state may be split repeatedly at each
+/// evaluation of the loop condition, and this can lead to following "weak"
+/// assumptions even though the code does not imply that they're valid and the
+/// programmer intended to cover them.
+/// This function is called to mark the `State` when the engine makes an
+/// assumption which is weak. Checkers may use this heuristical mark to discard
+/// result and reduce the amount of false positives.
+ProgramStateRef recordWeakLoopAssumption(ProgramStateRef State);
+
+/// Returns true if `recordWeakLoopAssumption()` was called on the execution
+/// path which produced `State`.
+bool seenWeakLoopAssumption(ProgramStateRef State);
+
 class ExprEngine {
   void anchor();
 
@@ -323,12 +342,13 @@ class ExprEngine {
 
   /// ProcessBranch - Called by CoreEngine.  Used to generate successor
   ///  nodes by processing the 'effects' of a branch condition.
-  void processBranch(const Stmt *Condition,
-                     NodeBuilderContext& BuilderCtx,
-                     ExplodedNode *Pred,
-                     ExplodedNodeSet &Dst,
-                     const CFGBlock *DstT,
-                     const CFGBlock *DstF);
+  /// If the branch condition is a loop condition, IterationsFinishedInLoop is
+  /// the number of already finished iterations (0, 1, 2...); otherwise it's
+  /// std::nullopt.
+  void processBranch(const Stmt *Condition, NodeBuilderContext &BuilderCtx,
+                     ExplodedNode *Pred, ExplodedNodeSet &Dst,
+                     const CFGBlock *DstT, const CFGBlock *DstF,
+                     std::optional<unsigned> IterationsFinishedInLoop);
 
   /// Called by CoreEngine.
   /// Used to generate successor nodes for temporary destructors depending
@@ -583,11 +603,11 @@ class ExprEngine {
                                 ExplodedNode *Pred,
                                 ExplodedNodeSet &Dst);
 
-  /// evalEagerlyAssumeBinOpBifurcation - Given the nodes in 'Src', eagerly assume symbolic
-  ///  expressions of the form 'x != 0' and generate new nodes (stored in Dst)
-  ///  with those assumptions.
-  void evalEagerlyAssumeBinOpBifurcation(ExplodedNodeSet &Dst, ExplodedNodeSet &Src,
-                         const Expr *Ex);
+  /// evalEagerlyAssumeOpBifurcation - Given the nodes in 'Src', eagerly assume
+  /// symbolic expressions of the form 'x != 0' or '!x' and generate new nodes
+  /// (stored in Dst) with those assumptions.
+  void evalEagerlyAssumeOpBifurcation(ExplodedNodeSet &Dst,
+                                      ExplodedNodeSet &Src, const Expr *Ex);
 
   static std::pair<const ProgramPointTag *, const ProgramPointTag *>
     geteagerlyAssumeBinOpBifurcationTags();
diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index 3f837564cf47c4..da9ae1c749a3db 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -697,6 +697,11 @@ void ArrayBoundCheckerV2::reportOOB(CheckerContext &C,
                                     ProgramStateRef ErrorState, Messages Msgs,
                                     NonLoc Offset, std::optional<NonLoc> Extent,
                                     bool IsTaintBug /*=false*/) const {
+  // Suppress results found through execution paths where in some loop the
+  // analyzer arbitrarily assumed either that the loop is skipped (0 iterations)
+  // or that 3 or more iterations are executed.
+  if (seenWeakLoopAssumption(ErrorState))
+    return;
 
   ExplodedNode *ErrorNode = C.generateErrorNode(ErrorState);
   if (!ErrorNode)
diff --git a/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp b/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp
index 8605fa149e4f52..b5e6b3c1bcb471 100644
--- a/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp
@@ -441,10 +441,33 @@ void CoreEngine::HandleCallEnter(const CallEnter &CE, ExplodedNode *Pred) {
 void CoreEngine::HandleBranch(const Stmt *Cond, const Stmt *Term,
                                 const CFGBlock * B, ExplodedNode *Pred) {
   assert(B->succ_size() == 2);
+
+  const LocationContext *LC = Pred->getLocationContext();
+  BlockCounter Counter = WList->getBlockCounter();
+  unsigned BlockCount =
+      Counter.getNumVisited(LC->getStackFrame(), B->getBlockID());
+  std::optional<unsigned> IterationsFinishedInLoop = std::nullopt;
+  if (isa<ForStmt, WhileStmt, CXXForRangeStmt>(Term)) {
+    // FIXME: This code approximates the number of finished iteration based on
+    // the block count, i.e. the number of evaluations of the terminator block
+    // on the current execution path (which includes the current evaluation, so
+    // is always at least 1). This is probably acceptable for the
+    // checker-specific false positive suppression that currently uses this
+    // value, but it would be better to calcuate an accurate count of
+    // iterations.
+    assert(BlockCount >= 1);
+    IterationsFinishedInLoop = BlockCount - 1;
+  } else if (isa<DoStmt>(Term)) {
+    // FIXME: The fixme note in the previous branch also applies here.
+    // In a do-while loop one iteration happens before the first evaluation of
+    // the loop condition, so we don't subtract one from the block count.
+    IterationsFinishedInLoop = BlockCount;
+  }
+
   NodeBuilderContext Ctx(*this, B, Pred);
   ExplodedNodeSet Dst;
   ExprEng.processBranch(Cond, Ctx, Pred, Dst, *(B->succ_begin()),
-                       *(B->succ_begin() + 1));
+                        *(B->succ_begin() + 1), IterationsFinishedInLoop);
   // Enqueue the new frontier onto the worklist.
   enqueue(Dst);
 }
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index fdabba46992b08..94a772de7f466a 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -212,6 +212,25 @@ typedef llvm::ImmutableMap<const LocationContext *, unsigned>
 REGISTER_TRAIT_WITH_PROGRAMSTATE(PendingArrayDestruction,
                                  PendingArrayDestructionMap)
 
+// This trait is used to heuristically filter out results produced from
+// execution paths that took "weak" assumptions within a loop.
+REGISTER_TRAIT_WITH_PROGRAMSTATE(SeenWeakLoopAssumption, bool)
+
+ProgramStateRef clang::ento::recordWeakLoopAssumption(ProgramStateRef State) {
+  return State->set<SeenWeakLoopAssumption>(true);
+}
+
+bool clang::ento::seenWeakLoopAssumption(ProgramStateRef State) {
+  return State->get<SeenWeakLoopAssumption>();
+}
+
+// This trait points to the last expression (logical operator) where an eager
+// assumption introduced a state split (i.e. both cases were feasible). This is
+// used by the WeakLoopAssumption heuristic to find situations where the an
+// eager assumption introduces a state split within the evaluation of a loop
+// condition.
+REGISTER_TRAIT_WITH_PROGRAMSTATE(LastEagerlyAssumeAssumptionAt, const Expr *)
+
 //===----------------------------------------------------------------------===//
 // Engine construction and deletion.
 //===----------------------------------------------------------------------===//
@@ -2128,7 +2147,7 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred,
           (B->isRelationalOp() || B->isEqualityOp())) {
         ExplodedNodeSet Tmp;
         VisitBinaryOperator(cast<BinaryOperator>(S), Pred, Tmp);
-        evalEagerlyAssumeBinOpBifurcation(Dst, Tmp, cast<Expr>(S));
+        evalEagerlyAssumeOpBifurcation(Dst, Tmp, cast<Expr>(S));
       }
       else
         VisitBinaryOperator(cast<BinaryOperator>(S), Pred, Dst);
@@ -2401,7 +2420,7 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred,
       if (AMgr.options.ShouldEagerlyAssume && (U->getOpcode() == UO_LNot)) {
         ExplodedNodeSet Tmp;
         VisitUnaryOperator(U, Pred, Tmp);
-        evalEagerlyAssumeBinOpBifurcation(Dst, Tmp, U);
+        evalEagerlyAssumeOpBifurcation(Dst, Tmp, U);
       }
       else
         VisitUnaryOperator(U, Pred, Dst);
@@ -2761,12 +2780,10 @@ assumeCondition(const Stmt *Condition, ExplodedNode *N) {
   return State->assume(V);
 }
 
-void ExprEngine::processBranch(const Stmt *Condition,
-                               NodeBuilderContext& BldCtx,
-                               ExplodedNode *Pred,
-                               ExplodedNodeSet &Dst,
-                               const CFGBlock *DstT,
-                               const CFGBlock *DstF) {
+void ExprEngine::processBranch(
+    const Stmt *Condition, NodeBuilderContext &BldCtx, ExplodedNode *Pred,
+    ExplodedNodeSet &Dst, const CFGBlock *DstT, const CFGBlock *DstF,
+    std::optional<unsigned> IterationsFinishedInLoop) {
   assert((!Condition || !isa<CXXBindTemporaryExpr>(Condition)) &&
          "CXXBindTemporaryExprs are handled by processBindTemporary.");
   const LocationContext *LCtx = Pred->getLocationContext();
@@ -2808,6 +2825,9 @@ void ExprEngine::processBranch(const Stmt *Condition,
       std::tie(StTrue, StFalse) = *KnownCondValueAssumption;
     else {
       assert(!isa<ObjCForCollectionStmt>(Condition));
+      // TODO: instead of this shortcut perhaps it would be better to "rejoin"
+      // the common execution path with
+      // StTrue = StFalse = PrevState;
       builder.generateNode(PrevState, true, PredN);
       builder.generateNode(PrevState, false, PredN);
       continue;
@@ -2815,20 +2835,53 @@ void ExprEngine::processBranch(const Stmt *Condition,
     if (StTrue && StFalse)
       assert(!isa<ObjCForCollectionStmt>(Condition));
 
+    const Expr *EagerlyAssumeExpr =
+        PrevState->get<LastEagerlyAssumeAssumptionAt>();
+    const Expr *ConditionExpr = dyn_cast<Expr>(Condition);
+    if (ConditionExpr)
+      ConditionExpr = ConditionExpr->IgnoreParenCasts();
+    bool DidEagerlyAssume = EagerlyAssumeExpr == ConditionExpr;
+    bool BothFeasible = (DidEagerlyAssume || (StTrue && StFalse)) &&
+                        builder.isFeasible(true) && builder.isFeasible(false);
+
     // Process the true branch.
     if (builder.isFeasible(true)) {
-      if (StTrue)
+      if (StTrue) {
+        if (BothFeasible && IterationsFinishedInLoop &&
+            *IterationsFinishedInLoop >= 2) {
+          // When programmers write a loop, they imply that at least two
+          // iterations are possible (otherwise they would just write an `if`),
+          // but the third iteration is not implied: there are situations where
+          // the programmer knows that there won't be a third iteration (e.g.
+          // they iterate over a structure that has <= 2 elements) but this is
+          // not marked in the source code.
+          // Checkers may use this heuristic mark to discard results found on
+          // branches that contain this "weak" assumption.
+          StTrue = recordWeakLoopAssumption(StTrue);
+        }
         builder.generateNode(StTrue, true, PredN);
-      else
+      } else {
         builder.markInfeasible(true);
+      }
     }
 
     // Process the false branch.
     if (builder.isFeasible(false)) {
-      if (StFalse)
+      if (StFalse) {
+        if (BothFeasible && IterationsFinishedInLoop &&
+            *IterationsFinishedInLoop == 0) {
+          // There are many situations where the programmers know that there
+          // will be at least one iteration in a loop (e.g. a structure is not
+          // empty) but the analyzer cannot deduce this and reports false
+          // positives after skipping the loop.
+          // Checkers may use this heuristic mark to discard results found on
+          // branches that contain this "weak" assumption.
+          StFalse = recordWeakLoopAssumption(StFalse);
+        }
         builder.generateNode(StFalse, false, PredN);
-      else
+      } else {
         builder.markInfeasible(false);
+      }
     }
   }
   currBldrCtx = nullptr;
@@ -3752,9 +3805,9 @@ ExprEngine::geteagerlyAssumeBinOpBifurcationTags() {
                         &eagerlyAssumeBinOpBifurcationFalse);
 }
 
-void ExprEngine::evalEagerlyAssumeBinOpBifurcation(ExplodedNodeSet &Dst,
-                                                   ExplodedNodeSet &Src,
-                                                   const Expr *Ex) {
+void ExprEngine::evalEagerlyAssumeOpBifurcation(ExplodedNodeSet &Dst,
+                                                ExplodedNodeSet &Src,
+                                                const Expr *Ex) {
   StmtNodeBuilder Bldr(Src, Dst, *currBldrCtx);
 
   for (const auto Pred : Src) {
@@ -3776,6 +3829,11 @@ void ExprEngine::evalEagerlyAssumeBinOpBifurcation(ExplodedNodeSet &Dst,
       ProgramStateRef StateTrue, StateFalse;
       std::tie(StateTrue, StateFalse) = state->assume(*SEV);
 
+      if (StateTrue && StateFalse) {
+        StateTrue = StateTrue->set<LastEagerlyAssumeAssumptionAt>(Ex);
+        StateFalse = StateFalse->set<LastEagerlyAssumeAssumptionAt>(Ex);
+      }
+
       // First assume that the condition is true.
       if (StateTrue) {
         SVal Val = svalBuilder.makeIntVal(1U, Ex->getType());
diff --git a/clang/test/Analysis/loop-unrolling.cpp b/clang/test/Analysis/loop-unrolling.cpp
index 66a828abfb5133..1d58ba171c0856 100644
--- a/clang/test/Analysis/loop-unrolling.cpp
+++ b/clang/test/Analysis/loop-unrolling.cpp
@@ -349,7 +349,7 @@ int simple_unknown_bound_loop() {
 #ifdef DFS
     clang_analyzer_numTimesReached(); // expected-warning {{16}}
 #else
-    clang_analyzer_numTimesReached(); // expected-warning {{8}}
+    clang_analyzer_numTimesReached(); // expected-warning {{10}}
 #endif
   }
   return 0;
@@ -369,9 +369,9 @@ int nested_inlined_no_unroll1() {
   int k;
   for (int i = 0; i < 9; i++) {
 #ifdef DFS
-    clang_analyzer_numTimesReached(); // expected-warning {{18}}
+    clang_analyzer_numTimesReached(); // expected-warning {{20}}
 #else
-    clang_analyzer_numTimesReached(); // expected-warning {{14}}
+    clang_analyzer_numTimesReached(); // expected-warning {{18}}
 #endif
     k = simple_unknown_bound_loop();  // reevaluation without inlining, splits the state as well
   }
diff --git a/clang/test/Analysis/out-of-bounds.c b/clang/test/Analysis/out-of-bounds.c
index 1f771c2b3bd138..6380e72543bb0c 100644
--- a/clang/test/Analysis/out-of-bounds.c
+++ b/clang/test/Analysis/out-of-bounds.c
@@ -1,4 +1,9 @@
 // RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection -verify %s
+// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection -analyzer-config eagerly-assume=false -verify %s
+
+// Note that eagerly-assume=false is tested separately because the
+// WeakLoopAssumption suppression heuristic uses different code paths to
+// achieve the same result with and without eagerly-assume.
 
 void clang_analyzer_eval(int);
 
@@ -194,3 +199,99 @@ char test_comparison_with_extent_symbol(struct incomplete *p) {
   return ((char *)p)[-1]; // no-warning
 }
 
+// WeakLoopAssumption suppression
+///////////////////////////////////////////////////////////////////////
+
+int GlobalArray[100];
+int loop_suppress_after_zero_iterations(unsigned len) {
+  for (unsigned i = 0; i < len; i++)
+    if (GlobalArray[i] > 0)
+      return GlobalArray[i];
+  // Previously this would have produced an overflow warning because splitting
+  // the state on the loop condition introduced an execution path where the
+  // analyzer thinks that len == 0.
+  // There are very many situations where the programmer knows that an argument
+  // is positive, but this is not indicated in the source code, so we must
+  // avoid reporting errors (especially out of bounds errors) on these
+  // branches, because otherwise we'd get prohibitively many false positives.
+  return GlobalArray[len - 1]; // no-warning
+}
+
+void loop_report_in_second_iteration(int len) {
+  int buf[1] = {0};
+  for (int i = 0; i < len; i++) {
+    // When a programmer writes a loop, we may assume that they intended at
+    // least two iterations.
+    buf[i] = 1; // expected-warning{{Out of bound access to memory}}
+  }
+}
+
+void loop_suppress_in_third_iteration(int len) {
+  int buf[2] = {0};
+  for (int i = 0; i < len; i++) {
+    // We should suppress array bounds errors on the third and later iterations
+    // of loops, because sometimes programmers write a loop in sitiuations
+    // where they know that there will be at most two iterations.
+    buf[i] = 1; // no-warning
+  }
+}
+
+void loop_suppress_in_third_iteration_cast(int len) {
+  int buf[2] = {0};
+  for (int i = 0; (unsigned)(i < len); i++) {
+    // Check that a (somewhat arbitrary) cast does not hinder the recognition
+    // of the condition expression.
+    buf[i] = 1; // no-warning
+  }
+}
+
+void loop_suppress_in_third_iteration_logical_and(int len, int flag) {
+  int buf[2] = {0};
+  for (int i = 0; i < len && flag; i++) {
+    // FIXME: In this case the checker should suppress the warning the same way
+    // as it's suppressed in loop_suppress_in_third_iteration, but the
+    // suppression is not activated because the terminator statement associated
+    // with the loop is just the expression 'flag', while 'i < len' is a
+    // separate terminator statement that's associated with the
+    // short-circuiting operator '&&'.
+    // I have seen a real-world FP that looks like this, but it is much rarer
+    // than the basic setup.
+    buf[i] = 1; // expected-warning{{Out of bound access to memory}}
+  }
+}
+
+void loop_suppress_in_third_iteration_logical_and_2(int len, int flag) {
+  int buf[2] = {0};
+  for (int i = 0; flag && i < len; i++) {
+    // If the two operands of '&&' are flipped, the suppression works.
+    buf[i] = 1; // no-warning
+  }
+}
+
+int coinflip(void);
+int do_while_report_after_one_iteration(void) {
+  int i = 0;
+  do {
+    i++;
+  } while (coinflip());
+  // Unlike `loop_suppress_after_zero_iterations`, running just one iteration
+  // in a do-while is not a corner case that would produce too many false
+  // positives, so don't suppress bounds errors in these situations.
+  return GlobalArray[i-2]; // expected-warning{{Out of bound access to memory}}
+}
+
+void do_while_report_in_second_iteration(int len) {
+  int buf[1] = {0};
+  int i = 0;
+  do {
+    buf[i] = 1; // expected-warning{{Out of bound access to memory}}
+  } while (i++ < len);
+}
+
+void do_while_suppress_in_third_iteration(int len) {
+  int buf[2] = {0};
+  int i = 0;
+  do {
+    buf[i] = 1; // no-warning
+  } while (i++ < len);
+}

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Sep 24, 2024

Instead of the loop widening plans that I discussed earlier (e.g. on discourse) I ended up implementing this suppression heuristic, which is currently in a "minimal stable product" state: it is working and ready to be merged IMO, but I'm also open to suggestions about technical improvements and generalizations.

This change significantly reduces the amount of ArrayBoundV2 false positives, for example on FFMPEG (the project where I've seen the most results) the number of ArrayBoundV2 results is reduced from 316 to 80.

I hope that after this change ArrayBoundV2 can be brought out of the alpha state, because the new result count is comparable to stable checkers: e.g. on ffmpeg where ArrayBoundV2 produces 80 results, there are 166 core.NullDereference results, 135 core.UndefinedBinaryOperatorResult results and 120 core.uninitialized.Assign results.

My first impression is that the remaining ArrayBoundV2 results are still mostly false positives (which is not surprising because these are stable open source projects), but I didn't find any "typical issue" among the first 20 results (on ffmpeg) that I investigated.

The following diff shows the effect of enabling ArrayBoundV2 (in addition to the core, cplusplus, nullability, unix and valist checkers, which are enabled on both sides of the diff).

Project Stable checkers + ArrayBoundV2 with this patch Stable checkers only
memcached 2 new reports 0 resolved reports
tmux 1 new reports 0 resolved reports
curl 4 new reports 1 resolved reports
twin 12 new reports 1 resolved reports
vim 38 new reports 1 resolved reports
openssl 18 new reports 0 resolved reports
sqlite 5 new reports 1 resolved reports
ffmpeg 80 new reports 16 resolved reports
postgres 49 new reports 6 resolved reports
tinyxml2 1 new reports 0 resolved reports
libwebm 15 new reports 1 resolved reports
xerces 2 new reports 1 resolved reports
bitcoin 7 new reports 0 resolved reports
protobuf 8 new reports 2 resolved reports
qtbase 63 new reports 0 resolved reports
contour 1 new reports 0 resolved reports

Note that an almost-final version of this patch is enabled on both sides of this diff, the difference is that ArrayBoundV2 is only enabled on the left side. I'm currently running another evaluation which compares ArrayBoundV2 without and with this patch; I'll also upload those results when they're ready.

Copy link
Member

@isuckatcs isuckatcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, we want to use (very likely) incorrect assumptions to suppress false positives produced by an alpha checker, which receives otherwise wrong assumptions.

I feel like this is a situation when someone makes 2 mistakes in a maths test that by chance cancel each other, so they end up getting the correct result. However in that case the solution is still not accepted, because the calculation was wrong.

I think the correct solution to this problem is to investigate why the specific checker receives wrong existing assumptions and make those assumptions correct.

Comment on lines 346 to 347
/// the number of already finished iterations (0, 1, 2...); otherwise it's
/// std::nullopt.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// the number of already finished iterations (0, 1, 2...); otherwise it's
/// std::nullopt.
/// the number of already finished iterations (0, 1, 2, ...); otherwise it's
/// std::nullopt.

Can't IterationsFinishedInLoop be a state trait? We already have state traits for element index inside arrays when elements are constructed and destructed, I suspect this would work similarly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't IterationsFinishedInLoop be a state trait?

We could introduce it, but we'd need to maintain a nested stack of loops (1 iteration in that while, then 2 iterations in the for, then 1 iteration in this other for in the function called there...) to correctly monitor the number of iterations.

This would make the heuristic introduced in this commit slightly more accurate, but I think that for practical purposes the current code is already good enough, so I didn't invest time into implementing this loop iteration count stack. (However it would be a nice follow-up commit if I have time.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could introduce it, but we'd need to maintain a nested stack of loops (1 iteration in that while, then 2 iterations in the for, then 1 iteration in this other for in the function called there...) to correctly monitor the number of iterations.

You can bind the current iteration number to the AST node of the loop, as it happens with nested ArrayInitLoopExpr IIRC. I'm sure we already have something similar implemented.

I'd prefer using state traits here instead of modifying the engine API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not say that processBranch() is part of the "engine API": it's a method which is called from one single location which is another closely related component of the engine, and I cannot imagine a valid use case for calling it from any other context. It's unfortunate that I'm extending this already long parameter list, but this kind of complexity cannot be entirely avoided in this sort of internal worker method.

I looked at the ArrayInitLoopExpr logic and it is not analogous to what I need. The main difference is that

  • IndexOfElementToConstruct is used to drive the iterations of the initialization loop, i.e. count the number of iterations and stop the loop when this reaches the Size of the array that's being constructed;
  • my code will a small part of the body of larger loops which are driven by unrelated, external checks.

This difference means that in an array init loop the code needs to manage an iteration count, while here I think it's much better to access the canonical BlockCount (which is, roughly speaking, the iteration count).

However, unfortunately the BlockCount is not the exactly right sort of iteration count for the heuristic that I want, because it's increasing monotonically with each evaluation of the loop condition, while I want to restart counting the number of iterations each time when we "reach the loop again" (due to a new iteration in an outer loop, or a new function call, or a goto that jumps to a place before the loop etc.).

Notice that ArrayInitLoopExpr does something similar when it adds a LocationContext to the key to distinguish initloops that happen within different function calls -- however, in my case something like this not be sufficient to restart iteration counting when we enter a new iteration of an outer loop.

This is the reason why I spoke about a "stack" in my previous comment -- to properly reset the iteration count each time we "reach the loop again", we'd need:

  • a stack of the loops that we're inside of;
  • tagging these loops with LocationContexts to distinguish executing the same code from different function calls;
  • preferably some additional logic to handle goto, break, return and similar stuff.

In theory it's perfectly possible to implement this exact iteration counting, but I feel that in practice it's not worth the effort, because the BlockCount is a good enough approximation for this heuristic and it will make this (already complex) code significantly easier to maintain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't binding the iteration count to the loop terminator sufficient?

Let's say we pass the terminator to processBranch() instead of the iteration count.
If there is no trait set for the terminator, we set it on the true branch, when the loop is entered.
If the terminator is already set, we know that we finished one iteration already and increment it.
If the condition is false and the loop is not entered again the trait is removed.

I'm not sure how well this would work though, since I didn't test it.

clang/test/Analysis/out-of-bounds.c Outdated Show resolved Hide resolved
clang/test/Analysis/out-of-bounds.c Outdated Show resolved Hide resolved
clang/test/Analysis/out-of-bounds.c Show resolved Hide resolved
clang/test/Analysis/out-of-bounds.c Show resolved Hide resolved
Comment on lines +219 to +225
ProgramStateRef clang::ento::recordWeakLoopAssumption(ProgramStateRef State) {
return State->set<SeenWeakLoopAssumption>(true);
}

bool clang::ento::seenWeakLoopAssumption(ProgramStateRef State) {
return State->get<SeenWeakLoopAssumption>();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every registered trait should be removed when they are no longer needed to stop polluting the states. Also, aren't the traits part of the exploded graph? If they are and they are not cleaned up, they pollute the egraph too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit doesn't have any "no longer needed" point -- if we made some shaky assumption for a loop condition, I want to suppress all the ArrayBoundV2 reports that are reached through a path which takes that particular branch.

(By the way, each ExplodedNode has an associated State, which is essentially a big heap of traits that are introduced by various parts of the engine and various checkers; so yes, traits are contained in the egraph as well.)

Copy link
Member

@isuckatcs isuckatcs Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit doesn't have any "no longer needed" point

Does it still worth to keep them once we finished processing the loop? The loop condition only gives us information while we are inside the loop the condition belongs to, right? Once the loop is exited, the trait could be safely removed I think.

I only see the pattern that every state trait that we have now comes with a method that sets it and another one that removes it, so I assume we remove them for a reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing to be removed here as this trait is a bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it still worth to keep them once we finished processing the loop? The loop condition only gives us information while we are inside the loop the condition belongs to, right? Once the loop is exited, the trait could be safely removed I think.

No, there is a common false positive pattern where an unfounded assumption within the loop is used after the loop and produces an unwanted result at that point. See e.g. the testcase loop_suppress_after_zero_iterations where the assumption in the loop introduces len == 0 as a separate branch, while there is no justified reason to handle this separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but with this left turned on, the coverage drop is huge even in cases that are not affected by the assumption.

void foo(int x, int y) {
  for (unsigned i = 0; i < x; i++) ; // split the state and set SeenWeakLoopAssumption to 'true'
  if (x != 0) return;                // drop the 'true' branch

  // no warnings are reported from this point on

  int buf[1] = {0};
  for (int i = 0; i < y; i++)
    buf[i] = 1;                      // SeenWeakLoopAssumption is 'true', so the warning is suppressed
}

This goes on through multiple function calls too.

void a() {}

void b() { a(); }

void c() { b(); }

void d() { c(); }

void main() {
  for (unsigned i = 0; i < x; i++)  ;
  if (x != 0) return;

  // no warnings are reported from this point on

  d();
}

If a warning is found inside any of a(), b(), c() or d(), it is suppressed because the trait is set on the top of the execution path.

Since we generate a sink it is just 1 false negative though, while relying on an unfounded assumption might trigger a false positive in one of the nested functions, so I guess we can live with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example

void foo(int x, int y) {
  for (unsigned i = 0; i < x; i++) ; // split the state and set SeenWeakLoopAssumption to 'true'
  if (x != 0) return;                // drop the 'true' branch

  // no warnings are reported from this point on
}

is a very good point and I'll probably add it to the tests to highlight this limitation of the heuristic.

However, I've seen {{ArrayBoundV2}} reports where lots of stuff happens between the point where we assume that a loop can have 0 iterations (i.e. some length/size variable is equal to 0) and the point where this triggers an unwanted report; so I don't think that we can have a natural cutoff where the "SeenWeakLoopAssumption" bit may be safely cleared.

I don't see a way to avoid these kinds of false negatives without a completely different loop handling approach, so I think we should accept them in the foreseeable future. (There are already lots of precedents for losing coverage after loops.)

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Sep 25, 2024

If I understand it correctly, we want to use (very likely) incorrect assumptions to suppress false positives produced by an alpha checker, which receives otherwise wrong assumptions. [...] I think the correct solution to this problem is to investigate why the specific checker receives wrong existing assumptions and make those assumptions correct.

No, the situation is significantly more complex.

The "wrong existing assumptions" are caused by a fundamental limitation of the path-sensitive analysis approach, namely that we need to follow one concrete execution path, which in a loop means modeling one particular iteration count. (See below for more detailed explanation.)

To "make those assumptions correct" we would need to implement a complex invalidation/widening procedure -- I wasted several months on it and didn't manage to make it work. If you write an accurate loop widening logic for the analyzer, then after that I'd happily remove the heuristic workaround added by this commit, but I don't want to wait years with bringing this checker out of the alpha state.

Justified and wrong of assumptions in the analyzer:

The analyzer makes lots of assumptions: at each conditional expression, short-circuiting operator, loop etc. it must pick one of the multiple branches (by the way, this is implemented in ExprEngine::processBranch() and the related functions).

If the condition of the if statement is ambiguous (it can be both true and false in the current State), then both choices are justified because we can say that "the programmer wrote an if here, so he thinks that both cases are possible".

<offtopic> This "justification" is not entirely correct because there are paranoid defensive checks (and we try to filter them out with heuristics) and there are many know false positives that look like

void do_something(/* some arguments */, bool flag1, bool flag2) {
  if (flag1) {
    // BLOCK1
  }
  // ...
  if (flag2) {
    // BLOCK2
  }
}

where it's semantically very clear for the programmer that flag1 and flag2 won't be used together, and the nonsense combination of BLOCK1 + BLOCK2 triggers some error. (Here the presence of "if (flag2)" implies that the programmer thinks that flag2 can be true in some situations, but the analyzer is wrong when it assumes that flag2 can be true on the branch where BLOCK1 was executed.) However this "correlated if" problem is a separate unrelated mostly-unsolvable limitation of the analyzer, so we shouldn't discuss it here. </offtopic>

If the condition of a loop is ambiguous (and e.g remains ambiguous each time it's evaluated), then that's a more complex issue: when programmers write a loop, then it's justified to assume that "in some situations there can be at least two iterations" (because otherwise the loop is an overkill and they should've just written an if) -- but more concrete assumptions are all unjustified, because the presence of the loop (with the ambiguous condition) doesn't imply anything like "there won't be any iterations", "there will be 1 iteration" or "there will be at least 3 iterations" etc.

Note: your inline comments suggest that the analyzer should assume and analyze e.g. the "there won't be any iterations" case unless something (e.g. an assertion) rules it out explicitly, but I strongly disagree with this philosophy because:

  • There are situations where there is no natural place for the assert(): e.g. if the loop looks like while (we_have_time() && !algorithm_finished(foo, bar, baz)) {...} where would you put the assertion?
  • The analyzer must aim for accepting all normal code, because it's not reasonable to demand that when a project starts to use static analyzer, they immediately introduce hundreds of assertions to fulfill the demands of our tyrannical tool.
    Requiring assertions in situations where (e.g.) the zero-iteration case is impossible may be an useful design rule, but it's far from being universally accepted so we shouldn't enforce it in a general-purpose checker. (If you want to implement it as an off-by-default flag, then that would be OK, but I'm not interested in covering it.)

So, to summarize these, when we encounter a loop and the evaluations of the condition are ambiguous:

  • The analyzer architecture can only follow concrete code paths like "skip the loop", "perform 1 iteration then leave", "perform 2 iterations then leave", ..., "perform 6457 iterations then leave" etc.
  • It's (mostly - see the offtopic block near the beginning) justified to stay in the loop for 2 iterations, because if the programmer expected just one iteration, then they would've written an if instead of a loop.
  • It's not justified to stay in the loop for more than two iterations, because the normal way to implement "loop that can execute $\le 2$ iterations" is the same as "loop that can execute many iterations" (and no, you cannot demand that short loops must be marked with assumptions, that's just one particular coding style).
  • Theoretically speaking, it's never completely justified to assume a loop exit: for example I can imagine graph algorithms that gradually flood a certain array, where it's clear for the programmer that there will be at least array_size steps (but perhaps more).
    • In practice, it's unjustified to assume that the loop is completely skipped -- those branches produce many really ugly false positives.
    • In practice, it's usually OK to assume a loop exit when at least one iteration is completed -- in some cases that can still lead to false positives (e.g. here is a false positive caused by assuming an exit after one iteration), but they are significantly rarer.

Based on this my patch aims to use the code paths that spend one or two iterations within the loops that they encounter. (This is currently limited to ArrayBoundV2, because that's where I've seen the overwhelming amount of false positives, but I think probably it'd be useful to generalize this kind of suppression to all the checkers.)

@pskrgag
Copy link
Contributor

pskrgag commented Sep 27, 2024

I didn't have a chance to look into details, but I'd like to share that FP rate is MUCH lower with this patch applied.

On real (proprietary) code base I see about 95% less ArrayBoundV2 reports and, I hope, we will be able to turn it on CI (finally)

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks convincing. Suppression like this is probably not the best we can achieve, but if it reduces the FPs by 2-3x, then I'm all in!
I'm putting this PR to the test downstream to see the overall feel we would get.
I'll report back once I have some time to eval the diffs :D

clang/lib/StaticAnalyzer/Core/CoreEngine.cpp Outdated Show resolved Hide resolved
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp Outdated Show resolved Hide resolved
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp Outdated Show resolved Hide resolved
This commit cleans up some typos thet were reported by the reviewers and
tries to provide better explanations for some parts of the patch that
turned out to be confusing.
Copy link

github-actions bot commented Sep 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@steakhal
Copy link
Contributor

steakhal commented Oct 1, 2024

I ran this PR, and we can see an overall reduction of 32.7% of OOBv2 reports.
I've sampled like 10 disappearing issues and they were either difficult to follow or outright bad by assuming an unjustified iteration count. I'll continue the sampling, but I think I can already say that this would improve the statusquo.


Btw just to clarify explicitly, we are not using bugreport visitors for this because it may follow a wrong predecessor path, and on a different predecessor path we might end up crossing the terminator Stmt a different number of times? Is this correct?

To me, it's not clear why couldn't we achieve the same using bug report visitors. Let me recap how diagnostic construction works.
To the best of my knowledge, a bugreport equivalent-class groups bug reports that are issues against the same "location" - basically line and issue message. Then when we create a reduced explodedgraph by keeping only the nodes that lead from the root to any bugs in the bug eqclass (including all the edges, splits and joins). Then we try each bug report, with its shortest path from the root with different bug report visitors to see if they accept the path (aka. the report). If accepted, then we have have a valid bug report, thus we are done with the bug eqclass and later on we will convert this into a diagnostic. If rejected, then we try the next bug report in the eqclass, and walk the shortest path of that one; and repeat until all refuted or accept one.

Why can't we implement this report invalidation by a bug report visitor?

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Oct 1, 2024

I ran this PR, and we can see an overall reduction of 32.7% of OOBv2 reports. I've sampled like 10 disappearing issues and they were either difficult to follow or outright bad by assuming an unjustified iteration count. I'll continue the sampling, but I think I can already say that this would improve the statusquo.

I'm happy to hear that, thanks for the testing!

Btw just to clarify explicitly, we are not using bugreport visitors for this because it may follow a wrong predecessor path, and on a different predecessor path we might end up crossing the terminator Stmt a different number of times? Is this correct?
[...]
Why can't we implement this report invalidation by a bug report visitor?

No, I didn't think about the predecessor issues, but I have several other reasons.

In fact, I originally tried to implement this heuristic as a bug reporter visitor (and I still have the mostly complete code in a local throwaway branch) but I ran into the issue that as the visitor is walking backwards along the execution path, I don't see the number of iterations already performed in the loop until that decision point (the time direction is flipped here 🙃 ).

To solve this issue, I thought about adding some logic that stores the iteration count (or the BlockCount which approximates it) in the ExplodedNodes, but then I realized that I might as well include the final "did we see a weak loop assumption along the path" bit in the State instead of the raw iteration count which would require additional post-processing by the bug reporter visitor.

This has the added benefit that here in processBranch it's easier to check whether the engine did a state split. (I know that this information is also accessible within a bug reporter visitor, but I felt that traversing the exploded graph is more complex than directly observing the state split.)

In general, I feel that BugReporterVisitor is a complicated piece of artillery which is needed for some difficult targets, but not a good choice for a little sparrow. This backwards-walking logic is sometimes needed to select the important things that end up being relevant in the bug report, but in this particular problem the forward propagation is completely sufficient.

Moreover, if this suppression works well, then perhaps we could generalize it and add a config option which enables it for all checkers. If we want to do this, it will be nice to have this logic in processBranch() where we can immediately discard the "weak" transitions (instead of just marking them).

Finally, I also admit that I personally dislike BugReporterVisitor, and while I can work with it when it's necessary, I prefer to avoid it when it's optional 😅

@steakhal
Copy link
Contributor

steakhal commented Oct 1, 2024

[...]
Why can't we implement this report invalidation by a bug report visitor?

No, I didn't think about the predecessor issues, but I have several other reasons.

In fact, I originally tried to implement this heuristic as a bug reporter visitor (and I still have the mostly complete code in a local throwaway branch) but I ran into the issue that as the visitor is walking backwards along the execution path, I don't see the number of iterations already performed in the loop until that decision point (the time direction is flipped here 🙃 ).

We can always have a map, mapping ForStmts to finished iterations. Basically, from the ErrorNode, we could walk backwards (as usual in a visitor like that), and check if the current ProgramPoint is PostStmt. We would then grab the wrapped Stmt to see if it's the terminator Stmt of a For-like statement. If so, them bump the counter in the map of that loop.
Once we finished the visitation, in the finalizeVisitor we could see if any of the loops had a wrong counter, and mark the report invalid.

If we want to see if we did a state-split when we left the loop we would have 2 options:

  1. Take the State and the condition expression and query by using assume as usual. We should get the same answer as during the egraph construction.
  2. Have a look at the original exploded graph and check if we have only have a single successor node.

This has the added benefit that here in processBranch it's easier to check whether the engine did a state split. (I know that this information is also accessible within a bug reporter visitor, but I felt that traversing the exploded graph is more complex than directly observing the state split.)

Note that the exploded graph that the VisitNode traverses is stripped, thus it's a single path of nodes, without ambiguities of successors or predecessor nodes.

In general, I feel that BugReporterVisitor is a complicated piece of artillery which is needed for some difficult targets, but not a good choice for a little sparrow. This backwards-walking logic is sometimes needed to select the important things that end up being relevant in the bug report, but in this particular problem the forward propagation is completely sufficient.

Yes, but I also wish we could achieve this without making the internals more complicated. Having a separated component doing this has a lot of value.

Moreover, if this suppression works well, then perhaps we could generalize it and add a config option which enables it for all checkers. If we want to do this, it will be nice to have this logic in processBranch() where we can immediately discard the "weak" transitions (instead of just marking them).

A BugReportVisitor is just as generic to me.

Finally, I also admit that I personally dislike BugReporterVisitor, and while I can work with it when it's necessary, I prefer to avoid it when it's optional 😅

If you prefer, I could also play with this concept. I'll probably have maybe 1 day this week where I could work on this.

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Oct 1, 2024

We can always have a map, mapping ForStmts to finished iterations. Basically, from the ErrorNode, we could walk backwards (as usual in a visitor like that), and check if the current ProgramPoint is PostStmt. We would then grab the wrapped Stmt to see if it's the terminator Stmt of a For-like statement. If so, them bump the counter in the map of that loop. Once we finished the visitation, in the finalizeVisitor we could see if any of the loops had a wrong counter, and mark the report invalid.

If we want to see if we did a state-split when we left the loop we would have 2 options:

1. Take the State and the condition expression and query by using `assume` as usual. We should get the same answer as during the egraph construction.

2. Have a look at the **original** exploded graph and check if we have only have a single successor node.

That's a clever solution, but I'm still not convinced that overall it'd be easier to understand than what I wrote.

Note that in addition to checking for state split when the analyzer leaves the loop, the heuristics also need to check for state split when the analyzer remains in the loop, so for each loop we would need to manage a list like "left the loop with[out] assumption, before that remained in the loop with[out] assumption, before that remained in the loop with[out] assumption, ..."

This already awkward code would become even more complex if we determined that we need accurate iteration counts in cases when a loop is reached several times (from different function calls or from different iterations of an outer loop). Implementing this would be already difficult in my eager model (see the last few paragraphs of this inline comment: #109804 (comment) ), and the backward-iteration logic would make this even more complicated.

Note that the exploded graph that the VisitNode traverses is stripped, thus it's a single path of nodes, without ambiguities of successors or predecessor nodes.

Thanks for alerting me 😅 I didn't have the opportunity to run into this trap (yet), and now I'll try to remember it.

In general, I feel that BugReporterVisitor is a complicated piece of artillery [...]

Yes, but I also wish we could achieve this without making the internals more complicated. Having a separated component doing this has a lot of value.

Moreover, if this suppression works well, then perhaps we could generalize it and add a config option which enables it for all checkers. If we want to do this, it will be nice to have this logic in processBranch() where we can immediately discard the "weak" transitions (instead of just marking them).

A BugReportVisitor is just as generic to me.

I see that there is value in keeping the internals as simple as possible, but I also feel that hiding pieces of the engine logic in scattered visitors also makes the code harder to understand. (Especially since the visitor will activate at a very different stage of the analysis, so its influence is completely invisible at the engine logic where it "belongs".)

I think this is a "simple is better than complex; complex is better than complicated" case (to quote the zen of python): we should try to keep the engine as simple as possible, but we should not hide parts of the necessary complexity with complicated tricks

If we end up with generalizing this and the "weak" transitions will be recognized by many checkers (or we discard them altogether), then it would be important to have this logic in its "natural" place in the engine (instead of being hidden away in a visitor).

On the other hand, if this heuristic remains limited to ArrayBoundV2, then I agree that architecturally it would be better to place this logic in a visitor in ArrayBoundCheckerV2.cpp -- but even in this case I feel that this concern is counterbalanced by the fact that the visitor code would be significantly more awkward and complex than the direct eager implementation.

Moreover, if we end up with enabling this heuristic for all the checkers then we can outright discard the weak transitions, which could provide a significant performance improvement compared to both the visitor-based alternative implementation and even the status quo.

If we eagerly prune the exploded graph by not following the weak loop assumptions, then we could avoid visiting lots of execution paths, while

  • the visitor-based approach would just discard results from these paths after wasting time on analyzing them;
  • the status quo displays results found on these paths, but I suspect that the results coming after assuming 0 or 3+ loop iterations are redundant and very similar to the results after 1 or 2 (with a few differences where I'd prefer to see the findings from 1 or 2 iterations).

On complex-real-world code I think it's fair to assume that the average bug path contains at least one loop where the analyzer can't evaluate the condition (currently these are displayed as skipped, because that's the shortest -- but sometimes visibly unrealistic -- path) and this would imply that (very roughly) half of the execution paths are affected by weak loop assumptions, and I don't think that we should discard so many paths.

If you prefer, I could also play with this concept. I'll probably have maybe 1 day this week where I could work on this.

Thanks for the offer, but I right now think I would prefer to keep the direct/eager approach. However, even the visitor-based approach would be better than nothing, so if you say that the engine modifications are unacceptable, then we could switch to a visitor. (However, I can probably write the visitor on my own following the approach that you suggested. I'm not incapable of writing bug reporter visitors, I just think that they're ugly.)

I also thought about finding a third way with a check::BranchCondition callback, but unfortunately that callback appears to be a terminal stage victim of code decay. (I'll probably write a github ticket about its sorry state in the upcoming days.)

Simplify the code by removing a special case that tried to ensure
identical behavior between analysis with and without eagerly-assume
in a weird theoretical situation.
@NagyDonat
Copy link
Contributor Author

NagyDonat commented Oct 2, 2024

@isuckatcs If I am not mistaken, I reacted to every comment in your first review.

I'm sorry for the salty initial reactions -- after thinking about this topic for months, I ended up feeling that everything is so obvious, while in fact it's a really complicated topic and perhaps my original explanations weren't clear enough.

I understand your point of view that it would be nice if the programmers marked their knowledge with assertions (and we could reason based on the lack of assertions), and e.g. half a year ago I might've agreed with you, but I don't think that it's the way forward. This is an application where the user is always right and we must avoid producing annoying/unwanted reports even if they are technically correct.

@NagyDonat NagyDonat requested a review from isuckatcs October 2, 2024 18:30
steakhal added a commit to steakhal/llvm-project that referenced this pull request Oct 5, 2024
@steakhal
Copy link
Contributor

steakhal commented Oct 5, 2024

I implemented my idea of doing this suppression sole within a BugReport visitor in PR #111258. Let me know what do you think.

Comment on lines +146 to +150
ProgramStateRef recordWeakLoopAssumption(ProgramStateRef State);

/// Returns true if `recordWeakLoopAssumption()` was called on the execution
/// path which produced `State`.
bool seenWeakLoopAssumption(ProgramStateRef State);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While browsing through the codebase, I've seen that these methods for the other traits are defined inside class ExprEngine as static methods with setters and deleters being private and getters being public.

IIUC, they are in the global scope so that they can be accessed from ArrayBoundCheckerV2. Tbh, I would make recordWeakLoopAssumption() a private static method of the ExprEngine class, as only this class is able to assume anything about the condition, so it should be the only one, that can set any state traits related to those assumptions, while seenWeakLoopAssumption() could be a public static member function of ExprEngine to follow the pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I prefer plain functions instead of static methods that are only vaguely connected to a class, but I can move them to ExprEngine to follow the precedents.

Making recordWeakLoopAssumption() private is also a good point -- I'll either do so or just inline its short one-line definition at the few locations where it's called. (I think the transparent State->set<> would be clearer than just calling some random method. If we need more complex logic later, we can reintroduce a method like it.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

methods that are only vaguely connected to a class

I think these methods are strongly connected to this class, as they only make sense in the context of ExprEngine, which is also indicated by them being in ExprEngine.h. Also ExprEngine is the only class that can set this trait, as no one else has access to the required information to do so.

Comment on lines +219 to +225
ProgramStateRef clang::ento::recordWeakLoopAssumption(ProgramStateRef State) {
return State->set<SeenWeakLoopAssumption>(true);
}

bool clang::ento::seenWeakLoopAssumption(ProgramStateRef State) {
return State->get<SeenWeakLoopAssumption>();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please move these definitions to the place, where the other similar definitions for other state traits are found, so that they stay grouped together?

// used by the WeakLoopAssumption heuristic to find situations where the an
// eager assumption introduces a state split within the evaluation of a loop
// condition.
REGISTER_TRAIT_WITH_PROGRAMSTATE(LastEagerlyAssumeAssumptionAt, const Expr *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested a renaming only because I'm not sure that EagerlyAssumeAssumption is a correct english phrase.

clang/lib/StaticAnalyzer/Core/CoreEngine.cpp Show resolved Hide resolved
Comment on lines +219 to +225
ProgramStateRef clang::ento::recordWeakLoopAssumption(ProgramStateRef State) {
return State->set<SeenWeakLoopAssumption>(true);
}

bool clang::ento::seenWeakLoopAssumption(ProgramStateRef State) {
return State->get<SeenWeakLoopAssumption>();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but with this left turned on, the coverage drop is huge even in cases that are not affected by the assumption.

void foo(int x, int y) {
  for (unsigned i = 0; i < x; i++) ; // split the state and set SeenWeakLoopAssumption to 'true'
  if (x != 0) return;                // drop the 'true' branch

  // no warnings are reported from this point on

  int buf[1] = {0};
  for (int i = 0; i < y; i++)
    buf[i] = 1;                      // SeenWeakLoopAssumption is 'true', so the warning is suppressed
}

This goes on through multiple function calls too.

void a() {}

void b() { a(); }

void c() { b(); }

void d() { c(); }

void main() {
  for (unsigned i = 0; i < x; i++)  ;
  if (x != 0) return;

  // no warnings are reported from this point on

  d();
}

If a warning is found inside any of a(), b(), c() or d(), it is suppressed because the trait is set on the top of the execution path.

Since we generate a sink it is just 1 false negative though, while relying on an unfounded assumption might trigger a false positive in one of the nested functions, so I guess we can live with this.

void evalEagerlyAssumeBinOpBifurcation(ExplodedNodeSet &Dst, ExplodedNodeSet &Src,
const Expr *Ex);
/// evalEagerlyAssumeOpBifurcation - Given the nodes in 'Src', eagerly assume
/// symbolic expressions of the form 'x != 0' or '!x' and generate new nodes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was that some other methods that this method calls also have the same misleading and incorrect name like geteagerlyAssumeBinOpBifurcationTags(). I think if we rename this method, every other similar method should also be renamed.

This is why I suggested a separate patch, or if you want to do it in this patch just keep them in 2 different commits so they are separated when the PR is merged.

@Szelethus
Copy link
Contributor

Szelethus commented Oct 14, 2024

Damn, look at this project, thriving as strong as ever 18 years after its conception! Great discussions!

Let me throw in my two cents. Overall, I think we should go with this patch.

I agree with a great many things previously said, this one summarizes the argument for the visitor implementation quite well:

Yes, but I also wish we could achieve this without making the internals more complicated. Having a separated component doing this has a lot of value.

The question is, whether the other benefits of this approach outweigh this drawback. What I see as a particular strength here is a possibility for discarding bogus paths of execution. BugReporterVisitors are retrospective tools, and can't affect the analysis as its happening. Suppressing reports later than never is still incredibly valuable, and necessary if we can't decide during analysis if subsequent reports will be bogus. A good example here is MacroNullReturnSuppressionVisitor, which suppresses reports that there were directly caused by a zero/null value that originated from a macro. If we were to prematurely sink all execution patch where a null value formed in a macro, we would flush perfectly good execution paths down the drain, leading to lower code coverage.

The case is a little different here. Since we are talking about limiting how many loop iterations we want to analyze, its more a question of how many times do we want to analyze a piece of code, not if we want to analyze it at all. I mean, I can construct a made up examples where discarding 0 or >2 iterations directly lead to code coverage loss on code that should've been analyzed, but I predict this to much more tolerable. Bugs that we detect now with little to no connection to loops in the program should be still detected, but maybe on a slightly different path of execution. Of course, this remains to be proven by actual data.

Where I'm getting at with this is that if we ever decide to sink 0 or >2 iteration paths, it would be a relatively trivial change starting out from this patch.

On a smaller note: I took a look at #111258 and as someone who has written his fair share of visitors, I find the code much more readable. Compared to the core, a visitor is more digestible and it is much easier to understand how it will affect the output of the analyzer. On the other hand, even though this patch complicates the core, this is an issue stemming from the core, so the placement of the changes seem more appropriate there.

To summarize, for me, this is a deciding factor: if we only want to suppress reports for selected checkers after analysis, a visitor is the better, more readable and maintainable solution. If we want axe some paths for good, and I think we should, this patch should remain.

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Oct 15, 2024

@steakhal I'm sorry that I disappeared for more than a week when you dropped your alternative implementation. I thought a lot about the advantages and limitations of my implementation, your implementation and even "third way" alternatives (but unfortunately check::BranchCondition doesn't seem to be useful for this), but finally (after a discussion with @Szelethus ) I decided that I would prefer to keep my approach.

I was tempted do switch to the visitor-based implementation because its code quality is simply better, but overall I feel that this direct approach would be even better if I clean it up properly.

My main considerations are:

  • I think it's likely (although not guaranteed) that this heuristic would be helpful for other checkers as well, and if we want to activate it for all checkers, then it must be done in an eager way (because eagerly sinking lots of paths is significantly better performance-wise than lazily discarding all those results).
  • I agree with @Szelethus that this heuristic is handling an engine issue, so its logic should be in the engine itself.
  • The OrigCursor trickery is not unacceptable by itself, but the added complexity is one disadvantage of the visitor approach.
  • The "enumerate all subexpressions of the loop condition and check the number of times they were assumed to be true/false" approach can handle more of my testcases than my original implementation, but it can produce very counter-intuitive results:
    • If we have a loop like while (!seen_errors(output, logging_mode() < LOG_DEBUG)) with eager assumptions and logging_mode() is defined in a different TU, then the "count true/false evaluations of subexpression" logic says that this loop makes an assumption at each iteration -- even if seen_errors is inlined and e.g. the analyzer knows that it always returns false.
    • There are also situations where not all evaluations of the loop condition evaluate a particular subexpression, which would also produce strange results.
    • Logical negation operations could swap the "real" meaning of "number of times it's assumed to be true" vs "number of times it's assumed to be false" counters.

@steakhal @ anybody else
Is my eager implementation acceptable for you (in theory, assuming that it's cleaned up sufficiently)? If yes, I'll start to work on implementing the suggestions of @isuckatcs and any other observations.

@NagyDonat
Copy link
Contributor Author

By the way, my plans for the "how do we get an accurate iteration count?" and the "how do we handle complex loop conditions that contain short-circuit operators?" issues is that in this first patch I want to go with the rough approximation that's easiest to implement, and I'll probably refine them in follow-up patches.

@steakhal
Copy link
Contributor

  • I think it's likely (although not guaranteed) that this heuristic would be helpful for other checkers as well, and if we want to activate it for all checkers, then it must be done in an eager way (because eagerly sinking lots of paths is significantly better performance-wise than lazily discarding all those results).

Alright. So, if I understand you is not only to fix the OOBv2 checker but also to later expand the scope to the whole engine - and skip exploring paths that were explored before. I think that deserves a deeper discussion, but I also understand that we need implementation experience for experimentation back the RFC with data.
This is wasn't clear to me initially. The PR title and summary refereed to suppression, which didn't motivate this change well enough to touch the engine. Last time I've checked the Loop hanging RFC discuss post, I though you finally decided to take @haoNoQ's suggestion - and suppress issues instead of changing exploration strategies (if I understood that correctly).

That said, if this patch only want's to suppress reports, then a bug report visitor is the right choice to me. If we shoot for something bigger (and this is just the first step), than it starts to make sense. However, achieving something like that is going to be really really tough, so the chances are slim. I wonder if you have plans for doing it incrementally and with options for disabling it while developing and preparing this new version.

  • I agree with @Szelethus that this heuristic is handling an engine issue, so its logic should be in the engine itself.

This point partially resonates with me that this is both a checker and engine issue, thus a hard and important problem to solve.

  • The OrigCursor trickery is not unacceptable by itself, but the added complexity is one disadvantage of the visitor approach.

I highly agree that that approach is highly unorthodox, and surprising. I have two observations for this.
1st, if we always had a ExplodedNode tag for taking an eager decision (and some similar tag when eager assumptions are disabled), then I wouldn't need that hack. You can see in my commit history there that this was my initial idea - but I realized that we may have good reasons for disabling eager assumptions, and in that case I wouldn't have those tags anymore :s
2nd, It's a convenient lie to have a stripped single-path "exploded graph" from a BugReport visitor, because then it's unambiguous what parent nodes it chooses, thus developers can focus on what matters the most. However, like in this case, it would be nice if we could do arbitrary things. If we had a better API doing this, it wouldn't feel odd anymore.
So there are ways to improve this.

  • The "enumerate all subexpressions of the loop condition and check the number of times they were assumed to be true/false" approach can handle more of my testcases than my original implementation, but it can produce very counter-intuitive results: [...]

Yes, I agree that this "try all subexpressions" (that I implemented) isn't bulletproof. However, I only did it in slightly more than a day. I'd say, if this is a concern, I can give it a bit of a polishing, and put it to the bench to see how it works in life.
I think if you have concerns about specifics, let's discuss that under that PR #111258, such that we all have the code in front of us. I think we can get fairly far once we can settle on some concrete examples where it would be surprising with the current implementation, and refine it from there.

@steakhal @ anybody else Is my eager implementation acceptable for you (in theory, assuming that it's cleaned up sufficiently)? If yes, I'll start to work on implementing the suggestions of @isuckatcs and any other observations.

I don't think we reached consensus yet. I'm less concerned about the details, because I know you can do it.
This is why I didn't look into the details of this PR, or commented nits so far.

@NagyDonat
Copy link
Contributor Author

NagyDonat commented Oct 16, 2024

  • I think it's likely (although not guaranteed) that this heuristic would be helpful for other checkers as well, and if we want to activate it for all checkers, then it must be done in an eager way (because eagerly sinking lots of paths is significantly better performance-wise than lazily discarding all those results).

Alright. So, if I understand you is not only to fix the OOBv2 checker but also to later expand the scope to the whole engine - and skip exploring paths that were explored before. I think that deserves a deeper discussion, but I also understand that we need implementation experience for experimentation back the RFC with data. This is wasn't clear to me initially. The PR title and summary refereed to suppression, which didn't motivate this change well enough to touch the engine. Last time I've checked the Loop hanging RFC discuss post, I though you finally decided to take @haoNoQ's suggestion - and suppress issues instead of changing exploration strategies (if I understood that correctly).

Currently, my only goal is that I want to stabilize ArrayBoundV2 ASAP and then continue with cleaning up the other bounds-checking checkers. However, I think that generalizing this (currently checker-specific) suppression to outright skipping those execution paths is a plausible direction of future development and I think it would be nice to implement the current goals in a way that supports this direction.

(As I said earlier, I'm strongly opposed to a solution where the visitor-based suppression is activated by each checker [or many checkers] because then we'd be discarding a significant percentage (perhaps even the majority) of work done by the analyzer.)

Also when I discussed this patch with @Szelethus he also told that he vaguely recalls from old test result reviews that these "weak loop assumptions" also cause some false positives in certain stable core checkers, which implies that we should implement this suppression in a way that can be generalized for other checkers.

Edit: the very recent bug report #112527 is another example where applying this heuristic to core.UndefinedBinaryOperatorResult would've probably prevented the false positive.

That said, if this patch only want's to suppress reports, then a bug report visitor is the right choice to me. If we shoot for something bigger (and this is just the first step), than it starts to make sense. However, achieving something like that is going to be really really tough, so the chances are slim.

I don't think that discarding the execution paths with these "weak" assumptions would be a really big change:

  • It only discards paths, so it cannot introduce any false positives (unlike the loop widening plans, which performs invalidation).
  • I don't think that there is a systemic class of bugs that are only found on these "weak assumption execution path" branches and can be distinguished algorithmically from the highly unwanted results.

I wonder if you have plans for doing it incrementally [...]

My roadmap roughly looks like:

  • clean up this PR and merge this when it's good enough to be useful;
  • refine this heuristic with follow-up commits that provide more accurate iteration counts and cover loop conditions that contain short-circuiting operators;
  • evaluate a change that replaces the ArrayBoundV2-specific result suppression with completely discarding the execution paths with "weak" assumptions
  • merge this if it turns out to be an improvement.
    (I didn't think too much about this yet.)

[...] and with options for disabling it while developing and preparing this new version.

Obviously I can put the freshly developed stuff behind analyzer options if there's demand for it. In the case of this ArrayBoundV2 change I did not do so, because ArrayBoundV2 is an alpha checker and it's very noisy without this patch, but if I extend this to stable checkers I'll probably introduce an option.

[RE OrigCursor trickery] 1st, if we always had a ExplodedNode tag for taking an eager decision (and some similar tag when eager assumptions are disabled), then I wouldn't need that hack.

I think "did we make a new assumption here?" is canonically answered by "does this node have two (or more) children?" and this information shouldn't be duplicated elsewhere.

Obviously, it's fine to have a tag which e.g. marks that "here we make an eager assumption" (or more precisely those tags currently means that "if we made an assumption here, then it's an eager assumption" because they're added even if there isn't actual bifurcation) -- but we should not aim to record all assumptions as tags.

[RE OrigCursor trickery] 2nd, It's a convenient lie to have a stripped single-path "exploded graph" from a BugReport visitor, because then it's unambiguous what parent nodes it chooses, thus developers can focus on what matters the most. However, like in this case, it would be nice if we could do arbitrary things. If we had a better API doing this, it wouldn't feel odd anymore. So there are ways to improve this.

I think the natural way to represent an execution path would be just array of ExplodedNodes and the checkers could just use its (completely ordinary) iterators to follow the bug path. This mutilation of the exploded graph seems to be a really stupid idea (which probably means that I don't see some important goal e.g. -- perhaps they want to release the memory used to store other nodes?).

Yes, I agree that this "try all subexpressions" (that I implemented) isn't bulletproof. However, I only did it in slightly more than a day. I'd say, if this is a concern, I can give it a bit of a polishing, and put it to the bench to see how it works in life.

I don't think that this is a significant concern, your visitor seems to be good enough for practical use. I only highlighted this part to show that the two implementations are roughly equal in this area (they are both good enough in practice, but both of them have inelegant corner cases).

@steakhal @ anybody else Is my eager implementation acceptable for you (in theory, assuming that it's cleaned up sufficiently)? If yes, I'll start to work on implementing the suggestions of @isuckatcs and any other observations.

I don't think we reached consensus yet.

I see. What can I do to approach consensus? Is there some aspect of either of the implementations that we should discuss in more detail? Or should we ask other contributors?

I'm less concerned about the details, because I know you can do it. This is why I didn't look into the details of this PR, or commented nits so far.

Understandable, I'm also delaying detail-oriented work (on both alternative implementations) until we decide on the way forward.

@steakhal
Copy link
Contributor

Thank you for sharing your roadmap.
I didn't remember there was consensus on a direction of dropping execution paths in the related RFC.
Nevertheless the direction seems interesting, but without consensus I'm not sure we should pursue that in upstream LLVM.

(As I said earlier, I'm strongly opposed to a solution where the visitor-based suppression is activated by each checker [or many checkers] because then we'd be discarding a significant percentage (perhaps even the majority) of work done by the analyzer.)

My problem with this approach is still that it's intrusive. We can only cut those branches once all checkers agree on suppressing all the reports coming from those branches. And it's not only about upstream checkers. There could be downstream checkers, that for example could do much more with the ExplodedGraph than just reporting issues.

One particular (slightly broken) checker that crosses my mind in this case is the UnreachableCodeChecker which checks the Eng.hasWorkRemaining() in the checkEndAnalysis callback to see if we explored all possible branches in the top-level function we just analyzed. If so, it checks what BasicBlocks in the CFG were left unvisited to diagnose unreachable code. I don't imply that this particular checker is of high quality or used in production (AFAIK); all I'm saying is that these make me skeptical of not exploring these branches. This makes the argument of future-proofing weaker to me.

Edit: the very recent bug report #112527 is another example where applying this heuristic to core.UndefinedBinaryOperatorResult would've probably prevented the false positive.

Another way of looking at that issue is, if we would have inlined the std::array::size() would would have known that the given branch is infeasible, like in this older case. In any case, I wouldn't draw far arching conclusions.

I don't think that discarding the execution paths with these "weak" assumptions would be a really big change:

  • It only discards paths, so it cannot introduce any false positives (unlike the loop widening plans, which performs invalidation).
  • I don't think that there is a systemic class of bugs that are only found on these "weak assumption execution path" branches and can be distinguished algorithmically from the highly unwanted results.

I think different tool vendors may want to target different FP TP rates. This argument only holds if we assume it's fine to drop TPs in favor of droping FPs. To me, it always depends. As a general guideline, I agree with your sentiment though. With mass changes like this, I'm not sure anymore without empirical evidence.

[...] and with options for disabling it while developing and preparing this new version.

Obviously I can put the freshly developed stuff behind analyzer options if there's demand for it. In the case of this ArrayBoundV2 change I did not do so, because ArrayBoundV2 is an alpha checker and it's very noisy without this patch, but if I extend this to stable checkers I'll probably introduce an option.

The ArrayBoundV2 changes make sense, exactly because it's an alpha checker like you said. I'm not challenging that.

@steakhal @ anybody else Is my eager implementation acceptable for you (in theory, assuming that it's cleaned up sufficiently)? If yes, I'll start to work on implementing the suggestions of @isuckatcs and any other observations.

I don't think we reached consensus yet.

I see. What can I do to approach consensus? Is there some aspect of either of the implementations that we should discuss in more detail? Or should we ask other contributors?

I had a slight discomfort of implementing something in the Engine that could be done in higher-level, like in a BR visitor.
What you said about dropping execution paths makes sense and would potentially bring a huge improvement in multiple aspects such as TP FP rate and even analysis time. Making research and experimentation tackling these issues is really important. The perfect is the enemy of good; so I'm fine now with doing this in the Engine.

Although, I still have major discomfort of extending this beyond OOBv2, and skipping analyzing execution paths we do today - like you shared in your plans. To convince me, I'd need evidence of the net benefit without any major blocking issues (in other words loosing truly valuable TPs for example, or if it would turn out that some internal components couldn't be upgraded to the new loop modeling model). This is a lot more difficult, and may not even be practical to do incrementally.
The llvm policies suggest that feature-branches are discouraged; but they don't formulate opinions on forks.
Maybe you would have more flexibility for reaching a stable version in a fork than directly developing this upstream.
This way once the prototype is complete, or reaches a maturity where we could gather data for supporting the claims that this is the way how loops should be handled.
Once this is formulated, the maintainers could discuss and see if they agree with these claims and then accepting the patches.

I think the more experienced Static Analyzer devs participate in a discussion, the better.
I'd definitely expect the code owners to express their opinion, and reach unanimous consensus before accepting patches for changing how loops are modeled in general.

Given the title, summary and the content of this PR, I'm convinced that this isn't the right forum though.
This should be done in the LLVM discourse.

@Xazax-hun
Copy link
Collaborator

I am really glad that there is a consensus on the behavior what we want and the only contention is how would we get there.

It would be great to summarize the pros and cons of the two approaches, but before someone is doing that I wanted to make it clear that I think some of the arguments might not hold much water (but let me know if you disagree). Specifically, the fact that other checkers might also benefit from these new suppression heuristics does not imply that any of the approaches are more desirable than the other. There are visitors shared across the checks, so I believe both approaches could be easily adopted by other checks.

Moreover, I don't think the future discussions about changing path exploration heuristics should get too much weight either. I think if we start to cut some execution paths, we will no longer need the state trait to check if we saw any assumptions, and we no longer need the bug visitor either. I think changing the path exploration heuristics in the engine is a relatively straightforward change and the challenge is to validate that the user experience and the results actually got better rather than making the code change itself.

My point is, I think in this case we should evaluate the approaches on their own merits, and future plans or adoptability to other checks should not play a big role in the decision because I do not see a clear advantage there for either of the solutions. So let's focus the discussion on comparing the correctness, complexity, separation of concerns, and performance of the proposed solutions without the diversions.

@Xazax-hun
Copy link
Collaborator

Xazax-hun commented Oct 21, 2024

Some random, unorganized questions:

  • Any reasons we cannot make block counts available to bug reporter visitors and get a best of both worlds (relatively simple implementation and get the separation of concerns)?
  • Does any of the approaches have trouble dealing with short-circuiting operators on the loop condition?
  • Would it make sense to have a combined approach? E.g., the engine adding ProgramPointTag in the engine where weak assumptions happened, and using a bug reporter visitor to look for those tags. I wonder if that approach would make reading exploded graphs easier when investigating false positives.

@NagyDonat
Copy link
Contributor Author

I'm abandoning this PR because I decided to return to writing general commits which improve the overall logic of the engine instead of just suppressing results from one particular (heavily affected) checker. (Moreover, this PR would've needed significant rebasing and code quality improvements.)

I already uploaded my first general improvement PR, #119388 "Don't assume third iteration in loops", which incorporates many fragments from this commit (but it has somewhat different goals and I hope that I was able to improve the code quality a bit).

@NagyDonat NagyDonat closed this Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants