-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[analyzer] Widen loops: Augment assignment note when values invalidated at the start of a loop #122398
base: main
Are you sure you want to change the base?
[analyzer] Widen loops: Augment assignment note when values invalidated at the start of a loop #122398
Conversation
…ed at the start of a loop This patch augments the assignment note to inform the user when a value is invalidated and reassigned a conjured symbol as part of loop widening.
@llvm/pr-subscribers-clang-static-analyzer-1 @llvm/pr-subscribers-clang Author: Douglas (dgg5503) ChangesThis PR attempts to improve the value assignment diagnostic specifically for the case where The motivation behind this diagnostic is the fact that the current diagnostic is a bit confusing at first glance. For example: class A {
public:
void m_fn1();
};
void fn1() {
A a;
A *b = &a;
for (;;) {
(void)!b;
b->m_fn1();
}
}
The message In such cases, I'd like to make the diagnostic a bit more verbose. I propose something like:
This indicates to the user that The implementation I present in this PR minimally passes all Clang static analysis LIT tests, however, I am not confident the approach I've taken is idiomatic with respect to the design of Clang's static analysis. I have what I think is a slightly more general solution here where I am open to any and all suggestions as my knowledge in the Clang static analyzer is limited. Full diff: https://github.com/llvm/llvm-project/pull/122398.diff 5 Files Affected:
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
index 20c446e33ef9a5..6017b7e0ea8f81 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
@@ -290,6 +290,10 @@ class ExprEngine {
/// This tag applies to a node created after removeDead.
static const ProgramPointTag *cleanupNodeTag();
+ /// A tag to track when loop widening occurs.
+ /// This tag applies to a node created after getWidenedLoopState.
+ static const ProgramPointTag *loopWideningNodeTag();
+
/// processCFGElement - Called by CoreEngine. Used to generate new successor
/// nodes by processing the 'effects' of a CFG element.
void processCFGElement(const CFGElement E, ExplodedNode *Pred,
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
index a9b4dbb39b5bd6..53494b5cf9cca8 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -1353,6 +1353,49 @@ static void showBRDefaultDiagnostics(llvm::raw_svector_ostream &OS,
OS << " to ";
SI.Dest->printPretty(OS);
}
+
+ // If the value was part of a loop widening node and its value was
+ // invalidated (i.e. replaced with a conjured symbol) then let the user know
+ // that it was due to loop widening.
+ if (SI.StoreSite && SI.Dest &&
+ SI.StoreSite->getLocation().getTag() ==
+ ExprEngine::loopWideningNodeTag()) {
+
+ // TODO: Is it always the case that there's only one predecessor?
+ assert(SI.StoreSite->hasSinglePred() &&
+ "more than one predecessor found, this needs to be handled...");
+ if (const ExplodedNode *previous = SI.StoreSite->getFirstPred()) {
+ // Was the associated memory region for this variable changed from
+ // non-Symbolic to Symbolic because of invalidation?
+ // TODO: Better yet, if there was simply a way to know if a given
+ // SVal / MemRegion was invalidated as part of the current state,
+ // then that should be more robust than what's going on here.
+ // Could we somehow make use of "RegionChanges" /
+ // "runCheckersForRegionChanges" and the ExplicitRegions parameter.
+ // Still need to somehow know when a particular Global
+ // Variable is invalidated. I have not found this to be possible with
+ // "RegionChanges" unless I'm missing something...
+ const ProgramState *lastState = previous->getState().get();
+ const SVal &lastSVal = lastState->getSVal(SI.Dest);
+ const SymbolRef valueSymbol = SI.Value.getAsSymbol(true);
+ const SymbolRef lastSymbol = lastSVal.getAsSymbol(true);
+
+ bool isNowSymbolic = false;
+ if (valueSymbol) {
+ if (!lastSymbol) {
+ // Last state did not have a symbol for SI.Value in current state.
+ // Probably (?) invalidated.
+ isNowSymbolic = true;
+ } else {
+ // If the symbol types differ, then there was likely invalidation
+ isNowSymbolic = valueSymbol->getKind() != lastSymbol->getKind();
+ }
+ }
+
+ if (isNowSymbolic)
+ OS << " (invalidation as part of widen-loops made this symbolic here)";
+ }
+ }
}
static bool isTrivialCopyOrMoveCtor(const CXXConstructExpr *CE) {
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index ff8bdcea9a2201..4660edbf3dde78 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1109,6 +1109,11 @@ const ProgramPointTag *ExprEngine::cleanupNodeTag() {
return &cleanupTag;
}
+const ProgramPointTag *ExprEngine::loopWideningNodeTag() {
+ static SimpleProgramPointTag loopWideningTag(TagProviderName, "Widen Loop");
+ return &loopWideningTag;
+}
+
void ExprEngine::ProcessStmt(const Stmt *currStmt, ExplodedNode *Pred) {
// Reclaim any unnecessary nodes in the ExplodedGraph.
G.reclaimRecentlyAllocatedNodes();
@@ -2546,7 +2551,7 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
const LocationContext *LCtx = Pred->getLocationContext();
ProgramStateRef WidenedState =
getWidenedLoopState(Pred->getState(), LCtx, BlockCount, Term);
- nodeBuilder.generateNode(WidenedState, Pred);
+ nodeBuilder.generateNode(WidenedState, Pred, loopWideningNodeTag());
return;
}
diff --git a/clang/test/Analysis/loop-widening-notes.cpp b/clang/test/Analysis/loop-widening-notes.cpp
index a3f030dfe98826..ac98d81c23475e 100644
--- a/clang/test/Analysis/loop-widening-notes.cpp
+++ b/clang/test/Analysis/loop-widening-notes.cpp
@@ -9,7 +9,7 @@ int test_for_bug_25609() {
bar();
for (int i = 0; // expected-note {{Loop condition is true. Entering loop body}}
// expected-note@-1 {{Loop condition is false. Execution continues on line 16}}
- ++i, // expected-note {{Value assigned to 'p_a'}}
+ ++i, // expected-note {{Value assigned to 'p_a' (invalidation as part of widen-loops made this symbolic here)}}
i < flag_a;
++i) {}
@@ -23,7 +23,7 @@ int while_analyzer_output() {
flag_b = 100;
int num = 10;
while (flag_b-- > 0) { // expected-note {{Loop condition is true. Entering loop body}}
- // expected-note@-1 {{Value assigned to 'num'}}
+ // expected-note@-1 {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
// expected-note@-2 {{Loop condition is false. Execution continues on line 30}}
num = flag_b;
}
@@ -45,7 +45,7 @@ int do_while_analyzer_output() {
do { // expected-note {{Loop condition is true. Execution continues on line 47}}
// expected-note@-1 {{Loop condition is false. Exiting loop}}
num--;
- } while (flag_c-- > 0); //expected-note {{Value assigned to 'num'}}
+ } while (flag_c-- > 0); //expected-note {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
int local = 0;
if (num == 0) // expected-note {{Assuming 'num' is equal to 0}}
// expected-note@-1 {{Taking true branch}}
@@ -59,7 +59,7 @@ int test_for_loop() {
int num = 10;
for (int i = 0; // expected-note {{Loop condition is true. Entering loop body}}
// expected-note@-1 {{Loop condition is false. Execution continues on line 67}}
- new int(10), // expected-note {{Value assigned to 'num'}}
+ new int(10), // expected-note {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
i < flag_d;
++i) {
++num;
@@ -73,7 +73,7 @@ int test_for_loop() {
int test_for_range_loop() {
int arr[10] = {0};
- for(auto x : arr) { // expected-note {{Assigning value}}
+ for(auto x : arr) { // expected-note {{Assigning value (invalidation as part of widen-loops made this symbolic here)}}
++x;
}
if (arr[0] == 0) // expected-note {{Assuming the condition is true}}
diff --git a/clang/test/Analysis/loop-widening.cpp b/clang/test/Analysis/loop-widening.cpp
index fbcb72dee160ae..0b05267c463fc3 100644
--- a/clang/test/Analysis/loop-widening.cpp
+++ b/clang/test/Analysis/loop-widening.cpp
@@ -16,7 +16,7 @@ void fn1() {
for (;;) { // expected-note{{Loop condition is true. Entering loop body}}
// expected-note@-1{{Loop condition is true. Entering loop body}}
- // expected-note@-2{{Value assigned to 'b'}}
+ // expected-note@-2{{Value assigned to 'b' (invalidation as part of widen-loops made this symbolic here)}}
// no crash during bug report construction
g = !b; // expected-note{{Assuming 'b' is null}}
|
// TODO: Better yet, if there was simply a way to know if a given | ||
// SVal / MemRegion was invalidated as part of the current state, | ||
// then that should be more robust than what's going on here. | ||
// Could we somehow make use of "RegionChanges" / | ||
// "runCheckersForRegionChanges" and the ExplicitRegions parameter. | ||
// Still need to somehow know when a particular Global | ||
// Variable is invalidated. I have not found this to be possible with | ||
// "RegionChanges" unless I'm missing something... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to implement this in main...dgg5503:llvm-project:dgg5503/main/invalidation-diagnostic-b but all MemRegion
s from the ProgramState
and before are tracked. I'm wondering if there's a way to clear InvalidatedMemoryRegionSet
at the start of each ProgramState
, or, if there's a different approach I should be taking...
I agree that it's important to augment the bug reports with information about the source of the symbols. Especially for conjured and derived symbols that are produced widely because of invalidations. Read this post of mine: https://discourse.llvm.org/t/memory-region-invalidation-tracking-improvements/62432 |
This PR attempts to improve the value assignment diagnostic specifically for the case where
widen-loops
is enabled during Clang static analysis.The motivation behind this diagnostic is the fact that the current diagnostic is a bit confusing at first glance. For example:
The message
loop-widening-notes-best.cpp:8:3: note: Value assigned to 'b'
appearing where it is makes technical sense if you understand whatwiden-loops
does behind the scenes. In short,widen-loops
invalidates nearly all values in the current state before the start of the loop. At least in this example, the variableb
is invalidated at the start of the for loop and, as part of the process of invalidation, is reassigned a conjured symbol henceValue assigned to 'b'
. Without that knowledge, it is not intuitive from the diagnostic why the assignment message appears where it does.In such cases, I'd like to make the diagnostic a bit more verbose. I propose something like:
This indicates to the user that
b
has been invalidated and turned into a symbolic representation because of actions taken bywiden-loops
.The implementation I present in this PR minimally passes all Clang static analysis LIT tests, however, I am not confident the approach I've taken is idiomatic with respect to the design of Clang's static analysis. I have what I think is a slightly more general solution here where
MemRegion
invalidations are tracked for eachProgramState
, but it also has its own pitfalls (see TODO comments in the diff). I'd be curious to know if there's a preference in either approach, or, if a different approach should be taken.I am open to any and all suggestions as my knowledge in the Clang static analyzer is limited.