Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STAL-2713] Implement phi nodes #512

Merged
merged 3 commits into from
Oct 2, 2024
Merged

[STAL-2713] Implement phi nodes #512

merged 3 commits into from
Oct 2, 2024

Conversation

jasonforal
Copy link
Collaborator

What problem are you trying to solve?

Our taint analysis implementation in #493 in omitted the handling of conditional (and mutually exclusive) control flow. This leads to incorrect analysis:
For example:

String y = alt0;
if (condition) {
    y = alt1;
} else {
    y = alt2;
}
y;

The above program is currently treated as:

String y = alt0;
{
    y = alt1;
} {
    y = alt2;
}
y;

This leads to a flow graph that incorrectly believes the y at line 7 can never be alt1 because it always gets re-assigned to alt2:

image

What is your solution?

The technique chosen takes inspiration from static single-assignment (SSA) form and how it handles merge points in a control-flow graph (CFG).

Specifically, we introduce phi nodes to our flow graph. Phi nodes are able to encapsulate control flow constructs that have mutually exclusive branches (e.g if/else). Upon entering a control flow construct with branches, we record all assignments and treat them as possible. Then, when exiting the control flow construct, we reconcile/merge the possible assignments, and then redefine the variable as the phi node.

Our graph then correctly becomes:

image

The above program represents an exhaustively-assigned variable. In the case where a variable isn't exhaustively assigned, the phi node references the definition of the variable before entering the control flow construct. For example:

String y = alt0;
if (condition) {
    y = alt1;
} else if (otherCondition) {
    y = alt2;
}
y;
image

Phi operands can be other phi nodes in the case of nested control flow:

String y = alt0;
if (condition) {
    y = alt1;
    if (otherCondition) {
        y = alt2;
    }
}
y;
image

Technical Details

  • A Digraph node can now be either a CST node or a phi node. This is implemented as an additional bit shift for the performance reasons outlined in [STAL-2195] Initial implementation of intra-method taint analysis in Java #493 and is especially important now that graph edges need to encode whether the node is a CST or a phi node.
  • A traditional way to determine if a variable is exhaustively assigned would be to construct a CFG and then construct a dominance frontier based on the CFG. Instead, we use an unconventional (but radically simpler) implementation where we just manually annotate if branch represents an consequent or alternative, and track which branches assignments occur in. (This is a good solution given we never construct an explicit CFG).

Limitations

  • Path reachability is not performed. if (false) { y = alt1 } will be treated as a possible execution path.
  • The "scope"s implemented are control flow scopes. Variable name scopes are yet to be implemented.

Alternatives considered

What the reviewer should know

  • A traversal context is added just to accommodate future extensions.
  • "ScopeBlock" and "ConditionalBlock" are CFG concepts -- while somewhat confusing, they should not be conflated with a CST block (defined by the code syntax "{ ... }")

@jasonforal jasonforal requested a review from juli1 September 20, 2024 17:12
@jasonforal jasonforal requested a review from a team as a code owner September 20, 2024 17:12
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 20, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 24, 2024
@DataDog DataDog deleted a comment from datadog-datadog-prod-us1 bot Sep 24, 2024
@jasonforal jasonforal merged commit 0b5e150 into main Oct 2, 2024
70 checks passed
@jasonforal jasonforal deleted the jf/STAL-2713 branch October 2, 2024 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants