Should be able to respect `Op{Selection,Loop}Merge`s in `cfg::Structurizer`. #11

eddyb · 2022-11-24T08:11:48Z

While Rust-GPU might not need it, it would help testing a lot to have restructurization respect existing merges.
_{(e.g. SPIR-V -> SPIR-T -> SPIR-V -> SPIR-T -> restructurize -> SPIR-V should produces the same SPIR-V from the middle lifting as in the final one, and it should be possible to feed large amounts of existing shader sources to this, after passing them through whatever compilers they need)}

There are two parts to this:

1. preserving `Op{Selection,Loop}Merge`s in `cfg::ControlFlowGraph`

the structurizer (see below) only needs the merge point (i.e. the exit of the SPIR-V region), but if restructurization isn't performed, spv::lift should have access to "loop continue" points too
{Selection,Loop}Control can be kept in Attr::SpvBitflagsOperand (in cfg::ControlInst's attrs)

2. enforcing merge points in the CFG structurizer

"loop continue" target likely irrelevant here, since it's "just" some point that is postdominated by the back-edge block, and AFAIK it has no impact on recovergence
(see also spv/lift: use loop body entry (instead of exit), as the "continue" target. #10)
the fundamental ambiguity here (wrt reconvergence), that explicit merges solve, looks like this:
```
loop {
    // ...
    if cond {  // the "then" edge *leaves the loop*
        A();   // !!! it might be important that this STAYS INSIDE the loop!
        break; // this is just a linear A -> B edge in the CFG (outside the loop's cycle)
    }
}
B(); // !!! it might be important that this STAYS OUTSIDE the loop!
```
because the A() -> B() edge just looks like a redundant fusable edge (isomorphic to a single A(); B(); basic block), structurizers will very likely either produce:
- loop { ... if cond { break; } } A(); B(); (moving A() out of the loop)
  - A() no longer called for "early finishers" (losing a desired side-effect)
- loop { ... if cond { A(); B(); break; } } (moving B() into the loop)
  - B() now being called for "early finishers" (gaining an undesired side-effect)
and they're both bad because a non-uniform break (i.e. an "early finisher") will block on subgroup neighbors (semantically, at least, in hardware the now-inactive lanes will likely stick around while loop instructions continue being executed for remaining lanes, until they all eventually break)
as SPIR-T's cfg::Structurizer is "greedy", we only have to worry about avoiding pulling the merge into the loop
- (i.e. anything dominated by the loop entry will always be inside the loop, but the merge is dominated too)
- we can "simply" take out the merge before structurizing the targets of the loop header
  - easiest way to do this would be to set its state to StructurizeRegionState::InProgress, which will make it look like a cycle (i.e. like a backedge to an outer loop)
  - this will reliably cause deferral of breaking edges (from inside the loop body, to the merge)
- we then put it back in afterwards, and either:
  - trigger an incorporation attempt - would need to factor out that logic from structurize_region_from though
    (could even assert that the loop did contain all the "incoming edge counts" for the merge block?)
  - or just leave it for some outer structured construct (but that may produce more confusing/worse quality structured control-flow? would need testing either way)

Addendum (forcing structured loop merges from Rust code)

(EDIT: I don't want to hide the text below, because it is interesting, but I made it tiny text, and you'll want to go to EmbarkStudios/rust-gpu#944 for that part of the discussion)

_{^{The above example is sadly made lossy in Rust-GPU by rustc's MIR not having much AST information, and being a CFG already (though to some extent it could perhaps be recovered from debuginfo).}}
_{^{However, not all hope is lost, as anything that can still reach a backedge must be in the loop (by definition):}}

_{^{loop {
// ...
if cond { // the "then" edge CANNOT leave the loop yet - backedge still reachable
A(); // !!! definitely INSIDE the loop

// `if true` *except* the constant is obscured, until *after* structurization,
// introducing a control dependency *only* during structurization, forcing
// `A()` to remain *inside* the loop
// (MIR calls such things "false edges", but "false" feels ambiguous here)
if control_dep_opaque_true() {
break;
}
}
}
B(); // !!! still a problem if we greedily pull it into the loop}}

_{^{At a glance, not sure B() can be kept out of the loop with such shenanigans, without switching the behavior from "greedy" to "lazy" - but it's not obvious how to do that without computing the SCC (via Tarjan) of the loop, and treating any edges leaving the SCC as break edges (which could move too much out of the loop).}}
_{^{A better approach might be to "just" make up a SPIR-T instruction that acts as "loop merge barrier", i.e.:}}

_{^{loop {
// ...
if cond {
A(); // !!! definitely INSIDE the loop
break; // just an edge to the `LoopMerge`, in the CFG
}
}
asm!("spirt.ControlInst.LoopMerge");
B(); // !!! definitely OUTSIDE the loop}}

_{^{Each LoopMerge would be consumed by one loop structurization - if you have nested loops, you'd need to be careful not to forget to add them to each loop.}}
_{^{But wait, even if we make this ergonomic with macros... we'd still want to check that you don't use any instructions that "care" about e.g. subgroups, right? That could be e.g. a MIR check pass, right? Which we can inject... just like...}}
_{^{Yes, that's right, rustc_codegen_spirv has enough query overriding power that it could override thir_body, which MIR construction uses as the source of truth, to introduce additional nodes around loops, allowing Rust to be on par with GLSL wrt structured control-flow guarantees.}}

The text was updated successfully, but these errors were encountered:

…aximal"). (#48) Quoting from the comment I added to the code: > This analysis was added to because of two observations wrt "reconvergence": > 1. syntactic loops (from some high-level language), when truly structured > (i.e. only using `while`/`do`-`while` exit conditions, not `break` etc.), > *always* map to "minimal loops" on a CFG, as the only loop exit edge is > built-in, and no part of the syntactic "loop body" can be its successor > 2. more pragmatically, compiling shader languages to SPIR-V seems to (almost?) > always *either* fully preserve syntactic loops (via SPIR-V `OpLoopMerge`), > *or* structurize CFGs in a way that produces "minimal loops", which can > be misleading with explicit `break`s (moving user code from just before > the `break` to after the loop), but is less impactful than "maximal loops" See also: * #11 (of which this only fixes the loop side, and doesn't even keep the bitflags)

eddyb mentioned this issue Nov 24, 2022

We should override thir_body to inject loop merge points for structurization. EmbarkStudios/rust-gpu#944

Open

eddyb self-assigned this Mar 31, 2023

eddyb mentioned this issue Sep 28, 2023

cfg: use Tarjan's SCC algorithm to get "minimal loops" (instead of "maximal"). #48

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should be able to respect `Op{Selection,Loop}Merge`s in `cfg::Structurizer`. #11

Should be able to respect `Op{Selection,Loop}Merge`s in `cfg::Structurizer`. #11

eddyb commented Nov 24, 2022 •

edited

Loading

Should be able to respect Op{Selection,Loop}Merges in cfg::Structurizer. #11

Should be able to respect Op{Selection,Loop}Merges in cfg::Structurizer. #11

Comments

eddyb commented Nov 24, 2022 • edited Loading

1. preserving Op{Selection,Loop}Merges in cfg::ControlFlowGraph

2. enforcing merge points in the CFG structurizer

Addendum (forcing structured loop merges from Rust code)

(EDIT: I don't want to hide the text below, because it is interesting, but I made it tiny text, and you'll want to go to EmbarkStudios/rust-gpu#944 for that part of the discussion)

Should be able to respect `Op{Selection,Loop}Merge`s in `cfg::Structurizer`. #11

Should be able to respect `Op{Selection,Loop}Merge`s in `cfg::Structurizer`. #11

eddyb commented Nov 24, 2022 •

edited

Loading

1. preserving `Op{Selection,Loop}Merge`s in `cfg::ControlFlowGraph`