You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While Rust-GPU might not need it, it would help testing a lot to have restructurization respect existing merges. (e.g. SPIR-V -> SPIR-T -> SPIR-V -> SPIR-T -> restructurize -> SPIR-V should produces the same SPIR-V from the middle lifting as in the final one, and it should be possible to feed large amounts of existing shader sources to this, after passing them through whatever compilers they need)
There are two parts to this:
1. preserving Op{Selection,Loop}Merges in cfg::ControlFlowGraph
the structurizer (see below) only needs the merge point (i.e. the exit of the SPIR-V region), but if restructurization isn't performed, spv::lift should have access to "loop continue" points too
{Selection,Loop}Control can be kept in Attr::SpvBitflagsOperand (in cfg::ControlInst's attrs)
the fundamental ambiguity here (wrt reconvergence), that explicit merges solve, looks like this:
loop{// ...if cond {// the "then" edge *leaves the loop*A();// !!! it might be important that this STAYS INSIDE the loop!break;// this is just a linear A -> B edge in the CFG (outside the loop's cycle)}}B();// !!! it might be important that this STAYS OUTSIDE the loop!
because the A() -> B() edge just looks like a redundant fusable edge (isomorphic to a single A(); B(); basic block), structurizers will very likely either produce:
loop { ... if cond { break; } } A(); B(); (moving A()out of the loop)
A() no longer called for "early finishers" (losing a desired side-effect)
loop { ... if cond { A(); B(); break; } } (moving B()into the loop)
B() now being called for "early finishers" (gaining an undesired side-effect)
and they're both bad because a non-uniform break (i.e. an "early finisher") will block on subgroup neighbors (semantically, at least, in hardware the now-inactive lanes will likely stick around while loop instructions continue being executed for remaining lanes, until they all eventually break)
as SPIR-T's cfg::Structurizer is "greedy", we only have to worry about avoiding pulling the merge into the loop
(i.e. anything dominated by the loop entry will always be inside the loop, but the merge is dominated too)
we can "simply" take out the merge before structurizing the targets of the loop header
easiest way to do this would be to set its state to StructurizeRegionState::InProgress, which will make it look like a cycle (i.e. like a backedge to an outer loop)
this will reliably cause deferral of breaking edges (from inside the loop body, to the merge)
we then put it back in afterwards, and either:
trigger an incorporation attempt - would need to factor out that logic from structurize_region_from though
(could even assert that the loop did contain all the "incoming edge counts" for the merge block?)
or just leave it for some outer structured construct (but that may produce more confusing/worse quality structured control-flow? would need testing either way)
Addendum (forcing structured loop merges from Rust code)
(EDIT: I don't want to hide the text below, because it is interesting, but I made it tiny text, and you'll want to go to EmbarkStudios/rust-gpu#944 for that part of the discussion)
The above example is sadly made lossy in Rust-GPU by rustc's MIR not having much AST information, and being a CFG already (though to some extent it could perhaps be recovered from debuginfo). However, not all hope is lost, as anything that can still reach a backedge must be in the loop (by definition):
loop{// ...if cond {// the "then" edge CANNOT leave the loop yet - backedge still reachableA();// !!! definitely INSIDE the loop// `if true` *except* the constant is obscured, until *after* structurization,// introducing a control dependency *only* during structurization, forcing// `A()` to remain *inside* the loop// (MIR calls such things "false edges", but "false" feels ambiguous here)ifcontrol_dep_opaque_true(){break;}}}B();// !!! still a problem if we greedily pull it into the loop
At a glance, not sure B() can be kept out of the loop with such shenanigans, without switching the behavior from "greedy" to "lazy" - but it's not obvious how to do that without computing the SCC (via Tarjan) of the loop, and treating any edges leaving the SCC as break edges (which could move too much out of the loop). A better approach might be to "just" make up a SPIR-T instruction that acts as "loop merge barrier", i.e.:
loop{// ...if cond {A();// !!! definitely INSIDE the loopbreak;// just an edge to the `LoopMerge`, in the CFG}}asm!("spirt.ControlInst.LoopMerge");B();// !!! definitely OUTSIDE the loop
Each LoopMerge would be consumed by one loop structurization - if you have nested loops, you'd need to be careful not to forget to add them to each loop. But wait, even if we make this ergonomic with macros... we'd still want to check that you don't use any instructions that "care" about e.g. subgroups, right? That could be e.g. a MIR check pass, right? Which we can inject... just like... Yes, that's right, rustc_codegen_spirv has enough query overriding power that it could override thir_body, which MIR construction uses as the source of truth, to introduce additional nodes around loops, allowing Rust to be on par with GLSL wrt structured control-flow guarantees.
The text was updated successfully, but these errors were encountered:
…aximal"). (#48)
Quoting from the comment I added to the code:
> This analysis was added to because of two observations wrt
"reconvergence":
> 1. syntactic loops (from some high-level language), when truly
structured
> (i.e. only using `while`/`do`-`while` exit conditions, not `break`
etc.),
> *always* map to "minimal loops" on a CFG, as the only loop exit edge
is
> built-in, and no part of the syntactic "loop body" can be its
successor
> 2. more pragmatically, compiling shader languages to SPIR-V seems to
(almost?)
> always *either* fully preserve syntactic loops (via SPIR-V
`OpLoopMerge`),
> *or* structurize CFGs in a way that produces "minimal loops", which
can
> be misleading with explicit `break`s (moving user code from just
before
> the `break` to after the loop), but is less impactful than "maximal
loops"
See also:
* #11
(of which this only fixes the loop side, and doesn't even keep the
bitflags)
While Rust-GPU might not need it, it would help testing a lot to have restructurization respect existing merges.
(e.g.
SPIR-V -> SPIR-T -> SPIR-V -> SPIR-T -> restructurize -> SPIR-V
should produces the same SPIR-V from the middle lifting as in the final one, and it should be possible to feed large amounts of existing shader sources to this, after passing them through whatever compilers they need)There are two parts to this:
1. preserving
Op{Selection,Loop}Merge
s incfg::ControlFlowGraph
spv::lift
should have access to "loop continue" points too{Selection,Loop}Control
can be kept inAttr::SpvBitflagsOperand
(incfg::ControlInst
'sattrs
)2. enforcing merge points in the CFG structurizer
"loop continue" target likely irrelevant here, since it's "just" some point that is postdominated by the back-edge block, and AFAIK it has no impact on recovergence
(see also spv/lift: use loop body entry (instead of exit), as the "continue" target. #10)
the fundamental ambiguity here (wrt reconvergence), that explicit merges solve, looks like this:
because the
A()
->B()
edge just looks like a redundant fusable edge (isomorphic to a singleA(); B();
basic block), structurizers will very likely either produce:loop { ... if cond { break; } } A(); B();
(movingA()
out of theloop
)A()
no longer called for "early finishers" (losing a desired side-effect)loop { ... if cond { A(); B(); break; } }
(movingB()
into theloop
)B()
now being called for "early finishers" (gaining an undesired side-effect)and they're both bad because a non-uniform
break
(i.e. an "early finisher") will block on subgroup neighbors (semantically, at least, in hardware the now-inactive lanes will likely stick around while loop instructions continue being executed for remaining lanes, until they all eventuallybreak
)as SPIR-T's
cfg::Structurizer
is "greedy", we only have to worry about avoiding pulling the merge into the loopStructurizeRegionState::InProgress
, which will make it look like a cycle (i.e. like a backedge to an outer loop)break
ing edges (from inside the loop body, to the merge)structurize_region_from
though(could even assert that the loop did contain all the "incoming edge counts" for the merge block?)
Addendum (forcing structured loop merges from Rust code)
(EDIT: I don't want to hide the text below, because it is interesting, but I made it tiny text, and you'll want to go to EmbarkStudios/rust-gpu#944 for that part of the discussion)
The above example is sadly made lossy in Rust-GPU by
rustc
's MIR not having much AST information, and being a CFG already (though to some extent it could perhaps be recovered from debuginfo).However, not all hope is lost, as anything that can still reach a backedge must be in the loop (by definition):
At a glance, not sure
B()
can be kept out of the loop with such shenanigans, without switching the behavior from "greedy" to "lazy" - but it's not obvious how to do that without computing the SCC (via Tarjan) of the loop, and treating any edges leaving the SCC asbreak
edges (which could move too much out of theloop
).A better approach might be to "just" make up a SPIR-T instruction that acts as "loop merge barrier", i.e.:
Each
LoopMerge
would be consumed by one loop structurization - if you have nested loops, you'd need to be careful not to forget to add them to each loop.But wait, even if we make this ergonomic with macros... we'd still want to check that you don't use any instructions that "care" about e.g. subgroups, right? That could be e.g. a MIR check pass, right? Which we can inject... just like...
Yes, that's right,
rustc_codegen_spirv
has enough query overriding power that it could overridethir_body
, which MIR construction uses as the source of truth, to introduce additional nodes around loops, allowing Rust to be on par with GLSL wrt structured control-flow guarantees.The text was updated successfully, but these errors were encountered: