You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One big limitation in Rust-GPU is that performing inlining for all eligible callsites in a SPIR-V module is a separate monolithic step, but this is very inefficient, because legalizations/simplifications unlocked after individual callsites are inlined can greatly reduce the amount of IR that would be produced after everything is inlined.
In fact, it seems exponentially bad (at least for functions with multiple callsites):
A more efficient design would be to interleave inlining and local transformations (i.e. simplifying a function as much as possible before ever inlining it in turn), but for that there needs to be some kind of framework for "apply these function-local passes repeatedly until they stop wanting to change anything", that can be wrapped in an "IPO" (inter-procedural optimization) pass manager which interleaves it with inlining (and any other "global" changes).
This is itself not a blocker for the initial Rust-GPU integration, but high-priority because it's the bigger win SPIR-T can offer, in terms of reducing Rust-GPU compile times.
Longer-term there are more interesting alternatives like "eqsat" but for now "rerun until fixpoint" seems fine.
The text was updated successfully, but these errors were encountered:
One big limitation in Rust-GPU is that performing inlining for all eligible callsites in a SPIR-V module is a separate monolithic step, but this is very inefficient, because legalizations/simplifications unlocked after individual callsites are inlined can greatly reduce the amount of IR that would be produced after everything is inlined.
In fact, it seems exponentially bad (at least for functions with multiple callsites):
rustc_codegen_spirv
taking a long time processing my (large) shader rust-gpu#851A more efficient design would be to interleave inlining and local transformations (i.e. simplifying a function as much as possible before ever inlining it in turn), but for that there needs to be some kind of framework for "apply these function-local passes repeatedly until they stop wanting to change anything", that can be wrapped in an "IPO" (inter-procedural optimization) pass manager which interleaves it with inlining (and any other "global" changes).
This is itself not a blocker for the initial Rust-GPU integration, but high-priority because it's the bigger win SPIR-T can offer, in terms of reducing Rust-GPU compile times.
Longer-term there are more interesting alternatives like "eqsat" but for now "rerun until fixpoint" seems fine.
The text was updated successfully, but these errors were encountered: