-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
inference: refactor the core loops to use less memory (#45276)
Currently inference uses `O(<number of statements>*<number of slots>)` state in the core inference loop. This is usually fine, because users don't tend to write functions that are particularly long. However, MTK does generate functions that are excessively long and we've observed MTK models that spend 99% of their inference time just allocating and copying this state. It is possible to get away with significantly smaller state, and this PR is a first step in that direction, reducing the state to `O(<number of basic blocks>*<number of slots>)`. Further improvements are possible by making use of slot liveness information and only storing those slots that are live across a particular basic block. The core change here is to keep a full set of `slottypes` only at basic block boundaries rather than at each statement. For statements in between, the full variable state can be fully recovered by linearly scanning throughout the basic block, taking note of slot assignments (together with the SSA type) and NewVarNodes. Co-Authored-By: Keno Fisher <keno@juliacomputing.com>
- Loading branch information
Showing
11 changed files
with
518 additions
and
362 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.