Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very WIP: Refactor core inference loops to use less memory #43999

Closed
wants to merge 1 commit into from

Conversation

Keno
Copy link
Member

@Keno Keno commented Feb 1, 2022

Currently inference uses O(<number of statements>*<number of slots>) state
in the core inference loop. This is usually fine, because users don't tend
to write functions that are particularly long. However, MTK does generate
functions that are excessively long and we've observed MTK models that spend
99% of their inference time just allocating and copying this state.
It is possible to get away with significantly smaller state, and this PR is
a first step in that direction, reducing the state to O(<number of basic blocks>*<number of slots>).
Further improvements are possible by making use of slot liveness information
and only storing those slots that are live across a particular basic block.

The core change here is to keep a full set of slottypes only at
basic block boundaries rather than at each statement. For statements
in between, the full variable state can be fully recovered by
linearly scanning throught the basic block, taking note of
slot assignments (together with the SSA type) and NewVarNodes.

The current status of this branch is that the changes appear correct
(no known functional regressions) and significantly improve the MTK
test cases in question (no exact benchmarks here for now, since
the branch still needs a number of fixes before final numbers make
sense), but somewhat regress optimizer quality (which is expected
and just a missing TODO) and bootstrap time (which is not expected
and something I need to dig into).

@aviatesk aviatesk self-assigned this Apr 21, 2022
@aviatesk aviatesk force-pushed the kf/inferencerefactor branch from 9ae2a51 to 36e3abe Compare April 21, 2022 10:09
@aviatesk aviatesk changed the base branch from kf/effects2 to master April 21, 2022 10:10
@aviatesk
Copy link
Member

Rebased against the latest master, and dropped changes from #43899 (I believe these two PRs are orthogonal and can be merged separately). Still there seems to be some problems in this PR. Will try to dig into them.

@aviatesk aviatesk force-pushed the kf/inferencerefactor branch 4 times, most recently from 2ec47da to 60a82a6 Compare April 25, 2022 13:31
Currently inference uses `O(<number of statements>*<number of slots>)` state
in the core inference loop. This is usually fine, because users don't tend
to write functions that are particularly long. However, MTK does generate
functions that are excessively long and we've observed MTK models that spend
99% of their inference time just allocating and copying this state.
It is possible to get away with significantly smaller state, and this PR is
a first step in that direction, reducing the state to `O(<number of basic blocks>*<number of slots>)`.
Further improvements are possible by making use of slot liveness information
and only storing those slots that are live across a particular basic block.

The core change here is to keep a full set of `slottypes` only at
basic block boundaries rather than at each statement. For statements
in between, the full variable state can be fully recovered by
linearly scanning throughout the basic block, taking note of
slot assignments (together with the SSA type) and NewVarNodes.

The current status of this branch is that the changes appear correct
(no known functional regressions) and significantly improve the MTK
test cases in question (no exact benchmarks here for now, since
the branch still needs a number of fixes before final numbers make
sense), but somewhat regress optimizer quality (which is expected
and just a missing TODO) and bootstrap time (which is not expected
and something I need to dig into).
@aviatesk aviatesk force-pushed the kf/inferencerefactor branch from 60a82a6 to dfeb2e9 Compare April 26, 2022 07:43
Comment on lines +1816 to +1818
if s.undef
sv.src.slotflags[sn] |= SLOT_USEDUNDEF | SLOT_STATICUNDEF
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging into the regressions in the optimizer, I found this change could be problematic, and especially causes some regressions in the SROA pass.

I think these changes are orthogonal to the main purpose of this PR, so I am trying to fix regressions by separating related changes into another branch.

Still there seems to be another cause of the regressions though.

aviatesk added a commit that referenced this pull request Apr 26, 2022
aviatesk added a commit that referenced this pull request May 4, 2022
aviatesk added a commit that referenced this pull request May 4, 2022
aviatesk added a commit that referenced this pull request May 4, 2022
aviatesk added a commit that referenced this pull request May 4, 2022
aviatesk added a commit to aviatesk/julia that referenced this pull request May 10, 2022
@aviatesk
Copy link
Member

Replaced by #45276 .

@aviatesk aviatesk closed this May 30, 2022
@aviatesk aviatesk deleted the kf/inferencerefactor branch May 30, 2022 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants