-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSAIR: improve inlining performance with in-place IR-inflation #45404
Conversation
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
This seems reasonable to me. I usually profile with Cthulhu when I profile inference, which doesn't go through the compression/decompression cycle, so I don't have a good idea of the usual performance impact, but the nanosolider results look favorable, so I'm in favor. My only concern would be ending up with corrupted CodeInfos in some of the generated function cases. I've run into hard to debug bugs of that kind before, but certainly for the compression case, it's a no brainer to do the inflation destructively. Eventually we may want to even skip the decompression step and just inline directly from the compressed representation as we go along, but that's obviously future work. |
For the uncompressed case, this PR makes a copy of that julia/base/compiler/ssair/inlining.jl Lines 916 to 926 in f34c577
@nanosoldier |
This commit improves the performance of a huge hot-spot within `inflate_ir` by using the in-place version of it (`inflate_ir!`) and avoiding some unnecessary allocations. For `NativeInterpreter`, `CodeInfo`-IR passed to `inflate_ir` can come from two ways: 1. from global cache: uncompressed from compressed format 2. from local cache: inferred `CodeInfo` as-is managed by `InferenceResult` And in the case of 1, an uncompressed `CodeInfo` is an newly-allocated object already and thus we can use the in-place version safely. And it turns out that this helps us avoid many unnecessary allocations. The original non-destructive `inflate_ir` remains there for testing or interactive purpose.
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
@nanosoldier |
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
@nanosoldier |
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
Just confirmed the JSONSchema test suite runs successfully on my local machine on this branch. Going to merge. |
This commit improves the performance of a huge hot-spot within `inflate_ir` by using the in-place version of it (`inflate_ir!`) and avoiding some unnecessary allocations. For `NativeInterpreter`, `CodeInfo`-IR passed to `inflate_ir` can come from two ways: 1. from global cache: uncompressed from compressed format 2. from local cache: inferred `CodeInfo` as-is managed by `InferenceResult` And in the case of 1, an uncompressed `CodeInfo` is an newly-allocated object already and thus we can use the in-place version safely. And it turns out that this helps us avoid many unnecessary allocations. The original non-destructive `inflate_ir` remains there for testing or interactive purpose.
This commit improves the performance of a huge hot-spot within
inflate_ir
by using the in-place version of it (
inflate_ir!
) and avoiding someunnecessary allocations.
For
NativeInterpreter
,CodeInfo
-IR passed toinflate_ir
can comefrom two ways:
CodeInfo
as-is managed byInferenceResult
And in the case of 1, an uncompressed
CodeInfo
is an newly-allocatedobject already and thus we can use the in-place version safely. And it
turns out that this helps us avoid many unnecessary allocations.
The original non-destructive
inflate_ir
remains there for testing orinteractive purpose.
@nanosoldier
runbenchmarks("inference", vs=":master")