-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce @nospecializeinfer
macro to tell the compiler to avoid excess inference
#41931
Conversation
The use case was JuliaGPU/GPUCompiler.jl#227, where I tried to avoid re-compilation of the GPU compiler when loading CUDA.jl (despite there being no invalidations). For example, methods like |
Various folks at RelationalAI have been dreaming of this or similar functionality for a while. Thanks for working on it Shuhei! :) |
You probably know this already, but it's worth mentioning that you can get your 4th strategy now by using julia> function invokef(f, itr)
local r = 0
r += f(itr[1])
r += f(itr[2])
r += f(itr[3])
r
end;
julia> _isa = isa
isa (built-in function)
julia> f(a) = _isa(a, Function);
julia> g(@nospecialize a) = _isa(a, Function);
julia> gbarrier(@nospecialize a) = g(Base.inferencebarrier(a))
gbarrier (generic function with 1 method)
julia> using MethodAnalysis
julia> dispatchonly = Any[sin, muladd, nothing];
julia> invokef(gbarrier, dispatchonly)
2
julia> methodinstances(g)
1-element Vector{Core.MethodInstance}:
MethodInstance for g(::Any)
julia> methodinstances(gbarrier)
1-element Vector{Core.MethodInstance}:
MethodInstance for gbarrier(::Any)
julia> withinfernce = tuple(sin, muladd, "foo");
julia> invokef(gbarrier, withinfernce)
2
julia> methodinstances(g)
1-element Vector{Core.MethodInstance}:
MethodInstance for g(::Any)
julia> methodinstances(gbarrier)
4-element Vector{Core.MethodInstance}:
MethodInstance for gbarrier(::Any)
MethodInstance for gbarrier(::typeof(sin))
MethodInstance for gbarrier(::typeof(muladd))
MethodInstance for gbarrier(::String) But I think it's quite likely that we'd also want to be able to (easily) enforce this in the callee too. I.e., both are valuable, much as you have done recently for inline annotations. |
yeah, after #41328 it would be much easier to add nicer support of callsite let
t = maybe_complex_type
r = @noinfer nospecf(t)
return r
end (same for We can reuse |
It also just seems to make more sense to handle this by compiler annotations than by runtime manipulations that (presumably, I've not checked) come with a small performance cost. That would be fixable, of course, and the main reason to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though others (including yourself!) may have more insight than I about whether you've changed all the necessary places.
In a couple of my suggestions I used a closing triple-backticks-julia
because otherwise GitHub got confused about the of a triple-backticks-suggest
block.
Thanks @timholy for your suggestions ! I've tried some experiments to find a good rule to automatically turn on this "no-inference" logic, but so far I couldn't succeed. diff --git a/base/compiler/abstractinterpretation.jl b/base/compiler/abstractinterpretation.jl
index 2045b14d97..b6a091345d 100644
--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -350,7 +350,7 @@ function abstract_call_method(interp::AbstractInterpreter, method::Method, @nosp
add_remark!(interp, sv, "Refusing to infer into `depwarn`")
return MethodCallResult(Any, false, false, nothing)
end
- if is_noinfer(method)
+ if should_not_infer(code_cache(interp), inlining_policy(interp), method, sig, sparams)
sig = get_nospecialize_sig(method, sig, sparams)
end
topmost = nothing
@@ -587,7 +587,8 @@ function maybe_get_const_prop_profitable(interp::AbstractInterpreter, result::Me
end
end
force |= allconst
- if is_noinfer(method)
+ if should_not_infer(code_cache(interp), inlining_policy(interp), method, match.spec_types, match.sparams)
+ return nothing
mi = specialize_method_noinfer(match; preexisting=!force)
else
mi = specialize_method(match; preexisting=!force)
diff --git a/base/compiler/ssair/inlining.jl b/base/compiler/ssair/inlining.jl
index cf36358791..24d0d90cd4 100644
--- a/base/compiler/ssair/inlining.jl
+++ b/base/compiler/ssair/inlining.jl
@@ -808,7 +808,8 @@ function analyze_method!(match::MethodMatch, atypes::Vector{Any},
end
# See if there exists a specialization for this method signature
- if is_noinfer(method)
+ # if is_noinfer(method)
+ if should_not_infer(state.mi_cache, state.policy, method, match.spec_types, match.sparams)
mi = specialize_method_noinfer(match; preexisting=true)
else
mi = specialize_method(match; preexisting=true)
diff --git a/base/compiler/utilities.jl b/base/compiler/utilities.jl
index 8eb9e26d10..6eb03df533 100644
--- a/base/compiler/utilities.jl
+++ b/base/compiler/utilities.jl
@@ -98,6 +98,37 @@ end
is_nospecialized(method::Method) = method.nospecialize ≠ 0
+# check if this `method` is `@nospecialize`d,
+# and if so further check if inference would be profitable for inlining
+# by peeking at the previously inferred method bodies
+# XXX this check is really not idempotent ...
+function should_not_infer(
+ cache::WorldView, inlining_policy::F,
+ method::Method, @nospecialize(sig), sparams::SimpleVector) where F
+ if is_nospecialized(method)
+ # TODO check if `method` is declared as `@noinline`
+ isdefined(method, :source) && ccall(:jl_ir_flag_inlineable, Bool, (Any,), method.source) && return false
+
+ local cacheexist = false
+ for mi in method.specializations
+ if isa(mi, MethodInstance)
+ code = get(cache, mi, nothing)
+ if isdefined(code, :inferred)
+ cache_inf = code.inferred
+ if !(cache_inf === nothing)
+ if inlining_policy(cache_inf) !== nothing
+ return false
+ end
+ end
+ end
+ cacheexist |= true
+ end
+ end
+ return cacheexist
+ end
+ return false
+end
+
is_noinfer(method::Method) = method.noinfer && is_nospecialized(method)
# is_noinfer(method::Method) = is_nospecialized(method) && is_declared_noinline(method) |
Yeah, I don't think you can rely on history, that's a recipe for a lot of head-scratching. Since currently the In SnoopCompile, I have an experimental |
Yeah I'm definitely interested in. Could you push it (it's okay not to make a PR if you prefer) ? |
@@ -584,7 +587,11 @@ function maybe_get_const_prop_profitable(interp::AbstractInterpreter, result::Me | |||
end | |||
end | |||
force |= allconst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add a new condition here (to has_nontrivial_const_info for const_prop_argument_heuristic) to check for the case where:
- inlining is estimated or declared to be beneficial AND
noinfer
is set on the method AND!(mi.specTypes <: argtypes)
# e.g. thatmatch.spec_types
was widened byspecialize_method_noinfer
So that we will compute an optimized version of the method here, in preparation for faster inlining of that code later?
Alright, I guess it is reasonable to separate this from |
02c46d7
to
4c17146
Compare
Is this still being actively developed? I think it still would be useful |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
27bc4d4
to
bca7fea
Compare
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
56526aa
to
3efba74
Compare
Okay, I think this PR is ready to go. 3efba74 adds
And it seems there are no regressions in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks so much for finishing this.
src/julia.h
Outdated
@@ -286,6 +286,7 @@ typedef struct _jl_code_info_t { | |||
uint8_t inferred; | |||
uint8_t propagate_inbounds; | |||
uint8_t has_fcall; | |||
uint8_t noinfer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is the intent here that we might eventually be able to have both inferred and noinferred compilations of the same method and thus support call-site @noinfer
annotations? If so, 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we represent boolean properties as uint8_t
field, including noinfer
, but yes, it's possible to implement call-site @noinfer
.
@@ -544,6 +544,10 @@ function abstract_call_method(interp::AbstractInterpreter, | |||
sigtuple = unwrap_unionall(sig) | |||
sigtuple isa DataType || return MethodCallResult(Any, false, false, nothing, Effects()) | |||
|
|||
if is_noinfer(method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make this controllable by the absint. GPUCompiler may choose to ignore this and force full inference.
cc: @maleadt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with implementing a configuration here but maybe we can delay it until it turns out that GPUCompiler (or other external compilation pipeline) really needs it? Since @nospecializeinfer
does not prohibit constant prop' and such, I'm still wondering if it can cause unwanted inference regression.
SGTM I think the name might be confusing though, as |
Both |
5324f3e
to
6fdc9ba
Compare
@noinfer
macro to tell the compiler to avoid excess inference@nospecializeinfer
macro to tell the compiler to avoid excess inference
I'm good with either name, they are both sensible. |
…cess inference This commit introduces a new compiler annotation called `@nospecializeinfer`, which allows us to request the compiler to avoid excessive inference. \## `@nospecialize` mechanism T discuss `@nospecializeinfer`, let's first understand the behavior of `@nospecialize`. Its docstring says that > This is only a hint for the compiler to avoid excess code generation. , and it works by suppressing dispatches with complex runtime occurrences of the annotated arguments. This could be understood with the example below: ```julia julia> function call_func_itr(func, itr) local r = 0 r += func(itr[1]) r += func(itr[2]) r += func(itr[3]) r end; julia> _isa = isa; # just for the sake of explanation, global variable to prevent inlining julia> func_specialize(a) = _isa(a, Function); julia> func_nospecialize(@nospecialize a) = _isa(a, Function); julia> dispatchonly = Any[sin, muladd, nothing]; # untyped container can cause excessive runtime dispatch julia> @code_typed call_func_itr(func_specialize, dispatchonly) CodeInfo( 1 ─ %1 = π (0, Int64) │ %2 = Base.arrayref(true, itr, 1)::Any │ %3 = (func)(%2)::Any │ %4 = (%1 + %3)::Any │ %5 = Base.arrayref(true, itr, 2)::Any │ %6 = (func)(%5)::Any │ %7 = (%4 + %6)::Any │ %8 = Base.arrayref(true, itr, 3)::Any │ %9 = (func)(%8)::Any │ %10 = (%7 + %9)::Any └── return %10 ) => Any julia> @code_typed call_func_itr(func_nospecialize, dispatchonly) CodeInfo( 1 ─ %1 = π (0, Int64) │ %2 = Base.arrayref(true, itr, 1)::Any │ %3 = invoke func(%2::Any)::Any │ %4 = (%1 + %3)::Any │ %5 = Base.arrayref(true, itr, 2)::Any │ %6 = invoke func(%5::Any)::Any │ %7 = (%4 + %6)::Any │ %8 = Base.arrayref(true, itr, 3)::Any │ %9 = invoke func(%8::Any)::Any │ %10 = (%7 + %9)::Any └── return %10 ) => Any ``` The calls of `func_specialize` remain to be `:call` expression (so that they are dispatched and compiled at runtime) while the calls of `func_nospecialize` are resolved as `:invoke` expressions. This is because `@nospecialize` requests the compiler to give up compiling `func_nospecialize` with runtime argument types but with the declared argument types, allowing `call_func_itr(func_nospecialize, dispatchonly)` to avoid runtime dispatches and accompanying JIT compilations (i.e. "excess code generation"). The difference is evident when checking `specializations`: ```julia julia> call_func_itr(func_specialize, dispatchonly) 2 julia> length(Base.specializations(only(methods(func_specialize)))) 3 # w/ runtime dispatch, multiple specializations julia> call_func_itr(func_nospecialize, dispatchonly) 2 julia> length(Base.specializations(only(methods(func_nospecialize)))) 1 # w/o runtime dispatch, the single specialization ``` The problem here is that it influences dispatch only, and does not intervene into inference in anyway. So there is still a possibility of "excess inference" when the compiler sees a considerable complexity of argument types during inference: ```julia julia> func_specialize(a) = _isa(a, Function); # redefine func to clear the specializations julia> @Assert length(Base.specializations(only(methods(func_specialize)))) == 0; julia> func_nospecialize(@nospecialize a) = _isa(a, Function); # redefine func to clear the specializations julia> @Assert length(Base.specializations(only(methods(func_nospecialize)))) == 0; julia> withinfernce = tuple(sin, muladd, "foo"); # typed container can cause excessive inference julia> @time @code_typed call_func_itr(func_specialize, withinfernce); 0.000812 seconds (3.77 k allocations: 217.938 KiB, 94.34% compilation time) julia> length(Base.specializations(only(methods(func_specialize)))) 4 # multiple method instances inferred julia> @time @code_typed call_func_itr(func_nospecialize, withinfernce); 0.000753 seconds (3.77 k allocations: 218.047 KiB, 92.42% compilation time) julia> length(Base.specializations(only(methods(func_nospecialize)))) 4 # multiple method instances inferred ``` The purpose of this PR is to implement a mechanism that allows us to avoid excessive inference to reduce the compilation latency when inference sees a considerable complexity of argument types. \## Design Here are some ideas to implement the functionality: 1. make `@nospecialize` block inference 2. add nospecializeinfer effect when `@nospecialize`d method is annotated as `@noinline` 3. implement as `@pure`-like boolean annotation to request nospecializeinfer effect on top of `@nospecialize` 4. implement as annotation that is orthogonal to `@nospecialize` After trying 1 ~ 3., I decided to submit 3. \### 1. make `@nospecialize` block inference This is almost same as what Jameson has done at <vtjnash@8ab7b6b>. It turned out that this approach performs very badly because some of `@nospecialize`'d arguments still need inference to perform reasonably. For example, it's obvious that the following definition of `getindex(@nospecialize(t::Tuple), i::Int)` would perform very badly if `@nospecialize` blocks inference, because of a lack of useful type information for succeeding optimizations: <https://github.com/JuliaLang/julia/blob/12d364e8249a07097a233ce7ea2886002459cc50/base/tuple.jl#L29-L30> \### 2. add nospecializeinfer effect when `@nospecialize`d method is annotated as `@noinline` The important observation is that we often use `@nospecialize` even when we expect inference to forward type and constant information. Adversely, we may be able to exploit the fact that we usually don't expect inference to forward information to a callee when we annotate it with `@noinline` (i.e. when adding `@noinline`, we're usually fine with disabling inter-procedural optimizations other than resolving dispatch). So the idea is to enable the inference suppression when `@nospecialize`'d method is annotated as `@noinline` too. It's a reasonable choice and can be efficiently implemented with #41922. But it sounds a bit weird to me to associate no infer effect with `@noinline`, and I also think there may be some cases we want to inline a method while partly avoiding inference, e.g.: ```julia \# the compiler will always infer with `f::Any` @noinline function twof(@nospecialize(f), n) # this method body is very simple and should be eligible for inlining if occursin('+', string(typeof(f).name.name::Symbol)) 2 + n elseif occursin('*', string(typeof(f).name.name::Symbol)) 2n else zero(n) end end ``` \### 3. implement as `@pure`-like boolean annotation to request nospecializeinfer effect on top of `@nospecialize` This is what this commit implements. It basically replaces the previous `@noinline` flag with a newly-introduced annotation named `@nospecializeinfer`. It is still associated with `@nospecialize` and it only has effect when used together with `@nospecialize`, but now it is not associated to `@noinline`, and it would help us reason about the behavior of `@nospecializeinfer` and experiment its effect more safely: ```julia \# the compiler will always infer with `f::Any` Base.@nospecializeinfer function twof(@nospecialize(f), n) # the compiler may or not inline this method if occursin('+', string(typeof(f).name.name::Symbol)) 2 + n elseif occursin('*', string(typeof(f).name.name::Symbol)) 2n else zero(n) end end ``` \### 4. implement as annotation that is orthogonal to `@nospecialize` Actually, we can have `@nospecialize` and `@nospecializeinfer` separately, and it would allow us to configure compilation strategies in a more fine-grained way. ```julia function noinfspec(Base.@nospecializeinfer(f), @nospecialize(g)) ... end ``` I'm fine with this approach but at the same time I'm afraid to have too many annotations that are related to some sort (I expect we will annotate both `@nospecializeinfer` and `@nospecialize` in this scheme). Co-authored-by: Mosè Giordano <giordano@users.noreply.github.com> Co-authored-by: Tim Holy <tim.holy@gmail.com>
This commit adds `@nospecializeinfer` macro on various `Core.Compiler` functions and achieves the following sysimage size reduction: | | this commit | master | % | | --------------------------------- | ----------- | ----------- | ------- | | `Core.Compiler` compilation (sec) | `66.4551` | `71.0846` | `0.935` | | `corecompiler.jl` (KB) | `17638080` | `18407248` | `0.958` | | `sys.jl` (KB) | `88736432` | `89361280` | `0.993` | | `sys-o.a` (KB) | `189484400` | `189907096` | `0.998` |
Woot! Thanks so much! |
This commit introduces a new compiler annotation called
@nospecializeinfer
,which allows us to request the compiler to avoid excessive inference.
TL;DR;
3efba74 adds
@nospecializeinfer
macro on variousCore.Compiler
functionsand achieves the following sysimage size reduction:
Core.Compiler
compilation (sec)66.4551
71.0846
0.935
corecompiler.jl
(KB)17638080
18407248
0.958
sys.jl
(KB)88736432
89361280
0.993
sys-o.a
(KB)189484400
189907096
0.998
@nospecialize
mechanismT discuss
@nospecializeinfer
, let's first understand the behavior of@nospecialize
.Its docstring says that
, and it works by suppressing dispatches with complex runtime
occurrences of the annotated arguments. This could be understood with
the example below:
The calls of
func_specialize
remain to be:call
expression (so thatthey are dispatched and compiled at runtime) while the calls of
func_nospecialize
are resolved as:invoke
expressions. This isbecause
@nospecialize
requests the compiler to give up compilingfunc_nospecialize
with runtime argument types but with the declaredargument types, allowing
call_func_itr(func_nospecialize, dispatchonly)
to avoid runtime dispatches and accompanying JIT compilations
(i.e. "excess code generation").
The difference is evident when checking
specializations
:The problem here is that it influences dispatch only, and does not
intervene into inference in anyway. So there is still a possibility of
"excess inference" when the compiler sees a considerable complexity of
argument types during inference:
The purpose of this PR is to implement a mechanism that allows us to
avoid excessive inference to reduce the compilation latency when
inference sees a considerable complexity of argument types.
Design
Here are some ideas to implement the functionality:
@nospecialize
block inference@nospecialize
d method is annotated as@noinline
@pure
-like boolean annotation to request nospecializeinfer effect on top of@nospecialize
@nospecialize
After trying 1 ~ 3., I decided to submit 3.
1. make
@nospecialize
block inferenceThis is almost same as what Jameson has done at vtjnash@8ab7b6b.
It turned out that this approach performs very badly because some of
@nospecialize
'd arguments still need inference to perform reasonably.For example, it's obvious that the following definition of
getindex(@nospecialize(t::Tuple), i::Int)
would perform very badly if@nospecialize
blocks inference, because of a lack of useful typeinformation for succeeding optimizations:
julia/base/tuple.jl
Lines 29 to 30 in 12d364e
2. add nospecializeinfer effect when
@nospecialize
d method is annotated as@noinline
The important observation is that we often use
@nospecialize
even whenwe expect inference to forward type and constant information.
Adversely, we may be able to exploit the fact that we usually don't
expect inference to forward information to a callee when we annotate it
with
@noinline
(i.e. when adding@noinline
, we're usually fine withdisabling inter-procedural optimizations other than resolving dispatch).
So the idea is to enable the inference suppression when
@nospecialize
'dmethod is annotated as
@noinline
too.It's a reasonable choice and can be efficiently implemented with #41922.
But it sounds a bit weird to me to associate no infer effect with
@noinline
, and I also think there may be some cases we want to inlinea method while partly avoiding inference, e.g.:
3. implement as
@pure
-like boolean annotation to request nospecializeinfer effect on top of@nospecialize
This is what this commit implements. It basically replaces the previous
@noinline
flag with a newly-introduced annotation named@nospecializeinfer
.It is still associated with
@nospecialize
and it only has effect whenused together with
@nospecialize
, but now it is not associated to@noinline
, and it would help us reason about the behavior of@nospecializeinfer
and experiment its effect more safely:
4. implement as annotation that is orthogonal to
@nospecialize
Actually, we can have
@nospecialize
and@nospecializeinfer
separately, and itwould allow us to configure compilation strategies in a more
fine-grained way.
I'm fine with this approach but at the same time I'm afraid to have too
many annotations that are related to some sort (I expect we will
annotate both
@nospecializeinfer
and@nospecialize
in this scheme).Friendly pings:
@nospecializeinfer
and could be used for experiments ? I'm happy to experiment any of what you have in your mind :)GitHub managements