Skip to content

Commit

Permalink
invoked calls: record invoke signature in backedges (#46010)
Browse files Browse the repository at this point in the history
This fixes a long-standing issue with how we've handled `invoke` calls
with respect to method invalidation.  When we load a package, we need
to ask whether a given MethodInstance would be compiled in the same
way now (aka, in the user's running session) as when the package was
precompiled; in practice, the way we do that is to test whether the
dispatches would be to the same methods in the current
world-age. `invoke` presents special challenges because it allows the
coder to deliberately select a different method than the one that
would be chosen by ordinary dispatch; if there is no record of how
this choice was made, it can look like it resolves to the wrong method
and this can trigger invalidation.

This allows a MethodInstance to store dispatch tuples as well as other
MethodInstances among their backedges.

Additionally:

- provide backedge-iterators for both C and Julia that abstracts
  the specific storage mechanism.

- fix a bug in the CodeInstance `relocatability` field, where methods
  that only return a constant (and hence store `nothing` for
  `inferred`) were deemed non-relocatable.

- fix a bug in which #43990 should have checked that the method had
  not been deleted. Tests passed formerly simply because we weren't
  caching external CodeInstances that inferred down to a `Const`;
  fixing that exposed the bug.  This bug has been exposed since
  merging #43990 for non-`Const` inference, and would affect Revise
  etc.

Co-authored-by: Jameson Nash <vtjnash@gmail.com>
  • Loading branch information
timholy and vtjnash authored Aug 24, 2022
1 parent 3b1c54d commit dd375e1
Show file tree
Hide file tree
Showing 16 changed files with 585 additions and 150 deletions.
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ Compiler/Runtime improvements
`@nospecialize`-d call sites and avoiding excessive compilation. ([#44512])
* All the previous usages of `@pure`-macro in `Base` has been replaced with the preferred
`Base.@assume_effects`-based annotations. ([#44776])
* `invoke(f, invokesig, args...)` calls to a less-specific method than would normally be chosen
for `f(args...)` are no longer spuriously invalidated when loading package precompile files. ([#46010])

Command-line option changes
---------------------------
Expand Down
7 changes: 6 additions & 1 deletion base/compiler/abstractinterpretation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -801,6 +801,11 @@ function collect_const_args(argtypes::Vector{Any})
end for i = 2:length(argtypes) ]
end

function invoke_signature(invokesig::Vector{Any})
ft, argtyps = widenconst(invokesig[2]), instanceof_tfunc(widenconst(invokesig[3]))[1]
return rewrap_unionall(Tuple{ft, unwrap_unionall(argtyps).parameters...}, argtyps)
end

function concrete_eval_call(interp::AbstractInterpreter,
@nospecialize(f), result::MethodCallResult, arginfo::ArgInfo, sv::InferenceState)
concrete_eval_eligible(interp, f, result, arginfo, sv) || return nothing
Expand Down Expand Up @@ -1631,7 +1636,7 @@ function abstract_invoke(interp::AbstractInterpreter, (; fargs, argtypes)::ArgIn
ti = tienv[1]; env = tienv[2]::SimpleVector
result = abstract_call_method(interp, method, ti, env, false, sv)
(; rt, edge, effects) = result
edge !== nothing && add_backedge!(edge::MethodInstance, sv)
edge !== nothing && add_backedge!(edge::MethodInstance, sv, types)
match = MethodMatch(ti, env, method, argtype <: method.sig)
res = nothing
sig = match.spec_types
Expand Down
5 changes: 4 additions & 1 deletion base/compiler/inferencestate.jl
Original file line number Diff line number Diff line change
Expand Up @@ -479,12 +479,15 @@ function add_cycle_backedge!(frame::InferenceState, caller::InferenceState, curr
end

# temporarily accumulate our edges to later add as backedges in the callee
function add_backedge!(li::MethodInstance, caller::InferenceState)
function add_backedge!(li::MethodInstance, caller::InferenceState, invokesig::Union{Nothing,Type}=nothing)
isa(caller.linfo.def, Method) || return # don't add backedges to toplevel exprs
edges = caller.stmt_edges[caller.currpc]
if edges === nothing
edges = caller.stmt_edges[caller.currpc] = []
end
if invokesig !== nothing
push!(edges, invokesig)
end
push!(edges, li)
return nothing
end
Expand Down
4 changes: 4 additions & 0 deletions base/compiler/optimize.jl
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ intersect!(et::EdgeTracker, range::WorldRange) =
et.valid_worlds[] = intersect(et.valid_worlds[], range)

push!(et::EdgeTracker, mi::MethodInstance) = push!(et.edges, mi)
function add_edge!(et::EdgeTracker, @nospecialize(invokesig), mi::MethodInstance)
invokesig === nothing && return push!(et.edges, mi)
push!(et.edges, invokesig, mi)
end
function push!(et::EdgeTracker, ci::CodeInstance)
intersect!(et, WorldRange(min_world(li), max_world(li)))
push!(et, ci.def)
Expand Down
31 changes: 17 additions & 14 deletions base/compiler/ssair/inlining.jl
Original file line number Diff line number Diff line change
Expand Up @@ -29,19 +29,21 @@ pass to apply its own inlining policy decisions.
struct DelayedInliningSpec
match::Union{MethodMatch, InferenceResult}
argtypes::Vector{Any}
invokesig # either nothing or a signature (signature is for an `invoke` call)
end
DelayedInliningSpec(match, argtypes) = DelayedInliningSpec(match, argtypes, nothing)

struct InliningTodo
# The MethodInstance to be inlined
mi::MethodInstance
spec::Union{ResolvedInliningSpec, DelayedInliningSpec}
end

InliningTodo(mi::MethodInstance, match::MethodMatch, argtypes::Vector{Any}) =
InliningTodo(mi, DelayedInliningSpec(match, argtypes))
InliningTodo(mi::MethodInstance, match::MethodMatch, argtypes::Vector{Any}, invokesig=nothing) =
InliningTodo(mi, DelayedInliningSpec(match, argtypes, invokesig))

InliningTodo(result::InferenceResult, argtypes::Vector{Any}) =
InliningTodo(result.linfo, DelayedInliningSpec(result, argtypes))
InliningTodo(result::InferenceResult, argtypes::Vector{Any}, invokesig=nothing) =
InliningTodo(result.linfo, DelayedInliningSpec(result, argtypes, invokesig))

struct ConstantCase
val::Any
Expand Down Expand Up @@ -810,15 +812,15 @@ end

function resolve_todo(todo::InliningTodo, state::InliningState, flag::UInt8)
mi = todo.mi
(; match, argtypes) = todo.spec::DelayedInliningSpec
(; match, argtypes, invokesig) = todo.spec::DelayedInliningSpec
et = state.et

#XXX: update_valid_age!(min_valid[1], max_valid[1], sv)
if isa(match, InferenceResult)
inferred_src = match.src
if isa(inferred_src, ConstAPI)
# use constant calling convention
et !== nothing && push!(et, mi)
et !== nothing && add_edge!(et, invokesig, mi)
return ConstantCase(quoted(inferred_src.val))
else
src = inferred_src # ::Union{Nothing,CodeInfo} for NativeInterpreter
Expand All @@ -829,7 +831,7 @@ function resolve_todo(todo::InliningTodo, state::InliningState, flag::UInt8)
if code isa CodeInstance
if use_const_api(code)
# in this case function can be inlined to a constant
et !== nothing && push!(et, mi)
et !== nothing && add_edge!(et, invokesig, mi)
return ConstantCase(quoted(code.rettype_const))
else
src = @atomic :monotonic code.inferred
Expand All @@ -851,7 +853,7 @@ function resolve_todo(todo::InliningTodo, state::InliningState, flag::UInt8)

src === nothing && return compileable_specialization(et, match, effects)

et !== nothing && push!(et, mi)
et !== nothing && add_edge!(et, invokesig, mi)
return InliningTodo(mi, retrieve_ir_for_inlining(mi, src), effects)
end

Expand All @@ -873,7 +875,7 @@ function validate_sparams(sparams::SimpleVector)
return true
end

function analyze_method!(match::MethodMatch, argtypes::Vector{Any},
function analyze_method!(match::MethodMatch, argtypes::Vector{Any}, invokesig,
flag::UInt8, state::InliningState)
method = match.method
spec_types = match.spec_types
Expand Down Expand Up @@ -905,7 +907,7 @@ function analyze_method!(match::MethodMatch, argtypes::Vector{Any},
mi = specialize_method(match; preexisting=true) # Union{Nothing, MethodInstance}
isa(mi, MethodInstance) || return compileable_specialization(et, match, Effects())

todo = InliningTodo(mi, match, argtypes)
todo = InliningTodo(mi, match, argtypes, invokesig)
# If we don't have caches here, delay resolving this MethodInstance
# until the batch inlining step (or an external post-processing pass)
state.mi_cache === nothing && return todo
Expand Down Expand Up @@ -1100,17 +1102,18 @@ function inline_invoke!(
if isa(result, ConcreteResult)
item = concrete_result_item(result, state)
else
invokesig = invoke_signature(sig.argtypes)
argtypes = invoke_rewrite(sig.argtypes)
if isa(result, ConstPropResult)
(; mi) = item = InliningTodo(result.result, argtypes)
(; mi) = item = InliningTodo(result.result, argtypes, invokesig)
validate_sparams(mi.sparam_vals) || return nothing
if argtypes_to_type(argtypes) <: mi.def.sig
state.mi_cache !== nothing && (item = resolve_todo(item, state, flag))
handle_single_case!(ir, idx, stmt, item, todo, state.params, true)
return nothing
end
end
item = analyze_method!(match, argtypes, flag, state)
item = analyze_method!(match, argtypes, invokesig, flag, state)
end
handle_single_case!(ir, idx, stmt, item, todo, state.params, true)
return nothing
Expand Down Expand Up @@ -1328,7 +1331,7 @@ function handle_match!(
# during abstract interpretation: for the purpose of inlining, we can just skip
# processing this dispatch candidate
_any(case->case.sig === spec_types, cases) && return true
item = analyze_method!(match, argtypes, flag, state)
item = analyze_method!(match, argtypes, nothing, flag, state)
item === nothing && return false
push!(cases, InliningCase(spec_types, item))
return true
Expand Down Expand Up @@ -1475,7 +1478,7 @@ function assemble_inline_todo!(ir::IRCode, state::InliningState)
if isa(result, ConcreteResult)
item = concrete_result_item(result, state)
else
item = analyze_method!(info.match, sig.argtypes, flag, state)
item = analyze_method!(info.match, sig.argtypes, nothing, flag, state)
end
handle_single_case!(ir, idx, stmt, item, todo, state.params)
end
Expand Down
13 changes: 5 additions & 8 deletions base/compiler/typeinfer.jl
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,9 @@ function CodeInstance(
const_flags = 0x00
end
end
relocatability = isa(inferred_result, Vector{UInt8}) ? inferred_result[end] : UInt8(0)
relocatability = isa(inferred_result, Vector{UInt8}) ? inferred_result[end] :
inferred_result === nothing ? UInt8(1) : UInt8(0)
# relocatability = isa(inferred_result, Vector{UInt8}) ? inferred_result[end] : UInt8(0)
return CodeInstance(result.linfo,
widenconst(result_type), rettype_const, inferred_result,
const_flags, first(valid_worlds), last(valid_worlds),
Expand Down Expand Up @@ -561,17 +563,12 @@ function store_backedges(frame::InferenceResult, edges::Vector{Any})
end

function store_backedges(caller::MethodInstance, edges::Vector{Any})
i = 1
while i <= length(edges)
to = edges[i]
for (typ, to) in BackedgeIterator(edges)
if isa(to, MethodInstance)
ccall(:jl_method_instance_add_backedge, Cvoid, (Any, Any), to, caller)
i += 1
ccall(:jl_method_instance_add_backedge, Cvoid, (Any, Any, Any), to, typ, caller)
else
typeassert(to, Core.MethodTable)
typ = edges[i + 1]
ccall(:jl_method_table_add_backedge, Cvoid, (Any, Any, Any), to, typ, caller)
i += 2
end
end
end
Expand Down
53 changes: 53 additions & 0 deletions base/compiler/utilities.jl
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,59 @@ Check if `method` is declared as `Base.@constprop :none`.
"""
is_no_constprop(method::Union{Method,CodeInfo}) = method.constprop == 0x02

#############
# backedges #
#############

"""
BackedgeIterator(backedges::Vector{Any})
Return an iterator over a list of backedges. Iteration returns `(sig, caller)` elements,
which will be one of the following:
- `(nothing, caller::MethodInstance)`: a call made by ordinary inferrable dispatch
- `(invokesig, caller::MethodInstance)`: a call made by `invoke(f, invokesig, args...)`
- `(specsig, mt::MethodTable)`: an abstract call
# Examples
```julia
julia> callme(x) = x+1
callme (generic function with 1 method)
julia> callyou(x) = callme(x)
callyou (generic function with 1 method)
julia> callyou(2.0)
3.0
julia> mi = first(which(callme, (Any,)).specializations)
MethodInstance for callme(::Float64)
julia> @eval Core.Compiler for (sig, caller) in BackedgeIterator(Main.mi.backedges)
println(sig)
println(caller)
end
nothing
callyou(Float64) from callyou(Any)
```
"""
struct BackedgeIterator
backedges::Vector{Any}
end

const empty_backedge_iter = BackedgeIterator(Any[])


function iterate(iter::BackedgeIterator, i::Int=1)
backedges = iter.backedges
i > length(backedges) && return nothing
item = backedges[i]
isa(item, MethodInstance) && return (nothing, item), i+1 # regular dispatch
isa(item, Core.MethodTable) && return (backedges[i+1], item), i+2 # abstract dispatch
return (item, backedges[i+1]::MethodInstance), i+2 # `invoke` calls
end

#########
# types #
#########
Expand Down
29 changes: 25 additions & 4 deletions base/loading.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2262,12 +2262,12 @@ macro __DIR__()
end

"""
precompile(f, args::Tuple{Vararg{Any}})
precompile(f, argtypes::Tuple{Vararg{Any}})
Compile the given function `f` for the argument tuple (of types) `args`, but do not execute it.
Compile the given function `f` for the argument tuple (of types) `argtypes`, but do not execute it.
"""
function precompile(@nospecialize(f), @nospecialize(args::Tuple))
precompile(Tuple{Core.Typeof(f), args...})
function precompile(@nospecialize(f), @nospecialize(argtypes::Tuple))
precompile(Tuple{Core.Typeof(f), argtypes...})
end

const ENABLE_PRECOMPILE_WARNINGS = Ref(false)
Expand All @@ -2279,6 +2279,27 @@ function precompile(@nospecialize(argt::Type))
return ret
end

# Variants that work for `invoke`d calls for which the signature may not be sufficient
precompile(mi::Core.MethodInstance, world::UInt=get_world_counter()) =
(ccall(:jl_compile_method_instance, Cvoid, (Any, Any, UInt), mi, C_NULL, world); return true)

"""
precompile(f, argtypes::Tuple{Vararg{Any}}, m::Method)
Precompile a specific method for the given argument types. This may be used to precompile
a different method than the one that would ordinarily be chosen by dispatch, thus
mimicking `invoke`.
"""
function precompile(@nospecialize(f), @nospecialize(argtypes::Tuple), m::Method)
precompile(Tuple{Core.Typeof(f), argtypes...}, m)
end

function precompile(@nospecialize(argt::Type), m::Method)
atype, sparams = ccall(:jl_type_intersection_with_env, Any, (Any, Any), argt, m.sig)::SimpleVector
mi = Core.Compiler.specialize_method(m, atype, sparams)
return precompile(mi)
end

precompile(include_package_for_output, (PkgId, String, Vector{String}, Vector{String}, Vector{String}, typeof(_concrete_dependencies), Nothing))
precompile(include_package_for_output, (PkgId, String, Vector{String}, Vector{String}, Vector{String}, typeof(_concrete_dependencies), String))
precompile(create_expr_cache, (PkgId, String, String, typeof(_concrete_dependencies), IO, IO))
Loading

10 comments on commit dd375e1

@aviatesk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nanosoldier runbenchmarks("inference", vs="@3b1c54d91fe8ed9965ba9dc4880530c714c3f82b")

@nanosoldier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit seems to have introduced some regressions in the inference benchmarks. The regression in quadratic example seems to be critical especially.

@aviatesk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Keno
Copy link
Member

@Keno Keno commented on dd375e1 Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have fix this in #46584

@aviatesk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like that the very latest master is still as slow as this commit.

@Keno
Copy link
Member

@Keno Keno commented on dd375e1 Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks for checking

@vtjnash
Copy link
Member

@vtjnash vtjnash commented on dd375e1 Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Profiling indicates (on 31d4c22):

   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   248 @Base/compiler/typeinfer.jl:276; _typeinf(interp::BaseBenchmarks.InferenceBenchmarks.Infere...
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    248 @Base/compiler/typeinfer.jl:563; store_backedges
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     247 @Base/compiler/typeinfer.jl:571; store_backedges(caller::Core.MethodInstance, edges::Vecto...
 31╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 237 /Users/jameson/julia1/src/gf.c:1514; ijl_method_instance_add_backedge
  6╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  6   /Users/jameson/julia1/src/method.c:0; get_next_edge
 16╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  16  /Users/jameson/julia1/src/method.c:798; get_next_edge
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  35  /Users/jameson/julia1/src/method.c:799; get_next_edge
 17╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   17  /Users/jameson/julia1/src/./julia.h:1006; jl_array_ptr_ref
 18╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   18  /Users/jameson/julia1/src/./julia.h:1008; jl_array_ptr_ref
135╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  135 /Users/jameson/julia1/src/method.c:800; get_next_edge
  5╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  5   /Users/jameson/julia1/src/method.c:802; get_next_edge
  9╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9   /Users/jameson/julia1/src/method.c:804; get_next_edge
  9╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9   /Users/jameson/julia1/src/gf.c:1517; ijl_method_instance_add_backedge
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 1   /Users/jameson/julia1/src/gf.c:1527; ijl_method_instance_add_backedge
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  1   ...s/jameson/julia1/src/./julia_locks.h:81; jl_mutex_unlock
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   1   /Users/jameson/julia1/src/threading.c:708; _jl_mutex_unlock
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    1   /Users/jameson/julia1/src/threading.c:691; _jl_mutex_unlock_nogc
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     1   /Users/jameson/julia1/src/threading.c:95; ijl_get_pgcstack
  1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 1   .../lib/system/libsystem_pthread.dylib:?; pthread_getspecific
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     1   @Base/compiler/typeinfer.jl:576; store_backedges(caller::Core.MethodInstance, edges::Vecto...
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 1   @Base/compiler/utilities.jl:278; iterate(iter::Core.Compiler.BackedgeIterator, i::Int64)
  1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  1   @Base/boot.jl:841; Pair
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     308 ...e/compiler/abstractinterpretation.jl:2620; typeinf_local(interp::BaseBenchmarks.InferenceBenchmarks...
  1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 1   @Base/compiler/typelattice.jl:0; stoverwrite1!(state::Vector{Core.Compiler.VarState}, cha...
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 236 @Base/compiler/typelattice.jl:590; stoverwrite1!(state::Vector{Core.Compiler.VarState}, cha...
 16╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  16  @Base/compiler/typelattice.jl:0; invalidate_slotwrapper
107╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  134 @Base/compiler/typelattice.jl:519; invalidate_slotwrapper
 27╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   27  @Base/compiler/typelattice.jl:203; ignorelimited(typ::Any)
  2╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  2   @Base/compiler/typelattice.jl:520; invalidate_slotwrapper
 84╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  84  @Base/essentials.jl:13; getindex
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 30  @Base/compiler/typelattice.jl:594; stoverwrite1!(state::Vector{Core.Compiler.VarState}, cha...
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  30  @Base/range.jl:885; iterate
 30╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   30  @Base/promotion.jl:499; ==
Total snapshots: 627. Utilization: 100% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task

Looking in Cthulhu, there is no reason given that Core.Compiler.ignorelimited is very expensive to call and is uninferred (it just returns its argument). It also crashes Cthulhu to check this. I think it is possibly due to our inference-limiting heuristic that we stop inferring methods after the type reaches Any, unaware that this might thwart significant inlining optimization later.

Secondly, this methods needs to add 10k backedges from
+(Int64, Int64) from +(T, T) where {T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}}
to
quadratic(Int64) from quadratic(Any)
where this methodinstance already has about 5k backedges

and that changed in this PR from being a very cheap pointer comparison for each element in the loop, to requiring it loads the type tag of each element of that array that is very expensive to do

@timholy
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I interpreting @vtjnash's analysis as suggesting there is not obviously an easy fix for this issue? We'd have to make the backedges list be of homogeneous type but then come up with some other storage mechanism for handling invoke?

@aviatesk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have fixes on this regression locally. I will try to submit a PR on this.

Please sign in to comment.