-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove gridcontent type parameterization #30
Conversation
Codecov Report
@@ Coverage Diff @@
## master #30 +/- ##
==========================================
+ Coverage 86.89% 91.76% +4.86%
==========================================
Files 7 6 -1
Lines 1351 1312 -39
==========================================
+ Hits 1174 1204 +30
+ Misses 177 108 -69
Continue to review full report at Codecov.
|
@timholy I've been trying to incorporate most of the changes from your older PR (I made some other changes in the meantime that I needed to work around, also I really wanted to understand everything that was going on), but now I'm a bit surprised at the output of this snippet: using SnoopCompile
using GridLayoutBase
tinf = @snoopi_deep begin
gl = GridLayout()
gl2 = GridLayout()
gl[1, 1] = gl2
nothing
end It seems that there are large flames that are not red, and I thought I'd specifically addressed them in the precompilation file. Why are these lines still being inferred again, is there a trick to it that I'm not getting? Somehow all my "improvements" didn't seem to have any effect on timings. I have a little benchmark package which just compares runs across commits/branches and I compared master to my branch for this code: # using GridLayoutBase
@timed using GridLayoutBase
# GridLayout constructor
@timed GridLayout()
# Big GridLayout
@timed let
gl = GridLayout()
for i in 1:10, j in 1:10, k in 1:10
gl[i, j] = GridLayout()
end
end Here's the output, the optimizations branch is not really faster (here it might look like 10ms for
If you have time, I'd be glad for any enlightening thoughts you might have on this :) |
Thanks for picking this up! To explain the flamegraphs...unless the external (Base) code is inlined into your functions, you can't save the inference results---Julia only saves inference results for module-owned methods. On JuliaLang/julia#43990, here's what the flamegraph looks like: 😄 On that PR I get julia> tinf
InferenceTimingNode: 1.951009/1.951067 on Core.Compiler.Timings.ROOT() with 1 direct children whereas on Julia 1.6 I get InferenceTimingNode: 2.413862/2.990975 on Core.Compiler.Timings.ROOT() with 71 direct children |
Ok cool, so in your new branch basically no additional inference has to run (everything is precompiled and saved)? I still don't understand though, why something like |
Yep.
It immediately demands some Base methods, and those bars are stacked right on top of But you're right that the presence of julia> using GridLayoutBase, MethodAnalysis
julia> mis = methodinstances(GridLayoutBase.compute_rowcols)
1-element Vector{Core.MethodInstance}:
MethodInstance for compute_rowcols(::GridLayout, ::GeometryBasics.HyperRectangle{2, Float32})
julia> mi = first(mis)
MethodInstance for compute_rowcols(::GridLayout, ::GeometryBasics.HyperRectangle{2, Float32})
julia> mi.backedges
1-element Vector{Any}:
MethodInstance for align_to_bbox!(::GridLayout, ::GeometryBasics.HyperRectangle{2, Float32})
julia> ci = mi.cache
Core.CodeInstance(MethodInstance for compute_rowcols(::GridLayout, ::GeometryBasics.HyperRectangle{2, Float32}), #undef, 0x00000000000073d6, 0xffffffffffffffff, Tuple{GridLayoutBase.RowCols{Vector{Float32}}, GridLayoutBase.RowCols{Vector{Float32}}}, #undef, nothing, false, false, Ptr{Nothing} @0x0000000000000000, Ptr{Nothing} @0x0000000000000000)
julia> ci.inferred # where did the inferred code go?
julia> ci.invoke # because we loaded this from a *.ji file, we also don't have a useful pointer for the native code
Ptr{Nothing} @0x0000000000000000
julia> begin
gl = GridLayout()
gl2 = GridLayout()
gl[1, 1] = gl2
nothing
end
julia> ci.inferred # stripped again!
julia> ci.invoke # but we at least have the native code
Ptr{Nothing} @0x00007fa6e6840d00 One of the changes in my PR is to stop deleting the inferred code if you're doing precompilation, see the change in my PR to |
Interesting, I didn't know about this behavior! But looks like inference-wise this package will be in good shape after your changes landed in Julia. So I don't have to spend more time on that, although some algorithmic improvements might decrease general compilation duration. What is the current best-practice to generate a list of compilation time per function? If I understood it correctly, |
There's also |
Thanks for tackling this! Lower priority, but it still might be worth making |
I had tried those changes out and they didn't seem to have much effect in timings. Because those were breaking I reversed them then, figuring it was not worth the effort |
That's reasonable! |
No description provided.