Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latency: fix cat invalidations #216

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

johnnychen94
Copy link

@johnnychen94 johnnychen94 commented Sep 26, 2023

This patch removes the invalidation on cat and thus reduces the OneHotArrays loading time from 4.5s to 0.5s (the normal status)

using OneHotArrays alone in a new Julia process needs about 0.57s. However, using LinearMaps before that makes the loading time of OneHotArrays to about 5s.

julia> @time using OneHotArrays # v0.2.4
  0.569807 seconds (539.89 k allocations: 36.725 MiB, 4.48% compilation time)
# in a new Julia process
# LinearMaps master branch
julia> @time using LinearMaps
  0.023610 seconds (6.77 k allocations: 501.328 KiB)

julia> @time using OneHotArrays
  5.038436 seconds (13.72 M allocations: 514.807 MiB, 2.32% gc time, 0.51% compilation time)

With SnoopCompile I've noticed LinearMaps invalidates the cat methods and OneHotArrays extends the cat methods:

julia> using SnoopCompileCore

julia> invalidations = @snoopr begin
           using LinearMaps
       end
174-element Vector{Any}:
   MethodInstance for Core.kwcall(::@NamedTuple{dims::Val{1}}, ::typeof(cat), ::Vector{Base.PkgId}, ::Vararg{Any})
   "invalidate_mt_cache"
   ...

This patch tweaks the cat method extension by dispatching on _cat instead of cat, and somehow it does work:

# also a new Julia process
# on this PR branch
julia> @time using LinearMaps
  0.012467 seconds (11.94 k allocations: 741.312 KiB)

julia> @time using OneHotArrays # 🎉 back to normal
  0.576323 seconds (539.17 k allocations: 36.551 MiB, 4.37% compilation time)

P.S. I figure this is related to JuliaLang/julia#45028 (because the Core.kwcall hint in the invalidations list) but I don't know why this works. Maybe @aviatesk or @timholy knows the magic behind?

Commenting out the Base.cat method in above link in OneHotArrays also reduces the latency, but I figure it's better to do it here since OneHotArrays looks more innocent while this packages code generates many cat methods in a seemingly wild way.
FWIW, if we reduce the number for k in 1:8 to something like for k in 1:4 the latency won't be that severe, too. As an experiment:

diff --git a/src/blockmap.jl b/src/blockmap.jl
index 72c2bb9..49ba024 100644
--- a/src/blockmap.jl
+++ b/src/blockmap.jl
@@ -497,7 +497,7 @@ BlockDiagonalMap(maps::LinearMap...) =

 # since the below methods are more specific than the Base method,
 # they would redefine Base/SparseArrays behavior
-for k in 1:8 # is 8 sufficient?
+for k in 1:16 # is 8 sufficient?
     Is = ntuple(n->:($(Symbol(:A, n))::AbstractVecOrMatOrQ), Val(k-1))
     # yields (:A1, :A2, :A3, ..., :A(k-1))
     L = :($(Symbol(:A, k))::LinearMap)

This makes using OneHotArrays takes forever.

This patch removes the invalidation on cat and thus reduces the OneHotArrays loading time from 4.5s to 0.5s (the normal status)
else
throw(ArgumentError("dims keyword in cat of LinearMaps must be (1,2)"))
@static if VERSION >= v"1.8"
# Dispatching on `cat` makes compiler hard to infer types and causes invalidations
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some random explaination; I figure this is related to the aggresive type inference introduced by JuliaLang/julia#45028 but I don't know what's actually happening behind the scene.

@codecov
Copy link

codecov bot commented Sep 26, 2023

Codecov Report

Attention: Patch coverage is 50.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 99.43%. Comparing base (c7a3c56) to head (0956320).

Current head 0956320 differs from pull request most recent head da9eded

Please upload reports for the commit da9eded to get more accurate results.

Files Patch % Lines
src/blockmap.jl 50.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #216      +/-   ##
==========================================
+ Coverage   99.30%   99.43%   +0.12%     
==========================================
  Files          22       22              
  Lines        1591     1595       +4     
==========================================
+ Hits         1580     1586       +6     
+ Misses         11        9       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dkarrasch
Copy link
Member

First I was afraid this would bring back #123 (see #124 for the offending method signature), but this works since you keep the method signature as is. Unfortunately, on my machine, this PR increases loading time from 0.05 seconds to about 7 seconds on v1.9!

@johnnychen94
Copy link
Author

Looks like this isn't the right fix. I'll try to see if there's any better solution then.

@dkarrasch
Copy link
Member

Interestingly, this seems to work well for Julia v1.10+. Unfortunately, load time of OneHotArrays.jl doubles from v1.10 to v1.11 and master even without any interference by LinearMaps.jl. But this PR increases load time of LinearMaps.jl by just a little bit, and decreases load time of OneHotArrays.jl after LinearMaps.jl dramatically (factor 8, roughly).

@dkarrasch
Copy link
Member

I like this, and I think we should push this even further: ride the generic call chain and catch somewhere much deeper, where there is almost no method competition. Doesn't work yet, but pushed to continue working on it from work tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants