You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I guess (and did a very quick search) LLVM does not have a mod intrinsic, otherwise I would expect it to do this optimization.
mod(i::Integer, 2^n) could be rewritten as something like (0b1<<n) & i (I think I meant (0b1 << (n+1)) - 1). I don't know how to benchmark small functions but here's a wonky benchmark:
f(i) =sin(mod(i, 4))
mod4(i) =0b11& i
g(i) =sin(mod4(i))
functionfacc(n)
a =0.0for i =1:n
a +=f(i)
endreturn a
endfunctiongacc(n)
a =0.0for i =1:n
a +=g(i)
endreturn a
end
I believe the main cause of OP stall is the failure of inlining. The mod produces slightly less efficient code than the bitwise and, but it does not matter compared to the slow sin.
The problem is that mod is supposed to use division instruction during the cost estimation for inlining. Of course, finally LLVM backend doesn't generate the divide instruction (on x86 systems).
The cost estimation is a difficult problem on modern CPUs. The case of OP should be resolvable with @inline and I don't think it's essential. I think we need the tips for non-developers about inlining. I have a plan to implement the hints feature in CodeCosts.jl package, but I haven't started yet.
I guess (and did a very quick search) LLVM does not have a
mod
intrinsic, otherwise I would expect it to do this optimization.mod(i::Integer, 2^n)
could be rewritten as something like(I think I meant(0b1<<n) & i
(0b1 << (n+1)) - 1
). I don't know how to benchmark small functions but here's a wonky benchmark:The text was updated successfully, but these errors were encountered: