Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load from GC frame preventing vectorization #13301

Closed
yuyichao opened this issue Sep 24, 2015 · 4 comments · Fixed by #13463
Closed

Load from GC frame preventing vectorization #13301

yuyichao opened this issue Sep 24, 2015 · 4 comments · Fixed by #13463
Labels
compiler:codegen Generation of LLVM IR and native code performance Must go faster regression Regression in behavior compared to a previous version

Comments

@yuyichao
Copy link
Contributor

The following code (T = Float32) fails to vectorize on the current master (but vectorizes on 0.4-release)

function test_scale2{T}(nele, factor::T)
    ary1 = Vector{T}(nele)
    @inbounds @simd for i in 1:nele
        v = ary1[i] + factor
        ary1[i] = v
    end
end

The function vectorize successfully if the allocation of the array is moved out of the function.

According to code_llvm_raw. There is a load in the loop that doesn't have a tbaa node attached.

AFAICT, that line is the load of the array from the GC frame. So,

  1. We can probably have a tbaa node for the gc frame, which should help this case and probably improve other stuff in general.
  2. Do we really ever want to load from the GC frame? They are leaked immediately to the global state so LLVM would never be able to optimize them out but they won't actually be changed by anything not in the current function.

@vtjnash @simonster

@yuyichao yuyichao added the compiler:codegen Generation of LLVM IR and native code label Sep 24, 2015
@simonster simonster added performance Must go faster regression Regression in behavior compared to a previous version labels Sep 24, 2015
@simonster
Copy link
Member

Conceivably we could bracket stores to the GC frame in llvm.invariant.end/llvm.invariant.start to tell LLVM that nothing else will ever modify it, even other functions. Not sure if LLVM will really use these though.

@yuyichao
Copy link
Contributor Author

Will that work if we have multiple stores to it? (Allocation in a loop or reuse of slots)

@simonster
Copy link
Member

I think so. The idea is to tell LLVM that the GC frame stops being invariant before a store and starts being invariant again afterwards. Might still be useful sometimes to have TBAA for the store so that LLVM knows it doesn't alias other loads. But as I noted in #8867 (comment) it's not clear if LLVM can actually make use of this metadata...

@JeffBezanson
Copy link
Sponsor Member

I see vector instructions in this example now. Fixed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants