-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.9 compatibility #1710
Comments
Sorting regression (a using CUDA
function kernel()
@cuda dynamic=true threads=Int32(1) blocks=Int64(1) identity(nothing)
return
end
function main()
@cuda kernel()
end Looks like there's some inference regression when splatting heterogeneous kwarg tuples. Further reduced to: child(; kwargs...) = return
function parent()
child(; a=1f0, b=1.0)
return
end
Further reduced to: using GPUCompiler
child(; kwargs...) = return
function parent()
child(; a=1f0, b=1.0)
return
end
# this override introduces a `jl_invoke`
GPUCompiler.@override GPUCompiler.GLOBAL_METHOD_TABLE @noinline Core.throw_inexacterror(f::Symbol, ::Type{T}, val) where {T} =
return
module DummyRuntime
# dummy methods
signal_exception() = return
malloc(sz) = C_NULL
report_oom(sz) = return
report_exception(ex) = return
report_exception_name(ex) = return
report_exception_frame(idx, func, file, line) = return
end
struct DummyCompilerParams <: AbstractCompilerParams end
GPUCompiler.runtime_module(::CompilerJob{<:Any,DummyCompilerParams}) = DummyRuntime
function main()
source = FunctionSpec(typeof(parent))
target = NativeCompilerTarget()
params = DummyCompilerParams()
job = CompilerJob(target, source, params)
JuliaContext() do ctx
string(GPUCompiler.compile(:llvm, job; ctx)[1])
end
end
isinteractive() || main() i.e. adding that overlay on
Now bisected to JuliaLang/julia#44224. @aviatesk, any quick thoughts? I'll also try to reduce this to a simpler AbsInt+overlay MWE. |
The shmem issue can be reproduced with: @inline shmem() = Base.llvmcall(("""
@shmem = internal global [1 x i8] zeroinitializer, align 32
define i8* @entry() #0 {
ret i8* getelementptr inbounds ([1 x i8], [1 x i8]* @shmem, i64 0, i64 0)
}
attributes #0 = { alwaysinline }""", "entry"),
Core.LLVMPtr{Int8,0}, Tuple{})
function main()
ptr1 = reinterpret(Ptr{Int8}, shmem())
arr1 = unsafe_wrap(Array, ptr1, 1)
ptr2 = reinterpret(Ptr{Int8}, shmem())
arr2 = unsafe_wrap(Array, ptr2, 1)
@inbounds begin
arr1[] = 1
arr2[]
end
end
using InteractiveUtils
@code_llvm debuginfo=:none dump_module=true main()
@show main() On 1.8, this yields two separate shmem variables: @shmem = internal global [1 x i8] zeroinitializer, align 32
@shmem.5 = internal global [1 x i8] zeroinitializer, align 32
...
%6 = call nonnull {}* inttoptr (i64 140193080728800 to {}* ({}*, i64, i64, i32)*)({}* inttoptr (i64 140192742526720 to {}*), i64 ptrtoint ([1 x i8]* @shmem to i64), i64 1, i32 0)
%8 = call nonnull {}* inttoptr (i64 140193080728800 to {}* ({}*, i64, i64, i32)*)({}* inttoptr (i64 140192742526720 to {}*), i64 ptrtoint ([1 x i8]* @shmem.5 to i64), i64 1, i32 0) While on 1.9: @shmem = internal global [1 x i8] zeroinitializer, align 32
%6 = call nonnull {}* inttoptr (i64 140218959557040 to {}* ({}*, i64, i64, i32)*)({}* inttoptr (i64 140218633426224 to {}*), i64 ptrtoint ([1 x i8]* @shmem to i64), i64 1, i32 0)
%8 = call nonnull {}* inttoptr (i64 140218959557040 to {}* ({}*, i64, i64, i32)*)({}* inttoptr (i64 140218633426224 to {}*), i64 ptrtoint ([1 x i8]* @shmem to i64), i64 1, i32 0) Bisected to JuliaLang/julia#44440. cc @jpsamaroo @pchintalapudi |
@dkarrasch Can you chime in on the The problem is with |
What's the return type of 3-arg |
3-arg similar demotes back to (Cu)Array, while 2-arg preserves the structure: julia> a = CUDA.rand(10,10);
julia> typeof(similar(Hermitian(a, :L), Float32, size(Hermitian(a, :L))))
CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
julia> typeof(similar(Hermitian(Array(a), :L), Float32, size(Hermitian(Array(a), :L))))
Matrix{Float32} (alias for Array{Float32, 2})
julia> typeof(similar(Hermitian(a, :L), Float32))
Hermitian{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}
julia> typeof(similar(Hermitian(Array(a), :L), Float32))
Hermitian{Float32, Matrix{Float32}} But this is, as demonstrated, similar to how Array works. |
Aha, I think it requires the same fix as in JuliaLinearAlgebra/BandedMatrices.jl#276. If you overload (potentially in a LinearAlgebra.cholcopy(A::RealHermSymComplexHerm{<:Any,<:CuArray}) =
copyto!(similar(A, LinearAlgebra.choltype(A)), A) does that fix the issue? The reason I generically switched to |
Yep, that seems to work, thanks! |
All tests work on the beta3 branch from JuliaLang/julia#48075. |
extern
, Use plain llvmcall calling convention for WMMA intrinsics. #1709Core.throw_inexacterror
overlay: Avoid a couple of InexactErrors in the IdDict code. JuliaLang/julia#48116The text was updated successfully, but these errors were encountered: