-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace several intrinsics with Julia equivalents #22202
Conversation
@nanosoldier |
base/math.jl
Outdated
@@ -435,8 +435,14 @@ Compute sine and cosine of `x`, where `x` is in radians. | |||
return res | |||
end | |||
|
|||
sqrt(x::Float64) = sqrt_llvm(x) | |||
sqrt(x::Float32) = sqrt_llvm(x) | |||
@inline function sqrt(x::Float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use dispatch for the f32 and f64 version?
src/julia.h
Outdated
@@ -521,7 +521,6 @@ extern JL_DLLEXPORT jl_value_t *jl_stackovf_exception; | |||
extern JL_DLLEXPORT jl_value_t *jl_memory_exception; | |||
extern JL_DLLEXPORT jl_value_t *jl_readonlymemory_exception; | |||
extern JL_DLLEXPORT jl_value_t *jl_diverror_exception; | |||
extern JL_DLLEXPORT jl_value_t *jl_domain_exception; | |||
extern JL_DLLEXPORT jl_value_t *jl_overflow_exception; | |||
extern JL_DLLEXPORT jl_value_t *jl_inexact_exception; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand, but if nothing else throws inexact exceptions from C can this be removed (like for domain_exception
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks like you're right. There are no uses of cvt_iintrinsic_checked
after this, so even though it looks like there is another use of jl_inexact_exception
in runtime_intrinsics, it seems to be dead code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already have that removed in the next PR, but I might as well move that to this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of that next PR, I'm noticing that the elegant mechanism here doesn't seem to work as I'm suspecting it was intended:
be1(a, i) = (if i > 1 throw(BoundsError(a, i)) end; @inbounds ret = a[1]; ret)
be2(a, i) = (if i > 1 Base.throw_boundserror(a, i) end; @inbounds ret = a[1]; ret)
a = rand(5)
julia> @code_llvm be1(a, 2)
define double @julia_be1_62876(i8** dereferenceable(40), i64) #0 !dbg !5 {
top:
%ptls_i8 = call i8* asm "movq %fs:0, $0;\0Aaddq $$-10928, $0", "=r,~{dirflag},~{fpsr},~{flags}"() #3
%ptls = bitcast i8* %ptls_i8 to i8****
%2 = alloca [3 x i8**], align 8
%.sub = getelementptr inbounds [3 x i8**], [3 x i8**]* %2, i64 0, i64 0
%3 = getelementptr [3 x i8**], [3 x i8**]* %2, i64 0, i64 2
store i8** null, i8*** %3, align 8
%4 = bitcast [3 x i8**]* %2 to i64*
store i64 2, i64* %4, align 8
%5 = getelementptr [3 x i8**], [3 x i8**]* %2, i64 0, i64 1
%6 = bitcast i8* %ptls_i8 to i64*
%7 = load i64, i64* %6, align 8
%8 = bitcast i8*** %5 to i64*
store i64 %7, i64* %8, align 8
store i8*** %.sub, i8**** %ptls, align 8
%9 = icmp slt i64 %1, 2
br i1 %9, label %L5, label %if
if: ; preds = %top
%10 = call i8** @jlsys_Type_59491(i8** inttoptr (i64 140329698287600 to i8**), i8** nonnull %0, i64 %1)
store i8** %10, i8*** %3, align 8
call void @jl_throw(i8** %10)
unreachable
L5: ; preds = %top
%11 = bitcast i8** %0 to double**
%12 = load double*, double** %11, align 8
%13 = load double, double* %12, align 8
%14 = load i64, i64* %8, align 8
store i64 %14, i64* %6, align 8
ret double %13
}
julia> @code_llvm be2(a, 2)
define double @julia_be2_62879(i8** dereferenceable(40), i64) #0 !dbg !5 {
top:
%2 = icmp slt i64 %1, 2
br i1 %2, label %L5, label %if
if: ; preds = %top
call void @julia_throw_boundserror_62880(i8** nonnull %0, i64 %1)
call void @llvm.trap()
unreachable
L5: ; preds = %top
%3 = bitcast i8** %0 to double**
%4 = load double*, double** %3, align 8
%5 = load double, double* %4, align 8
ret double %5
}
Obviously I can introduce throw_inexacterror
if necessary.
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
base/math.jl
Outdated
sqrt_llvm(x) | ||
end | ||
@inline function sqrt(x::Float32) | ||
x < 0.0f0 && throw(DomainError()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to just use x < zero(x)
and combine the two methods?
c3aa2b8
to
9d0d30a
Compare
So, some of those regressions do not seem to be reproducible, but others are real; one of the most affected is x = 1+2im
y = UInt(1)+UInt(2)im
myplus(x, y) = x+y If you inspect I'm not exactly sure how to handle this. It seems that the right solution is to improve the inliner (e.g., make it depend on LLVM IR rather than julia expressions), but that also seems like a big job. |
9d0d30a
to
36eb30e
Compare
Rebased on top of #22210, at least that specific problem is resolved. Time for another @nanosoldier |
36eb30e
to
d5ec1c6
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Looks good. The only two of these that are reproducible are |
The only uses of the function were removed in #22202, so building with Clang previously emitted an unused function warning.
The only uses of the function were removed in #22202, so building with Clang previously emitted an unused function warning.
While thinking about #18521/#20005 a bit more, I realized that perhaps an easier approach would be to ensure that every creation of certain exception types occurred from Julia code. To cover both
InexactError
andDomainError
(probably our two most irritating errors), that required Julia replacements for 4 intrinsics.I've checked that the LLVM code generation is identical, but it's probably a good idea to @nanosoldier
runbenchmarks((ALL, vs = ":master")
.