Replace several intrinsics with Julia equivalents #22202

timholy · 2017-06-03T13:41:16Z

While thinking about #18521/#20005 a bit more, I realized that perhaps an easier approach would be to ensure that every creation of certain exception types occurred from Julia code. To cover both InexactError and DomainError (probably our two most irritating errors), that required Julia replacements for 4 intrinsics.

I've checked that the LLVM code generation is identical, but it's probably a good idea to @nanosoldier runbenchmarks((ALL, vs = ":master").

timholy · 2017-06-03T13:43:10Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

musm · 2017-06-03T15:54:31Z

base/math.jl

@@ -435,8 +435,14 @@ Compute sine and cosine of `x`, where `x` is in radians.
    return res
 end

-sqrt(x::Float64) = sqrt_llvm(x)
-sqrt(x::Float32) = sqrt_llvm(x)
+@inline function sqrt(x::Float64)


why not use dispatch for the f32 and f64 version?

jebej · 2017-06-03T15:57:56Z

src/julia.h

@@ -521,7 +521,6 @@ extern JL_DLLEXPORT jl_value_t *jl_stackovf_exception;
 extern JL_DLLEXPORT jl_value_t *jl_memory_exception;
 extern JL_DLLEXPORT jl_value_t *jl_readonlymemory_exception;
 extern JL_DLLEXPORT jl_value_t *jl_diverror_exception;
-extern JL_DLLEXPORT jl_value_t *jl_domain_exception;
 extern JL_DLLEXPORT jl_value_t *jl_overflow_exception;
 extern JL_DLLEXPORT jl_value_t *jl_inexact_exception;


I'm not sure I fully understand, but if nothing else throws inexact exceptions from C can this be removed (like for domain_exception)?

Yes, looks like you're right. There are no uses of cvt_iintrinsic_checked after this, so even though it looks like there is another use of jl_inexact_exception in runtime_intrinsics, it seems to be dead code.

I already have that removed in the next PR, but I might as well move that to this one.

Speaking of that next PR, I'm noticing that the elegant mechanism here doesn't seem to work as I'm suspecting it was intended:

be1(a, i) = (if i > 1 throw(BoundsError(a, i)) end; @inbounds ret = a[1]; ret) be2(a, i) = (if i > 1 Base.throw_boundserror(a, i) end; @inbounds ret = a[1]; ret) a = rand(5) julia> @code_llvm be1(a, 2) define double @julia_be1_62876(i8** dereferenceable(40), i64) #0 !dbg !5 { top: %ptls_i8 = call i8* asm "movq %fs:0, $0;\0Aaddq $$-10928, $0", "=r,~{dirflag},~{fpsr},~{flags}"() #3 %ptls = bitcast i8* %ptls_i8 to i8**** %2 = alloca [3 x i8**], align 8 %.sub = getelementptr inbounds [3 x i8**], [3 x i8**]* %2, i64 0, i64 0 %3 = getelementptr [3 x i8**], [3 x i8**]* %2, i64 0, i64 2 store i8** null, i8*** %3, align 8 %4 = bitcast [3 x i8**]* %2 to i64* store i64 2, i64* %4, align 8 %5 = getelementptr [3 x i8**], [3 x i8**]* %2, i64 0, i64 1 %6 = bitcast i8* %ptls_i8 to i64* %7 = load i64, i64* %6, align 8 %8 = bitcast i8*** %5 to i64* store i64 %7, i64* %8, align 8 store i8*** %.sub, i8**** %ptls, align 8 %9 = icmp slt i64 %1, 2 br i1 %9, label %L5, label %if if: ; preds = %top %10 = call i8** @jlsys_Type_59491(i8** inttoptr (i64 140329698287600 to i8**), i8** nonnull %0, i64 %1) store i8** %10, i8*** %3, align 8 call void @jl_throw(i8** %10) unreachable L5: ; preds = %top %11 = bitcast i8** %0 to double** %12 = load double*, double** %11, align 8 %13 = load double, double* %12, align 8 %14 = load i64, i64* %8, align 8 store i64 %14, i64* %6, align 8 ret double %13 } julia> @code_llvm be2(a, 2) define double @julia_be2_62879(i8** dereferenceable(40), i64) #0 !dbg !5 { top: %2 = icmp slt i64 %1, 2 br i1 %2, label %L5, label %if if: ; preds = %top call void @julia_throw_boundserror_62880(i8** nonnull %0, i64 %1) call void @llvm.trap() unreachable L5: ; preds = %top %3 = bitcast i8** %0 to double** %4 = load double*, double** %3, align 8 %5 = load double, double* %4, align 8 ret double %5 }

Obviously I can introduce throw_inexacterror if necessary.

nanosoldier · 2017-06-03T16:39:56Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

ararslan · 2017-06-03T18:26:43Z

base/math.jl

+    sqrt_llvm(x)
+end
+@inline function sqrt(x::Float32)
+    x < 0.0f0 && throw(DomainError())


Would it be possible to just use x < zero(x) and combine the two methods?

timholy · 2017-06-03T22:30:00Z

So, some of those regressions do not seem to be reproducible, but others are real; one of the most affected is

x = 1+2im
y = UInt(1)+UInt(2)im

myplus(x, y) = x+y

If you inspect @code_llvm x+y, you get identical code with this branch as with master; but @code_llvm myplus(x,y) fails to inline + on this branch (but not master). This seem to arise because our inliner-worthiness algorithm counts expressions, and an intrinsic that generates the same LLVM as julia code nevertheless "looks" simpler by hiding the instructions from the inliner. So while I can force-inline the convert operation itself, things that use the convert will be more heavily penalized.

I'm not exactly sure how to handle this. It seems that the right solution is to improve the inliner (e.g., make it depend on LLVM IR rather than julia expressions), but that also seems like a big job.

timholy · 2017-07-08T08:29:02Z

Rebased on top of #22210, at least that specific problem is resolved.

Time for another @nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2017-07-08T08:53:07Z

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @jrevels

timholy · 2017-07-08T09:20:35Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2017-07-08T12:20:26Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

timholy · 2017-07-08T14:08:16Z

Looks good. The only two of these that are reproducible are ["io","read","readstring"] and ["scalar","intfuncs",("nextpow2","Int64","-")], and locally these are ~10% effects. They are due to very small differences in cost accounting ending up on one side or the other of the inlining threshold (e.g., cost 92 vs 105). If we really cared we could bump the threshold a bit.

The only uses of the function were removed in #22202, so building with Clang previously emitted an unused function warning.

JeffBezanson approved these changes Jun 3, 2017

View reviewed changes

musm reviewed Jun 3, 2017

View reviewed changes

jebej reviewed Jun 3, 2017

View reviewed changes

ararslan reviewed Jun 3, 2017

View reviewed changes

ararslan added maths Mathematical functions performance Must go faster labels Jun 3, 2017

timholy force-pushed the teh/purge_intrinsics branch from c3aa2b8 to 9d0d30a Compare June 3, 2017 22:13

timholy mentioned this pull request Jun 4, 2017

Decide inline-worthiness based on a more nuanced cost model #22210

Merged

timholy mentioned this pull request Jun 16, 2017

Should DomainErrors accept arguments? #12152

Closed

timholy force-pushed the teh/purge_intrinsics branch from 9d0d30a to 36eb30e Compare July 8, 2017 08:27

timholy added 3 commits July 8, 2017 04:01

Replace check_top_bit intrinsic with Julia code

3f588ce

Eliminate sqrt_llvm_fast but move DomainError to Julia

914c446

Convert checked_trunc_[su]int intrinsics to Julia code

d5ec1c6

timholy force-pushed the teh/purge_intrinsics branch from 36eb30e to d5ec1c6 Compare July 8, 2017 09:02

timholy merged commit 0c9a858 into master Jul 8, 2017

timholy deleted the teh/purge_intrinsics branch July 8, 2017 14:08

ararslan added a commit that referenced this pull request Jul 8, 2017

Remove the unused all_eq function in runtime_intrinsics

0f818c8

The only uses of the function were removed in #22202, so building with Clang previously emitted an unused function warning.

ararslan mentioned this pull request Jul 8, 2017

Remove the unused all_eq function in runtime_intrinsics #22714

Merged

ararslan added a commit that referenced this pull request Jul 9, 2017

Remove the unused all_eq function in runtime_intrinsics

047a688

The only uses of the function were removed in #22202, so building with Clang previously emitted an unused function warning.

timholy mentioned this pull request Jul 11, 2017

RGB(2.0,0.0,0.0) should throw an error JuliaGraphics/ColorTypes.jl#85

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace several intrinsics with Julia equivalents #22202

Replace several intrinsics with Julia equivalents #22202

timholy commented Jun 3, 2017

timholy commented Jun 3, 2017

musm Jun 3, 2017 •

edited

Loading

jebej Jun 3, 2017

JeffBezanson Jun 3, 2017

timholy Jun 3, 2017

timholy Jun 3, 2017

nanosoldier commented Jun 3, 2017

ararslan Jun 3, 2017

timholy commented Jun 3, 2017 •

edited

Loading

timholy commented Jul 8, 2017

nanosoldier commented Jul 8, 2017

timholy commented Jul 8, 2017

nanosoldier commented Jul 8, 2017

timholy commented Jul 8, 2017

Replace several intrinsics with Julia equivalents #22202

Replace several intrinsics with Julia equivalents #22202

Conversation

timholy commented Jun 3, 2017

timholy commented Jun 3, 2017

musm Jun 3, 2017 • edited Loading

Choose a reason for hiding this comment

jebej Jun 3, 2017

Choose a reason for hiding this comment

JeffBezanson Jun 3, 2017

Choose a reason for hiding this comment

timholy Jun 3, 2017

Choose a reason for hiding this comment

timholy Jun 3, 2017

Choose a reason for hiding this comment

nanosoldier commented Jun 3, 2017

ararslan Jun 3, 2017

Choose a reason for hiding this comment

timholy commented Jun 3, 2017 • edited Loading

timholy commented Jul 8, 2017

nanosoldier commented Jul 8, 2017

timholy commented Jul 8, 2017

nanosoldier commented Jul 8, 2017

timholy commented Jul 8, 2017

musm Jun 3, 2017 •

edited

Loading

timholy commented Jun 3, 2017 •

edited

Loading