Make DemoteFloat16 a conditional pass #43327

gbaraldi · 2021-12-03T20:43:43Z

Attempt at #40216
For now it's just an if statement, which might be enough. I wasn't sure if the check should be inside the pass or if the pass should be conditional. For now it's outside it.

julia> a = Float16(1)
Float16(1.0)

julia> b = Float16(2)
Float16(2.0)

julia> f(a,b) = a + b
f (generic function with 1 method)

#before 
 @code_llvm f(a,b)
;  @ REPL[4]:1 within `f`
define half @julia_f_127(half %0, half %1) #0 {
top:
; ┌ @ float.jl:397 within `+`
   %2 = fpext half %0 to float
   %3 = fpext half %1 to float
   %4 = fadd float %2, %3
   %5 = fptrunc float %4 to half
; └
  ret half %5
}
julia> @btime sum(x) setup = x = rand(Float16,100000)
  56.083 μs (0 allocations: 0 bytes)
Float16(5.002e4)

#after 
julia> @code_llvm f(a,b)
;  @ REPL[5]:1 within `f`
define half @julia_f_155(half %0, half %1) #0 {
top:
; ┌ @ float.jl:397 within `+`
   %2 = fadd half %0, %1
; └
  ret half %2
}
julia> @btime sum(x) setup = x = rand(Float16,100000)
  4.018 μs (0 allocations: 0 bytes)
Float16(4.992e4)

For this to work on the m1 #41924 needs to be merged

Fix #40216.

vchuravy · 2021-12-05T17:18:59Z

This needs to integrate with multiversioning so that we can create a sysimage in which this is enabled and disabled and then loaded conditionally #40216 (comment)

gbaraldi · 2021-12-06T01:51:16Z

The multiversioning stuff is a bit out of my water so what I did was kind of pattern matching.

src/aotcompile.cpp

src/llvm-multiversioning.cpp

gbaraldi · 2022-01-19T23:59:00Z

Bump :)

vchuravy · 2022-01-20T18:39:04Z

src/aotcompile.cpp

-    if (optlevel > 1)
-        PM->add(createGVNPass());
+    auto feat_string = TM->getTargetFeatureString();
+    if(feat_string.find("+fp16fml") == llvm::StringRef::npos||feat_string.find("+fp16fml") == llvm::StringRef::npos){


So instead of doing this check here, I think we need to do it on a per-function basis, within the pass.

The way this works is that the multi-versioning pass will clone the function if it contains Float16 ops and then we have two copies of the function, one that should have a target feature set and one without it. On the one that is lacking the target-feature we still want to run the DemoteFloat16 pass.

For the JIT context there is likely a similar issue that @vtjnash, @JeffBezanson and I discussed yesterday.
See #43085 (comment) for some more context

I moved the check to inside the pass, but I don't think the way the check is done is the best

The multiversioning check part seems correct (per-function). The issue is jl_ExecutionEngine->getTargetFeatureString right now can be looking at the wrong target for many people (e.g. GPUs, static-compilation, sysimg building), which is what https://reviews.llvm.org/D120585 was hoping to solve (edit: fixed link)

Yeah, I just did it similarly to how we do FMA. Could we get that the feature flags from TTI? Or does that need your change.
Also it seems the backend already does what demote float16 would do, but I guess it would then miss some possible optimizations

Right now, only builtin LLVM passes are allowed to access that data, but it is forbidden to external passes. That PR would make is available to all passes, but it is possible that LLVM devs will conceptually dislike the idea of fixing this bug.

I guess the "correct" thing would be to add a hasFloat16 function or something like it. But that sounds like a very roundabout way to solve something that seems quite obvious.

Do you have any other idea of how this could be done? Also isn't it an issue if the TargetMachine disagrees with TargetFeatureString? What does machine code generation use as the truth?

maleadt

Superficially looks OK, but I'm really not that familiar with the multiversioning pass.

vchuravy · 2022-08-11T14:07:48Z

src/llvm-demote-float16.cpp

+        return true;
+    }
+#else
+    if (FS.find("+avx512fp16") != llvm::StringRef::npos){


Note https://reviews.llvm.org/D107082 we are not there yet, but LLVM will support _Float16 correctly on SSE2 and above.

Note that the LLVM PR also changes the ABI to match GCC12 and thus is going to break us in fun ways. I haven't found how -fexcess-precision=16 is going to be implemented in LLVM,

I just added this there because GCC was complaining about FS being unused. That part of the branch doesn't matter for now since x86 is considered as never having Float16 for now.
SSE2 has f16C instructions which are just fast conversions which we might already use, I know we use them on aarch64 at least. The first native operations on float16 are the avx512 ones.

* add TargetMachine check * Add initial float16 multiversioning stuff * make check more robust and remove x86 check * move check to inside the pass * C++ is hard * Comment out the ckeck because it won't work inside the pass * whitespace in the comment * Change the logic not to depend on a TM * Add preliminary support for x86 test * Cosmetic changes (cherry picked from commit d18fd47)

gbaraldi changed the title ~~Make DemoteFloat16 an option pass~~ Make DemoteFloat16 a conditional pass Dec 3, 2021

vchuravy requested a review from maleadt December 5, 2021 17:17

maleadt reviewed Dec 6, 2021

View reviewed changes

src/aotcompile.cpp Outdated Show resolved Hide resolved

maleadt reviewed Dec 6, 2021

View reviewed changes

src/llvm-multiversioning.cpp Outdated Show resolved Hide resolved

oscardssmith added float16 forget me not PRs that one wants to make sure aren't forgotten labels Jan 20, 2022

vchuravy reviewed Jan 20, 2022

View reviewed changes

gbaraldi mentioned this pull request Jul 21, 2022

codegen: truncate Float16 vector ops also #46130

Merged

vchuravy added this to the 1.9 milestone Aug 2, 2022

gbaraldi requested a review from maleadt August 10, 2022 18:30

maleadt approved these changes Aug 11, 2022

View reviewed changes

maleadt requested a review from yuyichao August 11, 2022 08:56

vchuravy reviewed Aug 11, 2022

View reviewed changes

gbaraldi mentioned this pull request Aug 26, 2022

Add Float16 to supported x86 processors #46499

Merged

vchuravy approved these changes Oct 10, 2022

View reviewed changes

KristofferC removed this from the 1.9 milestone Nov 15, 2022

vchuravy force-pushed the add-float16 branch from 738798f to 8eff5d1 Compare November 16, 2022 02:35

vchuravy added merge me PR is reviewed. Merge when all tests are passing backport 1.9 Change should be backported to release-1.9 labels Nov 16, 2022

gbaraldi added 8 commits November 17, 2022 09:04

add TargetMachine check

8271edd

Add initial float16 multiversioning stuff

56413c5

make check more robust and remove x86 check

ad259a9

move check to inside the pass

5e29529

C++ is hard

6b3ac1a

Comment out the ckeck because it won't work inside the pass

fcd7fda

whitespace in the comment

7398568

Change the logic not to depend on a TM

c6f4cc9

gbaraldi added 2 commits November 17, 2022 09:04

Add preliminary support for x86 test

9abcf4d

Cosmetic changes

893b1a1

DilumAluthge force-pushed the add-float16 branch from 8eff5d1 to 893b1a1 Compare November 17, 2022 14:04

giordano merged commit d18fd47 into JuliaLang:master Nov 21, 2022

giordano removed the merge me PR is reviewed. Merge when all tests are passing label Nov 21, 2022

giordano mentioned this pull request Nov 21, 2022

Hardware Float16 on A64fx #40216

Closed

vchuravy mentioned this pull request Jan 16, 2023

Fix small nits in multiversioning #47675

Merged

KristofferC removed the backport 1.9 Change should be backported to release-1.9 label Jan 17, 2023

giordano mentioned this pull request Mar 8, 2023

performance regression for min_fast and Float16 on master #48848

Open

vchuravy mentioned this pull request Aug 5, 2023

Base.HWReal on Apple silicon #50804

Open

DilumAluthge removed the forget me not PRs that one wants to make sure aren't forgotten label Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make DemoteFloat16 a conditional pass #43327

Make DemoteFloat16 a conditional pass #43327

gbaraldi commented Dec 3, 2021 •

edited by giordano

Loading

vchuravy commented Dec 5, 2021

gbaraldi commented Dec 6, 2021

gbaraldi commented Jan 19, 2022

vchuravy Jan 20, 2022

gbaraldi Jan 21, 2022

vtjnash Sep 15, 2022 •

edited

Loading

gbaraldi Sep 15, 2022 •

edited

Loading

vtjnash Sep 15, 2022 •

edited

Loading

gbaraldi Sep 15, 2022

gbaraldi Oct 10, 2022

maleadt left a comment

vchuravy Aug 11, 2022

gbaraldi Aug 11, 2022

Make DemoteFloat16 a conditional pass #43327

Make DemoteFloat16 a conditional pass #43327

Conversation

gbaraldi commented Dec 3, 2021 • edited by giordano Loading

vchuravy commented Dec 5, 2021

gbaraldi commented Dec 6, 2021

gbaraldi commented Jan 19, 2022

vchuravy Jan 20, 2022

Choose a reason for hiding this comment

gbaraldi Jan 21, 2022

Choose a reason for hiding this comment

vtjnash Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

gbaraldi Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

vtjnash Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

gbaraldi Sep 15, 2022

Choose a reason for hiding this comment

gbaraldi Oct 10, 2022

Choose a reason for hiding this comment

maleadt left a comment

Choose a reason for hiding this comment

vchuravy Aug 11, 2022

Choose a reason for hiding this comment

gbaraldi Aug 11, 2022

Choose a reason for hiding this comment

gbaraldi commented Dec 3, 2021 •

edited by giordano

Loading

vtjnash Sep 15, 2022 •

edited

Loading

gbaraldi Sep 15, 2022 •

edited

Loading

vtjnash Sep 15, 2022 •

edited

Loading