-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Faster Math.Max/Min for x64 #65625
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsOptimize float Test(float a) => Math.Max(a, 10); Currently emits: ; Method Tests4:Test(float):float:this
G_M55200_IG01:
vzeroupper
G_M55200_IG02:
vmovss xmm0, dword ptr [reloc @RWD00]
vucomiss xmm1, xmm0
jp SHORT G_M55200_IG03
je SHORT G_M55200_IG06
G_M55200_IG03:
vucomiss xmm1, xmm1
jp SHORT G_M55200_IG05
vucomiss xmm1, xmm0
ja SHORT G_M55200_IG04
jmp SHORT G_M55200_IG08
G_M55200_IG04:
vmovaps xmm0, xmm1
jmp SHORT G_M55200_IG08
G_M55200_IG05:
vmovaps xmm0, xmm1
jmp SHORT G_M55200_IG08
G_M55200_IG06:
vmovaps xmm1, xmm0
vmovd eax, xmm1
test eax, eax
jl SHORT G_M55200_IG07
jmp SHORT G_M55200_IG08
G_M55200_IG07:
vmovss xmm0, dword ptr [reloc @RWD00]
G_M55200_IG08:
ret
RWD00 dd 41200000h ; 10
; Total bytes of code: 68 Expected codegen: ; Method Tests4:Test(float):float:this
vzeroupper
vmaxss xmm0, xmm1, dword ptr [reloc @RWD00]
ret
RWD00 dd 41200000h ; 10.0
; Total bytes of code: 12 #65584 did it for ARM where we could do it even for both non-constants
|
In particular, x86/x64 provide What this functionally means is that if both inputs are unknown, we can't really "optimize" and instead have to use
This ensures that |
Hey, everyone. I would like to take a look at the issue. @EgorBo Could you please tell me how you made "Math.Max" call to get inlined? |
Hey, sure! What do you mean inlined? Intrinsic? |
@EgorBo Thank you, so basically you are changing the call to instrinsic use. |
@EgorBo Seems like I figured out how to make the changes, but I got a few questions. |
Yes, we don't expand intrinsics in tier0 (unoptimized code) - only a few "must expand" ones. So I'd suggest to disable tiered compilation during development |
@EgorBo please take a look at my PR, I would like to know what you think of this. |
linking #65700 |
…#69434) * Add xarch optimization for min/max (#65625) * Changes according to the requirements (#65625) * Draft for Math.Max/Math.Min optimization (#65625) * Draft for optimizing Math.Max/Math.Min with a const (#65625) * Fix tests (#65625) * Refactoring of the conditions (#65625) * Fix of the summary (#65625) * Refactoring due to the PR comments (#65625) * Add spilling side effect + Fix of formats (#65625) * Update src/coreclr/jit/importer.cpp Co-authored-by: Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com> * Update src/coreclr/jit/importer.cpp Co-authored-by: Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com> Co-authored-by: Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com>
Optimize
Math.Max/Min
to a single instruction on x64 when one of the arguments is a constant (not NaN and whoever will be implementing it has to be careful around -/+0.0)Currently emits:
Expected codegen:
#65584 did it for ARM where we could do it even for both non-constants
The text was updated successfully, but these errors were encountered: