-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise how constant SIMD vectors are defined in BCL #44115
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
RWD00 dq FF01FFFFFF00FFFFh, FF03FFFFFF02FFFFh ; <-- it's here Is this aligned? |
Both the 16 and 32-byte SIMD constants will be properly aligned if we aren't emitting "small" code: |
Nevermind, the example above is already folding 👍 |
@tannergooding there is a minor issue with alignment: public static void Test(ref Vector128<double> a, ref Vector256<double> b)
{
a = Vector128.Create(1.0, 2.0);
b = Vector256.Create(11.0, 12.0, 13.0, 14.0);
} ; Method Egor:Test(byref,byref)
G_M24708_IG01:
vzeroupper
G_M24708_IG02:
vmovupd xmm0, xmmword ptr [reloc @RWD00]
vmovupd xmmword ptr [rcx], xmm0
vmovupd ymm0, ymmword ptr[reloc @RWD32]
vmovupd ymmword ptr[rdx], ymm0
G_M24708_IG03:
vzeroupper
ret
RWD00 dq 3FF0000000000000h, 4000000000000000h
RWD16 dd 00000000h, 00000000h, 00000000h, 00000000h ;; <--- padding
RWD32 dq 4026000000000000h, 4028000000000000h, 402A000000000000h, 402C000000000000h
; Total bytes of code: 31 It could be: RWD00 dq 4026000000000000h, 4028000000000000h, 402A000000000000h, 402C000000000000h
RWD32 dq 3FF0000000000000h, 4000000000000000h slightly more compact |
Right, we don't currently do any sorting of values and I don't know how amiable the existing data structures are to having that happen. |
Tagging subscribers to this area: @tannergooding, @jeffhandley |
We need to take into account that on older runtimes there may be a regression by using |
Marked as |
Do we need to care about the older runtimes or just use the current best pattern? |
This mostly applies to hardware intrinsics which are .NET Core 3.1+ only, so I don't believe we have much to be concerned about here (I don't believe we are compiling for If we do target downlevel TFMs, then we should consider the perf implications as .NET 3.1 will have continued support through Dec 2022: https://dotnet.microsoft.com/platform/support/policy/dotnet-core |
I can create a PR (next week or so), then we can discuss more over there?! |
ML.NET's highest .NET version (for the relevant project) is .NET Core 3.1, so current code over there seems to be the best option. Using Vector{128|256}.Create would be a de-optimization. Places: ASP.NET Core uses Vector{128|256}.Create already, no need to change something over there. |
From #44111 (comment)
There are 3 patterns we currently use across the BCL for const vectors:
Here is the current codegen for these cases:
The first case used to be avoided due to some codegen issues, but looks like those were resolved (e.g. JIT now saves such vectors into the data section, does Value Numbering for SIMDs including constant vectors, does CSE, etc - #31834?) so we now have a lot of
static readonly
fields and we can revise them and convert intoCase1(1.1)
-style where possible (maybe even if we need to duplicate them in different methods), e.g.:Places to revise:
Known limitations for Case1:
/cc @stephentoub @GrabYourPitchforks @benaadams @tannergooding
The text was updated successfully, but these errors were encountered: