-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark and expose additional Vector functions as Intrinsic #77562
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis resolves #76593
|
@@ -1150,7 +1151,7 @@ internal static Vector256<T> Create<T>(Vector128<T> lower, Vector128<T> upper) | |||
/// <returns>A new <see cref="Vector256{T}" /> instance with the first element initialized to <paramref name="value" /> and the remaining elements initialized to zero.</returns> | |||
/// <exception cref="NotSupportedException">The type of <paramref name="value" /> (<typeparamref name="T" />) is not supported.</exception> | |||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | |||
internal static Vector256<T> CreateScalar<T>(T value) | |||
public static Vector256<T> CreateScalar<T>(T value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we change this to
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector256<T> CreateScalar<T>(T value)
where T : struct
{
if (Avx.IsSupported)
{
return Vector128.CreateScalar(value).ToVector256Unsafe();
}
return Vector128.CreateScalar(value).ToVector256();
}
I end up not using this method a lot of times because it currently emits an extra vmovaps
to clear the upper lane after the scalar move, even though the scalar move will have already zeroed the upper lane with VEX encoding.
The extra vmovaps
should get no-op'ed by the CPU front-end anyway, but this is a common need, so smaller codegen would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to update this to be a proper intrinsic in a follow up PR.
CC. @dotnet/jit-contrib for the runtime side tweaks |
2ad678f
to
a52829b
Compare
4ad6072
to
8d9c929
Compare
@@ -19573,9 +19681,22 @@ GenTree* Compiler::gtNewSimdBinOpNode(genTreeOps op, | |||
case GT_RSH: | |||
case GT_RSZ: | |||
{ | |||
assert(!varTypeIsFloating(simdBaseType)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that Vector*<T>
is going to support shifting so it makes sense that the restriction is lifted, then you normalize the base type.
@@ -2030,17 +2055,21 @@ GenTree* Compiler::impBaseIntrinsic(NamedIntrinsic intrinsic, | |||
|
|||
if ((simdSize != 32) || compExactlyDependsOn(InstructionSet_AVX2)) | |||
{ | |||
genTreeOps op = varTypeIsUnsigned(simdBaseType) ? GT_RSZ : GT_RSH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the base type is unsigned, is op_RightShift
basically going to be the same thing as op_UnsignedRightShift
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, that's how op_RightShift
is expected to behave for unsigned types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JIT side of things look good to me.
/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr jitstress-isas-arm, runtime-coreclr outerloop |
Azure Pipelines successfully started running 3 pipeline(s). |
|
This resolves #76593