Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT ARM64-SVE: Add AddAcross #101674

Merged
merged 5 commits into from
May 1, 2024
Merged

Conversation

a74nh
Copy link
Contributor

@a74nh a74nh commented Apr 29, 2024

Contributes towards #99957

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels Apr 29, 2024
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@@ -5314,11 +5314,11 @@ void CodeGen::genArm64EmitterUnitTestsSve()
#endif // ALL_ARM64_EMITTER_UNIT_TESTS_SVE_UNSUPPORTED

// IF_SVE_AI_3A
theEmitter->emitIns_R_R_R(INS_sve_saddv, EA_1BYTE, REG_V1, REG_P4, REG_V2,
theEmitter->emitIns_R_R_R(INS_sve_saddv, EA_SCALABLE, REG_V1, REG_P4, REG_V2,
Copy link
Contributor Author

@a74nh a74nh Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the codegen changes:

For these instructions, arg2 (EA_1BYTE etc) is never used as the return value is dependent on the input type which is already specified in opt.
Switching arg2 to EA_SCALABLE means there is no need to write special hwinstrinsiccodegen code.

I've changed the bare minimal of instructions needed to make this patch work. There are quite a few more reduction like instructions - we should do those as we get to them in the API

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 29, 2024
@@ -0,0 +1,302 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is a copy of _SveUnaryOpTestTemplate.template with conditional tests removed as they can't be used for reduction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do cndSel(mask, AddAcross(a), falseVal). The result of this would be:

  • If 0th lane was active in mask, it would have the result of AddAcross(a)
  • For all the other lanes, it will either have 0 (for active lanes) or falseVal for in-active lanes

If that makes sense, can you add those in your new template? This might not be super meaningful but want to make sure that we test the jit code paths at least.
cc: @tannergooding

@a74nh a74nh force-pushed the sve_addacross_github branch 2 times, most recently from a0aaade to df9385b Compare April 29, 2024 11:17
@a74nh a74nh force-pushed the sve_addacross_github branch from df9385b to a3d0161 Compare April 29, 2024 12:22
src/coreclr/jit/hwintrinsic.h Outdated Show resolved Hide resolved
@a74nh a74nh marked this pull request as ready for review April 29, 2024 12:56
@a74nh
Copy link
Contributor Author

a74nh commented Apr 29, 2024

@kunalspathak @dotnet/arm64-contrib

@kunalspathak
Copy link
Member

superpmi-* failures are #101685

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some questions and minor changes.

src/coreclr/jit/hwintrinsic.h Outdated Show resolved Hide resolved
src/coreclr/jit/hwintrinsic.h Outdated Show resolved Hide resolved
@@ -0,0 +1,302 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do cndSel(mask, AddAcross(a), falseVal). The result of this would be:

  • If 0th lane was active in mask, it would have the result of AddAcross(a)
  • For all the other lanes, it will either have 0 (for active lanes) or falseVal for in-active lanes

If that makes sense, can you add those in your new template? This might not be super meaningful but want to make sure that we test the jit code paths at least.
cc: @tannergooding

@@ -1769,6 +1803,8 @@ private static byte HighNarrowing(ushort op1, bool round)

public static ushort AddWidening(ushort op1, byte op2) => (ushort)(op1 + op2);

public static ulong AddWidening(ulong op1, byte op2) => (ulong)(op1 + (ulong)op2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also not need the following?

        public static uint AddWidening(uint op1, byte op2) => (uint)(op1 + (uint)op2);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for the testing. Sve.AddAcross() always widens to 64bits regardless of the input.

helper.cs seems to be taking the approach of only adding helper functions when they are needed.

return reduceOp(op1[0], op1[1]);
}

if (op1.Length % 2 != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have input parameters that exercise this condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This would require an vector where the number of elements was not a power of 2.

I wasn't sure if there was a way of raising an exception in the testing. So instead, NaN would ensure the test failed.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak
Copy link
Member

/ba-g Failure is #101721

@kunalspathak kunalspathak merged commit a700005 into dotnet:main May 1, 2024
158 of 168 checks passed
@a74nh a74nh deleted the sve_addacross_github branch May 1, 2024 08:11
michaelgsharp pushed a commit to michaelgsharp/runtime that referenced this pull request May 9, 2024
* JIT ARM64-SVE: Add AddAcross

* Remove enum changes

* Fix SVE tests max vector size to 512bit

* fix zip test cases

---------

Co-authored-by: Kunal Pathak <Kunal.Pathak@microsoft.com>
Ruihan-Yin pushed a commit to Ruihan-Yin/runtime that referenced this pull request May 30, 2024
* JIT ARM64-SVE: Add AddAcross

* Remove enum changes

* Fix SVE tests max vector size to 512bit

* fix zip test cases

---------

Co-authored-by: Kunal Pathak <Kunal.Pathak@microsoft.com>
@github-actions github-actions bot locked and limited conversation to collaborators May 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants