[Mono] [Arm64] Added SIMD support for vector 2/3/4 methods #98761

jkurdek · 2024-02-21T15:51:47Z

Implements #91394.

jkurdek · 2024-02-21T19:40:51Z

/azp run runtime-extra-platforms

azure-pipelines · 2024-02-21T19:41:04Z

Azure Pipelines successfully started running 1 pipeline(s).

fanyang-mono

Overall, LGTM!

src/mono/mono/mini/mini-llvm.c

src/mono/mono/mini/simd-intrinsics.c

src/mono/mono/mini/mini-llvm.c

matouskozak

Looking good, thanks for implementing these. I left a few comments with questions.

Before merging, please take a look if fullAOT llvm compiles correctly these new intrinsic.

src/mono/mono/mini/simd-intrinsics.c

matouskozak · 2024-02-22T10:28:48Z

src/mono/mono/mini/simd-intrinsics.c

+	case SN_Lerp: {
+#if defined (TARGET_ARM64)
+		MonoInst* v1 = args [1];
+		if (!strcmp ("Quaternion", m_class_get_name (klass))) {


Quaternion.Lerp is not marked as intrinsic in the libraries:

runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/Quaternion.cs

Lines 493 to 498 in db7d269

/// <summary>Performs a linear interpolation between two quaternions based on a value that specifies the weighting of the second quaternion.</summary>

/// <param name="quaternion1">The first quaternion.</param>

/// <param name="quaternion2">The second quaternion.</param>

/// <param name="amount">The relative weight of <paramref name="quaternion2" /> in the interpolation.</param>

/// <returns>The interpolated quaternion.</returns>

public static Quaternion Lerp(Quaternion quaternion1, Quaternion quaternion2, float amount)

However, if this implementation generates better codegen we should probably keep it.

The intrinsified version is around 40% faster on my machine

Why is the intrinsified version faster here? Is it fundamentally doing something differently from the managed implementation or is there potentially a missing JIT optimization?

Or perhaps there is simply missing a change on the managed side and so its using scalar logic rather than any actual vectorization and a better fix is to update the managed impl?

We've typically tried to keep a clear separation between intrinsic functionality and more complex methods.

APIs like operator + or Sqrt are generally mapped to exactly 1 hardware instruction and this is the case for most platforms.

APIs like DotProduct or even Create may be mapped to exactly 1 hardware instruction on some platforms and are fairly "core" to the general throughput considerations of many platforms.

APIs like Quaternion.Lerp or CopyTo are more complex functions which use multiple instructions on all platforms and which may even require branching or masking logic. So, we've typically tried to keep them in managed and have them use the intrinsic APIs instead.

I agree that we should align Mono's behavior with CoreCLR, not intrinsifying Quaternion.Lerp or CopyTo for mono either.

Why is the intrinsified version faster here? Is it fundamentally doing something differently from the managed implementation or is there potentially a missing JIT optimization?

In general, Mono's mini JIT doesn't have as comprehensive optimizations as CoreCLR's RyuJIT.

tannergooding · 2024-02-23T15:52:46Z

src/mono/mono/mini/simd-intrinsics.c

+	MonoInst *sum = emit_simd_ins (cfg, klass, OP_ARM64_XADDV, arg->dreg, -1);
+	sum->inst_c0 = INTRINS_AARCH64_ADV_SIMD_FADDV;
+	sum->inst_c1 = MONO_TYPE_R4;


Worth noting that Arm has typically pushed us away from using FADDV as it does not perform well on some hardware.

Rather instead they had us use a sequence of FADDP (AddPairwise) instructions which tend to have better perf/throughput: https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/gentree.cpp#L25190-L25210

Thanks for sharing the information, @tannergooding. @jkurdek Feel free to create an issue to address it in a future PR.

src/mono/mono/mini/simd-intrinsics.c

[Mono] [Arm64] Added multiple vector instrinsics

a5f8a38

jkurdek requested review from tannergooding and matouskozak February 21, 2024 15:51

dotnet-issue-labeler bot added the area-Codegen-JIT-mono label Feb 21, 2024

ghost assigned jkurdek Feb 21, 2024

build-analysis bot mentioned this pull request Feb 21, 2024

System.Linq.Queryable.Tests native crash for mono_interpreter Debian.11.Amd64 #93990

Closed

jkurdek added 2 commits February 21, 2024 19:03

Added LLVM support

8a3cee4

fix build errors on x64

87d97d9

jkurdek marked this pull request as ready for review February 21, 2024 19:36

jkurdek requested review from fanyang-mono and vargaz as code owners February 21, 2024 19:36

fanyang-mono requested changes Feb 21, 2024

View reviewed changes

src/mono/mono/mini/mini-llvm.c Outdated Show resolved Hide resolved

src/mono/mono/mini/simd-intrinsics.c Outdated Show resolved Hide resolved

jkurdek added 2 commits February 21, 2024 22:57

Added Quaternion.Conjugate

5b1fa3d

Changed frsqrts codegen

70c4d7e

jkurdek requested review from lambdageek and SamMonoRT as code owners February 21, 2024 22:58

fanyang-mono reviewed Feb 22, 2024

View reviewed changes

src/mono/mono/mini/mini-llvm.c Outdated Show resolved Hide resolved

fanyang-mono approved these changes Feb 22, 2024

View reviewed changes

matouskozak reviewed Feb 22, 2024

View reviewed changes

fixed whitespace changes

f205632

tannergooding reviewed Feb 23, 2024

View reviewed changes

src/mono/mono/mini/simd-intrinsics.c Outdated Show resolved Hide resolved

This was referenced Feb 24, 2024

[browser][MT] WebWorkerTest.WaitAssertsOnJSInteropThreads #97914

Closed

[browser][MT] Assert failed: Cannot find Promise for JSHandle -2 #98406

Closed

Error: Assert failed: Expected Promise for GCHandle #98721

Closed

jkurdek added 3 commits March 11, 2024 15:39

Removed reciprocal sqrt estimation from normalize

874448c

Extracted dot method into sepearate function

4e3f2b8

Refactored code to use exposed dot function

646724e

jkurdek requested a review from MichalStrehovsky as a code owner March 15, 2024 08:36

jkurdek marked this pull request as draft March 15, 2024 08:37

jkurdek force-pushed the vector-additional-methods branch from f68c44b to 1ebb378 Compare March 15, 2024 08:38

jkurdek removed request for a team, lewing, kg, marek-safar, pavelsavara, steveisok, lambdageek, sbomer, jeffhandley, akoeplinger, thaystg, BrzVlad, maraf, janvorli, kotlarmilos, lateralusX, MichalStrehovsky, AaronRobinsonMSFT, ilonatommy, mkArtakMSFT and SamMonoRT March 15, 2024 08:39

Merge branch 'main' into vector-additional-methods

b30e41f

jkurdek marked this pull request as ready for review March 15, 2024 09:35

Removed trailing whitespaces

a134bae

jkurdek merged commit 516f5c4 into dotnet:main Mar 15, 2024
109 of 111 checks passed

jkurdek deleted the vector-additional-methods branch March 15, 2024 11:59

github-actions bot locked and limited conversation to collaborators Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Mono] [Arm64] Added SIMD support for vector 2/3/4 methods #98761

[Mono] [Arm64] Added SIMD support for vector 2/3/4 methods #98761

jkurdek commented Feb 21, 2024 •

edited

Loading

jkurdek commented Feb 21, 2024

azure-pipelines bot commented Feb 21, 2024

fanyang-mono left a comment

matouskozak left a comment

matouskozak Feb 22, 2024

jkurdek Feb 23, 2024

tannergooding Feb 23, 2024

tannergooding Feb 23, 2024

fanyang-mono Mar 13, 2024

fanyang-mono Mar 13, 2024

tannergooding Feb 23, 2024

fanyang-mono Mar 13, 2024

jkurdek Mar 14, 2024

	/// <summary>Performs a linear interpolation between two quaternions based on a value that specifies the weighting of the second quaternion.</summary>
	/// <param name="quaternion1">The first quaternion.</param>
	/// <param name="quaternion2">The second quaternion.</param>
	/// <param name="amount">The relative weight of <paramref name="quaternion2" /> in the interpolation.</param>
	/// <returns>The interpolated quaternion.</returns>
	public static Quaternion Lerp(Quaternion quaternion1, Quaternion quaternion2, float amount)

[Mono] [Arm64] Added SIMD support for vector 2/3/4 methods #98761

[Mono] [Arm64] Added SIMD support for vector 2/3/4 methods #98761

Conversation

jkurdek commented Feb 21, 2024 • edited Loading

jkurdek commented Feb 21, 2024

azure-pipelines bot commented Feb 21, 2024

fanyang-mono left a comment

Choose a reason for hiding this comment

matouskozak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkurdek commented Feb 21, 2024 •

edited

Loading