Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marking Matrix3x2, Matrix4x4, Plane, and Quaternion as Intrinsic #41829

Merged
merged 1 commit into from
Sep 4, 2020

Conversation

tannergooding
Copy link
Member

This resolves #41738 by marking Matrix3x2, Matrix4x4, Plane, and Quaternion as intrinsic causing the JitInlineSIMDMultipler to be applied as it was for .NET Core 3.1

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 3, 2020
@tannergooding
Copy link
Member Author

[2020/09/03 14:22:22][INFO] // * Summary *
[2020/09/03 14:22:22][INFO]
[2020/09/03 14:22:22][INFO] BenchmarkDotNet=v0.12.1.1405-nightly, OS=Windows 10.0.19041.388 (2004/May2020Update/20H1)
[2020/09/03 14:22:22][INFO] AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
[2020/09/03 14:22:22][INFO] .NET Core SDK=6.0.100-alpha.1.20453.5
[2020/09/03 14:22:22][INFO]   [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.41714, CoreFX 5.0.20.41714), X64 RyuJIT
[2020/09/03 14:22:22][INFO]   Job-EMLNHA : .NET Core 5.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT
[2020/09/03 14:22:22][INFO]
[2020/09/03 14:22:22][INFO] PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  Toolchain=CoreRun
[2020/09/03 14:22:22][INFO] IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15
[2020/09/03 14:22:22][INFO] WarmupCount=1
[2020/09/03 14:22:22][INFO]
[2020/09/03 14:22:22][INFO] |             Method |     Mean |     Error |    StdDev |   Median |       Min |      Max | Code Size | Gen 0 | Gen 1 | Gen 2 | Allocated |
[2020/09/03 14:22:22][INFO] |------------------- |---------:|----------:|----------:|---------:|----------:|---------:|----------:|------:|------:|------:|----------:|
[2020/09/03 14:22:22][INFO] | ConjugateBenchmark | 1.030 ns | 0.0309 ns | 0.0289 ns | 1.021 ns | 0.9983 ns | 1.104 ns |     130 B |     - |     - |     - |         - |
[2020/09/03 14:22:22][INFO] |    NegateBenchmark | 1.215 ns | 0.0165 ns | 0.0147 ns | 1.208 ns | 1.1971 ns | 1.239 ns |     142 B |     - |     - |     - |         - |
[2020/09/03 14:22:22][INFO]

Copy link
Contributor

@pgovind pgovind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. You might want to tag the perf folks/Andy as FYIs

@CarolEidt
Copy link
Contributor

Thanks @tannergooding - should this be considered for backporting?

Copy link
Member

@jeffhandley jeffhandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks for taking care of this. Once we have sign-off and this is merged into master, let's submit it for Ask Mode approval.

  1. Use the port bot to port this to release/5.0-rc2
  2. Fill out the PR template in the PR created by the bot, including notes for how this was caught, showing the regression numbers, and capturing the perf numbers you show here
  3. Label it with Servicing-consider

Once we have that, I'll email Tactics to request approval.

@tannergooding
Copy link
Member Author

Can you DM me instructions for using port bot?

@jeffhandley
Copy link
Member

Can you DM me instructions for using port bot?

Here's an example: #41817 (comment)

@AndyAyersMS
Copy link
Member

@tannergooding can you run diffs? I know it's a bit clunky with libraries changes, but I am curious what impact this has.

@tannergooding
Copy link
Member Author

can you run diffs? I know it's a bit clunky with libraries changes, but I am curious what impact this has.

Sure thing. It will take a bit to run, but I expect most changes will only be caught in our tests/benchmarks as we don't have much product code using these types.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you both @tannergooding and @AndyAyersMS for identifying and solving the regression!

@tannergooding
Copy link
Member Author

@AndyAyersMS, it was about as I expected with no visible diffs for anything except for the MicroBenchmarks that are covering the regression:
--frameworks

Found 802 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of diff: 0 (0.00% of base)

0 total files with Code Size differences (0 improved, 0 regressed), 267 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 259483 unchanged.

1 files had text diffs but no metric diffs.
diff\System.Private.CoreLib.dasm had 74 diff

--tests

Found 10519 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of diff: 0 (0.00% of base)

0 total files with Code Size differences (0 improved, 0 regressed), 3485 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 477140 unchanged.

64 files had text diffs but no metric diffs.
diff\JIT\Methodical\JIT.Methodical.XUnitWrapper.dasm had 133525 diffs
diff\JIT\Regression\JIT.Regression.XUnitWrapper.dasm had 108850 diffs
diff\JIT\HardwareIntrinsics\JIT.HardwareIntrinsics.XUnitWrapper.dasm had 61600 diffs
diff\JIT\Directed\JIT.Directed.XUnitWrapper.dasm had 24150 diffs
diff\JIT\IL_Conformance\JIT.IL_Conformance.XUnitWrapper.dasm had 19600 diffs
diff\JIT\jit64\JIT.jit64.XUnitWrapper.dasm had 18375 diffs
diff\JIT\SIMD\JIT.SIMD.XUnitWrapper.dasm had 16975 diffs
diff\Interop\PInvoke\Interop.PInvoke.XUnitWrapper.dasm had 15050 diffs
diff\JIT\Performance\JIT.Performance.XUnitWrapper.dasm had 14875 diffs
diff\JIT\opt\JIT.opt.XUnitWrapper.dasm had 11900 diffs
diff\JIT\Generics\JIT.Generics.XUnitWrapper.dasm had 7175 diffs
diff\JIT\CodeGenBringUpTests\JIT.CodeGenBringUpTests.XUnitWrapper.dasm had 5600 diffs
diff\JIT\Intrinsics\JIT.Intrinsics.XUnitWrapper.dasm had 4025 diffs
diff\Interop\COM\Interop.COM.XUnitWrapper.dasm had 3500 diffs
diff\Interop\IJW\Interop.IJW.XUnitWrapper.dasm had 1050 diffs
diff\Interop\StructMarshalling\Interop.StructMarshalling.XUnitWrapper.dasm had 875 diffs
diff\JIT\Stress\JIT.Stress.XUnitWrapper.dasm had 875 diffs
diff\Interop\NativeLibrary\Interop.NativeLibrary.XUnitWrapper.dasm had 700 diffs
diff\Interop\StringMarshalling\Interop.StringMarshalling.XUnitWrapper.dasm had 700 diffs
diff\Interop\ICustomMarshaler\Interop.ICustomMarshaler.XUnitWrapper.dasm had 525 diffs

--benchmarks

Found 246 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of diff: 0 (0.00% of base)

0 total files with Code Size differences (0 improved, 0 regressed), 82 unchanged.

0 total methods with Code Size differences (0 improved, 0 regressed), 1891 unchanged.

MicroBenchmarks.dll

Found 4 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of diff: 171 (0.01% of base)
    diff is a regression.

Top file regressions (bytes):
         171 : diff\MicroBenchmarks.dasm (0.01% of base)

1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.

Top method regressions (bytes):
          61 (75.31% of base) : diff\MicroBenchmarks.dasm - Perf_Quaternion:NegationOperatorBenchmark():Quaternion:this
          61 (75.31% of base) : diff\MicroBenchmarks.dasm - Perf_Quaternion:NegateBenchmark():Quaternion:this
          49 (60.49% of base) : diff\MicroBenchmarks.dasm - Perf_Quaternion:ConjugateBenchmark():Quaternion:this

Top method regressions (percentages):
          61 (75.31% of base) : diff\MicroBenchmarks.dasm - Perf_Quaternion:NegationOperatorBenchmark():Quaternion:this
          61 (75.31% of base) : diff\MicroBenchmarks.dasm - Perf_Quaternion:NegateBenchmark():Quaternion:this
          49 (60.49% of base) : diff\MicroBenchmarks.dasm - Perf_Quaternion:ConjugateBenchmark():Quaternion:this

3 total methods with Code Size differences (0 improved, 3 regressed), 8326 unchanged.

@JulieLeeMSFT
Copy link
Member

@tannergooding @AndyAyersMS Thanks a lot for rootcausing the issue and fixing it.

@tannergooding tannergooding merged commit a6aac1f into dotnet:master Sep 4, 2020
@tannergooding
Copy link
Member Author

/backport to release/5.0-rc2

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2020

Started backporting to release/5.0-rc2: https://github.com/dotnet/runtime/actions/runs/239889633

@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
@tannergooding tannergooding deleted the fix-41738 branch November 11, 2022 15:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance regressions in Quaternion.Conjugate and Quaternion.Negate
8 participants