Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crossgen method bodies having ARM64 intrinsic #38060

Merged
merged 6 commits into from
Jun 20, 2020

Conversation

kunalspathak
Copy link
Member

@kunalspathak kunalspathak commented Jun 17, 2020

Today, method bodies containing ARM64 intrinsics are not crossgened. Currently there are not many methods that uses ARM64 intrinsics, but gradually once all the APIs #33308 are intrinsified , it will impact startup time of those methods because they would get JITted first before executing. This PR makes such methods bodies to get crossgened for faster startup time. Additionally, methods of Vector64 and Vector128 were being treated as method calls but since they are intrinsified already (again as part of #33308), it is safe to just crossgen their method bodies.

Below is the code size difference. There seems to be regression of 1% but it will still give us good startup time benefits.


Summary of Code Size diffs:
(Lower is better)

Total bytes of diff: 53216 (0.935% of base)
    diff is a regression.

Total byte diff includes 59916 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had 1477 unique methods,    59916 unique bytes

Top file regressions (bytes):
       53216 : System.Private.CoreLib.dasm (0.935% of base)

1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.

Top method regressions (bytes):
         500 (10.113% of base) : System.Private.CoreLib.dasm - Internal.Runtime.CompilerServices.Unsafe:As(byref):byref (292 base, 323 diff methods)
         220 (117.021% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Single][System.Single]:GetHashCode():int:this
         208 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:WithLower(System.Runtime.Intrinsics.Vector128`1[__Canon],System.Runtime.Intrinsics.Vector64`1[__Canon]):System.Runtime.Intrinsics.Vector128`1[__Canon] (0 base, 1 diff methods)
         208 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:WithUpper(System.Runtime.Intrinsics.Vector128`1[__Canon],System.Runtime.Intrinsics.Vector64`1[__Canon]):System.Runtime.Intrinsics.Vector128`1[__Canon] (0 base, 1 diff methods)
         180 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte,ubyte):System.Runtime.Intrinsics.Vector128`1[Byte] (0 base, 1 diff methods)
         180 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(byte,byte,byte,byte,byte,byte,byte,byte,byte,byte,byte,byte,byte,byte,byte,byte):System.Runtime.Intrinsics.Vector128`1[SByte] (0 base, 1 diff methods)
         136 (75.556% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[UInt32][System.UInt32]:<Equals>g__SoftwareFallback|12_0(byref,System.Runtime.Intrinsics.Vector128`1[UInt32]):bool
         136 (75.556% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Int32][System.Int32]:<Equals>g__SoftwareFallback|12_0(byref,System.Runtime.Intrinsics.Vector128`1[Int32]):bool
         112 (70.000% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int16][System.Int16]:GetHashCode():int:this
         112 (70.000% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[UInt32][System.UInt32]:GetHashCode():int:this
         112 (70.000% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Int32][System.Int32]:GetHashCode():int:this
         112 (70.000% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[UInt16][System.UInt16]:GetHashCode():int:this
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Byte],ubyte,System.Runtime.Intrinsics.Vector64`1[Byte],ubyte):System.Runtime.Intrinsics.Vector128`1[Byte] (0 base, 1 diff methods)
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Byte],ubyte,System.Runtime.Intrinsics.Vector128`1[Byte],ubyte):System.Runtime.Intrinsics.Vector128`1[Byte] (0 base, 1 diff methods)
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Double],ubyte,System.Runtime.Intrinsics.Vector128`1[Double],ubyte):System.Runtime.Intrinsics.Vector128`1[Double] (0 base, 1 diff methods)
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Int16],ubyte,System.Runtime.Intrinsics.Vector64`1[Int16],ubyte):System.Runtime.Intrinsics.Vector128`1[Int16] (0 base, 1 diff methods)
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Int16],ubyte,System.Runtime.Intrinsics.Vector128`1[Int16],ubyte):System.Runtime.Intrinsics.Vector128`1[Int16] (0 base, 1 diff methods)
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Int32],ubyte,System.Runtime.Intrinsics.Vector64`1[Int32],ubyte):System.Runtime.Intrinsics.Vector128`1[Int32] (0 base, 1 diff methods)
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Int32],ubyte,System.Runtime.Intrinsics.Vector128`1[Int32],ubyte):System.Runtime.Intrinsics.Vector128`1[Int32] (0 base, 1 diff methods)
         108 (     ∞ of base) : System.Private.CoreLib.dasm - Arm64:InsertSelectedScalar(System.Runtime.Intrinsics.Vector128`1[Int64],ubyte,System.Runtime.Intrinsics.Vector128`1[Int64],ubyte):System.Runtime.Intrinsics.Vector128`1[Int64] (0 base, 1 diff methods)

Top method improvements (bytes):
        -236 (-19.601% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int64][System.Int64]:ToString():System.String:this
        -232 (-21.561% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Double][System.Double]:ToString():System.String:this
        -212 (-20.623% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[UInt64][System.UInt64]:ToString():System.String:this
        -176 (-3.392% of base) : System.Private.CoreLib.dasm - System.Numerics.Matrix4x4:<Invert>g__SseImpl|59_0(System.Numerics.Matrix4x4,byref):bool
        -156 (-9.466% of base) : System.Private.CoreLib.dasm - System.Text.ASCIIUtility:GetIndexOfFirstNonAsciiChar_Sse2(long,long):long
        -148 (-8.959% of base) : System.Private.CoreLib.dasm - System.Text.Latin1Utility:GetIndexOfFirstNonLatin1Char_Sse2(long,long):long
        -104 (-54.167% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:UnalignedCountVector128(byref):long (2 methods)
        -104 (-57.778% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int64][System.Int64]:Equals(System.Runtime.Intrinsics.Vector64`1[Int64]):bool:this
        -104 (-57.778% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[UInt64][System.UInt64]:Equals(System.Runtime.Intrinsics.Vector64`1[UInt64]):bool:this
         -88 (-44.898% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Double][System.Double]:Equals(System.Runtime.Intrinsics.Vector64`1[Double]):bool:this
         -84 (-21.429% of base) : System.Private.CoreLib.dasm - System.Text.ASCIIUtility:NarrowUtf16ToAscii_Sse2(long,long,long):long
         -84 (-20.588% of base) : System.Private.CoreLib.dasm - System.Text.Latin1Utility:NarrowUtf16ToLatin1_Sse2(long,long,long):long
         -76 (-14.179% of base) : System.Private.CoreLib.dasm - System.Text.Latin1Utility:WidenLatin1ToUtf16_Sse2(long,long,long)
         -76 (-38.776% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Double][System.Double]:GetHashCode():int:this
         -76 (-46.341% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int64][System.Int64]:GetHashCode():int:this
         -76 (-46.341% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[UInt64][System.UInt64]:GetHashCode():int:this
         -68 (-30.909% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1[SByte][System.SByte]:<Equals>g__SoftwareFallback|14_0(byref,System.Runtime.Intrinsics.Vector256`1[SByte]):bool
         -68 (-30.909% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1[Byte][System.Byte]:<Equals>g__SoftwareFallback|14_0(byref,System.Runtime.Intrinsics.Vector256`1[Byte]):bool
         -64 (-29.091% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1[UInt16][System.UInt16]:<Equals>g__SoftwareFallback|14_0(byref,System.Runtime.Intrinsics.Vector256`1[UInt16]):bool
         -64 (-29.091% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector256`1[Int16][System.Int16]:<Equals>g__SoftwareFallback|14_0(byref,System.Runtime.Intrinsics.Vector256`1[Int16]):bool

Top method regressions (percentages):
          40 (     ∞ of base) : System.Private.CoreLib.dasm - System.Math:CopySign(double,double):double (0 base, 1 diff methods)
          28 (     ∞ of base) : System.Private.CoreLib.dasm - System.MathF:CopySign(float,float):float (0 base, 1 diff methods)
          80 (     ∞ of base) : System.Private.CoreLib.dasm - System.SpanHelpers:TryFindFirstMatchedLane(System.Runtime.Intrinsics.Vector128`1[Byte],System.Runtime.Intrinsics.Vector128`1[Byte],byref):bool (0 base, 1 diff methods)
          72 (     ∞ of base) : System.Private.CoreLib.dasm - System.SpanHelpers:TryFindFirstMatchedLane(System.Runtime.Intrinsics.Vector128`1[UInt16],byref):bool (0 base, 1 diff methods)
          20 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.BitOperations:LeadingZeroCount(int):int (0 base, 1 diff methods)
          20 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.BitOperations:LeadingZeroCount(long):int (0 base, 1 diff methods)
          28 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.BitOperations:Log2(int):int (0 base, 1 diff methods)
          28 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.BitOperations:Log2(long):int (0 base, 1 diff methods)
          36 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.BitOperations:PopCount(int):int (0 base, 1 diff methods)
          32 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.BitOperations:PopCount(long):int (0 base, 1 diff methods)
          48 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.VectorMath:ConditionalSelectBitwise(System.Runtime.Intrinsics.Vector128`1[Single],System.Runtime.Intrinsics.Vector128`1[Single],System.Runtime.Intrinsics.Vector128`1[Single]):System.Runtime.Intrinsics.Vector128`1[Single] (0 base, 1 diff methods)
          48 (     ∞ of base) : System.Private.CoreLib.dasm - System.Numerics.VectorMath:ConditionalSelectBitwise(System.Runtime.Intrinsics.Vector128`1[Double],System.Runtime.Intrinsics.Vector128`1[Double],System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Double] (0 base, 1 diff methods)
          28 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(ubyte):System.Runtime.Intrinsics.Vector128`1[Byte] (0 base, 1 diff methods)
          24 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(double):System.Runtime.Intrinsics.Vector128`1[Double] (0 base, 1 diff methods)
          28 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(short):System.Runtime.Intrinsics.Vector128`1[Int16] (0 base, 1 diff methods)
          24 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(int):System.Runtime.Intrinsics.Vector128`1[Int32] (0 base, 1 diff methods)
          24 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(long):System.Runtime.Intrinsics.Vector128`1[Int64] (0 base, 1 diff methods)
          28 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(byte):System.Runtime.Intrinsics.Vector128`1[SByte] (0 base, 1 diff methods)
          24 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(float):System.Runtime.Intrinsics.Vector128`1[Single] (0 base, 1 diff methods)
          28 (     ∞ of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128:Create(ushort):System.Runtime.Intrinsics.Vector128`1[UInt16] (0 base, 1 diff methods)

Top method improvements (percentages):
         -44 (-64.706% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:GetCharVector128SpanLength(long,long):long
         -44 (-64.706% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:GetCharVector256SpanLength(long,long):long
         -40 (-58.824% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:GetByteVector128SpanLength(long,int):long
         -40 (-58.824% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:GetByteVector256SpanLength(long,int):long
        -104 (-57.778% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int64][System.Int64]:Equals(System.Runtime.Intrinsics.Vector64`1[Int64]):bool:this
        -104 (-57.778% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[UInt64][System.UInt64]:Equals(System.Runtime.Intrinsics.Vector64`1[UInt64]):bool:this
        -104 (-54.167% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:UnalignedCountVector128(byref):long (2 methods)
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Byte][System.Byte]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[Byte]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Double][System.Double]:get_AllBitsSet():System.Runtime.Intrinsics.Vector64`1[Double]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int16][System.Int16]:get_AllBitsSet():System.Runtime.Intrinsics.Vector64`1[Int16]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int64][System.Int64]:get_AllBitsSet():System.Runtime.Intrinsics.Vector64`1[Int64]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector64`1[Int32][System.Int32]:get_AllBitsSet():System.Runtime.Intrinsics.Vector64`1[Int32]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[UInt16][System.UInt16]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[UInt16]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[SByte][System.SByte]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[SByte]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Single][System.Single]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[Single]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[UInt64][System.UInt64]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[UInt64]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[UInt32][System.UInt32]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[UInt32]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Double][System.Double]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[Double]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Int16][System.Int16]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[Int16]
         -28 (-53.846% of base) : System.Private.CoreLib.dasm - System.Runtime.Intrinsics.Vector128`1[Int64][System.Int64]:get_AllBitsSet():System.Runtime.Intrinsics.Vector128`1[Int64]

1825 total methods with Code Size differences (314 improved, 1511 regressed), 26424 unchanged.

@kunalspathak kunalspathak changed the title Arm64 crossgen Crossgen method bodies having ARM64 intrinsic Jun 18, 2020
Copy link
Member

@davidwrighton davidwrighton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix my comment about wrapping the usage of the InstructionSet flags inside of a TARGET_ARM64 block.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants