Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MichalPetryka] [X64] Intrinsify Unsafe.Read/Write/Copy, handle struct BitCast #140

Open
MihuBot opened this issue Jul 22, 2023 · 2 comments

Comments

@MihuBot
Copy link
Owner

MihuBot commented Jul 22, 2023

Build completed in 2 hours 4 minutes.
dotnet/runtime#85562

CoreLib diffs

Found 2 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 6626343
Total bytes of diff: 6626338
Total bytes of delta: -5 (-0.00 % of base)
Total relative delta: -0.00
    diff is an improvement.
    relative diff is an improvement.


Top file improvements (bytes):
          -5 : System.Private.CoreLib.dasm (-0.00 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
          -5 (-0.37 % of base) : System.Private.CoreLib.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int (FullOpts)

Top method improvements (percentages):
          -5 (-0.37 % of base) : System.Private.CoreLib.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int (FullOpts)

1 total methods with Code Size differences (1 improved, 0 regressed), 52819 unchanged.

--------------------------------------------------------------------------------

Frameworks diffs

Diffs
Found 298 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 38419853
Total bytes of diff: 38420618
Total bytes of delta: 765 (0.00 % of base)
Total relative delta: 0.22
    diff is a regression.
    relative diff is a regression.


Top file regressions (bytes):
         269 : ILCompiler.Reflection.ReadyToRun.dasm (0.11 % of base)
         239 : System.Reflection.Metadata.dasm (0.04 % of base)
          98 : System.Text.Json.dasm (0.01 % of base)
          90 : System.Diagnostics.FileVersionInfo.dasm (0.88 % of base)
          41 : System.Reflection.MetadataLoadContext.dasm (0.02 % of base)
          40 : System.Security.Cryptography.dasm (0.00 % of base)

Top file improvements (bytes):
          -7 : System.Diagnostics.DiagnosticSource.dasm (-0.00 % of base)
          -5 : System.Private.CoreLib.dasm (-0.00 % of base)

8 total files with Code Size differences (2 improved, 6 regressed), 247 unchanged.

Top method regressions (bytes):
         245 (5.52 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.ReadyToRunMethod:.ctor(ILCompiler.Reflection.ReadyToRun.ReadyToRunReader,ILCompiler.Reflection.ReadyToRun.IAssemblyMetadata,System.Reflection.Metadata.EntityHandle,int,System.String,System.String,System.String[],System.Nullable`1[int]):this (FullOpts)
          98 (4.04 % of base) : System.Text.Json.dasm - System.Text.Json.JsonDocument:Parse(System.ReadOnlySpan`1[ubyte],System.Text.Json.JsonReaderOptions,byref,byref) (FullOpts)
          90 (5.20 % of base) : System.Diagnostics.FileVersionInfo.dasm - System.Diagnostics.FileVersionInfo:GetStringAttributeArgumentValue(System.Reflection.Metadata.MetadataReader,System.Reflection.Metadata.CustomAttribute,byref) (FullOpts)
          54 (2.51 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[System.__Canon]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[System.__Canon]:this (FullOpts)
          54 (3.09 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaCustomAttributeHelpers:TypeMatchesNameAndNamespace(System.Reflection.Metadata.EntityHandle,System.ReadOnlySpan`1[ubyte],System.ReadOnlySpan`1[ubyte],System.Reflection.Metadata.MetadataReader):bool (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[double]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[double]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[int]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[int]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[long]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[long]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[short]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[short]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[System.Nullable`1[int]]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[System.Nullable`1[int]]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[System.Numerics.Vector`1[float]]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[System.Numerics.Vector`1[float]]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[ubyte]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[ubyte]:this (FullOpts)
          24 (1.51 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.MetadataNameFormatter:EmitMethodDefinitionName(System.Reflection.Metadata.MethodDefinitionHandle,System.String,System.String):System.String:this (FullOpts)
          21 (0.86 % of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.CapiHelper:ToRSAParameters(ubyte[],bool):System.Security.Cryptography.RSAParameters (FullOpts)
          19 (0.48 % of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.CapiHelper:ToDSAParameters(ubyte[],bool,ubyte[]):System.Security.Cryptography.DSAParameters (FullOpts)
           2 (0.29 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaField:ComputeFieldType():System.Type:this (FullOpts)
           2 (0.32 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaGenericTypeParameterType:ComputeDeclaringType():System.Reflection.TypeLoading.RoType:this (FullOpts)
           2 (0.50 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaSignatureTypeProviderForToString:GetTypeFromSpecification(System.Reflection.Metadata.MetadataReader,System.Reflection.TypeLoading.TypeContext,System.Reflection.Metadata.TypeSpecificationHandle,ubyte):System.String:this (FullOpts)

Top method improvements (bytes):
         -19 (-4.48 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaModule:GetTypeFromSpecification(System.Reflection.Metadata.MetadataReader,System.Reflection.TypeLoading.TypeContext,System.Reflection.Metadata.TypeSpecificationHandle,ubyte):System.Reflection.TypeLoading.RoType:this (FullOpts)
         -11 (-1.40 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.AssemblyReference:GetAssemblyName():System.Reflection.AssemblyName:this (FullOpts)
          -7 (-5.34 % of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.ActivityTraceId:CreateRandom():System.Diagnostics.ActivityTraceId (FullOpts)
          -5 (-0.37 % of base) : System.Private.CoreLib.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int (FullOpts)

Top method regressions (percentages):
         245 (5.52 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.ReadyToRunMethod:.ctor(ILCompiler.Reflection.ReadyToRun.ReadyToRunReader,ILCompiler.Reflection.ReadyToRun.IAssemblyMetadata,System.Reflection.Metadata.EntityHandle,int,System.String,System.String,System.String[],System.Nullable`1[int]):this (FullOpts)
          90 (5.20 % of base) : System.Diagnostics.FileVersionInfo.dasm - System.Diagnostics.FileVersionInfo:GetStringAttributeArgumentValue(System.Reflection.Metadata.MetadataReader,System.Reflection.Metadata.CustomAttribute,byref) (FullOpts)
          98 (4.04 % of base) : System.Text.Json.dasm - System.Text.Json.JsonDocument:Parse(System.ReadOnlySpan`1[ubyte],System.Text.Json.JsonReaderOptions,byref,byref) (FullOpts)
          54 (3.09 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaCustomAttributeHelpers:TypeMatchesNameAndNamespace(System.Reflection.Metadata.EntityHandle,System.ReadOnlySpan`1[ubyte],System.ReadOnlySpan`1[ubyte],System.Reflection.Metadata.MetadataReader):bool (FullOpts)
          54 (2.51 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[System.__Canon]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[System.__Canon]:this (FullOpts)
          24 (1.51 % of base) : ILCompiler.Reflection.ReadyToRun.dasm - ILCompiler.Reflection.ReadyToRun.MetadataNameFormatter:EmitMethodDefinitionName(System.Reflection.Metadata.MethodDefinitionHandle,System.String,System.String):System.String:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[double]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[double]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[int]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[int]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[long]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[long]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[short]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[short]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[System.Nullable`1[int]]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[System.Nullable`1[int]]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[System.Numerics.Vector`1[float]]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[System.Numerics.Vector`1[float]]:this (FullOpts)
          28 (1.34 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.Ecma335.CustomAttributeDecoder`1[ubyte]:DecodeValue(System.Reflection.Metadata.EntityHandle,System.Reflection.Metadata.BlobHandle):System.Reflection.Metadata.CustomAttributeValue`1[ubyte]:this (FullOpts)
          21 (0.86 % of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.CapiHelper:ToRSAParameters(ubyte[],bool):System.Security.Cryptography.RSAParameters (FullOpts)
           2 (0.50 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaSignatureTypeProviderForToString:GetTypeFromSpecification(System.Reflection.Metadata.MetadataReader,System.Reflection.TypeLoading.TypeContext,System.Reflection.Metadata.TypeSpecificationHandle,ubyte):System.String:this (FullOpts)
          19 (0.48 % of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.CapiHelper:ToDSAParameters(ubyte[],bool,ubyte[]):System.Security.Cryptography.DSAParameters (FullOpts)
           2 (0.32 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaGenericTypeParameterType:ComputeDeclaringType():System.Reflection.TypeLoading.RoType:this (FullOpts)
           2 (0.29 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaField:ComputeFieldType():System.Type:this (FullOpts)

Top method improvements (percentages):
          -7 (-5.34 % of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.ActivityTraceId:CreateRandom():System.Diagnostics.ActivityTraceId (FullOpts)
         -19 (-4.48 % of base) : System.Reflection.MetadataLoadContext.dasm - System.Reflection.TypeLoading.Ecma.EcmaModule:GetTypeFromSpecification(System.Reflection.Metadata.MetadataReader,System.Reflection.TypeLoading.TypeContext,System.Reflection.Metadata.TypeSpecificationHandle,ubyte):System.Reflection.TypeLoading.RoType:this (FullOpts)
         -11 (-1.40 % of base) : System.Reflection.Metadata.dasm - System.Reflection.Metadata.AssemblyReference:GetAssemblyName():System.Reflection.AssemblyName:this (FullOpts)
          -5 (-0.37 % of base) : System.Private.CoreLib.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int (FullOpts)

22 total methods with Code Size differences (4 improved, 18 regressed), 233234 unchanged.

--------------------------------------------------------------------------------

Artifacts:

@MihuBot
Copy link
Owner Author

MihuBot commented Jul 22, 2023

Top method improvements

-5 (-0.37 % of base) - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int
 ; Assembly listing for method System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 31 single block inlinees; 14 inlinees without PGO data
+; 0 inlinees with PGO data; 12 single block inlinees; 14 inlinees without PGO data
 ; Final local variable assignments
 ;
 ;  V00 arg0         [V00,T02] ( 61, 8839.50)    long  ->  r15        
-;  V01 arg1         [V01,T26] ( 12,    8.50)     int  ->  r13        
+;  V01 arg1         [V01,T25] ( 12,    8.50)     int  ->  r13        
 ;  V02 arg2         [V02,T01] ( 60, 9795   )    long  ->  rbx        
 ;  V03 arg3         [V03,T03] ( 45, 8076.50)     int  ->  r14        
-;  V04 arg4         [V04,T29] (  4,    3   )   byref  ->  r12         single-def
-;  V05 arg5         [V05,T30] (  4,    3   )   byref  ->  [rbp-30H]   single-def
+;  V04 arg4         [V04,T28] (  4,    3   )   byref  ->  r12         single-def
+;  V05 arg5         [V05,T29] (  4,    3   )   byref  ->  [rbp-30H]   single-def
 ;  V06 loc0         [V06,T04] ( 10, 1629   )    long  ->  rax        
-;  V07 loc1         [V07,T36] (  2,   64.50)  simd16  ->  mm0         ld-addr-op <System.Runtime.Intrinsics.Vector128`1[short]>
+;  V07 loc1         [V07,T35] (  2,   64.50)  simd16  ->  mm0         ld-addr-op <System.Runtime.Intrinsics.Vector128`1[short]>
 ;  V08 loc2         [V08,T00] ( 45,15440.50)     int  ->  registers  
-;  V09 loc3         [V09,T27] ( 12,    6   )     int  ->  rax        
-;  V10 loc4         [V10,T33] (  5,    2.50)     int  ->  rax        
-;  V11 loc5         [V11,T28] (  6,    5   )    long  ->  rax        
+;  V09 loc3         [V09,T26] ( 12,    6   )     int  ->  rax        
+;  V10 loc4         [V10,T32] (  5,    2.50)     int  ->  rax        
+;  V11 loc5         [V11,T27] (  6,    5   )    long  ->  rax        
 ;* V12 loc6         [V12    ] (  0,    0   )     int  ->  zero-ref   
 ;* V13 loc7         [V13    ] (  0,    0   )     int  ->  zero-ref   
-;  V14 loc8         [V14,T23] (  3,   24   )     int  ->  rdx        
-;  V15 loc9         [V15,T15] (  3,   80   )     int  ->  rcx        
-;  V16 loc10        [V16,T16] (  8,   64   )    long  ->  rcx        
+;  V14 loc8         [V14,T22] (  3,   24   )     int  ->  rdx        
+;  V15 loc9         [V15,T14] (  3,   80   )     int  ->  rcx        
+;  V16 loc10        [V16,T15] (  8,   64   )    long  ->  rcx        
 ;  V17 loc11        [V17,T13] (  6,  216   )     int  ->  rdi        
-;  V18 loc12        [V18,T35] ( 11,  312   )  simd16  ->  mm1         <System.Runtime.Intrinsics.Vector128`1[short]>
+;  V18 loc12        [V18,T34] ( 11,  312   )  simd16  ->  mm1         <System.Runtime.Intrinsics.Vector128`1[short]>
 ;* V19 loc13        [V19    ] (  0,    0   )     ref  ->  zero-ref    class-hnd <System.Object>
 ;* V20 loc14        [V20    ] (  0,    0   )     ref  ->  zero-ref    class-hnd <System.Object>
 ;* V21 loc15        [V21    ] (  0,    0   )     ref  ->  zero-ref    class-hnd <System.Object>
 ;* V22 loc16        [V22    ] (  0,    0   )     ref  ->  zero-ref    class-hnd <System.Object>
 ;* V23 loc17        [V23    ] (  0,    0   )     ref  ->  zero-ref    class-hnd <System.Object>
 ;* V24 loc18        [V24    ] (  0,    0   )     int  ->  zero-ref   
 ;* V25 loc19        [V25    ] (  0,    0   )     int  ->  zero-ref   
 ;* V26 loc20        [V26    ] (  0,    0   )     int  ->  zero-ref   
 ;# V27 OutArgs      [V27    ] (  1,    1   )  struct ( 0) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;  V28 tmp1         [V28,T34] (  2,    2   )     int  ->  rdx         "Inline return value spill temp"
-;* V29 tmp2         [V29    ] (  0,    0   )  ushort  ->  zero-ref    "Inlining Arg"
-;  V30 tmp3         [V30,T24] (  3,   24   )    long  ->  rdx         "Inline return value spill temp"
-;  V31 tmp4         [V31,T20] (  3,   48   )    long  ->  rcx         "Inlining Arg"
-;  V32 tmp5         [V32,T21] (  3,   48   )    long  ->  rdx         "Inlining Arg"
-;* V33 tmp6         [V33    ] (  0,    0   )     int  ->  zero-ref    "Inlining Arg"
-;* V34 tmp7         [V34    ] (  0,    0   )     int  ->  zero-ref    "Inlining Arg"
-;* V35 tmp8         [V35    ] (  0,    0   )  ushort  ->  zero-ref    "Inlining Arg"
-;* V36 tmp9         [V36,T17] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V37 tmp10        [V37,T18] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V38 tmp11        [V38,T08] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V39 tmp12        [V39    ] (  0,    0   )     int  ->  zero-ref    "Inlining Arg"
-;* V40 tmp13        [V40,T05] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V41 tmp14        [V41    ] (  0,    0   )     int  ->  zero-ref    "Inlining Arg"
+;  V28 tmp1         [V28,T33] (  2,    2   )     int  ->  rdx         "Inline return value spill temp"
+;  V29 tmp2         [V29,T23] (  3,   24   )    long  ->  rdx         "Inline return value spill temp"
+;  V30 tmp3         [V30,T19] (  3,   48   )    long  ->  rcx         "Inlining Arg"
+;  V31 tmp4         [V31,T20] (  3,   48   )    long  ->  rdx         "Inlining Arg"
+;* V32 tmp5         [V32,T16] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V33 tmp6         [V33,T17] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V34 tmp7         [V34,T08] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V35 tmp8         [V35,T05] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V36 tmp9         [V36    ] (  0,    0   )     int  ->  zero-ref    "Inlining Arg"
+;* V37 tmp10        [V37    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
+;* V38 tmp11        [V38,T18] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V39 tmp12        [V39,T11] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V40 tmp13        [V40,T09] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V41 tmp14        [V41,T12] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
 ;* V42 tmp15        [V42    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;  V43 tmp16        [V43,T14] (  2,  128   )  ushort  ->  rdx         "Inlining Arg"
-;* V44 tmp17        [V44,T19] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V45 tmp18        [V45,T11] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V46 tmp19        [V46,T09] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V47 tmp20        [V47,T12] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V48 tmp21        [V48    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V49 tmp22        [V49    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V50 tmp23        [V50    ] (  0,    0   )     int  ->  zero-ref    "Inlining Arg"
-;* V51 tmp24        [V51    ] (  0,    0   )   byref  ->  zero-ref    "Inlining Arg"
-;* V52 tmp25        [V52    ] (  0,    0   )  ushort  ->  zero-ref    "Inlining Arg"
-;* V53 tmp26        [V53,T06] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V43 tmp16        [V43    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
+;* V44 tmp17        [V44,T06] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V45 tmp18        [V45    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
+;* V46 tmp19        [V46    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
+;* V47 tmp20        [V47,T10] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V48 tmp21        [V48,T07] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;* V49 tmp22        [V49,T30] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
+;  V50 tmp23        [V50,T24] (  5,   20   )     int  ->  rcx         "Inlining Arg"
+;* V51 tmp24        [V51    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
+;* V52 tmp25        [V52    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
+;* V53 tmp26        [V53    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
 ;* V54 tmp27        [V54    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V55 tmp28        [V55    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V56 tmp29        [V56    ] (  0,    0   )  ushort  ->  zero-ref    "Inlining Arg"
-;* V57 tmp30        [V57,T10] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V58 tmp31        [V58,T07] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;* V59 tmp32        [V59,T31] (  0,    0   )    bool  ->  zero-ref    "Inline return value spill temp"
-;  V60 tmp33        [V60,T25] (  5,   20   )     int  ->  rcx         "Inlining Arg"
-;* V61 tmp34        [V61    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V62 tmp35        [V62    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V63 tmp36        [V63    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V64 tmp37        [V64    ] (  0,    0   )     int  ->  zero-ref    "Inline stloc first use temp"
-;* V65 tmp38        [V65    ] (  0,    0   )     int  ->  zero-ref    "Inlining Arg"
-;  V66 rat0         [V66,T22] (  3,   48   )    long  ->  rcx         "ReplaceWithLclVar is creating a new local variable"
-;  V67 rat1         [V67,T32] (  3,    3   )    long  ->  rax         "ReplaceWithLclVar is creating a new local variable"
+;  V55 rat0         [V55,T21] (  3,   48   )    long  ->  rcx         "ReplaceWithLclVar is creating a new local variable"
+;  V56 rat1         [V56,T31] (  3,    3   )    long  ->  rax         "ReplaceWithLclVar is creating a new local variable"
 ;
 ; Lcl frame size = 8
 ; BEGIN METHOD System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int
 
 G_M33313_IG01:
        push     rbp
        push     r15
        push     r14
        push     r13
        push     r12
        push     rbx
        push     rax
        vzeroupper 
        lea      rbp, [rsp+30H]
        mov      bword ptr [rbp-30H], r9
        mov      r15, rdi
        mov      r13d, esi
        mov      rbx, rdx
        mov      r14d, ecx
        mov      r12, r8
 						;; size=38 bbWeight=1 PerfScore 10.75
 G_M33313_IG02:
        cmp      r13d, r14d
        mov      edx, r14d
        cmovle   edx, r13d
        mov      rdi, r15
        mov      rsi, rbx
        mov      rcx, 0xD1FFAB1E      ; code for System.Text.Ascii:NarrowUtf16ToAscii(ulong,ulong,ulong):ulong
        call     [rcx]System.Text.Ascii:NarrowUtf16ToAscii(ulong,ulong,ulong):ulong
        lea      r15, [r15+2*rax]
        add      rbx, rax
        cmp      eax, r13d
        jne      SHORT G_M33313_IG05
 						;; size=40 bbWeight=1 PerfScore 6.50
 G_M33313_IG03:
        mov      qword ptr [r12], r15
        mov      r12, bword ptr [rbp-30H]
        mov      qword ptr [r12], rbx
        xor      eax, eax
 						;; size=14 bbWeight=0.50 PerfScore 1.62
 G_M33313_IG04:
        add      rsp, 8
        pop      rbx
        pop      r12
        pop      r13
        pop      r14
        pop      r15
        pop      rbp
        ret      
 						;; size=15 bbWeight=0.50 PerfScore 2.12
 G_M33313_IG05:
        sub      r13d, eax
        sub      r14d, eax
        cmp      r13d, 2
        jl       G_M33313_IG31
        mov      eax, r13d
        lea      rax, [r15+2*rax-04H]
        vmovups  xmm0, xmmword ptr [reloc @RWD00]
 						;; size=32 bbWeight=0.50 PerfScore 3.00
 G_M33313_IG06:
        mov      ecx, dword ptr [r15]
 						;; size=3 bbWeight=4 PerfScore 8.00
 G_M33313_IG07:
        test     ecx, 0xD1FFAB1E
        jne      G_M33313_IG15
 						;; size=12 bbWeight=32 PerfScore 40.00
 G_M33313_IG08:
        cmp      r14d, 2
        jl       G_M33313_IG33
        mov      edx, ecx
        shr      edx, 8
        or       edx, ecx
        mov      word  ptr [rbx], dx
        add      r15, 4
        add      rbx, 2
        add      r14d, -2
        mov      rcx, rax
        sub      rcx, r15
        mov      rdx, rcx
        shr      rdx, 63
        add      rcx, rdx
        sar      rcx, 1
        add      ecx, 2
        movsxd   rdx, r14d
        cmp      rcx, rdx
        jle      SHORT G_M33313_IG09
        jmp      SHORT G_M33313_IG10
        align    [0 bytes for IG11]
 						;; size=64 bbWeight=8 PerfScore 78.00
 G_M33313_IG09:
        mov      rdx, rcx
 						;; size=3 bbWeight=8 PerfScore 2.00
 G_M33313_IG10:
        mov      ecx, edx
        shr      ecx, 3
        xor      edi, edi
        test     ecx, ecx
        jbe      SHORT G_M33313_IG12
 						;; size=11 bbWeight=8 PerfScore 18.00
 G_M33313_IG11:
        vmovups  xmm1, xmmword ptr [r15]
        vptest   xmm1, xmm0
        jne      SHORT G_M33313_IG13
        vpackuswb xmm1, xmm1, xmm1
        vmovq    qword ptr [rbx], xmm1
        add      r15, 16
        add      rbx, 8
        inc      edi
        cmp      edi, ecx
        jb       SHORT G_M33313_IG11
 						;; size=34 bbWeight=64 PerfScore 832.00
 G_M33313_IG12:
        shl      edi, 3
        sub      r14d, edi
        test     dl, 4
        je       G_M33313_IG29
        mov      rcx, qword ptr [r15]
        mov      rdx, 0xD1FFAB1E
        test     rcx, rdx
        jne      SHORT G_M33313_IG14
        vmovd    xmm1, rcx
        vpackuswb xmm1, xmm1, xmm1
        vmovd    dword ptr [rbx], xmm1
        add      r15, 8
        jmp      G_M33313_IG28
 						;; size=55 bbWeight=8 PerfScore 102.00
 G_M33313_IG13:
        lea      ecx, [8*rdi]
        sub      r14d, ecx
        vmovd    rcx, xmm1
        mov      rdx, 0xD1FFAB1E
        test     rcx, rdx
        jne      SHORT G_M33313_IG14
        vpackuswb xmm2, xmm1, xmm1
        vmovd    dword ptr [rbx], xmm2
        add      r15, 8
        add      rbx, 4
        add      r14d, -4
        vpextrq  rcx, xmm1, 1
 						;; size=56 bbWeight=8 PerfScore 88.00
 G_M33313_IG14:
        mov      edx, ecx
        test     edx, 0xD1FFAB1E
        jne      G_M33313_IG19
        mov      edi, edx
        shr      edi, 8
        or       edi, edx
        mov      word  ptr [rbx], di
        add      r15, 4
        add      rbx, 2
        add      r14d, -2
        shr      rcx, 32
        mov      edx, ecx
        mov      ecx, edx
 						;; size=44 bbWeight=8 PerfScore 42.00
 G_M33313_IG15:
        test     ecx, 0xFF80
        jne      SHORT G_M33313_IG17
 						;; size=8 bbWeight=128 PerfScore 160.00
 G_M33313_IG16:
        test     r14d, r14d
        je       G_M33313_IG41
        mov      byte  ptr [rbx], cl
        add      r15, 2
        inc      rbx
        dec      r14d
        cmp      r15, rax
        ja       G_M33313_IG30
        mov      ecx, dword ptr [r15]
 						;; size=33 bbWeight=32 PerfScore 200.00
 G_M33313_IG17:
        test     ecx, 0xF800
        jne      G_M33313_IG22
 						;; size=12 bbWeight=128 PerfScore 160.00
 G_M33313_IG18:
        lea      edx, [rcx+D1FFAB1EH]
        cmp      edx, 0xD1FFAB1E
        ja       SHORT G_M33313_IG20
        cmp      r14d, 4
        jl       G_M33313_IG33
        mov      edx, ecx
        shr      edx, 6
        and      edx, 0xD1FFAB1E
        shl      ecx, 8
        and      ecx, 0xD1FFAB1E
        add      ecx, edx
        add      ecx, 0xD1FFAB1E
        mov      dword ptr [rbx], ecx
        add      r15, 4
        add      rbx, 4
        add      r14d, -4
        cmp      r15, rax
        ja       G_M33313_IG30
        mov      ecx, dword ptr [r15]
        lea      edx, [rcx-80H]
        movzx    rdx, dx
        cmp      edx, 0x780
        jb       SHORT G_M33313_IG18
        jmp      G_M33313_IG07
 						;; size=97 bbWeight=512 PerfScore 7296.00
 G_M33313_IG19:
        mov      ecx, edx
        jmp      G_M33313_IG15
 						;; size=7 bbWeight=4 PerfScore 9.00
 G_M33313_IG20:
        cmp      r14d, 2
        jl       G_M33313_IG41
        lea      edx, [4*rcx]
        and      edx, 0x1F00
        mov      edi, ecx
        and      edi, 63
        lea      edx, [rdx+rdi+C080H]
        movzx    rdx, dx
-       ror      dx, 8
-       movzx    rdx, dx
-       mov      word  ptr [rbx], dx
+       movbe    word  ptr [rbx], dx
        cmp      ecx, 0xD1FFAB1E
        jae      SHORT G_M33313_IG21
        cmp      r14d, 3
        jl       G_M33313_IG32
        shr      ecx, 16
        mov      byte  ptr [rbx+02H], cl
        add      r15, 4
        add      rbx, 3
        add      r14d, -3
        jmp      G_M33313_IG29
-						;; size=89 bbWeight=32 PerfScore 392.00
+						;; size=84 bbWeight=32 PerfScore 368.00
 G_M33313_IG21:
        add      r15, 2
        add      rbx, 2
        add      r14d, -2
        cmp      r15, rax
        ja       G_M33313_IG30
        mov      ecx, dword ptr [r15]
 						;; size=24 bbWeight=32 PerfScore 128.00
 G_M33313_IG22:
        lea      edx, [rcx-D800H]
        test     edx, 0xF800
        je       G_M33313_IG27
        test     ecx, 0xD1FFAB1E
        je       G_M33313_IG24
 						;; size=30 bbWeight=1024 PerfScore 3072.00
 G_M33313_IG23:
        lea      edx, [rcx+D1FFAB1EH]
        cmp      edx, 0xD1FFAB1E
        jb       G_M33313_IG24
        cmp      r14d, 6
        jl       G_M33313_IG24
        lea      edx, [4*rcx]
        and      edx, 0x3F00
        mov      edi, ecx
        and      edi, 63
        shl      edi, 16
        or       edx, edi
        mov      edi, ecx
        shr      edi, 4
        and      edi, 0xD1FFAB1E
        mov      esi, ecx
        shr      esi, 12
        and      esi, 15
        or       edi, esi
        add      edx, edi
        add      edx, 0xD1FFAB1E
        mov      dword ptr [rbx], edx
        mov      edx, ecx
        shr      edx, 22
        and      edx, 63
        shr      ecx, 8
        and      ecx, 0x3F00
        add      ecx, edx
        add      ecx, 0x8080
        mov      word  ptr [rbx+04H], cx
        add      r15, 4
        add      rbx, 6
        add      r14d, -6
        cmp      r15, rax
        ja       G_M33313_IG30
        mov      ecx, dword ptr [r15]
        test     ecx, 0xF800
        jne      G_M33313_IG22
        jmp      G_M33313_IG07
 						;; size=152 bbWeight=512 PerfScore 9856.00
 G_M33313_IG24:
        cmp      r14d, 3
        jl       G_M33313_IG41
        lea      edx, [4*rcx]
        and      edx, 0x3F00
        movzx    rdi, cx
        shr      edi, 12
        add      edx, edi
        add      edx, 0x80E0
        mov      word  ptr [rbx], dx
        mov      edx, ecx
        and      edx, 63
        or       edx, -128
        mov      byte  ptr [rbx+02H], dl
        add      r15, 2
        add      rbx, 3
        add      r14d, -3
        cmp      ecx, 0xD1FFAB1E
        jae      SHORT G_M33313_IG26
 						;; size=71 bbWeight=1024 PerfScore 8192.00
 G_M33313_IG25:
        test     r14d, r14d
        je       G_M33313_IG41
        shr      ecx, 16
        mov      byte  ptr [rbx], cl
        add      r15, 2
        inc      rbx
        dec      r14d
        cmp      r15, rax
        ja       G_M33313_IG30
        mov      ecx, dword ptr [r15]
        test     ecx, 0xF800
        jne      G_M33313_IG22
        jmp      G_M33313_IG07
 						;; size=53 bbWeight=512 PerfScore 5120.00
 G_M33313_IG26:
        cmp      r15, rax
        ja       SHORT G_M33313_IG30
        mov      ecx, dword ptr [r15]
        jmp      G_M33313_IG15
 						;; size=13 bbWeight=16 PerfScore 84.00
 G_M33313_IG27:
        lea      edx, [rcx+D1FFAB1EH]
        test     edx, 0xD1FFAB1E
        jne      G_M33313_IG42
        cmp      r14d, 4
        jl       G_M33313_IG41
        add      ecx, 64
        mov      edx, ecx
        and      edx, 3
        shl      edx, 20
        or       edx, 0xD1FFAB1E
        mov      edi, ecx
        and      edi, 0xD1FFAB1E
        bswap    edi
        rol      edi, 16
        or       edx, edi
        mov      edi, ecx
        shr      edi, 6
        and      edi, 0xD1FFAB1E
        or       edx, edi
        and      ecx, 252
        shl      ecx, 6
        or       ecx, edx
        mov      dword ptr [rbx], ecx
        add      r15, 4
 						;; size=90 bbWeight=2 PerfScore 19.50
 G_M33313_IG28:
        add      rbx, 4
        add      r14d, -4
 						;; size=8 bbWeight=2 PerfScore 1.00
 G_M33313_IG29:
        cmp      r15, rax
        jbe      G_M33313_IG06
 						;; size=9 bbWeight=4 PerfScore 5.00
 G_M33313_IG30:
        sub      rax, r15
        mov      r13, rax
        shr      r13, 63
        add      r13, rax
        sar      r13, 1
        add      r13d, 2
 						;; size=20 bbWeight=0.50 PerfScore 1.00
 G_M33313_IG31:
        test     r13d, r13d
        je       G_M33313_IG39
        movzx    rax, word  ptr [r15]
        jmp      SHORT G_M33313_IG34
 						;; size=15 bbWeight=0.50 PerfScore 2.62
 G_M33313_IG32:
        add      r15, 2
        add      rbx, 2
        jmp      G_M33313_IG41
 						;; size=13 bbWeight=0.50 PerfScore 1.25
 G_M33313_IG33:
        movzx    rax, cx
 						;; size=3 bbWeight=0.50 PerfScore 0.12
 G_M33313_IG34:
        cmp      eax, 127
        ja       SHORT G_M33313_IG35
        test     r14d, r14d
        je       G_M33313_IG41
        mov      byte  ptr [rbx], al
        add      r15, 2
        inc      rbx
        jmp      SHORT G_M33313_IG38
 						;; size=25 bbWeight=0.50 PerfScore 3.00
 G_M33313_IG35:
        cmp      eax, 0x800
        jae      SHORT G_M33313_IG36
        cmp      r14d, 2
        jl       SHORT G_M33313_IG41
        mov      ecx, eax
        and      ecx, 63
        or       ecx, -128
        mov      byte  ptr [rbx+01H], cl
        shr      eax, 6
        or       eax, -64
        mov      byte  ptr [rbx], al
        add      r15, 2
        add      rbx, 2
        jmp      SHORT G_M33313_IG38
 						;; size=42 bbWeight=0.50 PerfScore 4.25
 G_M33313_IG36:
        lea      ecx, [rax-D800H]
        cmp      ecx, 0x7FF
        jbe      SHORT G_M33313_IG37
        cmp      r14d, 3
        jl       SHORT G_M33313_IG41
        mov      ecx, eax
        and      ecx, 63
        or       ecx, -128
        mov      byte  ptr [rbx+02H], cl
        mov      ecx, eax
        shr      ecx, 6
        and      ecx, 63
        or       ecx, -128
        mov      byte  ptr [rbx+01H], cl
        shr      eax, 12
        or       eax, -32
        mov      byte  ptr [rbx], al
        add      r15, 2
        add      rbx, 3
        jmp      SHORT G_M33313_IG38
 						;; size=63 bbWeight=0.50 PerfScore 5.62
 G_M33313_IG37:
        cmp      eax, 0xDBFF
        ja       SHORT G_M33313_IG42
        jmp      SHORT G_M33313_IG40
 						;; size=9 bbWeight=0.50 PerfScore 1.62
 G_M33313_IG38:
        cmp      r13d, 1
        jg       SHORT G_M33313_IG41
 						;; size=6 bbWeight=0.50 PerfScore 0.62
 G_M33313_IG39:
        xor      eax, eax
        jmp      SHORT G_M33313_IG43
 						;; size=4 bbWeight=0.50 PerfScore 1.12
 G_M33313_IG40:
        mov      eax, 2
        jmp      SHORT G_M33313_IG43
 						;; size=7 bbWeight=0.50 PerfScore 1.12
 G_M33313_IG41:
        mov      eax, 1
        jmp      SHORT G_M33313_IG43
 						;; size=7 bbWeight=0.50 PerfScore 1.12
 G_M33313_IG42:
        mov      eax, 3
 						;; size=5 bbWeight=0.50 PerfScore 0.12
 G_M33313_IG43:
        mov      qword ptr [r12], r15
        mov      r12, bword ptr [rbp-30H]
        mov      qword ptr [r12], rbx
 						;; size=12 bbWeight=0.50 PerfScore 1.50
 G_M33313_IG44:
        add      rsp, 8
        pop      rbx
        pop      r12
        pop      r13
        pop      r14
        pop      r15
        pop      rbp
        ret      
 						;; size=15 bbWeight=0.50 PerfScore 2.12
 RWD00  	dq	FF80FF80FF80FF80h, FF80FF80FF80FF80h
 
 ; END METHOD System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int
 
-; Total bytes of code 1363, prolog size 38, PerfScore 36092.05, instruction count 374, allocated bytes for code 1363 (MethodHash=85587dde) for method System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int (FullOpts)
+; Total bytes of code 1358, prolog size 38, PerfScore 36067.55, instruction count 372, allocated bytes for code 1358 (MethodHash=85587dde) for method System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ulong,int,ulong,int,byref,byref):int (FullOpts)

@MihuBot
Copy link
Owner Author

MihuBot commented Jul 22, 2023

@MichalPetryka

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant