Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jit: run throw helper merge phase before morph #35255

Merged
merged 1 commit into from
Apr 23, 2020

Conversation

AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS commented Apr 21, 2020

Now that we have pred lists before morph, we can move the throw helper
tail merge phase earlier in the phase list.

This has two benefits:

  • we can now merge a few more cases, because morph can introduce unique
    temps for otherwise identical calls;
  • it saves some throughput, because we no longer need to morph duplicate
    calls.

There is more opportunity here to reduce code size if we can find the right
heuristic in morph to decide if throw helpers should be called or tail-called,
though the overall benefit is small (~600 methods, ~2000 bytes). I left the
current heuristic in place as I couldn't come up with anything better.

Fixes #35135.

Now that we have pred lists before morph, we can move the throw helper
tail merge phase earlier in the phase list.

This has two benefits:
* we can now merge a few more cases, because morph can introduce unique
temps for otherwise identical calls;
* it saves some throughput, because we no longer need to morph duplicate
calls.

There is more opportunity here to reduce code size if we can find the right
heuristic in morph to decide if throw helpers should be called or tail-called,
though the overall benefit is small (~600 methods, ~2000k bytes). I left the
current heuristic in place as I couldn't come up with anything better.

Fixes dotnet#35135.
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 21, 2020
@AndyAyersMS
Copy link
Member Author

cc @dotnet/jit-contrib @benaadams

Diffs on Bens' example:

PMI CodeSize Diffs for c:\bugs\asp-throw\ex.exe for x64 default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -83 (-0.59% of base)
    diff is an improvement.
Top file improvements (bytes):
         -83 : ex.dasm (-0.59% of base)
1 total files with Code Size differences (1 improved, 0 regressed), 0 unchanged.
Top method improvements (bytes):
         -83 (-9.78% of base) : ex.dasm - HttpParser:ParseRequestLine(IHttpRequestLineHandler,ReadOnlySpan`1):this
Top method improvements (percentages):
         -83 (-9.78% of base) : ex.dasm - HttpParser:ParseRequestLine(IHttpRequestLineHandler,ReadOnlySpan`1):this
1 total methods with Code Size differences (1 improved, 0 regressed), 113 unchanged.

Some hits across full FX as well:

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for x64 default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -606 (-0.00% of base)
    diff is an improvement.
Top file regressions (bytes):
         101 : Microsoft.CodeAnalysis.CSharp.dasm (0.00% of base)
           6 : System.Transactions.Local.dasm (0.01% of base)
           1 : Microsoft.CodeAnalysis.dasm (0.00% of base)
Top file improvements (bytes):
        -374 : System.Linq.Parallel.dasm (-0.02% of base)
        -114 : System.Collections.Concurrent.dasm (-0.03% of base)
         -88 : System.Private.Xml.dasm (-0.00% of base)
         -73 : System.Private.CoreLib.dasm (-0.00% of base)
         -33 : System.Text.Json.dasm (-0.00% of base)
         -11 : System.Diagnostics.PerformanceCounter.dasm (-0.01% of base)
          -8 : System.Net.Primitives.dasm (-0.01% of base)
          -7 : System.Net.Sockets.dasm (-0.00% of base)
          -2 : System.Security.Cryptography.Algorithms.dasm (-0.00% of base)
          -2 : System.Security.Cryptography.Cng.dasm (-0.00% of base)
          -2 : System.Security.Cryptography.Pkcs.dasm (-0.00% of base)
14 total files with Code Size differences (11 improved, 3 regressed), 252 unchanged.
Top method regressions (bytes):
         149 ( 4.15% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SourceMemberContainerTypeSymbol:ComputeInterfaceImplementations(DiagnosticBag,CancellationToken):ImmutableArray`1:this
           6 ( 2.08% of base) : System.Transactions.Local.dasm - TransactionScope:SetCurrent(Transaction):this
           4 ( 0.18% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SourceMemberContainerTypeSymbol:ForceComplete(SourceLocation,CancellationToken):this
           3 ( 0.46% of base) : System.Private.CoreLib.dasm - SorterGenericArray:DownHeap(int,int,int):this
           2 ( 0.08% of base) : System.Private.Xml.dasm - XmlSerializationReader:ReadArray(String,String):Object:this
           1 ( 0.20% of base) : Microsoft.CodeAnalysis.CSharp.dasm - DocumentationCommentCompiler:VisitNamespace(NamespaceSymbol):this
           1 ( 0.02% of base) : Microsoft.CodeAnalysis.dasm - CommonCompiler:RunCore(TextWriter,ErrorLogger,CancellationToken):int:this
Top method improvements (bytes):
        -374 (-1.98% of base) : System.Linq.Parallel.dasm - SortHelper`2:MergeSortCooperatively():this (7 methods)
         -45 (-3.19% of base) : System.Private.Xml.dasm - XmlCharCheckingReader:Read():bool:this
         -45 (-0.86% of base) : System.Private.Xml.dasm - <ReadAsync>d__36:MoveNext():this
         -44 (-3.30% of base) : System.Private.CoreLib.dasm - Task:WaitAllCore(ref,int,CancellationToken):bool
         -42 (-12.32% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:ThrowIfInvalidObjectValue(Object) (7 methods)
         -40 (-2.20% of base) : Microsoft.CodeAnalysis.CSharp.dasm - DocumentationCommentCompiler:DefaultVisit(Symbol):this
         -36 (-1.68% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:System.Collections.IDictionary.Add(Object,Object):this (7 methods)
         -36 (-1.68% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:System.Collections.IDictionary.set_Item(Object,Object):this (7 methods)
         -33 (-0.17% of base) : System.Text.Json.dasm - <ReadAsync>d__11`1:MoveNext():this (7 methods)
         -11 (-2.03% of base) : System.Diagnostics.PerformanceCounter.dasm - PerformanceDataRegistryKey:GetValue(String,bool):ref:this
         -10 (-0.39% of base) : Microsoft.CodeAnalysis.CSharp.dasm - MethodCompiler:CompileNamedType(NamedTypeSymbol):this
          -9 (-1.31% of base) : System.Private.CoreLib.dasm - SorterGenericArray:InsertionSort(int,int):this
          -8 (-0.26% of base) : System.Net.Primitives.dasm - CookieContainer:AgeCookies(String):bool:this
          -7 (-3.98% of base) : System.Net.Sockets.dasm - IOControlKeepAlive:Fill(ref):this
          -7 (-1.89% of base) : System.Private.CoreLib.dasm - TextInfo:AddTitlecaseLetter(byref,byref,int,int):int:this
          -3 (-0.71% of base) : Microsoft.CodeAnalysis.CSharp.dasm - CSharpSemanticModel:GetParameterSymbol(ImmutableArray`1,ParameterSyntax,CancellationToken):ParameterSymbol:this
          -3 (-0.34% of base) : System.Private.CoreLib.dasm - Vector128`1:GetHashCode():int:this (6 methods)
          -3 (-0.39% of base) : System.Private.CoreLib.dasm - Vector256`1:GetHashCode():int:this (6 methods)
          -3 (-0.08% of base) : System.Private.CoreLib.dasm - Vector256`1:ToString():String:this (6 methods)
          -3 (-0.36% of base) : System.Private.CoreLib.dasm - Vector256`1:<Equals>g__SoftwareFallback|12_0(byref,Vector256`1):bool (6 methods)
Top method regressions (percentages):
         149 ( 4.15% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SourceMemberContainerTypeSymbol:ComputeInterfaceImplementations(DiagnosticBag,CancellationToken):ImmutableArray`1:this
           6 ( 2.08% of base) : System.Transactions.Local.dasm - TransactionScope:SetCurrent(Transaction):this
           3 ( 0.46% of base) : System.Private.CoreLib.dasm - SorterGenericArray:DownHeap(int,int,int):this
           1 ( 0.20% of base) : Microsoft.CodeAnalysis.CSharp.dasm - DocumentationCommentCompiler:VisitNamespace(NamespaceSymbol):this
           4 ( 0.18% of base) : Microsoft.CodeAnalysis.CSharp.dasm - SourceMemberContainerTypeSymbol:ForceComplete(SourceLocation,CancellationToken):this
           2 ( 0.08% of base) : System.Private.Xml.dasm - XmlSerializationReader:ReadArray(String,String):Object:this
           1 ( 0.02% of base) : Microsoft.CodeAnalysis.dasm - CommonCompiler:RunCore(TextWriter,ErrorLogger,CancellationToken):int:this
Top method improvements (percentages):
         -42 (-12.32% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:ThrowIfInvalidObjectValue(Object) (7 methods)
          -7 (-3.98% of base) : System.Net.Sockets.dasm - IOControlKeepAlive:Fill(ref):this
         -44 (-3.30% of base) : System.Private.CoreLib.dasm - Task:WaitAllCore(ref,int,CancellationToken):bool
         -45 (-3.19% of base) : System.Private.Xml.dasm - XmlCharCheckingReader:Read():bool:this
         -40 (-2.20% of base) : Microsoft.CodeAnalysis.CSharp.dasm - DocumentationCommentCompiler:DefaultVisit(Symbol):this
         -11 (-2.03% of base) : System.Diagnostics.PerformanceCounter.dasm - PerformanceDataRegistryKey:GetValue(String,bool):ref:this
        -374 (-1.98% of base) : System.Linq.Parallel.dasm - SortHelper`2:MergeSortCooperatively():this (7 methods)
          -7 (-1.89% of base) : System.Private.CoreLib.dasm - TextInfo:AddTitlecaseLetter(byref,byref,int,int):int:this
         -36 (-1.68% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:System.Collections.IDictionary.set_Item(Object,Object):this (7 methods)
         -36 (-1.68% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:System.Collections.IDictionary.Add(Object,Object):this (7 methods)
          -9 (-1.31% of base) : System.Private.CoreLib.dasm - SorterGenericArray:InsertionSort(int,int):this
         -45 (-0.86% of base) : System.Private.Xml.dasm - <ReadAsync>d__36:MoveNext():this
          -3 (-0.71% of base) : Microsoft.CodeAnalysis.CSharp.dasm - CSharpSemanticModel:GetParameterSymbol(ImmutableArray`1,ParameterSyntax,CancellationToken):ParameterSymbol:this
          -3 (-0.51% of base) : System.Private.CoreLib.dasm - Vector64`1:GetHashCode():int:this (6 methods)
          -3 (-0.39% of base) : System.Private.CoreLib.dasm - Vector256`1:GetHashCode():int:this (6 methods)
         -10 (-0.39% of base) : Microsoft.CodeAnalysis.CSharp.dasm - MethodCompiler:CompileNamedType(NamedTypeSymbol):this
          -3 (-0.36% of base) : System.Private.CoreLib.dasm - Vector256`1:<Equals>g__SoftwareFallback|12_0(byref,Vector256`1):bool (6 methods)
          -3 (-0.34% of base) : System.Private.CoreLib.dasm - Vector128`1:GetHashCode():int:this (6 methods)
          -8 (-0.26% of base) : System.Net.Primitives.dasm - CookieContainer:AgeCookies(String):bool:this
          -2 (-0.18% of base) : System.Security.Cryptography.Pkcs.dasm - KeyFormatHelper:WriteEncryptedPkcs8(ReadOnlySpan`1,ReadOnlySpan`1,AsnWriter,PbeParameters):AsnWriter
32 total methods with Code Size differences (25 improved, 7 regressed), 244486 unchanged.

The one big regression is block weight/RA related.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Apr 21, 2020

Odd, a comment by @EgorBo was here but vanished. I'll recreate it...

Is it going to fold comparisons of constant strings? (don't know where GenTree::Compare is used)

Yes, it could, for reference equality checks. GenTree::Compare is invoked by gtFoldExprCompare. I didn't see this happen, but will play with some test cases. We probably fold those away later on during the optimizer, so this might not lead to observable diffs.

@EgorBo
Copy link
Member

EgorBo commented Apr 21, 2020

@AndyAyersMS
I removed my comment because I realized that constant strings are compared via string.Equals call instead of GT_EQ/GT_NE 🙂

@benaadams
Copy link
Member

Thank you; this is very helpful for throw helpers that take larger data types (e.g. Span<byte>) as there is a bunch of preamble in passing them

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib ping...

Copy link
Member

@erozenfeld erozenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one question.


for (; !iter.Equal(end); iter++)
{
BasicBlock* const nonCanonicalBlock = iter.Get();
BasicBlock* const canonicalBlock = iter.GetValue();
flowList* nextPredEdge = nullptr;
bool updated = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this new local?

Copy link
Member Author

@AndyAyersMS AndyAyersMS Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously updateCount would overstate the number of cases transformed, so I added updated to get the accounting right.

Admittedly we don't use this now more accurate the count for much. I had ambitions of using it to adjust optNoReturnCallCount so that the tail call heuristic in morph would also see a more accurate value and hence make better decisions, but doing that made things worse overall (bigger code size).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, github shows diffs inaccurately, which that confused me and I didn't see that there was a semantic change. Previously we incremented updateCount once per predecessor and now we increment updateCount once per non-canonical block if there was a predecessor change. That makes sense.

@AndyAyersMS
Copy link
Member Author

Not sure what's up with the perf leg, am going to ignore it.

@AndyAyersMS AndyAyersMS merged commit b828309 into dotnet:master Apr 23, 2020
@AndyAyersMS AndyAyersMS deleted the RunThrowMergingBeforeMorph branch April 23, 2020 21:48
@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: consider running throw helper merge pass earlier
5 participants