Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: build pred lists before inlining #81000

Merged
merged 2 commits into from
Jan 24, 2023

Conversation

AndyAyersMS
Copy link
Member

Move pred list building up a bit further. Note that this impacts both the root method and all inlinees, since we run a number of the initial phases on both.

Since inlinees build their basic blocks and flow edges in the root compiler's memory pool, the only work required to unify the pred lists for a successful inline is to get things right at the boundaries. And for failed inlines there is no cross-referencing so we can just let the new pred lists leak away (like we alredy do for the inlinee blocks).

Contributes towards #80193.

Move pred list building up a bit further. Note that this impacts both the root
method and all inlinees, since we run a number of the initial phases on both.

Since inlinees build their basic blocks and flow edges in the root compiler's
memory pool, the only work required to unify the pred lists for a successful
inline is to get things right at the boundaries. And for failed inlines there
is no cross-referencing so we can just let the new pred lists leak away (like
we alredy do for the inlinee blocks).

Contributes towards dotnet#80193.
@ghost ghost assigned AndyAyersMS Jan 22, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 22, 2023
@ghost
Copy link

ghost commented Jan 22, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Move pred list building up a bit further. Note that this impacts both the root method and all inlinees, since we run a number of the initial phases on both.

Since inlinees build their basic blocks and flow edges in the root compiler's memory pool, the only work required to unify the pred lists for a successful inline is to get things right at the boundaries. And for failed inlines there is no cross-referencing so we can just let the new pred lists leak away (like we alredy do for the inlinee blocks).

Contributes towards #80193.

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@EgorBo PTAL
cc @dotnet/jit-contrib

Should be zero diff because I take pains to renumber after inlining. At some point we should think about how to address the implicit dependence some phases have on bbNums.

@AndyAyersMS
Copy link
Member Author

Hmm, crossgen unhappy.

@AndyAyersMS
Copy link
Member Author

Hmm, crossgen unhappy.

We were running post-phase checks on failed inlines. Not surprisingly they found things they didn't like. We can actually bail out of the phase list once an inline fails (which is can during importation). So added code for this too -- might actually give us a bit of TP improvement.

@AndyAyersMS
Copy link
Member Author

might actually give us a bit of TP improvement.

Diffs show a 0.2%ish TP regression when optimizing.

A bit more costly than I'd expect; let me poke at it a bit.

@AndyAyersMS
Copy link
Member Author

might actually give us a bit of TP improvement.

Diffs show a 0.2%ish TP regression when optimizing.

A bit more costly than I'd expect; let me poke at it a bit.

Looks like it is just the extra cost of building pred lists for inlinees, which we need to pay anyways.

@EgorBo
Copy link
Member

EgorBo commented Jan 24, 2023

Looks like it is just the extra cost of building pred lists for inlinees, which we need to pay anyways.

Probably some more places need if (compIsForInlining())? Although, I'm fine with it as long as you think it's an unavoidable cost and JIT TP for MinOpts is not affected

@AndyAyersMS
Copy link
Member Author

you think it's an unavoidable cost and JIT TP for MinOpts is not affected

Min opts is not affected.

I think we might be able to win back some of the lost perf, say by making bbID a real thing and using that to order the pred lists (so that they don't need reordering after renumbering).

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@AndyAyersMS
Copy link
Member Author

Detailed TP diff using @SingleAccretion 's pin tool for the benchmarks collection.

Method Base Diff Delta %Delta %Total
Grand Total 34792005497 34872975213 80969716 100% 0.23%
?fgRenumberBlocks@Compiler@@QEAA_NXZ 44544527 75012827 30468300 38% 0.09%
?fgAddRefPred@Compiler@@QEAAPEAUflowList@@PEAUBasicBlock@@0PEAU2@_N@Z 18999448 30165686 11166238 14% 0.03%
?fgComputePreds@Compiler@@QEAAXXZ 26088746 32101797 6013051 7% 0.02%
?compCompile@Compiler@@IEAAXPEAPEAXPEAIPEAVJitFlags@@@z 32536579 38528577 5991998 7% 0.02%
??9iterator@StatementList@@QEBA_NAEBV01@@z 28034492 32647436 4612944 6% 0.01%
?reorderPredList@BasicBlock@@QEAAXPEAVCompiler@@@z 5119793 8550818 3431025 4% 0.01%
jitstd 398728559 401244398 2515839 3% 0.01%
?PostPhase@Phase@@MEAAXW4PhaseStatus@@@z 37945934 40409577 2463643 3% 0.01%
ActionPhase<`Compiler 6197418 8649178 2451760 3% 0.01%
?fgRemoveRefPred@Compiler@@QEAAPEAUflowList@@PEAUBasicBlock@@0@Z 8988022 11423168 2435146 3% 0.01%
?compIsForInlining@Compiler@@QEBA_NXZ 7511871 9514632 2002761 2% 0.01%
?fgInsertInlineeBlocks@Compiler@@AEAAXPEAUInlineInfo@@@z 38208667 40041772 1833105 2% 0.01%
?InlDecisionIsFailure@@YA_NW4InlineDecision@@@z 11974945 13557798 1582853 2% 0.00%
?fgReplacePred@Compiler@@QEAAXPEAUBasicBlock@@00@Z 15185222 16740126 1554904 2% 0.00%
memset 290165795 291096514 930719 1% 0.00%
?PrePhase@Phase@@MEAAXXZ 11675672 12433716 758044 1% 0.00%
?IsFailure@InlineResult@@QEBA_NXZ 10770 600651 589881 1% 0.00%

@AndyAyersMS
Copy link
Member Author

coreclr jitstress failure is #80666

libraries jitstress failures are not ones I've seen before, but also seem unrelated. Both are on linux arm runs:

export DOTNET_TieredCompilation=0
export DOTNET_DbgEnableMiniDump=1
export DOTNET_DbgMiniDumpName=$HELIX_DUMP_FOLDER/coredump.%d.dmp
export DOTNET_ReadyToRun=0
export DOTNET_ZapDisable=1
  Starting:    System.Text.Json.Tests (parallel test collections = on, max threads = 2)

Assert failure(PID 28 [0x0000001c], Thread: 37 [0x0025]): GCHeapUtilities::GetGCHeap()->IsInFrozenSegment(obj)
    File: /__w/1/s/src/coreclr/vm/syncblk.cpp Line: 2048
    Image: /root/helix/work/correlation/dotnet

and

export DOTNET_TieredCompilation=0
export DOTNET_DbgEnableMiniDump=1
export DOTNET_DbgMiniDumpName=$HELIX_DUMP_FOLDER/coredump.%d.dmp
export DOTNET_ReadyToRun=0
export DOTNET_ZapDisable=1

  Starting:    Microsoft.Extensions.Logging.Generators.Roslyn4.0.Tests (parallel test collections = on, max threads = 2)

Assert failure(PID 27 [0x0000001b], Thread: 41 [0x0029]): Consistency check failed: FAILED: (!implSlot.IsNull() || pMT->IsInterface()) && "Valid method implementation was not found."
    File: /__w/1/s/src/coreclr/vm/virtualcallstub.cpp Line: 2206
    Image: /root/helix/work/correlation/dotnet

@AndyAyersMS AndyAyersMS merged commit e58174b into dotnet:main Jan 24, 2023
@EgorBo
Copy link
Member

EgorBo commented Jan 24, 2023

My understanding that on 32bit we have a sort of random asserts from Object::Validate on heap corruption/gc holes, while on x64 we mostly stop at that assert that checks that Object Header's padding is 4 bytes of all zeros and garbage almost never pass that assert. While on 32bit we continue validating object by checking some header bits and calling GC apis. In this case it was BIT_SBLK_GC_RESERVE + IsInFrozenSegment

@AndyAyersMS AndyAyersMS mentioned this pull request Jan 25, 2023
44 tasks
@ghost ghost locked as resolved and limited conversation to collaborators Feb 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants