Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Port loop unrolling to new loop representation #96454

Merged

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Jan 3, 2024

Port loop unrolling to the new loop representation. Switch strategy slightly with how loop unrolling works:

  • If the bottom block of the loop is a BBJ_COND, create a "redirection" block to jump to its fallthrough. This is similar to how loop cloning works and saves a lot of annoying special casing around updating pred lists.
  • Leave the old loop unreachable in the flow graph after loop unrolling. Remove these blocks by running fgDfsBlocksAndRemove. Previously we would create a chain of BBJ_ALWAYS going through all the previous blocks, keeping them all reachable, likely because the old fgComputeDoms does not handle statically unreachable blocks correctly.
  • We run unrolling in a sort of "closure" algorithm, allowing only one unrolling in each loop nest per iteration. This avoids us having to maintain changed blocks of descendant loops on the side as we unroll.

Some minor diffs are expected:

  • We no longer recompute the old loop table in some cases (unrolling nested loops). This means for instance that hoisting may not kick in for some those loops because of the "has matching old loop" quirk in hoisting. This should go away later when we remove that quirk.
  • Different weights because the old unrolling leaves the loop around as a chain of BBJ_ALWAYS, keeping their weight; when we later compact them, we propagate the original "loop" weight to the blocks we compact with. This causes differences in if-conversion, register allocation and block layout.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 3, 2024
@ghost ghost assigned jakobbotsch Jan 3, 2024
@ghost
Copy link

ghost commented Jan 3, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Port loop unrolling to the new loop representation. Switch strategy slightly with how loop unrolling works:

  • If the bottom block of the loop is a BBJ_COND, create a "redirection" block to jump to its fallthrough. This is similar to how loop cloning works and saves a lot of annoying special casing around updating pred lists.
  • Leave the old loop unreachable in the flow graph after loop unrolling. Remove these blocks by running fgDfsBlocksAndRemove. Previously we would create a chain of BBJ_ALWAYS going through all the previous blocks, keeping them all reachable, likely because the old fgComputeDoms does not handle statically unreachable blocks correctly.
  • We run unrolling in a sort of "closure" algorithm, allowing only one unrolling in each loop nest per iteration. This avoids us having to maintain changed blocks of descendant loops on the side as we unroll.

Some minor diffs are expected:

  • We no longer recompute the old loop table in some cases. This means for instance that hoisting may not kick in for some those loops because of the "has matching old loop" quirk in hoisting. This should go away later.
  • (I'll take a look at some more cases later)
Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

m_dfsTree = fgComputeDfs();
optFindNewLoops();
passes++;
goto RETRY_UNROLL;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass could use some factoring into a few separate functions... I'll do that as a mechanical follow-up.

@jakobbotsch jakobbotsch marked this pull request as ready for review January 3, 2024 21:44
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @BruceForstall

Diffs. See above for why, in particular when I spot checked most of the ones I hit were the weight change one.

libraries-jitstress failures are #96464 and #86565

@@ -4914,6 +4914,8 @@ void Compiler::compCompile(void** methodCodePtr, uint32_t* methodCodeSize, JitFl

while (iterations > 0)
{
fgModified = false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be set to false by fgComputeDoms at the end of loop unrolling. It seems better to explicitly do it here, given the comment below on PHASE_OPT_UPDATE_FLOW_GRAPH.

(It seems a bit questionable not to run the phase if we loop cloned/unrolled -- I saw some beneficial diffs when I originally didn't have this.)

Comment on lines +2576 to +2579
else if (block->bbMemorySsaPhiFunc[memoryKind] == BasicBlock::EmptyMemoryPhiDef)
{
printf(" = phi([not filled])\n");
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've hit crashes here a few times during JITDUMP when I've messed up the flow graph.

optFindNewLoops();

fgDomsComputed = false;
fgRenumberBlocks();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this would be necessary, but there is a preexisting fgDebugCheckBBlist below that validates sequential bbNum values. Something for a future PR.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakobbotsch jakobbotsch merged commit 045e55a into dotnet:main Jan 4, 2024
163 of 171 checks passed
@jakobbotsch jakobbotsch deleted the loop-unrolling-new-representation branch January 4, 2024 09:11
@github-actions github-actions bot locked and limited conversation to collaborators Feb 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants