-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue where resolution move would overwrite the source of operand of JCMP #48833
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -7656,7 +7656,7 @@ void LinearScan::resolveRegisters() | |||||||
printf("Resolution Candidates: "); | ||||||||
dumpConvertedVarSet(compiler, resolutionCandidateVars); | ||||||||
printf("\n"); | ||||||||
printf("Has %sCritical Edges\n\n", hasCriticalEdges ? "" : "No"); | ||||||||
printf("Has %sCritical Edges\n\n", hasCriticalEdges ? "" : "No "); | ||||||||
|
||||||||
printf("Prior to Resolution\n"); | ||||||||
foreach_block(compiler, block) | ||||||||
|
@@ -8277,31 +8277,46 @@ void LinearScan::handleOutgoingCriticalEdges(BasicBlock* block) | |||||||
} | ||||||||
} | ||||||||
|
||||||||
// Next, if this blocks ends with a switch table, we have to make sure not to copy | ||||||||
// into the registers that it uses. | ||||||||
regMaskTP switchRegs = RBM_NONE; | ||||||||
// Next, if this blocks ends with a switch table, or for Arm64, ends with JCMP instruction, | ||||||||
// make sure to not copy into the registers that are consumed at the end of this block. | ||||||||
// | ||||||||
// Note: Only switches and JCMP (for Arm4) have input regs (and so can be fed by copies), so those | ||||||||
// are the only block-ending branches that need special handling. | ||||||||
regMaskTP consumedRegs = RBM_NONE; | ||||||||
if (block->bbJumpKind == BBJ_SWITCH) | ||||||||
{ | ||||||||
// At this point, Lowering has transformed any non-switch-table blocks into | ||||||||
// cascading ifs. | ||||||||
GenTree* switchTable = LIR::AsRange(block).LastNode(); | ||||||||
assert(switchTable != nullptr && switchTable->OperGet() == GT_SWITCH_TABLE); | ||||||||
|
||||||||
switchRegs = switchTable->gtRsvdRegs; | ||||||||
consumedRegs = switchTable->gtRsvdRegs; | ||||||||
GenTree* op1 = switchTable->gtGetOp1(); | ||||||||
GenTree* op2 = switchTable->gtGetOp2(); | ||||||||
noway_assert(op1 != nullptr && op2 != nullptr); | ||||||||
assert(op1->GetRegNum() != REG_NA && op2->GetRegNum() != REG_NA); | ||||||||
// No floating point values, so no need to worry about the register type | ||||||||
// (i.e. for ARM32, where we used the genRegMask overload with a type). | ||||||||
assert(varTypeIsIntegralOrI(op1) && varTypeIsIntegralOrI(op2)); | ||||||||
switchRegs |= genRegMask(op1->GetRegNum()); | ||||||||
switchRegs |= genRegMask(op2->GetRegNum()); | ||||||||
consumedRegs |= genRegMask(op1->GetRegNum()); | ||||||||
consumedRegs |= genRegMask(op2->GetRegNum()); | ||||||||
} | ||||||||
|
||||||||
#ifdef TARGET_ARM64 | ||||||||
// Next, if this blocks ends with a JCMP, we have to make sure not to copy | ||||||||
// into the register that it uses or modify the local variable it must consume | ||||||||
// Next, if this blocks ends with a JCMP, we have to make sure: | ||||||||
// 1. Not to copy into the register that JCMP uses | ||||||||
// e.g. JCMP w21, BRANCH | ||||||||
// 2. Not to copy into the source of JCMP's operand before it is consumed | ||||||||
// e.g. Should not use w0 since it will contain wrong value after resolution | ||||||||
// call METHOD | ||||||||
// ; mov w0, w19 <-- should not resolve in w0 here. | ||||||||
// mov w21, w0 | ||||||||
// JCMP w21, BRANCH | ||||||||
// 3. Not to modify the local variable it must consume | ||||||||
|
||||||||
// Note: GT_COPY has special handling in codegen and its generation is merged with the | ||||||||
// node that consumes its result. So both, the input and output regs of GT_COPY must be | ||||||||
// excluded from the set available for resolution. | ||||||||
LclVarDsc* jcmpLocalVarDsc = nullptr; | ||||||||
if (block->bbJumpKind == BBJ_COND) | ||||||||
{ | ||||||||
|
@@ -8310,7 +8325,13 @@ void LinearScan::handleOutgoingCriticalEdges(BasicBlock* block) | |||||||
if (lastNode->OperIs(GT_JCMP)) | ||||||||
{ | ||||||||
GenTree* op1 = lastNode->gtGetOp1(); | ||||||||
switchRegs |= genRegMask(op1->GetRegNum()); | ||||||||
consumedRegs |= genRegMask(op1->GetRegNum()); | ||||||||
|
||||||||
if (op1->OperIs(GT_COPY)) | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are locals and copies the only special cases that might happen here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's the only ones I could think of. runtime/src/coreclr/jit/codegenarmarch.cpp Lines 379 to 381 in cb606ad
So in the test that I was investigating, the
Regarding |
||||||||
{ | ||||||||
GenTree* srcOp1 = op1->gtGetOp1(); | ||||||||
consumedRegs |= genRegMask(srcOp1->GetRegNum()); | ||||||||
} | ||||||||
|
||||||||
if (op1->IsLocal()) | ||||||||
{ | ||||||||
|
@@ -8388,9 +8409,11 @@ void LinearScan::handleOutgoingCriticalEdges(BasicBlock* block) | |||||||
{ | ||||||||
sameToReg = REG_NA; | ||||||||
} | ||||||||
// If this register is used by a switch table at the end of the block, we can't do the copy | ||||||||
// in this block (since we can't insert it after the switch). | ||||||||
if ((sameToRegMask & switchRegs) != RBM_NONE) | ||||||||
// If this register is busy because it is used by a switch table at the end of the block | ||||||||
// (or for Arm64, it is consumed by JCMP), we can't do the copy in this block since we can't | ||||||||
// insert it after the switch (or for Arm64, can't insert and overwrite the operand/source | ||||||||
// of operand of JCMP). | ||||||||
if ((sameToRegMask & consumedRegs) != RBM_NONE) | ||||||||
{ | ||||||||
sameToReg = REG_NA; | ||||||||
} | ||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not obvious to me why the copy needs special casing here -- could we not insert the resolution move between the copy and jump?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inserting the resolution move between the copy and jump should be fine for some cases, for example:
but when we generate code for
JCMP
, it is tied with the copy and so the resolution move happens before the copy/jmp, like this:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment that the
GT_COPY
gets special handling in codegen and is conceptually merged with the operation that consumes its result, so both its input and output regs must be excluded from the set available for resolution.And only switches and jcmp have input regs (and so can be fed by copies), so those are the only block-ending branches that need special handling.
Which makes me wonder if we might need similar logic for switches. It looks like we don't check if the input regs for switches come from
GT_COPY
. Switches are rare enough that we might not have good stress coverage.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Possibly, but I am not confident to make that change before seeing a scenario, so lets postpone it for now.