Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISC-V] Fix initReg usage in genPushCalleeSavedRegisters #99353

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

sirntar
Copy link
Member

@sirntar sirntar commented Mar 6, 2024

This PR is the result of a discussion in #99156 and #99313

Tests fixed by this PR:

  • JIT/jit64/hfa/main/testC/hfa_nd2C_d
  • JIT/jit64/hfa/main/testC/hfa_sd2C_d

After a brief discussion in #99313, we decided to mimic arm64 approach:

// Probe large frames now, if necessary, since genPushCalleeSavedRegisters() will allocate the frame. Note that
// for arm64, genAllocLclFrame only probes the frame; it does not actually allocate it (it does not change SP).
// For arm64, we are probing the frame before the callee-saved registers are saved. The 'initReg' might have
// been calculated to be one of the callee-saved registers (say, if all the integer argument registers are
// in use, and perhaps with other conditions being satisfied). This is ok in other cases, after the callee-saved
// registers have been saved. So instead of letting genAllocLclFrame use initReg as a temporary register,
// always use REG_SCRATCH. We don't care if it trashes it, so ignore the initRegZeroed output argument.
bool ignoreInitRegZeroed = false;
genAllocLclFrame(compiler->compLclFrameSize, REG_SCRATCH, &ignoreInitRegZeroed,
intRegState.rsCalleeRegArgMaskLiveIn);

Why there was a problem in the first place?

// Next we prefer to use one of the unused argument registers.
// If they aren't available we use one of the caller-saved integer registers.
else
{
tempMask = regSet.rsGetModifiedRegsMask() & RBM_ALLINT & ~excludeMask & ~regSet.rsMaskResvd;
if (tempMask != RBM_NONE)
{
// We pick the lowest register number
tempMask = genFindLowestBit(tempMask);
initReg = genRegNumFromMask(tempMask);
}
}

As you can see initReg is created from RBM_ALLINT, where RBM_ALLINT is RBM_INT_CALLEE_SAVED | RBM_INT_CALLEE_TRASH, so in fact what matters here is which register have lower rnum.
If T0, T1 and T2 are excluded or reserved for some unknown reason, the next lowest register is S1, not T3, because S1 mask is 0x0200 and T3 is 0x10000000.

Part of #84834
cc @dotnet/samsung

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 6, 2024
@ghost
Copy link

ghost commented Mar 6, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR is the result of a discussion in #99156 and #99313

Tests fixed by this PR:

  • JIT/jit64/hfa/main/testC/hfa_nd2C_d
  • JIT/jit64/hfa/main/testC/hfa_sd2C_d

After a brief discussion in #99313, we decided to mimic arm64 approach:

// Probe large frames now, if necessary, since genPushCalleeSavedRegisters() will allocate the frame. Note that
// for arm64, genAllocLclFrame only probes the frame; it does not actually allocate it (it does not change SP).
// For arm64, we are probing the frame before the callee-saved registers are saved. The 'initReg' might have
// been calculated to be one of the callee-saved registers (say, if all the integer argument registers are
// in use, and perhaps with other conditions being satisfied). This is ok in other cases, after the callee-saved
// registers have been saved. So instead of letting genAllocLclFrame use initReg as a temporary register,
// always use REG_SCRATCH. We don't care if it trashes it, so ignore the initRegZeroed output argument.
bool ignoreInitRegZeroed = false;
genAllocLclFrame(compiler->compLclFrameSize, REG_SCRATCH, &ignoreInitRegZeroed,
intRegState.rsCalleeRegArgMaskLiveIn);

Why there was a problem in the first place?

// Next we prefer to use one of the unused argument registers.
// If they aren't available we use one of the caller-saved integer registers.
else
{
tempMask = regSet.rsGetModifiedRegsMask() & RBM_ALLINT & ~excludeMask & ~regSet.rsMaskResvd;
if (tempMask != RBM_NONE)
{
// We pick the lowest register number
tempMask = genFindLowestBit(tempMask);
initReg = genRegNumFromMask(tempMask);
}
}

As you can see initReg is created from RBM_ALLINT, where RBM_ALLINT is RBM_INT_CALLEE_SAVED | RBM_INT_CALLEE_TRASH, so in fact what matters here is which register have lower rnum.
If T0, T1 and T2 are excluded or reserved for some unknown reason, the next lowest register is S1, not T3, because S1 mask is 0x0200 and T3 is 0x10000000.

Part of #84834
cc @dotnet/samsung

Author: sirntar
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@clamp03 clamp03 added the arch-riscv Related to the RISC-V architecture label Mar 6, 2024
@BruceForstall BruceForstall merged commit 8330db9 into dotnet:main Mar 7, 2024
129 checks passed
@clamp03
Copy link
Member

clamp03 commented Mar 7, 2024

cc @shushanhf @LuckyXu-HF IMO, LOONGARCH64 also need to fix it.

@sirntar
Copy link
Member Author

sirntar commented Mar 7, 2024

cc @shushanhf @LuckyXu-HF IMO, LOONGARCH64 also need to fix it.

I think the reason it didn't fail on the LA64 is because of the differences in register architecture - the RBM_INT_CALLEE_TRASH registers have lower numbers than the RBM_INT_CALLEE_SAVED registers.

@shushanhf
Copy link
Contributor

cc @shushanhf @LuckyXu-HF IMO, LOONGARCH64 also need to fix it.

  • JIT/jit64/hfa/main/testC/hfa_nd2C_d
  • JIT/jit64/hfa/main/testC/hfa_sd2C_d

Thanks for your reminder.
These tests are ok for LA64 now which we had fixed these during runtime8.

@clamp03
Copy link
Member

clamp03 commented Mar 7, 2024

@sirntar @shushanhf Oh, I was wrong. Thanks for the comments. :)

@shushanhf
Copy link
Contributor

shushanhf commented Mar 7, 2024

@sirntar @shushanhf Oh, I was wrong. Thanks for the comments. :)

Thanks for your reminder.
I think the RISCV64 meets this for adding OSR recently.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 7, 2024
@sirntar sirntar deleted the riscv64-fix-initReg-usage branch October 9, 2024 09:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-riscv Related to the RISC-V architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants