-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PPC][AIX] Save/restore r31 when using base pointer #100182
Conversation
…epilogue When the base pointer r30 is used to hold the stack pointer, r30 is spilled in the prologue. On AIX registers are saved from highest to lowest, so r31 also needs to be saved. Setting needsFP to true on AIX when the base pointer is used allows r31 to also be saved and restored.
@llvm/pr-subscribers-backend-powerpc Author: Zaara Syeda (syzaara) Changes…epilogue When the base pointer r30 is used to hold the stack pointer, r30 is spilled in the prologue. On AIX registers are saved from highest to lowest, so r31 also needs to be saved. Setting needsFP to true on AIX when the base pointer is used allows r31 to also be saved and restored. Patch is 31.64 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100182.diff 3 Files Affected:
diff --git a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
index 1963582ce6863..11332dbd8147c 100644
--- a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
@@ -376,6 +376,9 @@ bool PPCFrameLowering::needsFP(const MachineFunction &MF) const {
if (MF.getFunction().hasFnAttribute(Attribute::Naked))
return false;
+ if (Subtarget.isAIXABI() && Subtarget.getRegisterInfo()->hasBasePointer(MF))
+ return true;
+
return MF.getTarget().Options.DisableFramePointerElim(MF) ||
MFI.hasVarSizedObjects() || MFI.hasStackMap() || MFI.hasPatchPoint() ||
MF.exposesReturnsTwice() ||
diff --git a/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll b/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
index ab222d770360c..cc4f0ee92c5dc 100644
--- a/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
+++ b/llvm/test/CodeGen/PowerPC/aix-base-pointer.ll
@@ -6,8 +6,9 @@
; Use an overaligned buffer to force base-pointer usage. Test verifies:
; - base pointer register (r30) is saved/defined/restored.
+; - frame pointer register (r31) is saved/defined/restored.
; - stack frame is allocated with correct alignment.
-; - Address of %AlignedBuffer is calculated based off offset from the stack
+; - Address of %AlignedBuffer is calculated based off offset from the frame
; pointer.
define float @caller(float %f) {
@@ -19,23 +20,29 @@ define float @caller(float %f) {
declare void @callee(ptr)
; 32BIT-LABEL: .caller:
+; 32BIT: stw 31, -12(1)
; 32BIT: stw 30, -16(1)
; 32BIT: mr 30, 1
; 32BIT: clrlwi 0, 1, 27
; 32BIT: subfic 0, 0, -224
; 32BIT: stwux 1, 1, 0
-; 32BIT: addi 3, 1, 64
+; 32BIT: mr 31, 1
+; 32BIT: addi 3, 31, 64
; 32BIT: bl .callee
; 32BIT: mr 1, 30
+; 32BIT: lwz 31, -12(1)
; 32BIT: lwz 30, -16(1)
; 64BIT-LABEL: .caller:
+; 64BIT: std 31, -16(1)
; 64BIT: std 30, -24(1)
; 64BIT: mr 30, 1
; 64BIT: clrldi 0, 1, 59
; 64BIT: subfic 0, 0, -288
; 64BIT: stdux 1, 1, 0
-; 64BIT: addi 3, 1, 128
+; 64BIT: mr 31, 1
+; 64BIT: addi 3, 31, 128
; 64BIT: bl .callee
; 64BIT: mr 1, 30
+; 64BIT: ld 31, -16(1)
; 64BIT: ld 30, -24(1)
diff --git a/llvm/test/CodeGen/PowerPC/ppc64-rop-protection-aix.ll b/llvm/test/CodeGen/PowerPC/ppc64-rop-protection-aix.ll
index 8955835f41ea6..318b6d2fc6aa3 100644
--- a/llvm/test/CodeGen/PowerPC/ppc64-rop-protection-aix.ll
+++ b/llvm/test/CodeGen/PowerPC/ppc64-rop-protection-aix.ll
@@ -2297,510 +2297,546 @@ define dso_local zeroext i32 @aligned(ptr nocapture readonly %in) #0 {
; BE-P10-LABEL: aligned:
; BE-P10: # %bb.0: # %entry
; BE-P10-NEXT: mflr r0
+; BE-P10-NEXT: std r31, -8(r1)
; BE-P10-NEXT: std r30, -16(r1)
; BE-P10-NEXT: lis r12, -1
; BE-P10-NEXT: mr r30, r1
; BE-P10-NEXT: std r0, 16(r1)
-; BE-P10-NEXT: hashst r0, -24(r1)
+; BE-P10-NEXT: hashst r0, -32(r1)
; BE-P10-NEXT: clrldi r0, r1, 49
; BE-P10-NEXT: subc r0, r12, r0
; BE-P10-NEXT: stdux r1, r1, r0
-; BE-P10-NEXT: std r31, -8(r30) # 8-byte Folded Spill
-; BE-P10-NEXT: mr r31, r3
+; BE-P10-NEXT: std r29, -24(r30) # 8-byte Folded Spill
+; BE-P10-NEXT: mr r29, r3
; BE-P10-NEXT: lwz r3, 4(r3)
; BE-P10-NEXT: lis r4, 0
-; BE-P10-NEXT: addi r5, r1, 32764
-; BE-P10-NEXT: ori r4, r4, 65508
-; BE-P10-NEXT: stwx r3, r1, r4
-; BE-P10-NEXT: lwz r3, 12(r31)
+; BE-P10-NEXT: mr r31, r1
+; BE-P10-NEXT: ori r4, r4, 65500
+; BE-P10-NEXT: stwx r3, r31, r4
+; BE-P10-NEXT: lwz r3, 12(r29)
; BE-P10-NEXT: lis r4, 0
; BE-P10-NEXT: ori r4, r4, 32768
-; BE-P10-NEXT: stwx r3, r1, r4
-; BE-P10-NEXT: lwz r3, 20(r31)
+; BE-P10-NEXT: stwx r3, r31, r4
+; BE-P10-NEXT: lwz r3, 20(r29)
; BE-P10-NEXT: lis r4, 0
-; BE-P10-NEXT: ori r4, r4, 65508
-; BE-P10-NEXT: add r4, r1, r4
-; BE-P10-NEXT: stw r3, 32764(r1)
+; BE-P10-NEXT: ori r4, r4, 65500
+; BE-P10-NEXT: stw r3, 32764(r31)
; BE-P10-NEXT: lis r3, 0
; BE-P10-NEXT: ori r3, r3, 32768
-; BE-P10-NEXT: add r3, r1, r3
+; BE-P10-NEXT: add r3, r31, r3
+; BE-P10-NEXT: add r4, r31, r4
+; BE-P10-NEXT: addi r5, r31, 32764
; BE-P10-NEXT: bl .callee3[PR]
; BE-P10-NEXT: nop
-; BE-P10-NEXT: lwz r4, 16(r31)
-; BE-P10-NEXT: ld r31, -8(r30) # 8-byte Folded Reload
+; BE-P10-NEXT: lwz r4, 16(r29)
+; BE-P10-NEXT: ld r29, -24(r30) # 8-byte Folded Reload
; BE-P10-NEXT: add r3, r4, r3
; BE-P10-NEXT: clrldi r3, r3, 32
; BE-P10-NEXT: mr r1, r30
; BE-P10-NEXT: ld r0, 16(r1)
-; BE-P10-NEXT: ld r30, -16(r1)
+; BE-P10-NEXT: ld r31, -8(r1)
; BE-P10-NEXT: mtlr r0
-; BE-P10-NEXT: hashchk r0, -24(r1)
+; BE-P10-NEXT: ld r30, -16(r1)
+; BE-P10-NEXT: hashchk r0, -32(r1)
; BE-P10-NEXT: blr
;
; BE-P9-LABEL: aligned:
; BE-P9: # %bb.0: # %entry
; BE-P9-NEXT: mflr r0
-; BE-P9-NEXT: std r30, -16(r1)
+; BE-P9-NEXT: std r31, -8(r1)
; BE-P9-NEXT: lis r12, -1
+; BE-P9-NEXT: std r30, -16(r1)
; BE-P9-NEXT: mr r30, r1
; BE-P9-NEXT: std r0, 16(r1)
-; BE-P9-NEXT: hashst r0, -24(r1)
+; BE-P9-NEXT: hashst r0, -32(r1)
; BE-P9-NEXT: clrldi r0, r1, 49
; BE-P9-NEXT: subc r0, r12, r0
; BE-P9-NEXT: stdux r1, r1, r0
-; BE-P9-NEXT: std r31, -8(r30) # 8-byte Folded Spill
-; BE-P9-NEXT: mr r31, r3
+; BE-P9-NEXT: std r29, -24(r30) # 8-byte Folded Spill
+; BE-P9-NEXT: mr r29, r3
; BE-P9-NEXT: lwz r3, 4(r3)
; BE-P9-NEXT: lis r4, 0
-; BE-P9-NEXT: addi r5, r1, 32764
-; BE-P9-NEXT: ori r4, r4, 65508
-; BE-P9-NEXT: stwx r3, r1, r4
-; BE-P9-NEXT: lwz r3, 12(r31)
+; BE-P9-NEXT: mr r31, r1
+; BE-P9-NEXT: ori r4, r4, 65500
+; BE-P9-NEXT: addi r5, r31, 32764
+; BE-P9-NEXT: stwx r3, r31, r4
+; BE-P9-NEXT: lwz r3, 12(r29)
; BE-P9-NEXT: lis r4, 0
; BE-P9-NEXT: ori r4, r4, 32768
-; BE-P9-NEXT: stwx r3, r1, r4
-; BE-P9-NEXT: lwz r3, 20(r31)
+; BE-P9-NEXT: stwx r3, r31, r4
+; BE-P9-NEXT: lwz r3, 20(r29)
; BE-P9-NEXT: lis r4, 0
-; BE-P9-NEXT: ori r4, r4, 65508
-; BE-P9-NEXT: stw r3, 32764(r1)
+; BE-P9-NEXT: ori r4, r4, 65500
+; BE-P9-NEXT: stw r3, 32764(r31)
; BE-P9-NEXT: lis r3, 0
-; BE-P9-NEXT: add r4, r1, r4
+; BE-P9-NEXT: add r4, r31, r4
; BE-P9-NEXT: ori r3, r3, 32768
-; BE-P9-NEXT: add r3, r1, r3
+; BE-P9-NEXT: add r3, r31, r3
; BE-P9-NEXT: bl .callee3[PR]
; BE-P9-NEXT: nop
-; BE-P9-NEXT: lwz r4, 16(r31)
-; BE-P9-NEXT: ld r31, -8(r30) # 8-byte Folded Reload
+; BE-P9-NEXT: lwz r4, 16(r29)
+; BE-P9-NEXT: ld r29, -24(r30) # 8-byte Folded Reload
; BE-P9-NEXT: add r3, r4, r3
; BE-P9-NEXT: clrldi r3, r3, 32
; BE-P9-NEXT: mr r1, r30
; BE-P9-NEXT: ld r0, 16(r1)
+; BE-P9-NEXT: ld r31, -8(r1)
; BE-P9-NEXT: ld r30, -16(r1)
; BE-P9-NEXT: mtlr r0
-; BE-P9-NEXT: hashchk r0, -24(r1)
+; BE-P9-NEXT: hashchk r0, -32(r1)
; BE-P9-NEXT: blr
;
; BE-P8-LABEL: aligned:
; BE-P8: # %bb.0: # %entry
; BE-P8-NEXT: mflr r0
+; BE-P8-NEXT: std r31, -8(r1)
; BE-P8-NEXT: std r30, -16(r1)
; BE-P8-NEXT: lis r12, -1
; BE-P8-NEXT: mr r30, r1
; BE-P8-NEXT: std r0, 16(r1)
-; BE-P8-NEXT: hashst r0, -24(r1)
+; BE-P8-NEXT: hashst r0, -32(r1)
; BE-P8-NEXT: clrldi r0, r1, 49
; BE-P8-NEXT: subc r0, r12, r0
; BE-P8-NEXT: stdux r1, r1, r0
; BE-P8-NEXT: lis r4, 0
-; BE-P8-NEXT: std r31, -8(r30) # 8-byte Folded Spill
-; BE-P8-NEXT: mr r31, r3
+; BE-P8-NEXT: std r29, -24(r30) # 8-byte Folded Spill
+; BE-P8-NEXT: mr r29, r3
; BE-P8-NEXT: lwz r3, 4(r3)
-; BE-P8-NEXT: addi r5, r1, 32764
-; BE-P8-NEXT: ori r4, r4, 65508
-; BE-P8-NEXT: stwx r3, r1, r4
+; BE-P8-NEXT: mr r31, r1
+; BE-P8-NEXT: ori r4, r4, 65500
+; BE-P8-NEXT: addi r5, r31, 32764
+; BE-P8-NEXT: stwx r3, r31, r4
; BE-P8-NEXT: lis r4, 0
-; BE-P8-NEXT: lwz r3, 12(r31)
+; BE-P8-NEXT: lwz r3, 12(r29)
; BE-P8-NEXT: ori r4, r4, 32768
-; BE-P8-NEXT: stwx r3, r1, r4
-; BE-P8-NEXT: lwz r3, 20(r31)
+; BE-P8-NEXT: stwx r3, r31, r4
+; BE-P8-NEXT: lwz r3, 20(r29)
; BE-P8-NEXT: lis r4, 0
-; BE-P8-NEXT: ori r4, r4, 65508
-; BE-P8-NEXT: stw r3, 32764(r1)
+; BE-P8-NEXT: ori r4, r4, 65500
+; BE-P8-NEXT: stw r3, 32764(r31)
; BE-P8-NEXT: lis r3, 0
-; BE-P8-NEXT: add r4, r1, r4
+; BE-P8-NEXT: add r4, r31, r4
; BE-P8-NEXT: ori r3, r3, 32768
-; BE-P8-NEXT: add r3, r1, r3
+; BE-P8-NEXT: add r3, r31, r3
; BE-P8-NEXT: bl .callee3[PR]
; BE-P8-NEXT: nop
-; BE-P8-NEXT: lwz r4, 16(r31)
-; BE-P8-NEXT: ld r31, -8(r30) # 8-byte Folded Reload
+; BE-P8-NEXT: lwz r4, 16(r29)
+; BE-P8-NEXT: ld r29, -24(r30) # 8-byte Folded Reload
; BE-P8-NEXT: add r3, r4, r3
; BE-P8-NEXT: clrldi r3, r3, 32
; BE-P8-NEXT: mr r1, r30
; BE-P8-NEXT: ld r0, 16(r1)
+; BE-P8-NEXT: ld r31, -8(r1)
; BE-P8-NEXT: ld r30, -16(r1)
-; BE-P8-NEXT: hashchk r0, -24(r1)
+; BE-P8-NEXT: hashchk r0, -32(r1)
; BE-P8-NEXT: mtlr r0
; BE-P8-NEXT: blr
;
; BE-32BIT-P10-LABEL: aligned:
; BE-32BIT-P10: # %bb.0: # %entry
; BE-32BIT-P10-NEXT: mflr r0
+; BE-32BIT-P10-NEXT: stw r31, -4(r1)
; BE-32BIT-P10-NEXT: stw r30, -8(r1)
; BE-32BIT-P10-NEXT: lis r12, -1
; BE-32BIT-P10-NEXT: mr r30, r1
; BE-32BIT-P10-NEXT: stw r0, 8(r1)
-; BE-32BIT-P10-NEXT: hashst r0, -16(r1)
+; BE-32BIT-P10-NEXT: hashst r0, -24(r1)
; BE-32BIT-P10-NEXT: clrlwi r0, r1, 17
; BE-32BIT-P10-NEXT: subc r0, r12, r0
; BE-32BIT-P10-NEXT: stwux r1, r1, r0
-; BE-32BIT-P10-NEXT: stw r31, -4(r30) # 4-byte Folded Spill
-; BE-32BIT-P10-NEXT: mr r31, r3
+; BE-32BIT-P10-NEXT: stw r29, -12(r30) # 4-byte Folded Spill
+; BE-32BIT-P10-NEXT: mr r29, r3
; BE-32BIT-P10-NEXT: lwz r3, 4(r3)
; BE-32BIT-P10-NEXT: lis r4, 0
-; BE-32BIT-P10-NEXT: addi r5, r1, 32764
-; BE-32BIT-P10-NEXT: ori r4, r4, 65516
-; BE-32BIT-P10-NEXT: stwx r3, r1, r4
-; BE-32BIT-P10-NEXT: lwz r3, 12(r31)
+; BE-32BIT-P10-NEXT: mr r31, r1
+; BE-32BIT-P10-NEXT: ori r4, r4, 65508
+; BE-32BIT-P10-NEXT: stwx r3, r31, r4
+; BE-32BIT-P10-NEXT: lwz r3, 12(r29)
; BE-32BIT-P10-NEXT: lis r4, 0
; BE-32BIT-P10-NEXT: ori r4, r4, 32768
-; BE-32BIT-P10-NEXT: stwx r3, r1, r4
-; BE-32BIT-P10-NEXT: lwz r3, 20(r31)
+; BE-32BIT-P10-NEXT: stwx r3, r31, r4
+; BE-32BIT-P10-NEXT: lwz r3, 20(r29)
; BE-32BIT-P10-NEXT: lis r4, 0
-; BE-32BIT-P10-NEXT: ori r4, r4, 65516
-; BE-32BIT-P10-NEXT: add r4, r1, r4
-; BE-32BIT-P10-NEXT: stw r3, 32764(r1)
+; BE-32BIT-P10-NEXT: ori r4, r4, 65508
+; BE-32BIT-P10-NEXT: stw r3, 32764(r31)
; BE-32BIT-P10-NEXT: lis r3, 0
; BE-32BIT-P10-NEXT: ori r3, r3, 32768
-; BE-32BIT-P10-NEXT: add r3, r1, r3
+; BE-32BIT-P10-NEXT: add r3, r31, r3
+; BE-32BIT-P10-NEXT: add r4, r31, r4
+; BE-32BIT-P10-NEXT: addi r5, r31, 32764
; BE-32BIT-P10-NEXT: bl .callee3[PR]
; BE-32BIT-P10-NEXT: nop
-; BE-32BIT-P10-NEXT: lwz r4, 16(r31)
-; BE-32BIT-P10-NEXT: lwz r31, -4(r30) # 4-byte Folded Reload
+; BE-32BIT-P10-NEXT: lwz r4, 16(r29)
+; BE-32BIT-P10-NEXT: lwz r29, -12(r30) # 4-byte Folded Reload
; BE-32BIT-P10-NEXT: add r3, r4, r3
; BE-32BIT-P10-NEXT: mr r1, r30
; BE-32BIT-P10-NEXT: lwz r0, 8(r1)
-; BE-32BIT-P10-NEXT: lwz r30, -8(r1)
+; BE-32BIT-P10-NEXT: lwz r31, -4(r1)
; BE-32BIT-P10-NEXT: mtlr r0
-; BE-32BIT-P10-NEXT: hashchk r0, -16(r1)
+; BE-32BIT-P10-NEXT: lwz r30, -8(r1)
+; BE-32BIT-P10-NEXT: hashchk r0, -24(r1)
; BE-32BIT-P10-NEXT: blr
;
; BE-32BIT-P9-LABEL: aligned:
; BE-32BIT-P9: # %bb.0: # %entry
; BE-32BIT-P9-NEXT: mflr r0
-; BE-32BIT-P9-NEXT: stw r30, -8(r1)
+; BE-32BIT-P9-NEXT: stw r31, -4(r1)
; BE-32BIT-P9-NEXT: lis r12, -1
+; BE-32BIT-P9-NEXT: stw r30, -8(r1)
; BE-32BIT-P9-NEXT: mr r30, r1
; BE-32BIT-P9-NEXT: stw r0, 8(r1)
-; BE-32BIT-P9-NEXT: hashst r0, -16(r1)
+; BE-32BIT-P9-NEXT: hashst r0, -24(r1)
; BE-32BIT-P9-NEXT: clrlwi r0, r1, 17
; BE-32BIT-P9-NEXT: subc r0, r12, r0
; BE-32BIT-P9-NEXT: stwux r1, r1, r0
-; BE-32BIT-P9-NEXT: stw r31, -4(r30) # 4-byte Folded Spill
-; BE-32BIT-P9-NEXT: mr r31, r3
+; BE-32BIT-P9-NEXT: stw r29, -12(r30) # 4-byte Folded Spill
+; BE-32BIT-P9-NEXT: mr r29, r3
; BE-32BIT-P9-NEXT: lwz r3, 4(r3)
; BE-32BIT-P9-NEXT: lis r4, 0
-; BE-32BIT-P9-NEXT: addi r5, r1, 32764
-; BE-32BIT-P9-NEXT: ori r4, r4, 65516
-; BE-32BIT-P9-NEXT: stwx r3, r1, r4
-; BE-32BIT-P9-NEXT: lwz r3, 12(r31)
+; BE-32BIT-P9-NEXT: mr r31, r1
+; BE-32BIT-P9-NEXT: ori r4, r4, 65508
+; BE-32BIT-P9-NEXT: addi r5, r31, 32764
+; BE-32BIT-P9-NEXT: stwx r3, r31, r4
+; BE-32BIT-P9-NEXT: lwz r3, 12(r29)
; BE-32BIT-P9-NEXT: lis r4, 0
; BE-32BIT-P9-NEXT: ori r4, r4, 32768
-; BE-32BIT-P9-NEXT: stwx r3, r1, r4
-; BE-32BIT-P9-NEXT: lwz r3, 20(r31)
+; BE-32BIT-P9-NEXT: stwx r3, r31, r4
+; BE-32BIT-P9-NEXT: lwz r3, 20(r29)
; BE-32BIT-P9-NEXT: lis r4, 0
-; BE-32BIT-P9-NEXT: ori r4, r4, 65516
-; BE-32BIT-P9-NEXT: stw r3, 32764(r1)
+; BE-32BIT-P9-NEXT: ori r4, r4, 65508
+; BE-32BIT-P9-NEXT: stw r3, 32764(r31)
; BE-32BIT-P9-NEXT: lis r3, 0
-; BE-32BIT-P9-NEXT: add r4, r1, r4
+; BE-32BIT-P9-NEXT: add r4, r31, r4
; BE-32BIT-P9-NEXT: ori r3, r3, 32768
-; BE-32BIT-P9-NEXT: add r3, r1, r3
+; BE-32BIT-P9-NEXT: add r3, r31, r3
; BE-32BIT-P9-NEXT: bl .callee3[PR]
; BE-32BIT-P9-NEXT: nop
-; BE-32BIT-P9-NEXT: lwz r4, 16(r31)
-; BE-32BIT-P9-NEXT: lwz r31, -4(r30) # 4-byte Folded Reload
+; BE-32BIT-P9-NEXT: lwz r4, 16(r29)
+; BE-32BIT-P9-NEXT: lwz r29, -12(r30) # 4-byte Folded Reload
; BE-32BIT-P9-NEXT: add r3, r4, r3
; BE-32BIT-P9-NEXT: mr r1, r30
; BE-32BIT-P9-NEXT: lwz r0, 8(r1)
+; BE-32BIT-P9-NEXT: lwz r31, -4(r1)
; BE-32BIT-P9-NEXT: lwz r30, -8(r1)
; BE-32BIT-P9-NEXT: mtlr r0
-; BE-32BIT-P9-NEXT: hashchk r0, -16(r1)
+; BE-32BIT-P9-NEXT: hashchk r0, -24(r1)
; BE-32BIT-P9-NEXT: blr
;
; BE-32BIT-P8-LABEL: aligned:
; BE-32BIT-P8: # %bb.0: # %entry
; BE-32BIT-P8-NEXT: mflr r0
+; BE-32BIT-P8-NEXT: stw r31, -4(r1)
; BE-32BIT-P8-NEXT: stw r30, -8(r1)
; BE-32BIT-P8-NEXT: lis r12, -1
; BE-32BIT-P8-NEXT: mr r30, r1
; BE-32BIT-P8-NEXT: stw r0, 8(r1)
-; BE-32BIT-P8-NEXT: hashst r0, -16(r1)
+; BE-32BIT-P8-NEXT: hashst r0, -24(r1)
; BE-32BIT-P8-NEXT: clrlwi r0, r1, 17
; BE-32BIT-P8-NEXT: subc r0, r12, r0
; BE-32BIT-P8-NEXT: stwux r1, r1, r0
; BE-32BIT-P8-NEXT: lis r4, 0
-; BE-32BIT-P8-NEXT: stw r31, -4(r30) # 4-byte Folded Spill
-; BE-32BIT-P8-NEXT: mr r31, r3
+; BE-32BIT-P8-NEXT: stw r29, -12(r30) # 4-byte Folded Spill
+; BE-32BIT-P8-NEXT: mr r29, r3
; BE-32BIT-P8-NEXT: lwz r3, 4(r3)
-; BE-32BIT-P8-NEXT: addi r5, r1, 32764
-; BE-32BIT-P8-NEXT: ori r4, r4, 65516
-; BE-32BIT-P8-NEXT: stwx r3, r1, r4
+; BE-32BIT-P8-NEXT: mr r31, r1
+; BE-32BIT-P8-NEXT: ori r4, r4, 65508
+; BE-32BIT-P8-NEXT: addi r5, r31, 32764
+; BE-32BIT-P8-NEXT: stwx r3, r31, r4
; BE-32BIT-P8-NEXT: lis r4, 0
-; BE-32BIT-P8-NEXT: lwz r3, 12(r31)
+; BE-32BIT-P8-NEXT: lwz r3, 12(r29)
; BE-32BIT-P8-NEXT: ori r4, r4, 32768
-; BE-32BIT-P8-NEXT: stwx r3, r1, r4
-; BE-32BIT-P8-NEXT: lwz r3, 20(r31)
+; BE-32BIT-P8-NEXT: stwx r3, r31, r4
+; BE-32BIT-P8-NEXT: lwz r3, 20(r29)
; BE-32BIT-P8-NEXT: lis r4, 0
-; BE-32BIT-P8-NEXT: ori r4, r4, 65516
-; BE-32BIT-P8-NEXT: stw r3, 32764(r1)
+; BE-32BIT-P8-NEXT: ori r4, r4, 65508
+; BE-32BIT-P8-NEXT: stw r3, 32764(r31)
; BE-32BIT-P8-NEXT: lis r3, 0
-; BE-32BIT-P8-NEXT: add r4, r1, r4
+; BE-32BIT-P8-NEXT: add r4, r31, r4
; BE-32BIT-P8-NEXT: ori r3, r3, 32768
-; BE-32BIT-P8-NEXT: add r3, r1, r3
+; BE-32BIT-P8-NEXT: add r3, r31, r3
; BE-32BIT-P8-NEXT: bl .callee3[PR]
; BE-32BIT-P8-NEXT: nop
-; BE-32BIT-P8-NEXT: lwz r4, 16(r31)
-; BE-32BIT-P8-NEXT: lwz r31, -4(r30) # 4-byte Folded Reload
+; BE-32BIT-P8-NEXT: lwz r4, 16(r29)
+; BE-32BIT-P8-NEXT: lwz r29, -12(r30) # 4-byte Folded Reload
; BE-32BIT-P8-NEXT: add r3, r4, r3
; BE-32BIT-P8-NEXT: mr r1, r30
; BE-32BIT-P8-NEXT: lwz r0, 8(r1)
+; BE-32BIT-P8-NEXT: lwz r31, -4(r1)
; BE-32BIT-P8-NEXT: lwz r30, -8(r1)
-; BE-32BIT-P8-NEXT: hashchk r0, -16(r1)
+; BE-32BIT-P8-NEXT: hashchk r0, -24(r1)
; BE-32BIT-P8-NEXT: mtlr r0
; BE-32BIT-P8-NEXT: blr
;
; BE-P10-PRIV-LABEL: aligned:
; BE-P10-PRIV: # %bb.0: # %entry
; BE-P10-PRIV-NEXT: mflr r0
+; BE-P10-PRIV-NEXT: std r31, -8(r1)
; BE-P10-PRIV-NEXT: std r30, -16(r1)
; BE-P10-PRIV-NEXT: lis r12, -1
; BE-P10-PRIV-NEXT: mr r30, r1
; BE-P10-PRIV-NEXT: std r0, 16(r1)
-; BE-P10-PRIV-NEXT: hashstp r0, -24(r1)
+; BE-P10-PRIV-NEXT: hashstp r0, -32(r1)
; BE-P10-PRIV-NEXT: clrldi r0, r1, 49
; BE-P10-PRIV-NEXT: subc r0, r12, r0
; BE-P10-PRIV-NEXT: stdux r1, r1, r0
-; BE-P10-PRIV-NEXT: std r31, -8(r30) # 8-byte Folded Spill
-; BE-P10-PRIV-NEXT: mr r31, r3
+; BE-P10-PRIV-NEXT: std r29, -24(r30) # 8-byte Folded Spill
+; BE-P10-PRIV-NEXT: mr r29, r3
; BE-P10-PRIV-NEXT: lwz r3, 4(r3)
; BE-P10-PRIV-NEXT: lis r4, 0
-; BE-P10-PRIV-NEXT: addi r5, r1, 32764
-; BE-P10-PRIV-NEXT: ori r4, r4, 65508
-; BE-P10-PRIV-NEXT: stwx r3, r1, r4
-; BE-P10-PRIV-NEXT: lwz r3, 12(r31)
+; BE-P10-PRIV-NEXT: mr r31, r1
+; BE-P10-PRIV-NEXT: ori r4, r4, 65500
+; BE-P10-PRIV-NEXT: stwx r3, r31, r4
+; BE-P10-PRIV-NEXT: lwz r3, 12(r29)
; BE-P10-PRIV-NEXT: lis r4, 0
; BE-P10-PRIV-NEXT: ori r4, r4, 32768
-; BE-P10-PRIV-NEXT: stwx r3, r1, r4
-; BE-P10-PRIV-NEXT: lwz r3, 20(r31)
+; BE-P10-PRIV-NEXT: stwx r3, r31, r4
+; BE-P10-PRIV-NEXT: lwz r3, 20(r29)
; BE-P10-PRIV-NEXT: lis r4, 0
-; BE-P10-PRIV-NEXT: ori r4, r4, 65508
-; BE-P10-PRIV-NEXT: add r4, r1, r4
-; BE-P10-PRIV-NEXT: stw r3, 32764(r1)
+; BE-P10-PRIV-NEXT: ori r4, r4, 65500
+; BE-P10-PRIV-NEXT: stw r3, 32764(r31)
; BE-P10-PRIV-NEXT: lis r3, 0
; BE-P10-PRIV-NEXT: ori r3, r3, 32768
-; BE-P10-PRIV-NEXT: add r3, r1, r3
+; BE-P10-PRIV-NEXT: add r3, r31, r3
+; BE-P10-PRIV-NEXT: add r4, r31, r4
+; BE-P10-PRIV-NEXT: addi r5, r31, 32764
; BE-P10-PRIV-NEXT: bl .callee3[PR]
; BE-P10-PRIV-NEXT: nop
-; BE-P10-PRIV-NEXT: lwz r4, 16(r31)
-; BE-P10-PRIV-NEXT: ld r31, -8(r30) # 8-byte Folded Reload
+; BE-P10-PRIV-NEXT: lwz r4, 16(r29)
+; BE-P10-PRIV-NEXT: ld r29, -24(r30) # 8-byte Folded Reload
; BE-P10-PRIV-NEXT: add r3, r4, r3
; BE-P10-PRIV-NEXT: clrldi r3, r3, 32
; BE-P10-PRIV-NEXT: mr r1, r30
; BE-P10-PRIV-NEXT: ld r0, 16(r1)
-; BE-P10-PRIV-NEXT: ld r30, -16(r1)
+; BE-P10-PRIV-NEXT: ld r31, -8(r1)
; BE-P10-PRIV-NEXT: mtlr r0
-; BE-P10-PRIV-NEXT: hashchkp r0, -24(r1)
+; BE-P10-PRIV-NEXT: ld r30, -16(r1)
+; BE-P10-PRIV-NEXT: hashchkp r0, -32(r1)
; BE-P10-PRIV-NEXT: blr
;
; BE-P9-PRIV-LABEL: aligned:
; BE-P9-PRIV: # %bb.0: # %entry
; BE-P9-PRIV-NEXT: mflr r0
-; BE-P9-PRIV-NEXT: std r30, -16(r1)
+; BE-P9-PRIV-NEXT: std r31, -8(r1)
; BE-P9-PRIV-NEXT: lis r12, -1
+; BE-P9-PRIV-NEXT: std r30, -16(r1)
; BE-P9-PRIV-NEXT: mr r30, r1
; BE-P9-PRIV-NEXT: std r0, 16(r1)
-; BE-P9-PRIV-NEXT: hashstp r0, -24(r1)
+; BE-P9-PRIV-NEXT: hashstp r0, -32(r1)
; BE-P9-PRIV-NEXT: clrldi r0, r1, 49
; BE-P9-PRIV-NEXT: subc r0, r12, r0
; BE-P9-PRIV-NEXT: stdux r1, r1, r0
-; BE-P9-PRIV-NEXT: std r31, -8(r30) # 8-byte Folded Spill
-; BE-P9-PRIV-NEXT: mr r31, r3
+; BE-P9-PRIV-NEXT: std r29, -24(r30) # 8-byte Folded Spill
+; BE-P9-PRIV-NEXT: mr r29, r3
; BE-P9-PRIV-NEXT: lwz r3, 4(r3)
; BE-P9-PRIV-NEXT: lis r4, 0
-; BE-P9-PRIV-NEXT: addi r5, r1, 32764
-; BE-P9-PRIV-NEXT: ori r4, r4, 65508
-; BE-P9-PRIV-NEXT: stwx r3, r1, r4
-; BE-P9-PRIV-NEXT: lwz r3, 12(r31)
+; BE-P9-PRIV-NEXT: mr r31, r1
+; BE-P9-PRIV-NEXT: ori r4, r4, 65500
+; BE-P9-PRIV-NEXT: addi r5, r31, 32764
+; BE-P9-PRIV-NEXT: stwx r3, r31, r4
+; BE-P9-PRIV-NEXT: lwz r3, 12(r29)
; BE-P9-PRIV-NEXT: lis r4, 0
; BE-P9-PRIV-NEXT: ori r4, r4, 32768
-; BE-P9-PRIV-NEXT: stwx r3, r1, r4
-; BE-P9-PRIV-NEXT: lwz r3, 20(r31)
+; BE-P9-PR...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks very much for fixing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this PR has already been approved but would it be possible to instead update the initialization of HasFP
to include the Subtarget.isAIXABI()) && RegInfo->hasBasePointer(MF)
, as well as modify the needsFP
helper function to take RegInfo
and Subtarget as arguments and put the check of RegInfo->hasBasePointer(MF) && Subtarget.isAIXABI())
into its body? My worry is that under refactoring or updating we could add a new check that missies the AIX condition. having them in the init/helper-body removes that possibility.
Thanks for taking a look. It is always good to do more discussion : ) For your recommendation, @syzaara did exactly same thing in her first commit, see e5fd2aa . I suggested we do not backup/restore the
I agree the new change is hard to maintain. I took another look. Since backup/restore diff --git a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
index 1963582ce686..2f787e5c731d 100644
--- a/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
@@ -2025,8 +2025,18 @@ void PPCFrameLowering::determineCalleeSaves(MachineFunction &MF,
// code. Same goes for the base pointer and the PIC base register.
if (needsFP(MF))
SavedRegs.reset(isPPC64 ? PPC::X31 : PPC::R31);
- if (RegInfo->hasBasePointer(MF))
+ if (RegInfo->hasBasePointer(MF)) {
SavedRegs.reset(RegInfo->getBaseRegister(MF));
+ // On AIX, when BaseRegister(R30) is used, need to spill r31 too to match
+ // AIX trackback table requirement.
+ if (!needsFP(MF) && !SavedRegs.test(isPPC64 ? PPC::X31 : PPC::R31) &&
+ Subtarget.isAIXABI()) {
+ assert(
+ (RegInfo->getBaseRegister(MF) == (isPPC64 ? PPC::X30 : PPC::R30)) &&
+ "Invalid base register on AIX!");
+ SavedRegs.set(isPPC64 ? PPC::X31 : PPC::R31);
+ }
+ } (This patch is not well tested!) Sorry, @syzaara for the solution change... What do you guys think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM. Thanks very much.
SavedRegs.reset(RegInfo->getBaseRegister(MF)); | ||
// On AIX, when BaseRegister(R30) is used, need to spill r31 too to match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe it's good to make R30 and r31 be consistent, my bad : )
Thanks for the insight Zheng - I never considered that we would be wasting an allocatable register. I think the new direction alleviates my concerns as well. |
When the base pointer r30 is used to hold the stack pointer, r30 is spilled in the prologue. On AIX registers are saved from highest to lowest, so r31 also needs to be saved. Fixes #96411
/cherry-pick 953d1f1 |
Error: Command failed due to missing milestone. |
/cherry-pick 953d1f1 |
Failed to cherry-pick: 953d1f1 https://github.com/llvm/llvm-project/actions/runs/10372003179 Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request |
/cherry-pick d07f106 |
When the base pointer r30 is used to hold the stack pointer, r30 is spilled in the prologue. On AIX registers are saved from highest to lowest, so r31 also needs to be saved. Fixes llvm#96411 (cherry picked from commit d07f106)
/pull-request #103301 |
When the base pointer r30 is used to hold the stack pointer, r30 is spilled in the prologue. On AIX registers are saved from highest to lowest, so r31 also needs to be saved. Fixes llvm#96411 (cherry picked from commit d07f106)
When the base pointer r30 is used to hold the stack pointer, r30 is spilled in the prologue. On AIX registers are saved from highest to lowest, so r31 also needs to be saved. Fixes llvm#96411
When the base pointer r30 is used to hold the stack pointer, r30 is spilled in the prologue. On AIX registers are saved from highest to lowest, so r31 also needs to be saved.
Fixes #96411