-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R11 not preserved with PAC-M and AAPCS frame chain defect fix #81249
Conversation
When PAC-RET and AAPCS were enabled, R12 was pushed to the stack between R11 (the frame pointer) and the link register, which violated the principle that the frame pointer and link register must always be adjacent on the stack. This change separates R12 into a different push instruction to R11 and the link register when R11 is the frame pointer, meaning the frame pointer and link register are adjacent on the stack.
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-backend-arm Author: James Westwood (jwestwood921) ChangesWhen PAC-RET and AAPCS were enabled, R12 was pushed to the stack between R11 (the frame pointer) and the link register, which violated the principle that the frame pointer and link register must always be adjacent on the stack. This change separates R12 into a different push instruction to R11 and the link register when R11 is the frame pointer, meaning the frame pointer and link register are adjacent on the stack. Full diff: https://github.com/llvm/llvm-project/pull/81249.diff 4 Files Affected:
diff --git a/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h b/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h
index 926d702b4092a5..95c108ed9d4379 100644
--- a/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h
+++ b/llvm/lib/Target/ARM/ARMBaseRegisterInfo.h
@@ -78,9 +78,12 @@ static inline bool isSplitFPArea1Register(unsigned Reg,
switch (Reg) {
case R0: case R1: case R2: case R3:
case R4: case R5: case R6: case R7:
- case R8: case R9: case R10: case R12:
- case SP: case PC:
+ case R8: case R9: case SP: case PC:
return true;
+ case R10: case R12:
+ return !SplitFramePushPop;
+ case LR:
+ return SplitFramePushPop;
default:
return false;
}
@@ -91,8 +94,12 @@ static inline bool isSplitFPArea2Register(unsigned Reg,
using namespace ARM;
switch (Reg) {
- case R11: case LR:
+ case R10: case R12:
+ return SplitFramePushPop;
+ case R11:
return true;
+ case LR:
+ return !SplitFramePushPop;
default:
return false;
}
diff --git a/llvm/lib/Target/ARM/ARMSubtarget.cpp b/llvm/lib/Target/ARM/ARMSubtarget.cpp
index 922fa93226f298..3a28a2cc04c6c4 100644
--- a/llvm/lib/Target/ARM/ARMSubtarget.cpp
+++ b/llvm/lib/Target/ARM/ARMSubtarget.cpp
@@ -496,6 +496,27 @@ bool ARMSubtarget::ignoreCSRForAllocationOrder(const MachineFunction &MF,
bool ARMSubtarget::splitFramePointerPush(const MachineFunction &MF) const {
const Function &F = MF.getFunction();
+ const std::vector<CalleeSavedInfo> CSI = MF.getFrameInfo().getCalleeSavedInfo();
+
+ if (CSI.size() > 1 && MF.getInfo<ARMFunctionInfo>()->shouldSignReturnAddress()) {
+ bool r11InCSI = false;
+ bool lrInCSI = false;
+ unsigned long r11Idx = 0;
+ unsigned long lrIdx = 0;
+ for (unsigned long i = 0; i < CSI.size(); i++) {
+ if (CSI[i].getReg() == ARM::LR) {
+ lrIdx = i;
+ lrInCSI = true;
+ }
+ else if (CSI[i].getReg() == ARM::R11) {
+ r11Idx = i;
+ r11InCSI = true;
+ }
+ }
+ if (lrIdx +1 != r11Idx && r11InCSI && lrInCSI)
+ return true;
+ }
+
if (!MF.getTarget().getMCAsmInfo()->usesWindowsCFI() ||
!F.needsUnwindTableEntry())
return false;
diff --git a/llvm/lib/Target/ARM/ARMSubtarget.h b/llvm/lib/Target/ARM/ARMSubtarget.h
index 91f3978b041a3a..19dac4ffcb3b23 100644
--- a/llvm/lib/Target/ARM/ARMSubtarget.h
+++ b/llvm/lib/Target/ARM/ARMSubtarget.h
@@ -445,7 +445,7 @@ class ARMSubtarget : public ARMGenSubtargetInfo {
/// to lr. This is always required on Thumb1-only targets, as the push and
/// pop instructions can't access the high registers.
bool splitFramePushPop(const MachineFunction &MF) const {
- if (MF.getInfo<ARMFunctionInfo>()->shouldSignReturnAddress())
+ if (MF.getInfo<ARMFunctionInfo>()->shouldSignReturnAddress() && !createAAPCSFrameChain())
return true;
return (getFramePointerReg() == ARM::R7 &&
MF.getTarget().Options.DisableFramePointerElim(MF)) ||
diff --git a/llvm/test/CodeGen/Thumb2/pacbti-m-frame-chain.ll b/llvm/test/CodeGen/Thumb2/pacbti-m-frame-chain.ll
new file mode 100644
index 00000000000000..638bb86777dfc7
--- /dev/null
+++ b/llvm/test/CodeGen/Thumb2/pacbti-m-frame-chain.ll
@@ -0,0 +1,82 @@
+; RUN: llc -filetype asm -o - %s --frame-pointer=all -mattr=+aapcs-frame-chain -mattr=+aapcs-frame-chain-leaf -force-dwarf-frame-section | FileCheck %s
+target triple = "thumbv8m.main-none-none-eabi"
+
+; int f() {
+; return 0;
+; }
+;
+; int x(int, char *);
+; int y(int n) {
+; char a[n];
+; return 1 + x(n, a);
+; }
+
+define hidden i32 @f() local_unnamed_addr {
+entry:
+ ret i32 0;
+}
+
+define hidden i32 @x(i32 noundef %n) local_unnamed_addr {
+entry:
+ %vla = alloca i8, i32 %n, align 1
+ %call = call i32 @y(i32 noundef %n, ptr noundef nonnull %vla)
+ %add = add nsw i32 %call, 1
+ ret i32 %add
+}
+
+declare dso_local i32 @y(i32 noundef, ptr noundef) local_unnamed_addr
+
+; CHECK-LABEL: f:
+; CHECK: pac r12, lr, sp
+; CHECK-NEXT: .save {ra_auth_code}
+; CHECK-NEXT: str r12, [sp, #-4]!
+; CHECK-NEXT: .cfi_def_cfa_offset 4
+; CHECK-NEXT: .cfi_offset lr, -4
+; CHECK-NEXT: .cfi_offset r12, -8
+; CHECK-NEXT: .cfi_offset r11, -12
+; CHECK-NEXT: .save {r11, lr}
+; CHECK-NEXT: push.w {r11, lr}
+; CHECK-NEXT: .setfp r11, sp
+; CHECK-NEXT: mov r11, sp
+; CHECK-NEXT: .cfi_def_cfa r11, 12
+; CHECK-NEXT: .pad #8
+; CHECK-NEXT: sub sp, #8
+; CHECK-NEXT: movs r0, #0
+; CHECK-NEXT: pop.w {r11, lr}
+; CHECK-NEXT: ldr r12, [sp], #4
+; CHECK-NEXT: aut r12, lr, sp
+; CHECK-NEXT: bx lr
+
+; CHECK-LABEL: x:
+; CHECK: pac r12, lr, sp
+; CHECK-NEXT: .save {r4, r7, ra_auth_code}
+; CHECK-NEXT: push.w {r4, r7, r12}
+; CHECK-NEXT: .cfi_def_cfa_offset 12
+; CHECK-NEXT: .cfi_offset lr, -4
+; CHECK-NEXT: .cfi_offset r12, -8
+; CHECK-NEXT: .cfi_offset r11, -12
+; CHECK-NEXT: .cfi_offset r7, -16
+; CHECK-NEXT: .cfi_offset r4, -20
+; CHECK-NEXT: .save {r11, lr}
+; CHECK-NEXT: push.w {r11, lr}
+; CHECK-NEXT: .setfp r11, sp
+; CHECK-NEXT: mov r11, sp
+; CHECK-NEXT: .cfi_def_cfa_register r11
+; CHECK-NEXT: .pad #12
+; CHECK-NEXT: sub sp, #12
+; CHECK-NEXT: adds r1, r0, #7
+; CHECK-NEXT: bic r1, r1, #7
+; CHECK-NEXT: sub.w r1, sp, r1
+; CHECK-NEXT: mov sp, r1
+; CHECK-NEXT: bl y
+; CHECK-NEXT: sub.w r4, r11, #8
+; CHECK-NEXT: adds r0, #1
+; CHECK-NEXT: mov sp, r4
+; CHECK-NEXT: pop.w {r11, lr}
+; CHECK-NEXT: pop.w {r4, r7, r12}
+; CHECK-NEXT: aut r12, lr, sp
+; CHECK-NEXT: bx lr
+
+!llvm.module.flags = !{!0}
+
+!0 = !{i32 8, !"sign-return-address", i32 1}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
When PAC-RET and AAPCS were enabled, R12 was pushed to the stack between R11 (the frame pointer) and the link register, which violated the principle that the frame pointer and link register must always be adjacent on the stack. This change separates R12 into a different push instruction to R11 and the link register when R11 is the frame pointer, meaning the frame pointer and link register are adjacent on the stack.
When PAC-RET and AAPCS were enabled, R12 was pushed to the stack between R11 (the frame pointer) and the link register, which violated the principle that the frame pointer and link register must always be adjacent on the stack. This change separates R12 into a different push instruction to R11 and the link register when R11 is the frame pointer, meaning the frame pointer and link register are adjacent on the stack.
…vm-project into r11_preservation
We have two different ways of splitting the pushes of callee-saved registers onto the stack, controlled by the confusingly similar names STI.splitFramePushPop() and STI.splitFramePointerPush(). This removes those functions and replaces them with a single function which returns an enum. This is in preparation for adding another value to that enum. The original work of this patch was done by James Westwood, reviewed as llvm#82801 and llvm#81249, with some tidy-ups done by Mark Murray and myself.
When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as llvm#82801 and llvm#81249, with some tidy-ups done by Mark Murray and myself.
We have two different ways of splitting the pushes of callee-saved registers onto the stack, controlled by the confusingly similar names STI.splitFramePushPop() and STI.splitFramePointerPush(). This removes those functions and replaces them with a single function which returns an enum. This is in preparation for adding another value to that enum. The original work of this patch was done by James Westwood, reviewed as #82801 and #81249, with some tidy-ups done by Mark Murray and myself.
When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as llvm#82801 and llvm#81249, with some tidy-ups done by Mark Murray and myself.
When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as #82801 and #81249, with some tidy-ups done by Mark Murray and myself.
When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as llvm#82801 and llvm#81249, with some tidy-ups done by Mark Murray and myself.
When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as llvm#82801 and llvm#81249, with some tidy-ups done by Mark Murray and myself.
When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as #82801 and #81249, with some tidy-ups done by Mark Murray and myself.
When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as llvm#82801 and llvm#81249, with some tidy-ups done by Mark Murray and myself.
When PAC-RET and AAPCS were enabled, R12 was pushed to the stack between R11 (the frame pointer) and the link register, which violated the principle that the frame pointer and link register must always be adjacent on the stack. This change separates R12 into a different push instruction to R11 and the link register when R11 is the frame pointer, meaning the frame pointer and link register are adjacent on the stack.