-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
builtin trap placed after frame pop on Arm 32-bit #113154
Comments
@llvm/issue-subscribers-backend-arm Author: David Spickett (DavidSpickett)
Since https://github.com//pull/109628, we have had an lldb test failure on 32 bit Arm:
https://lab.llvm.org/buildbot/#/builders/18/builds/5545
This test checks that lldb can show the first frame of user written code when there's an error in the STL. So in this case we expect to see This appears to be because the trap instruction generated by
I can show this with another example: https://godbolt.org/z/jnGWh1Wqh
Produces:
And I think changing the frame information before the trap is misleading because a debugger will see a PC within I don't know right now whether the linked change has caused the problem or merely exposed the problem by changing how leaf functions are handled. |
You can work around the issue by putting something after the trap:
Then you get the expected order:
|
ARM:
RISC-V:
However this does not happen if the only code in the body is |
…en issue Since #109628 landed, this test has been failing on 32-bit Arm. This is due to a codegen problem (whether added or uncovered by the change, not known) where the trap instruction is placed after the frame pointer and link register are restored. #113154 So the code was: ``` std::__1::vector<int>::operator[](unsigned int): sub sp, sp, #8 str r0, [sp, #4] str r1, [sp] add sp, sp, #8 .inst 0xe7ffdefe bx lr ``` When lldb saw the trap, the PC was inside operator[] but the frame information actually pointed to g. This bug only happens for leaf functions so adding a return type works around it: ``` std::__1::vector<int>::operator[](unsigned int): push {r11, lr} mov r11, sp sub sp, sp, #8 str r0, [sp, #4] str r1, [sp] mov sp, r11 pop {r11, lr} .inst 0xe7ffdefe bx lr ``` (and operator[] should return T& anyway) Now the PC location and frame information should match and the test passes.
I'm not sure how you're building it, but if it's a leaf function, you could try eliminating the frame pointer (FP) by using the |
I'm building with: No explicit options related to frame pointers. Yes adding that flag helps, but in the case of the lldb test case I was able to add a return type to make it non-leaf and avoid this. |
#109628 The behavior for non-leaf functions remains consistent before and after the patch modification, so I don't believe this modification introduced a bug. like this : https://godbolt.org/z/vrqn71qKa |
Right, this has just revealed what was already there by adding frame information to the function in question. As long as no one backtraces it, it would work fine it's only because lldb inspected it that we found this. Without frame pointers I can still see the trap being placed after stack pointer restore code:
From CE's pipeline viewer:
Again it's inserting the frame destroy before the trap. Does not do this for the RISC-V equivalent options:
And this happens prior to your PR. |
I think the problem is that the Arm "trap" instruction is marked "isTerminator = 1" in ARMInstrInfo.td. No other target does this, as far as I can tell, precisely because it leads to this sort of weird result.
This actually fails MachineVerifier: "Bad machine code: Non-terminator instruction after the first terminator". |
Fixes llvm#113154 The encodings used for llvm.trap() on ARM were all marked as barriers and terminators. This lead to stack frame destroy code being inserted before the trap if the trap was the last thing in the function and it had no return statement. ``` void fn() { volatile int i = 0; __builtin_trap(); } ``` Produced: ``` fn: push {r11, lr} << stack frame create <...> pop {r11, lr} << stack frame destroy .inst 0xe7ffdefe << trap bx lr ``` All the other targets don't mark them this way, instead they mark them with isTrap. I've changed ARM to do this, which fixes the code generation: ``` fn: push {r11, lr} << stack frame create <...> .inst 0xe7ffdefe << trap pop {r11, lr} << stack frame destroy bx lr ``` I've updated the existing trap test to force the need for a stack frame, then check that the instruction immediately after the trap is resetting the stack pointer.
Fixes llvm#113154 The encodings used for llvm.trap() on ARM were all marked as barriers and terminators. This lead to stack frame destroy code being inserted before the trap if the trap was the last thing in the function and it had no return statement. ``` void fn() { volatile int i = 0; __builtin_trap(); } ``` Produced: ``` fn: push {r11, lr} << stack frame create <...> mov sp, r11 pop {r11, lr} << stack frame destroy .inst 0xe7ffdefe << trap bx lr ``` All the other targets don't mark them this way, instead they mark them with isTrap. I've changed ARM to do this, which fixes the code generation: ``` fn: push {r11, lr} << stack frame create <...> .inst 0xe7ffdefe << trap mov sp, r11 pop {r11, lr} << stack frame destroy bx lr ``` I've updated the existing trap test to force the need for a stack frame, then check that the instruction immediately after the trap is resetting the stack pointer.
…en issue Since llvm#109628 landed, this test has been failing on 32-bit Arm. This is due to a codegen problem (whether added or uncovered by the change, not known) where the trap instruction is placed after the frame pointer and link register are restored. llvm#113154 So the code was: ``` std::__1::vector<int>::operator[](unsigned int): sub sp, sp, llvm#8 str r0, [sp, #4] str r1, [sp] add sp, sp, llvm#8 .inst 0xe7ffdefe bx lr ``` When lldb saw the trap, the PC was inside operator[] but the frame information actually pointed to g. This bug only happens for leaf functions so adding a return type works around it: ``` std::__1::vector<int>::operator[](unsigned int): push {r11, lr} mov r11, sp sub sp, sp, llvm#8 str r0, [sp, #4] str r1, [sp] mov sp, r11 pop {r11, lr} .inst 0xe7ffdefe bx lr ``` (and operator[] should return T& anyway) Now the PC location and frame information should match and the test passes.
Fixes #113154 The encodings used for llvm.trap() on ARM were all marked as barriers and terminators. This lead to stack frame destroy code being inserted before the trap if the trap was the last thing in the function and it had no return statement. ``` void fn() { volatile int i = 0; __builtin_trap(); } ``` Produced: ``` fn: push {r11, lr} << stack frame create <...> mov sp, r11 pop {r11, lr} << stack frame destroy .inst 0xe7ffdefe << trap bx lr ``` All the other targets don't mark them this way, instead they mark them with isTrap. I've changed ARM to do this, which fixes the code generation: ``` fn: push {r11, lr} << stack frame create <...> .inst 0xe7ffdefe << trap mov sp, r11 pop {r11, lr} << stack frame destroy bx lr ``` I've updated the existing trap test to force the need for a stack frame, then check that the instruction immediately after the trap is resetting the stack pointer. debugtrap was already working but I've added the same checks for it anyway.
Since #109628, we have had an lldb test failure on 32 bit Arm:
https://lab.llvm.org/buildbot/#/builders/18/builds/5545
This test checks that lldb can show the first frame of user written code when there's an error in the STL. So in this case we expect to see
g()
in the backtrace, rather than the operator[], or main as we now get.This appears to be because the trap instruction generated by
__builtin_trap
is put after the frame information is reset:I can show this with another example: https://godbolt.org/z/jnGWh1Wqh
Produces:
fn
has the trap after the frame pointer and link register is reset. This is not the case on RISC-V:And I think changing the frame information before the trap is misleading because a debugger will see a PC within
fn
but the unwind information will look as if it is already in the caller offn
.I don't know right now whether the linked change has caused the problem or merely exposed the problem by changing how leaf functions are handled.
The text was updated successfully, but these errors were encountered: