-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the MacOS remote unwinder for VS4Mac #63405
Conversation
Tagging subscribers to this area: @tommcdon Issue DetailsCustomer ImpactThe VS4Mac team reported that native frames missing crashreport.json causing problems with their Watson analysis. The wrong module was being passed to the remote unwinder because the load bias for shared modules Issue: #63309 TestingLocal testing. Ran diagnostics tests against this change. Will get VS4Mac team to verify after this PR's build is finished. RiskLow. It undo's a change/regression from PR #61063.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. We should take for consideration in 6.0.x
/cc: @kdubau |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Talked to Mike offline, it looks fixed for x64 but not for Arm64. |
@mikem8361 is this ready to merge? |
There is still this issue on arm64 and have a fix that I'm testing. Can we wait until this afternoon? |
We should go ahead and merge. The arm64 changes are taking longer than I thought and will require more testing. I'll create another PR. |
Let's hold until all of the fixes are in hand. |
@mikem8361 @mmitche are we taking this in to release/6.0 still or should we retarget to the vs4mac branch? |
We have another set of fixes that we plan on brining along with this one. It is pending validation and porting to this branch. |
The wrong module was being passed to the remote unwinder because the load bias for shared modules was being calculated incorrectly. Issue: dotnet#63309
From PR in main: dotnet#63598 Add arm64 version of StepWithCompactNoEncoding for syscall leaf node wrappers that have compact encoding of 0. Fix ReadCompactEncodingRegister so it actually decrements the addr. Change StepWithCompactEncodingArm64 to match what MacOS libunwind does for framed and frameless stepping. arm64 can have frames with the same SP (but different IPs). Increment SP for this condition so createdump's unwind loop doesn't break out on the "SP not increasing" check and the frames are added to the thread frame list in the correct order. Add getting the unwind info for tail called functions like this: __ZL14PROCEndProcessPvji: 36630: f6 57 bd a9 stp x22, x21, [sp, #-48]! 36634: f4 4f 01 a9 stp x20, x19, [sp, dotnet#16] 36638: fd 7b 02 a9 stp x29, x30, [sp, dotnet#32] 3663c: fd 83 00 91 add x29, sp, dotnet#32 ... 367ac: e9 01 80 52 mov w9, dotnet#15 367b0: 7f 3e 02 71 cmp w19, dotnet#143 367b4: 20 01 88 1a csel w0, w9, w8, eq 367b8: 2e 00 00 94 bl _PROCAbort _TerminateProcess: -> 367bc: 22 00 80 52 mov w2, dotnet#1 367c0: 9c ff ff 17 b __ZL14PROCEndProcessPvji The IP (367bc) returns the (incorrect) frameless encoding with nothing on the stack (uses an incorrect LR to unwind). To fix this get the unwind info for PC -1 which points to PROCEndProcess with the correct unwind info. This matches how lldb unwinds this frame. Always address module segment to IP lookup list instead of checking the module regions. Strip pointer authentication bits on PC/LR.
5c448d2
to
064cbe0
Compare
// are absolute address. | ||
if (segment->fileoff == 0 && segment->filesize > 0) | ||
// segment to get the actual address. | ||
if (strcmp(segment->segname, SEG_TEXT) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any guarantee that the first segment is the text segment, and that there's only one ever (the spec seems to imply that there is always at least one, but nothing about there being multiple or the ordering)? Otherwise we may get here more than once or have a wrong offset.
Also, what happens with the bias of data segments? Granted, this was already never considered if the first segment was always text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is only __TEXT segment and it always seems to be the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is not a concern, LGTM
I verified the latest from Mike on both x64 and Arm64 and it looks like the issue is resolved. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. We should consider for 6.0.x
Let's discuss in Tactics this morning. |
/backport to release/6.0-vs4mac |
Started backporting to release/6.0-vs4mac: https://github.com/dotnet/runtime/actions/runs/1694232125 |
Customer Impact
The VS4Mac team reported that native frames missing crashreport.json causing problems with their Watson analysis.
The wrong module was being passed to the remote unwinder because the load bias for shared modules
was being calculated incorrectly.
The arm64 remote unwinding that createdump uses was very broken. Fixed various issues (see the below commit for details).
Issue: #63309
Testing
Local testing. Ran diagnostics tests against this change. Will get VS4Mac team to verify after this PR's build is finished.
Risk
Low. The changes are restricted to MacOS arm64 with a couple of low risk exceptions. Part of this change undo's a change/regression from PR #61063.