Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xcode organizer reports crash on writeBinaryImage #1892

Closed
jaroslavas opened this issue Jun 16, 2022 · 31 comments
Closed

Xcode organizer reports crash on writeBinaryImage #1892

jaroslavas opened this issue Jun 16, 2022 · 31 comments

Comments

@jaroslavas
Copy link

Platform

iOS

Installed

CocoaPods

Version

7.16.0

Steps to Reproduce

Add sentry to app and start reporting issues.

SentrySDK.start { (options) in
            options.dsn = dsn
            options.debug = false
            options.enableAutoSessionTracking = false
            options.attachStacktrace = false
            options.maxBreadcrumbs = 25

            options.beforeSend = { event in
                // Some data is removed to make reports lighter
                event.debugMeta = nil
                return event
            }
        }

Expected Result

Crashes should be seen in sentry as well as in XCode organizer.

Actual Result

XCode organizer starts showing a lot of different crashes as one, where Sentrys writeBinaryImage is shown as the cause for the crash.
Screenshot 2022-06-16 at 10 11 22
Screenshot 2022-06-16 at 10 11 46

@philipphofmann
Copy link
Member

What is the error message of the crash, @jaroslavas? When does this happen? Do you maybe have a repro for us?

@jaroslavas
Copy link
Author

jaroslavas commented Jun 17, 2022

Nope, I only have different stack traces ending in a crash by sentry and binary image, either writing to it or reading:
Screenshot 2022-06-17 at 12 43 47
Screenshot 2022-06-17 at 12 43 59

Unfortunately I can't reproduce this, only see a lot of reports in organizer from live users.

@philipphofmann
Copy link
Member

a lot of reports in organizer from live users.

What exactly do you mean by that?

@brustolin
Copy link
Contributor

It means crash reports downloaded from iTunes Connect.

@philipphofmann
Copy link
Member

Did this start happening in 7.16.0 or did you see this problem for earlier versions as well, @jaroslavas?

@jaroslavas
Copy link
Author

We have not used it for some time, but I'm almost sure it was there before 7.16.0

@philipphofmann
Copy link
Member

It seems like we can't handle a signal properly, and we crash in writeBinaryImage. Do you maybe have any clue which type of crash this could be?

@armcknight, can you look at the stacktrace above, and do you maybe have any clue what could cause a crash when calling into _dyld_get_image_header in sentrycrashdl_getBinaryImage?

@armcknight
Copy link
Member

@jaroslavas Could you provide a full crash report for some of these examples? There should be a thread that is indicated as the "Crashed thread", and maybe other info like an exception type or backtrace.

XCode organizer starts showing a lot of different crashes as one, where Sentrys writeBinaryImage is shown as the cause

My guess is that since our symbols show up in crash reports (which is normal; our crash handler logic is running to collect crash info) Xcode sees us as a common element amongst the reports and is blaming us. In practice, automatically choosing a frame to blame for a crash report is an inexact art.

@philipphofmann philipphofmann changed the title XCode organizer reports crash on writeBinaryImage Xcode organizer reports crash on writeBinaryImage Jun 28, 2022
@jaroslavas
Copy link
Author

@armcknight Can I send them to the email in your profile?

@philipphofmann
Copy link
Member

Yes, you can send them to philipp.hofmann@sentry.io.

@jaroslavas
Copy link
Author

Interesting finding: I went through several of crashes with different stacks, but all them had one thing in common: app was killed by the watchdog with infamous 0x8BADF00D code.

@philipphofmann
Copy link
Member

philipphofmann commented Jun 30, 2022

As you pointed out the crash is a watchdog termination. The crash reports you sent me contain

72180 exhausted real (wall clock) time allowance of 10.00 seconds
ProcessVisibility: Background
ProcessState: Running
WatchdogEvent: scene-update
WatchdogVisibility: Foreground

So it seems like the app gets killed because it hangs when being in the background.

0   libsystem_kernel.dylib        	0x00000001b7c91138 __ulock_wait + 8
1   libsystem_platform.dylib      	0x00000001f0df39ac _os_unfair_lock_lock_slow + 172 (lock.c:578)
2   dyld                          	0x00000001027406f4 dyld4::RuntimeState::withLoadersReadLock(void () block_pointer) + 76 (DyldRuntimeState.cpp:138)
3   dyld                          	0x000000010274697c dyld4::APIs::_dyld_get_image_header(unsigned int) + 136 (DyldAPIs.cpp:222)
4   Sentry                        	0x00000001047dd754 sentrycrashdl_getBinaryImage + 28 (SentryCrashDynamicLinker.c:345)
5   Sentry                        	0x00000001047ec748 writeBinaryImage + 32 (SentryCrashReport.c:1191)
6   Sentry                        	0x00000001047ec748 writeBinaryImages + 124 (SentryCrashReport.c:1234)
7   Sentry                        	0x00000001047ec748 sentrycrashreport_writeStandardReport + 460 (SentryCrashReport.c:1752)
8   Sentry                        	0x00000001047d9ca4 onCrash + 176 (SentryCrashC.c:107)
9   Sentry                        	0x00000001047e5ac8 0x1047c4000 + 137928 (SentryCrashMonitor.c:241)
10  Sentry                        	0x00000001047e8074 handleSignal + 336 (SentryCrashMonitor_Signal.c:108)
11  libsystem_platform.dylib      	0x00000001f0df4c10 _sigtramp + 56 (sigtramp.c:116)

It seems like the SentrySDK tries to write a crash report for a signal it received and it gets stuck. I'm not sure what is causing this. Worth checking some forks of KSCrash to see if they updated some logic in files similar to SentryCrashReport, SentryCrashDynamicLinker, and SentryCrashC.

Thanks for reporting this @jaroslavas. BTW, if you can come up with a repro for us this would be awesome 👏.

@armcknight
Copy link
Member

Looking again at the stack trace, the app has already crashed for some reason, as our signal handler is running (onCrash). However, the fact that _dyld_get_image_header takes a lock is dangerous; you can only make reentrant calls from within a signal handler, and taking an unfair lock is not reentrant. So, the system is terminating the app with the 0x8BADF00D watchdog code because we've deadlocked and the app is taking too long to terminate.

SentryCrash (and KSCrash, in turn) need to avoid the _dyld call by gathering the list of loaded binary images beforehand. This can be done with _dyld_register_func_for_add_image and _dyld_register_func_for_remove_image. Then, if/when a crash occurs, we already have the list of loaded images and don't have to query dyld in the handler function.

@philipphofmann
Copy link
Member

@jaroslavas, can you tell us the percentage of crashes from the Xcode organizer that crashed on writeOnBinaryImage of the total amount of crashes of the Xcode organizers, so we know how bad this is? Does our SDK miss other crashes as well that the Xcode organizer has? If yes, again could you tell us how many, please?

@philipphofmann
Copy link
Member

This could be related to #1864.

@jaroslavas
Copy link
Author

@jaroslavas, can you tell us the percentage of crashes from the Xcode organizer that crashed on writeOnBinaryImage of the total amount of crashes of the Xcode organizers, so we know how bad this is?

Actually, most of the crashes are from sentry atm :) Here is a screenshot from TestFlight, version on the bottom has sentry enabled, version on the top has it disabled:
Screenshot 2022-07-11 at 11 15 03

@philipphofmann
Copy link
Member

Thanks for the update, @jaroslavas. I would be surprised if this would be the case for most apps. I'm trying to understand what could cause this, although we already know how to fix it. Do you have any other crash reporting tools installed in parallel or another library that hooks into signal handlers?

@jaroslavas
Copy link
Author

We only use Sentry and reports provided by Apple automatically.

@philipphofmann
Copy link
Member

Again thanks for the update, @jaroslavas.

@philipphofmann
Copy link
Member

Might also fix #1979.

@github-actions
Copy link

github-actions bot commented Aug 9, 2022

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@jaroslavas
Copy link
Author

So, is there a plan to fix it before bot closes the issue? :)

@brustolin
Copy link
Contributor

Yes, we definitely plan to fix this. We had a minor set back that delayed this, but we're working to have it fixed.

@bruno-garcia
Copy link
Member

As the message says we need a backlog label to avoid the stale bot

jaroslavas added a commit to ProtonVPN/ios-mac-app that referenced this issue Aug 19, 2022
@jaroslavas
Copy link
Author

Any updates on this issue?

@philipphofmann
Copy link
Member

No, not yet, @jaroslavas. Sorry, we didn't have the bandwidth yet to get to that issue.

@jacopom
Copy link

jacopom commented Nov 24, 2022

Hi @philipphofmann do you have any news? thank you

@philipphofmann
Copy link
Member

Nope not yet; sorry, @jacopom.

@brustolin
Copy link
Contributor

Fixed with #2939

@jaroslavas
Copy link
Author

Hi @brustolin @philipphofmann
We've tried to add back Sentry, but looks like the problem is still there. Watchdog kills the app after eating bad food :(

Exception Type:  EXC_CRASH (SIGKILL)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Termination Reason: FRONTBOARD 2343432205 
<RBSTerminateContext| domain:10 code:0x8BADF00D explanation:scene-update watchdog transgression: app<ch.protonmail.vpn(F71EC81F-DD3E-43E5-A342-4052E7631350)>:990 exhausted real (wall clock) time allowance of 10.00 seconds
ProcessVisibility: Foreground
ProcessState: Running
WatchdogEvent: scene-update
WatchdogVisibility: Foreground
WatchdogCPUStatistics: (
"Elapsed total CPU time (seconds): 8.600 (user 8.600, system 0.000), 16% CPU",
"Elapsed application CPU time (seconds): 0.008, 0% CPU"
) reportType:CrashLog maxTerminationResistance:Interactive>

And here is the main thread that crashed:

Thread 0 name:   Dispatch queue: com.apple.main-thread
Thread 0 Crashed:
0   libsystem_kernel.dylib        	       0x1f394dbf8 __ulock_wait + 8
1   libsystem_platform.dylib      	       0x2161562dc _os_unfair_lock_lock_slow + 176
2   dyld                          	       0x1cef7915c dyld4::RuntimeLocks::withLoadersReadLock(void () block_pointer) + 84
3   dyld                          	       0x1cef86a74 dyld4::APIs::_dyld_get_image_header(unsigned int) + 140
4   ProtonVPN                     	       0x104a3c338 sentrycrashdl_dladdr (in ProtonVPN) (SentryCrashDynamicLinker.c:215) + 1442616
5   ProtonVPN                     	       0x104a43a94 symbolicate_internal (in ProtonVPN) (SentryCrashSymbolicator.c:75) + 1473172
6   ProtonVPN                     	       0x104a38ee8 writeThread (in ProtonVPN) (SentryCrashReport.c:1089) + 1429224
7   ProtonVPN                     	       0x104a39e68 writeAllThreads (in ProtonVPN) (SentryCrashReport.c:1142) + 1433192
8   ProtonVPN                     	       0x104a39ae8 sentrycrashreport_writeStandardReport (in ProtonVPN) (SentryCrashReport.c:1717) + 1432296
9   ProtonVPN                     	       0x104a360d8 onCrash (in ProtonVPN) (SentryCrashC.c:95) + 1417432
10  ProtonVPN                     	       0x104a31020 sentrycrashcm_handleException (in ProtonVPN) (SentryCrashMonitor.c:235) + 1396768
11  ProtonVPN                     	       0x104a32fe8 handleSignal (in ProtonVPN) (SentryCrashMonitor_Signal.c:110) + 1404904
12  libsystem_platform.dylib      	       0x216158888 _sigtramp + 56
13  libobjc.A.dylib               	       0x1a45c38dc class_createInstance + 68
14  libobjc.A.dylib               	       0x1a45c38dc class_createInstance + 68
15  CoreFoundation                	       0x1ac23ae30 __CFAllocateObject + 20
16  CoreFoundation                	       0x1ac26aea4 __NSSetM_new + 132
17  CoreFoundation                	       0x1ac26a15c +[NSSet set] + 28
18  BaseBoard                     	       0x1b435ea78 -[BSKeyedSettings initWithSettings:] + 228
19  FrontBoardServices            	       0x1c46a0600 -[FBSSettings _settings:] + 120
20  FrontBoardServices            	       0x1c46a056c -[FBSSettings _rootSettingsForLocal:createIfNeeded:] + 44
21  FrontBoardServices            	       0x1c46a3890 -[FBSSettings _setValue:forSetting:] + 176
22  FrontBoardServices            	       0x1c46a119c -[FBSSettings(FBSSceneExtension) setValue:forProperty:] + 72
23  UIKitCore                     	       0x1ae9efa90 -[_UISceneHostingDiffAction _performActionsForUIScene:withUpdatedFBSScene:settingsDiff:fromSettings:transitionContext:lifecycleActionType:] + 232
24  UIKitCore                     	       0x1ae9010a8 __64-[UIScene scene:didUpdateWithDiff:transitionContext:completion:]_block_invoke.225 + 612
25  UIKitCore                     	       0x1ae57c454 -[UIScene _emitSceneSettingsUpdateResponseForCompletion:afterSceneUpdateWork:] + 216
26  UIKitCore                     	       0x1ae57c2c4 -[UIScene scene:didUpdateWithDiff:transitionContext:completion:] + 244
27  UIKitCore                     	       0x1ae57c104 -[UIApplicationSceneClientAgent scene:handleEvent:withCompletion:] + 336
28  FrontBoardServices            	       0x1c46ac5b4 -[FBSScene updater:didUpdateSettings:withDiff:transitionContext:completion:] + 660
29  FrontBoardServices            	       0x1c46ac300 __94-[FBSWorkspaceScenesClient _queue_updateScene:withSettings:diff:transitionContext:completion:]_block_invoke_2 + 152
30  FrontBoardServices            	       0x1c46ac19c -[FBSWorkspace _calloutQueue_executeCalloutFromSource:withBlock:] + 168
31  FrontBoardServices            	       0x1c46ac0b8 __94-[FBSWorkspaceScenesClient _queue_updateScene:withSettings:diff:transitionContext:completion:]_block_invoke + 344
32  libdispatch.dylib             	       0x1b420d300 _dispatch_client_callout + 20
33  libdispatch.dylib             	       0x1b4210d48 _dispatch_block_invoke_direct + 284
34  FrontBoardServices            	       0x1c46a8520 __FBSSERIALQUEUE_IS_CALLING_OUT_TO_A_BLOCK__ + 52
35  FrontBoardServices            	       0x1c46a84a0 -[FBSMainRunLoopSerialQueue _targetQueue_performNextIfPossible] + 240
36  FrontBoardServices            	       0x1c46a8378 -[FBSMainRunLoopSerialQueue _performNextFromRunLoopSource] + 28
37  CoreFoundation                	       0x1ac25112c __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28
38  CoreFoundation                	       0x1ac2503a8 __CFRunLoopDoSource0 + 176
39  CoreFoundation                	       0x1ac24eb5c __CFRunLoopDoSources0 + 244
40  CoreFoundation                	       0x1ac24d898 __CFRunLoopRun + 828
41  CoreFoundation                	       0x1ac24d478 CFRunLoopRunSpecific + 608
42  GraphicsServices              	       0x1ef7a64f8 GSEventRunModal + 164
43  UIKitCore                     	       0x1ae67162c -[UIApplication _run] + 888
44  UIKitCore                     	       0x1ae670c68 UIApplicationMain + 340
45  ProtonVPN                     	       0x1048e2480 main (in ProtonVPN) (<compiler-generated>:0) + 25728
46  dyld                          	       0x1cef72dcc start + 2240

As you can see _dyld_get_image_header is still in the stack.

Is there any way we can help you debug and fix this?

@philipphofmann
Copy link
Member

@jaroslavas, I think that is a different crash that could be related to this issue, but I would prefer if you could open an new issue for that, please.

protonjohn pushed a commit to ProtonVPN/ios-mac-app that referenced this issue Aug 22, 2024
getsentry/sentry-cocoa#1892 is not fixed, or
the fix was not enough for us to get sentry back working. We still
get crash reports where the source of the crash is reported as
Sentry itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

6 participants