-
-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add SIGTERM support #3895
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @naftaly! I think it's fine if we leave in the SIGTRAP logic, it looks like that was a bug that has since been fixed upstream. Would you mind making that small change?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3895 +/- ##
=============================================
- Coverage 90.815% 90.604% -0.211%
=============================================
Files 590 589 -1
Lines 45946 45854 -92
Branches 16380 16273 -107
=============================================
- Hits 41726 41546 -180
- Misses 4040 4217 +177
+ Partials 180 91 -89
... and 42 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes look good to me, we might just want to push this diff up in one of our own branches to get all the other tests running.
@armcknight is there anything you need me to do to get this on in? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with the results of the tests in #3946! 🙏🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some missing tests. Apart from that LGTM. Thanks for doing this @naftaly 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the test @naftaly 👏
Add support for catching sigterm signals. Co-authored-by: Philipp Hofmann <philipp.hofmann@sentry.io>
Add support for catching sigterm signals. Co-authored-by: Philipp Hofmann <philipp.hofmann@sentry.io>
What is this telling me when I get this on iOS and tvOS? Is this a good or a bad thing? What would Sentry have been reporting otherwise? The Watchdog event? |
It’s the OS trying to tell you that it wants apps to exit gracefully. I’ve seen it happen in many different scenarios. Some of the interesting ones are when the device is shutdown, when an app will be terminated to update it and when something changes in traits that might require the OS to restart apps. What’s interesting is that previously this would all go unnoticed to us the developers. If these happen when the user can see them, or perceive them, then we have an issue that need to be resolved. If they happen in the background, then we just hope that state restauration is well implemented and users don’t notice it.
Likely nothing, it would have gone unnoticed. |
We've always had a large number of |
I guess it’s possible since the Watchdog code in Sentry (if I’m not mistaken) simply returns at the end of its heuristics and calls it an OOM/Watchdog event. So if the actual signal isn’t caught, then that might be the default. FWIW, I’ve also proposed a PR to add real OOM recognition so that could help with other false positives. It might end up in KSCrash first though which is what Sentry uses for crash reporting. |
@eric if you’re getting SIGTERM’s now, it would be interesting to know what type of event is going down (if any). That would help understand exactly what is happening or will happen for your app. |
I just received the first SIGTERM on our Testflight beta today, and so wasn't sure what to do with it... it didn't really feel actionable. |
We do currently send memory usage with all of our breadcrumbs to try to give us a better idea of where we have actual memory issues. It always seemed silly to me that Sentry couldn't at least sample the memory usage once per second and store it somewhere to give a more informed message related to the Watchdog events. |
If it’s background and it feels unactionable, then I would not prioritize it, but still keep in mind it’s there and happening. If it’s foreground, then I usually start adding breadcrumbs to make it actionable if the stack isn’t enough to give you an idea of why it is going on. These issues are definitely going to harder than your run of the mill exception since these happen based on what happened in the past vs. A mistake that might have happened at that point in time. If you can share the stack (of all threads) then I would be happy to take a look and see if I spot anything. |
It doesn't seem like much of anything is happening here: https://fancy-bits-llc.sentry.io/issues/5386894502/ What is the signal that is sent when the OS decides that it's time for your app to stop running in the background when it needs to get more memory? Is SIGTERM ever going to be sent during normal operation? |
I don’t have access. Maybe copy it to a gist or something.
That would be SIGKILL, we can’t catch it. Sometimes, if we’re really lucky, a SIGTERM comes before it, but I’ve not seen it often.
Depends what normal operation is, but yes, it can happen at any time. It’s a vestige of unix where it’s meant to tell apps to do their cleanup and exit otherwise be “exited”. |
We have that planned here: #3518 (comment), but we haven't gotten to it yet, sorry.
Yes, that makes sense. I created an issue. Thanks for the input 👏 , @eric. #4003 |
📜 Description
Added support for catching the SIGTERM signal.
💡 Motivation and Context
All Apple OS's send SIGTERM (like any good unix based system) when they want to request a graceful termination of an application, when the app doesn't quit in those circumstances, it'll receive a SIGKILL. This often gives us the opportunity to get a stack trace where we know Apple wants the app to be terminated but don't know why (watchdog events). This is just one more piece of information that can be added to the puzzle of figuring out unexplained app terminations.
💚 How did you test it?
I ran the tests as well as Tested on a real app.
📝 Checklist
You have to check all boxes before merging:
sendDefaultPII
is enabled.🔮 Next steps