-
-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: blocking detection #2709
feat: blocking detection #2709
Conversation
|
720a534
to
579fc98
Compare
since I haven't had a change to look into this for a long time, some notes I have:
|
@bruno-garcia are we sure we want this as the call stack? I know you intentially changed skipframes from |
Braindump during chat with @jamescrosswell
|
These frames all come after the blocking call so I don't think they're relevant to the event. |
…dotnet into feat/blocking-detector
@bruno-garcia @jamescrosswell glancing at the code here this seems different from what I changed for Java SDK. We removed some auto generated IDs from class names in the stacktrace so they don't mess up grouping in the issue you linked. |
yes, it's a totally different use case. But the goal here is to send frames that we'd like to show, but not group by. Which can be achieved by changing the server the similarly to how you did it. |
@vaind added u as reviewer since @jamescrosswell wrote the code since the initial push I made so I guess both of us are like "not sure I should review or merge this" :D |
@@ -155,7 +155,7 @@ private SentryException BuildSentryException(Exception exception, int id, int? p | |||
sentryEx.Mechanism = mechanism; | |||
} | |||
|
|||
sentryEx.Stacktrace = SentryStackTraceFactoryAccessor().Create(exception); | |||
sentryEx.Stacktrace ??= SentryStackTraceFactoryAccessor().Create(exception); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the implication here? Previously, the stacktrace would get overwritten. Why is this change necessary? Where would the stacktrace come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the stack trace is already set in the event. We get it from the blocking detection logic so we won't want it overwriten
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sentryEx.Stacktrace ??= SentryStackTraceFactoryAccessor().Create(exception); | |
sentryEx.Stacktrace = SentryStackTraceFactoryAccessor().Create(exception); |
The sentryEx
gets newly instantiated in this method and the StackTrace
has to be set explicitly. I think this change does nothing.
We get it from the blocking detection logic so we won't want it overwriten
That all happens in the event processor.
// Skip frames relating to the async state machine | ||
|| frameInfo?.StartsWith("System.Threading") == true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can skip doing this and improve grouping on the server side for this kind of stuff in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bitsandfoxes could you link to the associated PR where this logic gets changed on the server?
Also, if we do skip this here, we'll need to make some changes to set a custom fingerprint that excludes all this stuff (since by default the entire stacktrace is used as the fingerprint).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll follow up on this. The blocking detection is opt-in and the improvements to the grouping are not blocking this PR. #3202
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll follow up on this. The blocking detection is opt-in and the improvements to the grouping are not blocking this PR. #3202
If we remove the logic above from the client, the grouping will be broken unless some similar logic exists on the server.
I had a chat to Bruno though... Not sure we need to be doing this on the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good to me!
Thanks folks! |
if (_options.CaptureBlockingCalls && _monitor is not null) | ||
{ | ||
var syncCtx = SynchronizationContext.Current; | ||
SynchronizationContext.SetSynchronizationContext(syncCtx == null ? _detectBlockingSyncCtx : new DetectBlockingSynchronizationContext(_monitor, syncCtx)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bruno-garcia won't the SynchronizationContext in ASP.NET Core always be null? Is this code here just in case SDK users have implemented a custom sync context for some reason?
Just playing around, but after a thread on twitter (https://twitter.com/brungarc/status/1711484812702171309). Based in the Ben Adams BlockingDetector.
Suppressing Blocking Detection
If users are intentionally making blocking calls in their code for some reason, the blocking detection can be suppressed (temporarily) via:
Volume Concerns
If blocking calls exist on a hot path, we may end up sending lots of events.
We did create a solution to this in #3174 but decided to roll this back in this PR due to concerns about using metrics.
Top Frame Decision
Bruno had experimented, in the initial code, with the number of frames to skip when capturing a stack trace. The thinking was that it might be nice to see one or two calls into the Sentry blocking detection code. This turns out to be problematic when none of the stack frames are InApp, since Sentry doesn't know which frame to highlight. So I've reverted to Ben Adams' original code that leaves the culprit blocking call at the top of the stack. This makes it easy to find regardless of whether there are InApp frames present in the stack trace or not.
Grouping
There were concerns about grouping and fingerprinting... By default Sentry uses the stack trace as a fingerprint and we can get different stack traces, depending on how the async state machine schedules the task. However this is also related to the top frame decision and given that we're skipping any frames related to the TPL now, the grouping problem has gone away.
Resources
Aparently you have to be called Stephen to understand Synchronization 😜