-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
instrumentation changes #55888
instrumentation changes #55888
Conversation
Tagging subscribers to this area: @dotnet/gc Issue DetailsWe've never had ETW events that had fields about time - we rely on the the timestamps of the ETW events themselves to calculate time. This checkin introduces some new events/event fields that will include time info instead of firing individual events that otherwise don't carry much info, ie, they'd just be there so we could use their timestamps, which would be a nuisance when we have many heaps. The rationale behind firing events with time already calculated is
Added MarkWithType events for marking due to dependent handles, newly promoted due to dead finalizable objects and Recorded time for marking roots (but sizedref is separate), short weak, ScanForFinalization, long weak, relocate, compact and sweep. Made the 100k alloc tick event threshold configurable (set with Provided a new rundown Added a Added a verbose level
Fixed issues - For BGC we were reporting the pinned object count the same as the last FGC..and that causes confusion so fixed that. Fixed #45375 While fixing #45375, I noticed we have another bug related to alloc tick which is we are not firing the alloc tick events correctly for LOH and POH since the ETW alloc tracking didn't seperate them... fixed this too. Added the POH type for Did some cleanup in eventtrace.h - we don't need the info that's not used which means we just ended up duplicating things like Note, I realize that I do have some inconsistency with the
|
@dotnet/gc please review. @PeterSolMS please let me know if you are ok with how I structured the new @noahfalk PTAL at the changes in clretwall.man and the related interface changes. |
a85f903
to
eeccd35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looked fine to me. A few thoughts overall:
- I wasn't sure if the config mechanism for AllocTick events will be readily usable by profiler vendors. They might prefer a profiler API.
- Emitting the rundown settings at startup too is an option if it is useful now or in the future
- Given that these events are mostly intended for GC team's own analysis I didn't push for docs.
We've never had ETW events that had fields about time - we rely on the the timestamps of the ETW events themselves to calculate time. This checkin introduces some new events/event fields that will include time info instead of firing individual events that otherwise don't carry much info, ie, they'd just be there so we could use their timestamps, which would be a nuisance when we have many heaps.
The rationale behind firing events with time already calculated is
GCStart
/GCEnd
which can be correlated with other events (eg, you getGCStart
, and then get a bunch of other events so you know those happened after a GC started) wheras things like very GC internal things don't have this property, ie, we are not gonna care that "these other events happened during a GC and specifically during the relocation phase".Added MarkWithType events for marking due to dependent handles, newly promoted due to dead finalizable objects and
mark_steal
. Perfview needs to be updated to work with this otherwise you can't see the GCStats view (I'm submitting a PR for that).Recorded time for marking roots (but sizedref is separate), short weak, ScanForFinalization, long weak, relocate, compact and sweep.
Added a new version that includes the size of the object that triggered the event. This is for a request from this issue. I'm unclear if the config for the threshold was requested as a runtime config or something configurable via profiling API so asked for clarification there.
Provided a new rundown
GCSettings
event that has info on settings hard to get from traces.Added a
GCLOHCompact
event which is fired for all heaps (heaps that didn't actually do LOH compact would have values of all 0s). I'm trying to add events that don't require a lot of correlation with other events to interpret. This is to help get an idea how long it takes to compact LOH and how reference rich it is.Added a verbose level
GCFitBucketInfo
event which helps us with FL fitting investigation. I'm firing this for 2 things in a gen1 GC:for plugs that allocated with
allocate_in_condemned_generations
the event will capture all of them with the same bucketing as we do for gen2 FL;for gen2 FL we look at the largest free items that take up 25% of the FL space, or if there are too many of them we stop after walking a certain number of free items as we have to limit the amount of time we are spending here.
Fixed issues -
For BGC we were reporting the pinned object count the same as the last FGC..and that causes confusion so fixed that.
Fixed #45375
While fixing #45375, I noticed we have another bug related to alloc tick which is we are not firing the alloc tick events correctly for LOH and POH since the ETW alloc tracking didn't seperate them... fixed this too.
Added the POH type for
GCSegmentTypeMap
which was missing in the manifest.Did some cleanup in eventtrace.h - we don't need the info that's not used which means we just ended up duplicating things like
_GC_ROOT_KIND
in more places than needed.Note, I realize that I do have some inconsistency with the
FEAETURE_EVENT_TRACE
here, as in, some code should be under an#ifdef
check but is not. I will look into a remedy for that with a separate PR.