Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Using the EventPipeEventDelivered profiler callback for allocation profiling #43345

Closed
d-schneider opened this issue Oct 13, 2020 · 8 comments
Labels
area-Diagnostics-coreclr question Answer questions and provide assistance, not an issue with source code or documentation.
Milestone

Comments

@d-schneider
Copy link

d-schneider commented Oct 13, 2020

I would like to use the profiler API for receiving EventPipe events that was introduced in .NET 5 preview 8 for memory profiling (ICorProfilerCallback10::EventPipeEventDelivered). One goal is to track the approximate number and size of objects allocated at specific allocation sites (identified by the call stack).
For testing, I used the the rc1 version of .NET 5.

“GCAllocationTick_V3” event:
The address passed in the event data can’t be used to read the object’s metadata during the Callback. This would be necessary to obtain e.g. the object size for the allocated object that triggered the callback, since the event data only contains the cumulative allocated size since the last event. While it is possible to get the object size later, this is more complex and error prone since we can only access it between after the callback and possibly before the next garbage collection.
The question is if this behavior is intended or if it would be possible to e.g. trigger the event pipe event after the allocation is finished, so that the object metadata can be accessed during the callback since the callback is called synchronously? Or is there another way to get the allocated object size during the callback?

The EventPipeEventDelivered callback contains a stackFrames parameter. However, during my testing that parameter was always empty. I am not sure if there should be no call stack for that event or if it has to be enabled separately?

I also tried to enable the “GCSampledObjectAllocationLow” and “GCSampledObjectAllocationHigh” events that could be better suited for this. However, these must be enabled during application startup and I have not managed to enable them during the profiler initialize. I found a check at https://github.com/dotnet/runtime/blob/master/src/coreclr/src/vm/eventtrace.cpp#L2840 if they are enabled during startup, which seems to be executed before the profiler initialize where I can start the EventPipe session at earliest. Is there any way to enable such events during profiler startup?

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-Diagnostics-coreclr untriaged New issue has not been triaged by the area owner labels Oct 13, 2020
@ghost
Copy link

ghost commented Oct 13, 2020

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

@tommcdon tommcdon added question Answer questions and provide assistance, not an issue with source code or documentation. and removed untriaged New issue has not been triaged by the area owner labels Oct 13, 2020
@tommcdon tommcdon added this to the 6.0.0 milestone Oct 13, 2020
@tommcdon
Copy link
Member

@davmason

@davmason
Copy link
Member

Hi @d-schneider, thanks for reaching out.

Is there a reason you are not using the ICorProfiler::ObjectAllocated callbacks? That should give you exactly what you want without having to go through all the trouble of dealing with EventPipe. That will be a lot easier than dealing with all the overhead and parsing of EventPipe data. And then you can use SurvivingReferences() to map moved objects from a GC.

If you have other things making you intent on using EventPipe, GCSampledObjectAllocationLow and GCSampledObjectAllocationHigh shouldn't need to be enabled at startup - there is code here to detect changes after startup:

void ETW::TypeSystemLog::OnKeywordsChanged()
{
LIMITED_METHOD_CONTRACT;
// If the desired frequencey for the GCSampledObjectAllocation events has changed,
// update our state.
s_fHeapAllocLowEventEnabledNow = ETW_TRACING_CATEGORY_ENABLED(MICROSOFT_WINDOWS_DOTNETRUNTIME_PROVIDER_DOTNET_Context, TRACE_LEVEL_INFORMATION, CLR_GCHEAPALLOCLOW_KEYWORD);
s_fHeapAllocHighEventEnabledNow = ETW_TRACING_CATEGORY_ENABLED(MICROSOFT_WINDOWS_DOTNETRUNTIME_PROVIDER_DOTNET_Context, TRACE_LEVEL_INFORMATION, CLR_GCHEAPALLOCHIGH_KEYWORD);
// FUTURE: Would be nice here to log an error event if (s_fHeapAllocLowEventEnabledNow ||
// s_fHeapAllocHighEventEnabledNow), but !s_fHeapAllocEventEnabledOnStartup
// If the type events should be turned off, eliminate the hash tables that tracked
// which types were logged. (If type events are turned back on later, we'll re-log
// them all as we encounter them.) Note that all we can really test for is that the
// Types keyword on the runtime provider is off. Not necessarily that it was on and
// was just turned off with this request. But either way, TypeSystemLog can handle it
// because it is extremely smart.
if (!ETW_TRACING_CATEGORY_ENABLED(MICROSOFT_WINDOWS_DOTNETRUNTIME_PROVIDER_DOTNET_Context, TRACE_LEVEL_INFORMATION, CLR_TYPE_KEYWORD))
OnTypesKeywordTurnedOff();
}
.

Eventing is intialized in the runtime before profiling so there isn't a way to set events truly at the beginning of the process, but events should be able to turn on and off throughout the life of the process. If that is not the case I'm happy to investigate any repro you can share.

@d-schneider
Copy link
Author

d-schneider commented Oct 15, 2020

Hi @davmason, thanks for the fast reply.

The main reason I'm trying to use the EventPipe events is that the runtime performance overhead from just enabling the ObjectAllocated callback is significantly higher than for the AllocationTick event. Likely because the AllocationTick event is only triggered once every ~100KB, which is sufficient for me.

In the method you linked for the GCSampledObjectAllocation events, there is also a comment which mentions that an error should be logged if the flags weren't set at startup, but are enabled now:

// FUTURE: Would be nice here to log an error event if (s_fHeapAllocLowEventEnabledNow ||
// s_fHeapAllocHighEventEnabledNow), but !s_fHeapAllocEventEnabledOnStartup

Additionally, at the beginning of the SendObjectAllocatedEvent method, there is a check if the events were enabled at startup and otherwise returns before checking if the events are enabled now:

void ETW::TypeSystemLog::SendObjectAllocatedEvent(Object * pObject)
{
CONTRACTL
{
NOTHROW;
GC_NOTRIGGER;
MODE_COOPERATIVE;
}
CONTRACTL_END;
// No-op if the appropriate keywords were not enabled on startup (or we're not yet
// started up)
if (!s_fHeapAllocEventEnabledOnStartup || !g_fEEStarted)
return;

I haven't prepared a full reproducer that I can share, but I think the relevant part is that I started the EventPipe session in the profiler Initialize method with following provider settings:

COR_PRF_EVENTPIPE_PROVIDER_CONFIG providers[] = {
		{L"Microsoft-Windows-DotNETRuntime",
		 // GCHEAPALLOCHIGH_KEYWORD | GCHEAPALLOCLOW_KEYWORD,
		 0x200000 | 0x2000000,
		 COR_PRF_EVENTPIPE_INFORMATIONAL,
		 nullptr}};

HRESULT hr = corProfilerInfo12->EventPipeStartSession(1, providers, false, &eventPipeSession);

Then I checked if I receive any Events with the ids 20(GCSampledObjectAllocationHigh) or 32(GCSampledObjectAllocationLow) during the EventPipeEventDelivered callback.

@davmason
Copy link
Member

davmason commented Oct 15, 2020

Thanks for pointing that out. I clearly didn't look hard enough at the code path here. You're right that they have to be enabled at startup, which makes sense since the jit has to use the slow path allocation helpers to be able to track every allocation. Normally the jit has assembly helpers that bump the thread local allocation pointer and only call in to the runtime if the allocation limit is exceeded. There currently isn't a way to set EventPipe keywords truly at startup via the profiler, the only workaround would be to create a session on startup through some other means with the keywords enabled on startup, then close it once your profiler was up and running.

However, even if you get the heap allocation keywords working you will see the same overhead as using ICorProfilerCallback::ObjectAllocated since the runtime would be using the slow path allocation helpers for every allocation.

To answer a question from your original post, the behavior of GCAllocationTick_V3 is as intended, most of the events are intended to be emitted by the runtime and then bulk processed by a tool like perfview later. In the perfview situation it isn't trying to do live inspection on the object. Which isn't to say you can't make it work in a live profiler session, just that it wasn't the original design intent of the event.

GCAllocationTick_V3 has a TypeName field which should be able to be used with IMetaDataImport::FindTypeDefByName to look up the object metadata, although I haven't actually personally tried that approach so you could run in to issues. Hopefully it just works, but you might need to massage the name format, or do a manual search through metadata to make it work. If you can't make that work, the TypeID field of GCAllocationTick_V3 is a MethodTable*, which is the same thing a ClassID is (it's just casted to a uintptr_t to make it opaque). So it's technically unsupported behavior but calling the ICorProfilerInfo* methods with that TypeID casted to a ClassID should work. Although I haven't tried that before either, so it should work as in "I'm reading the code and it looks like it should work"

@d-schneider
Copy link
Author

Thanks for the information about the overhead of the ObjectAllocation events. In that case, it seems that the GCAllocationTick_V3 event is probably the better choice anyway.

I tried your suggestion of using the TypeID from the GCAllocationTick_V3 event to get the size. However, it seems that this does not work for array types, since the object size of those is variable. I haven't found a way to obtain the size of an array object without the object address.

@davmason
Copy link
Member

There isn't an way to get the array size inside the GCAllocationTick_V3 callback since the callback happens before the object's methodtable is set, as you have discovered.

You unfortunately have to choose between the perf hit of the ICorProfilerCallback::ObjectAllocated events, but being able to query object information in the callback, or doing some extra work to get the information later.

I don't think it would be all that much work to get the information later, though. Objects will be safe to inspect in the GarbageCollectionStarted callback, so it should be fairly straightforward to keep a list of objects you want information about and get all that information in GarbageCollectionStarted.

@d-schneider
Copy link
Author

Thank you for the information. In that case, the best approach would probably be to inspect the objects in the GarbageCollectionStarted callback, like you suggested.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Diagnostics-coreclr question Answer questions and provide assistance, not an issue with source code or documentation.
Projects
None yet
Development

No branches or pull requests

4 participants