-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rare race condition in EventSource dispose/finalizer #55441
Comments
Migrating the conversation from #55862 to here:
|
Correct, |
Having investigated this a little bit I think there are a couple things we could do to solve it:
Neither of the above solutions actually prevents a user from writing their own code that attempts to write events while an event source is being disabled. It's worth noting that this has only been observed on arm32. Most likely due to instruction ordering constraints being looser than other architectures. As a result, I'm leaning towards option 2. |
Moving to future as there isn't a strong signal that this is a pressing issue. If we receive more reports of this specific failure we can pull it back on the backlog. CC @tommcdon |
We have daily crashes during process shutdown probably related to this issue. Platform: .NET6, Debian
To mitigate it we tried to dispose all
We hope this bug report encourage further investigation/fixing of the issue. |
Hi @SergeiPavlov, thanks for reaching out. Are you able to share a core dump of the crash? We have ways to share them privately if that is a concern |
Unfortunately I cannot. BTW the mentioned NRE probably is race condition between our call of And we still have similar CRASH problem on stage when |
There is a rare race that can result in a use-after-free on the native end of
EventPipeEventProvider
and potentially theEtwEventProvider
.runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Tracing/EventSource.cs
Lines 1420 to 1484 in c90aa4d
If one thread (A) is calling
EventSource.Dispose()
, and another is in the process of writing (B), it is possible for the following sequence to occur:A: (1)
Dispose
-> (3)m_eventSourceEnabled = false
-> (4)m_eventPipeProvider.Dispose()
-> (6)m_eventPipeProvider = null
B: (2)
if (IsEnabled)
-> (5)usem_eventPipeProvider
EventPipeEventProvider.Dispose()
callsEventPipeEventProvider.EventUnregister()
. This deletes the underlying native structures (the only timeEventPipeProvider::m_pEventList
is set tonullptr
). The managed code, does not unset them_provHandle
member, so if someone got a reference to this managed object, they would have a pointer to freed memory. The managed provider has been marked as disabled, however, not all code paths check that value. Specifically:runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Tracing/EventPipeEventProvider.cs
Lines 81 to 87 in c90aa4d
which is where we AV in this case. We get here from
TraceLoggingEventSource.WriteImpl()
which ends up inNameInfo.GetOrCreateEventHandle()
which callsDefineEvent
on the provider:runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Tracing/TraceLogging/NameInfo.cs
Lines 81 to 123 in 9da4d07
I'm still pinning down the exact sequencing of events in the hopes I can create a deterministic repro.
I believe this is the cause for the failures on dotnet/coreclr#28179 and #55240.
CC @tommcdon @noahfalk @dotnet/dotnet-diag
The text was updated successfully, but these errors were encountered: