-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent the native loader from being unloaded while sending telemetry (#5944 => V2) #5957
Conversation
…#5944) ## Summary of changes Prevent the native loader from being unloaded while sending telemetry. ## Reason for change We send telemetry when we decide not to instrument a process (for instance because of an EOL runtime). To send the telemetry, we spawn the telemetry forwarder. This is done in a background thread to avoid blocking the startup of the process. However, .NET unloads the profiler when the `CorProfilerInfo::Initialize` method returns, even if the background thread is still running, causing a segfault. ## Implementation details We increment the reference count of the module to prevent it from being unloaded by the .NET runtime. Two different implementations: - On Windows, we use `GetModuleHandleEx` to increment the reference count. Then, when we're done sending the telemetry, we use `FreeLibraryAndExitThread` to safely unload the module. - On Linux, we use `dlopen` to increment the reference count. Unfortunately there is no safe way to unload the module from within itself, so we keep it in memory. If for some reason we failed to increment the reference count, we give up on sending telemetry. ## Test coverage `OnEolFrameworkInSsi_WhenForwarderPathExists_CallsForwarderWithExpectedTelemetry` segfaults on 3.0 without this change.
Datadog ReportBranch report: ✅ 0 Failed, 346867 Passed, 2080 Skipped, 15h 44m 58.52s Total Time ⌛ Performance Regressions vs Default Branch (14)
|
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5957) - mean (76ms) : 67, 86
. : milestone, 76,
section CallTarget+Inlining+NGEN
This PR (5957) - mean (1,027ms) : 1009, 1045
. : milestone, 1027,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5957) - mean (109ms) : 105, 113
. : milestone, 109,
section CallTarget+Inlining+NGEN
This PR (5957) - mean (707ms) : 686, 728
. : milestone, 707,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5957) - mean (93ms) : 90, 96
. : milestone, 93,
section CallTarget+Inlining+NGEN
This PR (5957) - mean (664ms) : 643, 684
. : milestone, 664,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5957) - mean (190ms) : 187, 193
. : milestone, 190,
section CallTarget+Inlining+NGEN
This PR (5957) - mean (1,109ms) : 1090, 1129
. : milestone, 1109,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5957) - mean (276ms) : 271, 282
. : milestone, 276,
section CallTarget+Inlining+NGEN
This PR (5957) - mean (883ms) : 864, 902
. : milestone, 883,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5957) - mean (264ms) : 259, 269
. : milestone, 264,
section CallTarget+Inlining+NGEN
This PR (5957) - mean (863ms) : 843, 883
. : milestone, 863,
|
This is a backport of #5944 to v2.
Summary of changes
Prevent the native loader from being unloaded while sending telemetry.
Reason for change
We send telemetry when we decide not to instrument a process (for instance because of an EOL runtime). To send the telemetry, we spawn the telemetry forwarder. This is done in a background thread to avoid blocking the startup of the process. However, .NET unloads the profiler when the
CorProfilerInfo::Initialize
method returns, even if the background thread is still running, causing a segfault.Implementation details
We increment the reference count of the module to prevent it from being unloaded by the .NET runtime.
Two different implementations:
GetModuleHandleEx
to increment the reference count. Then, when we're done sending the telemetry, we useFreeLibraryAndExitThread
to safely unload the module.dlopen
to increment the reference count. Unfortunately there is no safe way to unload the module from within itself, so we keep it in memory.If for some reason we failed to increment the reference count, we give up on sending telemetry.
Test coverage
OnEolFrameworkInSsi_WhenForwarderPathExists_CallsForwarderWithExpectedTelemetry
segfaults on 3.0 without this change.