Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Profiler] Avoid too many syscall calls for timer-create-based CPU profiler #6067

Merged
merged 1 commit into from
Sep 24, 2024

Conversation

gleocadie
Copy link
Collaborator

Summary of changes

Avoid too many syscall calls

Reason for change

TL;DR; syscalls add a sensitive overhead on the application. The more you add the more the application will suffer.

In a recent PR, I added the a way to disarm/rearm the timer_create-based CPU profiler. This was to prevent associating CPU to unwind stacks (by all profilers) to managed threads.

This adds a sensitive number of syscalls calls (ex: Walltime => 5 * 2 (threads with no trace context) + 10 * 2 (threads with trace context : Code Hotspot) every 10 ms).

Implementation details

Remote CpuProfilerDisableScope class.

Test coverage

Other details

@gleocadie gleocadie requested a review from a team as a code owner September 24, 2024 08:38
@github-actions github-actions bot added the area:profiler Issues related to the continous-profiler label Sep 24, 2024
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Sep 24, 2024

Datadog Report

Branch report: gleocadie/remove-syscall-calls-for-cpu-profiler
Commit report: dd16791
Test service: dd-trace-dotnet

✅ 0 Failed, 366526 Passed, 2340 Skipped, 16h 41m 51.65s Total Time

@andrewlock
Copy link
Member

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6067) - mean (70ms)  : 68, 72
     .   : milestone, 70,
    master - mean (70ms)  : 67, 73
     .   : milestone, 70,

    section CallTarget+Inlining+NGEN
    This PR (6067) - mean (1,100ms)  : 1081, 1119
     .   : milestone, 1100,
    master - mean (1,105ms)  : 1067, 1143
     .   : milestone, 1105,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6067) - mean (109ms)  : 106, 112
     .   : milestone, 109,
    master - mean (109ms)  : 105, 112
     .   : milestone, 109,

    section CallTarget+Inlining+NGEN
    This PR (6067) - mean (768ms)  : 757, 779
     .   : milestone, 768,
    master - mean (766ms)  : 751, 782
     .   : milestone, 766,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6067) - mean (92ms)  : 89, 94
     .   : milestone, 92,
    master - mean (92ms)  : 90, 94
     .   : milestone, 92,

    section CallTarget+Inlining+NGEN
    This PR (6067) - mean (725ms)  : 714, 737
     .   : milestone, 725,
    master - mean (727ms)  : 707, 747
     .   : milestone, 727,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6067) - mean (190ms)  : 187, 194
     .   : milestone, 190,
    master - mean (191ms)  : 187, 195
     .   : milestone, 191,

    section CallTarget+Inlining+NGEN
    This PR (6067) - mean (1,190ms)  : 1168, 1211
     .   : milestone, 1190,
    master - mean (1,202ms)  : 1165, 1240
     .   : milestone, 1202,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6067) - mean (276ms)  : 271, 281
     .   : milestone, 276,
    master - mean (276ms)  : 270, 281
     .   : milestone, 276,

    section CallTarget+Inlining+NGEN
    This PR (6067) - mean (939ms)  : 918, 961
     .   : milestone, 939,
    master - mean (938ms)  : 918, 958
     .   : milestone, 938,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6067) - mean (264ms)  : 261, 268
     .   : milestone, 264,
    master - mean (265ms)  : 261, 269
     .   : milestone, 265,

    section CallTarget+Inlining+NGEN
    This PR (6067) - mean (925ms)  : 908, 942
     .   : milestone, 925,
    master - mean (922ms)  : 906, 939
     .   : milestone, 922,

Loading

@andrewlock
Copy link
Member

Throughput/Crank Report ⚡

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6067) (11.245M)   : 0, 11245443
    master (11.101M)   : 0, 11101013
    benchmarks/2.9.0 (11.081M)   : 0, 11080577

    section Automatic
    This PR (6067) (7.430M)   : 0, 7429939
    master (7.322M)   : 0, 7321945
    benchmarks/2.9.0 (7.732M)   : 0, 7732233

    section Trace stats
    master (7.648M)   : 0, 7647979

    section Manual
    master (11.159M)   : 0, 11158874

    section Manual + Automatic
    This PR (6067) (6.779M)   : 0, 6778675
    master (6.756M)   : 0, 6755889

    section DD_TRACE_ENABLED=0
    master (10.104M)   : 0, 10103630

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6067) (9.404M)   : 0, 9403805
    master (9.538M)   : 0, 9538166
    benchmarks/2.9.0 (9.798M)   : 0, 9798067

    section Automatic
    This PR (6067) (6.598M)   : 0, 6598285
    master (6.560M)   : 0, 6559669

    section Trace stats
    master (6.899M)   : 0, 6898524

    section Manual
    master (9.530M)   : 0, 9529944

    section Manual + Automatic
    This PR (6067) (6.133M)   : 0, 6132517
    master (6.111M)   : 0, 6111240

    section DD_TRACE_ENABLED=0
    master (8.837M)   : 0, 8836832

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6067) (10.099M)   : 0, 10098813
    master (10.017M)   : 0, 10017307
    benchmarks/2.9.0 (10.067M)   : 0, 10067315

    section Automatic
    This PR (6067) (6.776M)   : 0, 6776096
    master (6.715M)   : 0, 6715040
    benchmarks/2.9.0 (7.552M)   : 0, 7552193

    section Trace stats
    master (7.357M)   : 0, 7357178

    section Manual
    master (9.959M)   : 0, 9958707

    section Manual + Automatic
    This PR (6067) (6.275M)   : 0, 6274970
    master (6.139M)   : 0, 6139159

    section DD_TRACE_ENABLED=0
    master (9.384M)   : 0, 9383899

Loading

Copy link
Contributor

@chrisnas chrisnas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gleocadie gleocadie merged commit 599c92f into master Sep 24, 2024
78 of 81 checks passed
@gleocadie gleocadie deleted the gleocadie/remove-syscall-calls-for-cpu-profiler branch September 24, 2024 15:14
@github-actions github-actions bot added this to the vNext-v3 milestone Sep 24, 2024
@andrewlock andrewlock added the type:performance Performance, speed, latency, resource usage (CPU, memory) label Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:profiler Issues related to the continous-profiler type:performance Performance, speed, latency, resource usage (CPU, memory)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants