Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock access to rejitters #5757

Merged
merged 1 commit into from
Jul 2, 2024
Merged

Lock access to rejitters #5757

merged 1 commit into from
Jul 2, 2024

Conversation

kevingosse
Copy link
Collaborator

Summary of changes

Add a lock when accessing the rejitters in RejitHandler

Reason for change

RegisterRejitter can be accessed concurrently with NotifyReJITParameters, causing chaos.

Implementation details

Reused the existing reader/writer lock implementation

Test coverage

See if stuff keep crashing, I guess.

@kevingosse kevingosse requested a review from a team as a code owner July 1, 2024 11:10
Copy link
Contributor

@daniel-romano-DD daniel-romano-DD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot

@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Jul 1, 2024

Datadog Report

Branch report: kevin/lock_rejit
Commit report: a6f07e4
Test service: dd-trace-dotnet

✅ 0 Failed, 345001 Passed, 2108 Skipped, 22h 13m 7.04s Total Time
❄️ 1 New Flaky

New Flaky Tests (1)

  • SimpleActivitiesAndSpansTest - Datadog.Trace.Tests.ActivityTests - Last Failure

    Expand for error
     Expected childActivity.ParentSpanId.ToString() to be "af53c96e888daa20", but "1f14738d539bc651" differs near "1f1" (index 0).
    

@andrewlock
Copy link
Member

andrewlock commented Jul 1, 2024

Throughput/Crank Report:zap:

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (5757) (11.923M)   : 0, 11922621
    master (11.885M)   : 0, 11885449
    benchmarks/2.9.0 (11.959M)   : 0, 11959218

    section Automatic
    This PR (5757) (8.005M)   : 0, 8004540
    master (8.083M)   : 0, 8083143
    benchmarks/2.9.0 (8.424M)   : 0, 8423539

    section Trace stats
    master (8.397M)   : 0, 8397245

    section Manual
    This PR (5757) (10.263M)   : 0, 10262840
    master (10.421M)   : 0, 10420552

    section Manual + Automatic
    This PR (5757) (7.545M)   : 0, 7545468
    master (7.723M)   : 0, 7723457

    section Version Conflict
    master (6.894M)   : 0, 6893618

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (5757) (9.595M)   : 0, 9595293
    master (9.460M)   : 0, 9460018
    benchmarks/2.9.0 (9.647M)   : 0, 9646678

    section Automatic
    This PR (5757) (6.600M)   : 0, 6600277
    master (6.295M)   : 0, 6295393

    section Trace stats
    master (6.890M)   : 0, 6889897

    section Manual
    This PR (5757) (8.182M)   : 0, 8182205
    master (8.322M)   : 0, 8322214

    section Manual + Automatic
    This PR (5757) (6.250M)   : 0, 6249646
    master (6.028M)   : 0, 6027602

    section Version Conflict
    master (5.501M)   : 0, 5501210

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (5757) (10.132M)   : 0, 10132463
    master (10.045M)   : 0, 10044712
    benchmarks/2.9.0 (10.154M)   : 0, 10153990

    section Automatic
    This PR (5757) (7.297M)   : 0, 7297110
    master (7.251M)   : 0, 7250974
    benchmarks/2.9.0 (7.563M)   : 0, 7562893

    section Trace stats
    master (7.555M)   : 0, 7554611

    section Manual
    This PR (5757) (9.026M)   : 0, 9025618
    master (9.057M)   : 0, 9056875

    section Manual + Automatic
    This PR (5757) (6.924M)   : 0, 6924012
    master (7.113M)   : 0, 7113130

    section Version Conflict
    master (6.383M)   : 0, 6382724

Loading

@andrewlock
Copy link
Member

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5757) - mean (74ms)  : 63, 86
     .   : milestone, 74,
    master - mean (72ms)  : 63, 81
     .   : milestone, 72,

    section CallTarget+Inlining+NGEN
    This PR (5757) - mean (898ms)  : 869, 927
     .   : milestone, 898,
    master - mean (895ms)  : 874, 916
     .   : milestone, 895,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5757) - mean (110ms)  : 107, 113
     .   : milestone, 110,
    master - mean (109ms)  : 106, 113
     .   : milestone, 109,

    section CallTarget+Inlining+NGEN
    This PR (5757) - mean (630ms)  : 614, 646
     .   : milestone, 630,
    master - mean (634ms)  : 618, 651
     .   : milestone, 634,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5757) - mean (93ms)  : 89, 96
     .   : milestone, 93,
    master - mean (93ms)  : 90, 96
     .   : milestone, 93,

    section CallTarget+Inlining+NGEN
    This PR (5757) - mean (590ms)  : 575, 604
     .   : milestone, 590,
    master - mean (591ms)  : 572, 610
     .   : milestone, 591,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5757) - mean (192ms)  : 188, 195
     .   : milestone, 192,
    master - mean (192ms)  : 188, 196
     .   : milestone, 192,

    section CallTarget+Inlining+NGEN
    This PR (5757) - mean (999ms)  : 973, 1025
     .   : milestone, 999,
    master - mean (1,003ms)  : 974, 1031
     .   : milestone, 1003,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5757) - mean (276ms)  : 271, 281
     .   : milestone, 276,
    master - mean (275ms)  : 270, 280
     .   : milestone, 275,

    section CallTarget+Inlining+NGEN
    This PR (5757) - mean (818ms)  : 795, 840
     .   : milestone, 818,
    master - mean (823ms)  : 791, 855
     .   : milestone, 823,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5757) - mean (265ms)  : 261, 270
     .   : milestone, 265,
    master - mean (264ms)  : 260, 269
     .   : milestone, 264,

    section CallTarget+Inlining+NGEN
    This PR (5757) - mean (805ms)  : 774, 835
     .   : milestone, 805,
    master - mean (811ms)  : 780, 843
     .   : milestone, 811,

Loading

@andrewlock andrewlock merged commit 421a272 into master Jul 2, 2024
60 of 62 checks passed
@andrewlock andrewlock deleted the kevin/lock_rejit branch July 2, 2024 14:12
@github-actions github-actions bot added this to the vNext-v2 milestone Jul 2, 2024
andrewlock added a commit that referenced this pull request Jul 3, 2024
andrewlock added a commit that referenced this pull request Jul 3, 2024
…5768)

## Summary of changes

Reverts recent change that moved definitions to the native side

## Reason for change

We've seen crashes on the native side in smoke tests etc that we haven't
got to the bottom of yet. Out of caution, choosing to revert these
changes to investigate further

## Implementation details

Reverted the following PRs: 
- #5757 (introduced to
fix a bug in 5592)
- #5592

## Test coverage

This is the main test, will run a full installer test suite on this PR
to confirm we don't see the issues immediately. Will require further
monitoring to look for other issues

## Other details

A less drastic version of 
- #5767

as we expect the issue is in the moved definitions.

<!-- ⚠️ Note: where possible, please obtain 2 approvals prior to
merging. Unless CODEOWNERS specifies otherwise, for external teams it is
typically best to have one review from a team member, and one review
from apm-dotnet. Trivial changes do not require 2 reviews. -->
@e-n-0 e-n-0 added the area:native-library Automatic instrumentation native C++ code (Datadog.Trace.ClrProfiler.Native) label Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:native-library Automatic instrumentation native C++ code (Datadog.Trace.ClrProfiler.Native)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants