-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI Visibility] UDS and NamedPipes support #5634
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, mostly just suggest some tidying up in the tests to reduce duplication
case TracesTransportType.WindowsNamedPipe: | ||
Log.Information<string?, int>("Using " + nameof(NamedPipeClientStreamFactory) + " for CI Visibility transport, with pipe name {TracesPipeName} and timeout {TracesPipeTimeoutMs}ms.", exporterSettings.TracesPipeNameInternal, exporterSettings.TracesPipeTimeoutMsInternal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this identical to the default Agent strategy? Can't you use AgentTransportStrategy
if so? What's different here? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AutomaticDescompression part and the custom timeout I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 8a129d8 to reuse the strategy if is not tcp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could I suggest an alternative? Just add support for the automatic decompression and custom timeout to the default strategy?
tracer/test/Datadog.Trace.ClrProfiler.IntegrationTests/CI/PipesXUnitTests.cs
Outdated
Show resolved
Hide resolved
tracer/test/snapshots/PipesXUnitTests.SubmitTraces_packageVersion=all.verified.txt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍 Just one small suggestion about extending the default agent strategy instead of conditionally calling it. Could always do that in a separate PR if you agree though
…etries up to 5 to reduce flakiness
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5634) - mean (72ms) : 64, 81
. : milestone, 72,
master - mean (75ms) : 61, 90
. : milestone, 75,
section CallTarget+Inlining+NGEN
This PR (5634) - mean (976ms) : 957, 995
. : milestone, 976,
master - mean (978ms) : 947, 1010
. : milestone, 978,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5634) - mean (109ms) : 106, 112
. : milestone, 109,
master - mean (109ms) : 107, 112
. : milestone, 109,
section CallTarget+Inlining+NGEN
This PR (5634) - mean (683ms) : 660, 706
. : milestone, 683,
master - mean (688ms) : 669, 707
. : milestone, 688,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5634) - mean (93ms) : 90, 96
. : milestone, 93,
master - mean (93ms) : 90, 96
. : milestone, 93,
section CallTarget+Inlining+NGEN
This PR (5634) - mean (644ms) : 617, 670
. : milestone, 644,
master - mean (643ms) : 622, 663
. : milestone, 643,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5634) - mean (192ms) : 186, 197
. : milestone, 192,
master - mean (192ms) : 188, 196
. : milestone, 192,
section CallTarget+Inlining+NGEN
This PR (5634) - mean (1,072ms) : 1047, 1097
. : milestone, 1072,
master - mean (1,066ms) : 1041, 1090
. : milestone, 1066,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5634) - mean (275ms) : 270, 281
. : milestone, 275,
master - mean (276ms) : 272, 280
. : milestone, 276,
section CallTarget+Inlining+NGEN
This PR (5634) - mean (860ms) : 839, 882
. : milestone, 860,
master - mean (864ms) : 840, 888
. : milestone, 864,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (5634) - mean (266ms) : 262, 269
. : milestone, 266,
master - mean (266ms) : 262, 270
. : milestone, 266,
section CallTarget+Inlining+NGEN
This PR (5634) - mean (852ms) : 824, 880
. : milestone, 852,
master - mean (855ms) : 828, 881
. : milestone, 855,
|
Throughput/Crank Report:zap:Throughput results for AspNetCoreSimpleController comparing the following branches/commits: Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red. Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards! gantt
title Throughput Linux x64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (5634) (11.866M) : 0, 11866204
master (11.793M) : 0, 11792642
benchmarks/2.9.0 (11.919M) : 0, 11919161
section Automatic
This PR (5634) (8.068M) : 0, 8068286
master (8.100M) : 0, 8100271
benchmarks/2.9.0 (8.337M) : 0, 8336508
section Trace stats
master (8.392M) : 0, 8391668
section Manual
This PR (5634) (10.403M) : 0, 10403094
master (10.382M) : 0, 10382409
section Manual + Automatic
This PR (5634) (7.674M) : 0, 7673613
master (7.604M) : 0, 7604410
section Version Conflict
master (6.895M) : 0, 6894993
gantt
title Throughput Linux arm64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (5634) (9.709M) : 0, 9708502
master (9.469M) : 0, 9469402
benchmarks/2.9.0 (9.646M) : 0, 9645709
section Automatic
This PR (5634) (6.496M) : 0, 6496151
master (6.624M) : 0, 6624452
section Trace stats
master (6.914M) : 0, 6913938
section Manual
This PR (5634) (8.224M) : 0, 8224423
master (8.381M) : 0, 8381087
section Manual + Automatic
This PR (5634) (6.286M) : 0, 6286200
master (6.165M) : 0, 6165101
section Version Conflict
master (5.727M) : 0, 5727364
gantt
title Throughput Windows x64 (Total requests)
dateFormat X
axisFormat %s
section Baseline
This PR (5634) (9.747M) : 0, 9747485
master (9.839M) : 0, 9839143
benchmarks/2.9.0 (10.036M) : 0, 10035695
section Automatic
This PR (5634) (6.930M) : 0, 6930328
master (6.893M) : 0, 6892972
benchmarks/2.9.0 (7.475M) : 0, 7475419
section Trace stats
master (7.392M) : 0, 7392031
section Manual
This PR (5634) (8.597M) : 0, 8597415
master (8.678M) : 0, 8677977
section Manual + Automatic
This PR (5634) (6.666M) : 0, 6666118
master (6.649M) : 0, 6648668
section Version Conflict
master (6.142M) : 0, 6142246
|
Benchmarks Report for tracer 🐌Benchmarks for #5634 compared to master:
The following thresholds were used for comparing the benchmark speeds:
Allocation changes below 0.5% are ignored. Benchmark detailsBenchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SpanBenchmark - Slower
|
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑netcoreapp3.1 | 1.116 | 538.53 | 600.74 |
Benchmark | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net472 | 1.198 | 694.85 | 580.07 | |
Benchmarks.Trace.SpanBenchmark.StartFinishScope‑netcoreapp3.1 | 1.144 | 782.32 | 683.74 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | StartFinishSpan |
net6.0 | 398ns | 0.112ns | 0.433ns | 0.00817 | 0 | 0 | 576 B |
master | StartFinishSpan |
netcoreapp3.1 | 539ns | 0.183ns | 0.708ns | 0.00767 | 0 | 0 | 576 B |
master | StartFinishSpan |
net472 | 694ns | 0.544ns | 2.11ns | 0.0917 | 0 | 0 | 578 B |
master | StartFinishScope |
net6.0 | 478ns | 0.161ns | 0.625ns | 0.00984 | 0 | 0 | 696 B |
master | StartFinishScope |
netcoreapp3.1 | 783ns | 0.287ns | 1.07ns | 0.00935 | 0 | 0 | 696 B |
master | StartFinishScope |
net472 | 847ns | 0.518ns | 2.01ns | 0.104 | 0 | 0 | 658 B |
#5634 | StartFinishSpan |
net6.0 | 390ns | 0.167ns | 0.623ns | 0.00819 | 0 | 0 | 576 B |
#5634 | StartFinishSpan |
netcoreapp3.1 | 598ns | 2.82ns | 10.9ns | 0.00786 | 0 | 0 | 576 B |
#5634 | StartFinishSpan |
net472 | 580ns | 0.501ns | 1.94ns | 0.0917 | 0 | 0 | 578 B |
#5634 | StartFinishScope |
net6.0 | 467ns | 0.104ns | 0.402ns | 0.00985 | 0 | 0 | 696 B |
#5634 | StartFinishScope |
netcoreapp3.1 | 684ns | 0.303ns | 1.17ns | 0.00933 | 0 | 0 | 696 B |
#5634 | StartFinishScope |
net472 | 916ns | 0.454ns | 1.76ns | 0.105 | 0 | 0 | 658 B |
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | RunOnMethodBegin |
net6.0 | 626ns | 0.209ns | 0.809ns | 0.00965 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
netcoreapp3.1 | 944ns | 0.488ns | 1.82ns | 0.00921 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
net472 | 1.04μs | 0.451ns | 1.75ns | 0.104 | 0 | 0 | 658 B |
#5634 | RunOnMethodBegin |
net6.0 | 590ns | 0.297ns | 1.15ns | 0.00976 | 0 | 0 | 696 B |
#5634 | RunOnMethodBegin |
netcoreapp3.1 | 941ns | 0.521ns | 2.02ns | 0.00938 | 0 | 0 | 696 B |
#5634 | RunOnMethodBegin |
net472 | 1.09μs | 0.461ns | 1.79ns | 0.104 | 0 | 0 | 658 B |
Datadog ReportBranch report: ✅ 0 Failed, 340419 Passed, 2131 Skipped, 22h 18m 40.44s Total Time |
// The server implementation of named pipes is flaky so have 5 attempts | ||
var attemptsRemaining = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 attempts? Is it really that flaky?! IIRC, looking at the metrics, we normally need 1 retry, tops 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw a fail with 3 attempts, so I'm trying to play safe...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still find 3 consecutive failures concerning 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next week I'll look deeper on why the flakiness happens, but seems like something related to our mock server, the test being flaky do parallel connections and send approx 120 spans using those. Is very stable for small payloads.
Summary of changes
This PR adds UDS and Windows Named Pipes transport support for CI Visibility
Reason for change
The feature is supported in APM but missing in CI Visibility
Implementation details
Copied the apm code and adapt it to the Ci Visibility Case
Test coverage
Two new UdsXUnitTest and PipesXUnitTest were added as copy of XUnitTest but using different transports.
Same with UdsXunitEvpTests
Other details