Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Channels benchmarks #79

Merged
merged 4 commits into from
Jul 3, 2018
Merged

Port Channels benchmarks #79

merged 4 commits into from
Jul 3, 2018

Conversation

adamsitnik
Copy link
Member

Fixes #72

@ViktorHofer this is a good example of benchmarks that require to setup them once before running and produce very stable results.

Results reported by xunit-performance:

System.Threading.Channels.Performance.Tests.dll Metric Unit Iterations Average STDEV.S Min Max
System.Threading.Channels.Tests.BoundedChannelPerfTests.PingPong Duration msec 4 2504.966 14.556 2486.916 2519.338
System.Threading.Channels.Tests.BoundedChannelPerfTests.PingPong Allocation Size on Benchmark Execution Thread bytes 4 -6874.000 34188.135 -39888.000 29808.000
System.Threading.Channels.Tests.BoundedChannelPerfTests.ReadAsyncThenWriteAsync Duration msec 100 92.164 3.370 89.415 116.241
System.Threading.Channels.Tests.BoundedChannelPerfTests.ReadAsyncThenWriteAsync Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.BoundedChannelPerfTests.TryWriteThenTryRead Duration msec 100 49.657 0.656 48.887 53.348
System.Threading.Channels.Tests.BoundedChannelPerfTests.TryWriteThenTryRead Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.BoundedChannelPerfTests.WriteAsyncThenReadAsync Duration msec 100 77.709 6.011 72.436 103.316
System.Threading.Channels.Tests.BoundedChannelPerfTests.WriteAsyncThenReadAsync Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.PingPong Duration msec 5 2451.046 35.630 2414.985 2505.484
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.PingPong Allocation Size on Benchmark Execution Thread bytes 5 -440.000 29448.079 -40208.000 42952.000
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.ReadAsyncThenWriteAsync Duration msec 100 91.562 0.901 90.826 97.535
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.ReadAsyncThenWriteAsync Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.TryWriteThenTryRead Duration msec 100 30.574 0.191 30.373 31.380
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.TryWriteThenTryRead Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.WriteAsyncThenReadAsync Duration msec 100 72.449 0.744 71.742 75.072
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.WriteAsyncThenReadAsync Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.UnboundedChannelPerfTests.PingPong Duration msec 4 2565.998 32.485 2525.836 2602.982
System.Threading.Channels.Tests.UnboundedChannelPerfTests.PingPong Allocation Size on Benchmark Execution Thread bytes 4 -7770.000 184523.815 -256784.000 189120.000
System.Threading.Channels.Tests.UnboundedChannelPerfTests.ReadAsyncThenWriteAsync Duration msec 94 107.246 1.138 106.359 113.047
System.Threading.Channels.Tests.UnboundedChannelPerfTests.ReadAsyncThenWriteAsync Allocation Size on Benchmark Execution Thread bytes 94 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.UnboundedChannelPerfTests.TryWriteThenTryRead Duration msec 100 44.448 2.991 42.320 64.041
System.Threading.Channels.Tests.UnboundedChannelPerfTests.TryWriteThenTryRead Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000
System.Threading.Channels.Tests.UnboundedChannelPerfTests.WriteAsyncThenReadAsync Duration msec 100 72.118 4.670 66.332 84.937
System.Threading.Channels.Tests.UnboundedChannelPerfTests.WriteAsyncThenReadAsync Allocation Size on Benchmark Execution Thread bytes 100 0.000 0.000 0.000 0.000

Results reported by BenchmarkDotNet:

Type Method Mean Error StdDev Median Min Max Gen 0 Allocated
BoundedChannelPerfTests TryWriteThenTryRead 51.29 ms 0.1146 ms 0.1016 ms 51.29 ms 51.12 ms 51.46 ms - 0 B
SpscUnboundedChannelPerfTests TryWriteThenTryRead 29.64 ms 0.0197 ms 0.0184 ms 29.65 ms 29.61 ms 29.67 ms - 0 B
UnboundedChannelPerfTests TryWriteThenTryRead 42.51 ms 0.1769 ms 0.1477 ms 42.47 ms 42.34 ms 42.87 ms - 0 B
BoundedChannelPerfTests WriteAsyncThenReadAsync 73.55 ms 2.0606 ms 1.9275 ms 72.83 ms 72.38 ms 79.19 ms - 0 B
SpscUnboundedChannelPerfTests WriteAsyncThenReadAsync 57.50 ms 0.1168 ms 0.0975 ms 57.47 ms 57.34 ms 57.67 ms - 0 B
UnboundedChannelPerfTests WriteAsyncThenReadAsync 69.54 ms 2.2016 ms 2.4470 ms 69.72 ms 66.63 ms 74.75 ms - 0 B
BoundedChannelPerfTests ReadAsyncThenWriteAsync 93.21 ms 2.7695 ms 3.0783 ms 91.41 ms 90.31 ms 100.63 ms - 0 B
SpscUnboundedChannelPerfTests ReadAsyncThenWriteAsync 92.76 ms 2.2358 ms 2.5748 ms 91.40 ms 90.85 ms 99.89 ms - 0 B
UnboundedChannelPerfTests ReadAsyncThenWriteAsync 109.82 ms 2.7086 ms 3.1192 ms 108.83 ms 106.55 ms 118.01 ms - 0 B
BoundedChannelPerfTests PingPong 2,544.90 ms 25.1021 ms 23.4805 ms 2,547.47 ms 2,512.98 ms 2,577.61 ms 20000.0000 792 B
SpscUnboundedChannelPerfTests PingPong 2,441.26 ms 20.7453 ms 19.4052 ms 2,440.15 ms 2,398.93 ms 2,486.24 ms 20000.0000 792 B
UnboundedChannelPerfTests PingPong 2,606.69 ms 34.3391 ms 32.1208 ms 2,607.97 ms 2,556.90 ms 2,659.98 ms 20000.0000 792 B

Hint: To join the results of multiple types from the same namespace into a single summary I used --join combined with --namespace filter (dotnet run -c Release -f netcoreapp2.1 -- --namespace=System.Threading.Channels.Tests --join)

Comments:

  1. BenchmarkDotNet needed 205.31 sec to run the benchmarks, xunit-performance only half of it. The reason is that xunit-performance does not run many iterations for time-consuming benchmarks. In this case for the PingPong benchmark which takes 2.5s to execute BDN run 20 iterations when xunit-performance run 5 iterations. I added the info about this to Ideas for reducing the time required to run all benchmarks #78
  2. BenchmarkDotNet produced much better results (smaller standard deviation, the distribution was more narrow)
  3. For some reason xunit-performance reported negative values for allocated memory for PingPong benchmarks.
  4. PingPongBenchmarks suffer from MemoryDiagnoser should include memory allocated by all Threads that were live during benchmark execution BenchmarkDotNet#723 which could be solved by https://github.com/dotnet/corefx/issues/30644 (we show the number of allocated bytes per single thread, whereas the benchmarks include many)

ChannelReader<int> reader = _reader;
ChannelWriter<int> writer = _writer;

for (int i = 0; i < 1_000_000; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1_000_000 [](start = 32, length = 9)

This magic number is in 6 different places in the source file. Could we make it a const int?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorive done!

Copy link
Member

@jorive jorive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo comment.

@adamsitnik adamsitnik merged commit 48990a7 into dotnet:master Jul 3, 2018
@adamsitnik adamsitnik deleted the channels branch October 17, 2018 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants