Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Changes at 5/21/2021 6:35:06 PM #6086

Open
performanceautofiler bot opened this issue May 25, 2021 · 5 comments
Open

[Perf] Changes at 5/21/2021 6:35:06 PM #6086

performanceautofiler bot opened this issue May 25, 2021 · 5 comments

Comments

@performanceautofiler
Copy link

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.Sort<Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
LinqOrderByExtension - Duration of single invocation 47.71 μs 51.24 μs 1.07 0.01
LinqQuery - Duration of single invocation 47.60 μs 51.23 μs 1.08 0.01

graph
graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.Sort&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.Sort<Int32>.LinqOrderByExtension(Size: 512)


System.Collections.Sort<Int32>.LinqQuery(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Threading.Tests.Perf_Timer

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ShortScheduleAndDispose - Duration of single invocation 173.06 ns 184.72 ns 1.07 0.02

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Threading.Tests.Perf_Timer*'

Payloads

Baseline
Compare

Histogram

System.Threading.Tests.Perf_Timer.ShortScheduleAndDispose


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.ContainsTrue<Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ImmutableSortedSet - Duration of single invocation 31.61 μs 35.17 μs 1.11 0.00
SortedSet - Duration of single invocation 34.16 μs 36.28 μs 1.06 0.00

graph
graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.ContainsTrue&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsTrue<Int32>.ImmutableSortedSet(Size: 512)


System.Collections.ContainsTrue<Int32>.SortedSet(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.CtorFromCollection<Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
SortedList - Duration of single invocation 8.60 μs 9.57 μs 1.11 0.01

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.CtorFromCollection&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.CtorFromCollection<Int32>.SortedList(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.TryGetValueFalse<Int32, Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ImmutableSortedDictionary - Duration of single invocation 37.35 μs 43.23 μs 1.16 0.01

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.TryGetValueFalse&lt;Int32, Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.TryGetValueFalse<Int32, Int32>.ImmutableSortedDictionary(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.ContainsKeyTrue<Int32, Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ImmutableSortedDictionary - Duration of single invocation 35.90 μs 38.79 μs 1.08 0.00

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.ContainsKeyTrue&lt;Int32, Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsKeyTrue<Int32, Int32>.ImmutableSortedDictionary(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Text.Tests.Perf_StringBuilder

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Append_Char - Duration of single invocation 227.50 μs 259.18 μs 1.14 0.05
Append_Char - Duration of single invocation 329.92 ns 361.08 ns 1.09 0.07

graph
graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Text.Tests.Perf_StringBuilder*'

Payloads

Baseline
Compare

Histogram

System.Text.Tests.Perf_StringBuilder.Append_Char(length: 100000)


System.Text.Tests.Perf_StringBuilder.Append_Char(length: 100)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.CreateAddAndClear<Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
SortedSet - Duration of single invocation 47.34 μs 53.47 μs 1.13 0.00

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.CreateAddAndClear&lt;Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.CreateAddAndClear<Int32>.SortedSet(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.ContainsKeyFalse<Int32, Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ImmutableSortedDictionary - Duration of single invocation 38.05 μs 44.24 μs 1.16 0.01

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.ContainsKeyFalse&lt;Int32, Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.ContainsKeyFalse<Int32, Int32>.ImmutableSortedDictionary(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in Microsoft.Extensions.Primitives.StringSegmentBenchmark

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
IndexOfAny - Duration of single invocation 6.61 ns 8.11 ns 1.23 0.04

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'Microsoft.Extensions.Primitives.StringSegmentBenchmark*'

Payloads

Baseline
Compare

Histogram

Microsoft.Extensions.Primitives.StringSegmentBenchmark.IndexOfAny


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Diagnostics.Perf_Process

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
GetCurrentProcess - Duration of single invocation 102.15 ns 111.38 ns 1.09 0.01

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Diagnostics.Perf_Process*'

Payloads

Baseline
Compare

Histogram

System.Diagnostics.Perf_Process.GetCurrentProcess


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.TryGetValueTrue<Int32, Int32>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ImmutableSortedDictionary - Duration of single invocation 35.55 μs 39.34 μs 1.11 0.00

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.TryGetValueTrue&lt;Int32, Int32&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.TryGetValueTrue<Int32, Int32>.ImmutableSortedDictionary(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline becccb2c3c4556bbcdfde27a101e7615544da7ab
Compare c0860776f133fa98d3d2ef412821c9d3dd48b4fb
Diff Diff

Regressions in System.Collections.CreateAddAndClear<String>

Benchmark Baseline Test Test/Base Test Quality Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
HashSet - Duration of single invocation 16.36 μs 17.66 μs 1.08 0.05

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.CreateAddAndClear&lt;String&gt;*'

Payloads

Baseline
Compare

Histogram

System.Collections.CreateAddAndClear<String>.HashSet(Size: 512)


Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@DrewScoggins
Copy link
Owner

@AndyAyersMS PGO

@AndyAyersMS
Copy link
Collaborator

The change in System.Collections.CtorFromCollection<Int32> came from dotnet/runtime#52966.

@jakobbotsch can you look at the PGO data before/after and see if you can see what might have impacted this? If you open up the details tab under the chart you can find downloads of the two versions that were tested.

@jakobbotsch
Copy link

dotnet/runtime@becccb2 is using the 1.0.0-prerelease.21267.7 nuget package of the PGO data while dotnet/runtime@c086077 is using 1.0.0-prerelease.21270.4. Downloading those, merging them (to simulate what the build does) and then comparing them gives the following: https://gist.github.com/jakobbotsch/edb6f940d089aa98b50761351d0b49a4
Now this is comparing the sparse edge profiles which I'm not 100% about. With that said, some interesting observations:

  1. The latter profile has way more methods (54951 vs 45951) including way more methods with profile data.
  2. The histogram looks decent:
When comparing the flow-graphs of the matching methods, their overlaps break down as follows:
100% ███████████████████████████████████████████████████████████████████████████████████████▍ (80.3%)
>95% ██████████████▍ (13.3%)
>90% ██▍ (2.3%)
>85% █▏ (1.1%)
>80% ▊ (0.7%)
>75% ▍ (0.4%)
>70% ▍ (0.4%)
>65% ▍ (0.4%)
>60% ▏ (0.2%)
>55% ▎ (0.2%)
>50% ▏ (0.1%)
>45% ▏ (0.1%)
>40% ▏ (0.1%)
>35% ▏ (0.1%)
>30% ▏ (0.1%)
>25% ▏ (0.1%)
>20% ▏ (0.0%)
>15% ▏ (0.1%)
>10% ▏ (0.0%)
> 5% ▏ (0.0%)
> 0% ▏ (0.1%)
0%   ▏ (0.0%)

But:
3. Some of the worst offenders in terms of overlap are very important. 126 of the methods have overlap < 50%, including:

  • [S.P.CoreLib]System.Collections.Generic.Dictionary`2<System.ValueTuple`2<System.__Canon,System.__Canon>,System.__Canon>.TryGetValue(ValueTuple`2<__Canon,__Canon>,__Canon&) (0.07% overlap)
  • [S.P.CoreLib]System.SpanHelpers.IndexOf<int32>(int32&,int32,int32) (0.09% overlap)
  • [S.P.CoreLib]System.Collections.Generic.HashSet`1<System.__Canon>.ConstructFrom(HashSet`1<__Canon>) (0.68% overlap) (could explain the HashSet regression?)
  • [S.P.CoreLib]System.Array.Clear(Array) (27.65% overlap)
  • [S.P.CoreLib]System.Collections.Generic.ArraySortHelper`1<System.__Canon>.IntroSort(Span`1<__Canon>,int32,Comparison`1<__Canon>) (28.44% overlap) (could explain the sorting regression?)

Some interesting data here. I can see if I can overlay the edge counts on some of their flow graphs tomorrow, though not entirely sure how easy it is to do that with the sparse edge profiles.

@jakobbotsch
Copy link

I tried reproducing this locally by downloading the baseline/compare versions in the post above, but the variance seems very high. I tried with the Append_Char tests and I can basically get any conclusion by running it a few times. Here's the results from a few runs:

Method Job Toolchain length Mean Error Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
Append_Char Job-ROJTIC \baseline\Core_Root\corerun.exe 100 495.5 μs NA 495.5 μs 495.5 μs 495.5 μs 1.00 - - - 592 B
Append_Char Job-HCLUJV \compare\Core_Root\corerun.exe 100 459.0 μs NA 459.0 μs 459.0 μs 459.0 μs 0.93 - - - 592 B
Append_Char Job-ROJTIC \baseline\Core_Root\corerun.exe 100000 757.6 μs NA 757.6 μs 757.6 μs 757.6 μs 1.00 - - - 210,016 B
Append_Char Job-HCLUJV \compare\Core_Root\corerun.exe 100000 669.3 μs NA 669.3 μs 669.3 μs 669.3 μs 0.88 - - - 210,016 B
Method Job Toolchain length Mean Error Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
Append_Char Job-TAMWIF \baseline\Core_Root\corerun.exe 100 614.1 μs NA 614.1 μs 614.1 μs 614.1 μs 1.00 - - - 592 B
Append_Char Job-XZVEFO \compare\Core_Root\corerun.exe 100 549.1 μs NA 549.1 μs 549.1 μs 549.1 μs 0.89 - - - 592 B
Append_Char Job-TAMWIF \baseline\Core_Root\corerun.exe 100000 717.1 μs NA 717.1 μs 717.1 μs 717.1 μs 1.00 - - - 210,016 B
Append_Char Job-XZVEFO \compare\Core_Root\corerun.exe 100000 718.4 μs NA 718.4 μs 718.4 μs 718.4 μs 1.00 - - - 210,016 B
Method Job Toolchain length Mean Error Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
Append_Char Job-AQYTDT \baseline\Core_Root\corerun.exe 100 561.2 μs NA 561.2 μs 561.2 μs 561.2 μs 1.00 - - - 592 B
Append_Char Job-ILDEMG \compare\Core_Root\corerun.exe 100 518.9 μs NA 518.9 μs 518.9 μs 518.9 μs 0.92 - - - 592 B
Append_Char Job-AQYTDT \baseline\Core_Root\corerun.exe 100000 817.6 μs NA 817.6 μs 817.6 μs 817.6 μs 1.00 - - - 210,016 B
Append_Char Job-ILDEMG \compare\Core_Root\corerun.exe 100000 631.9 μs NA 631.9 μs 631.9 μs 631.9 μs 0.77 - - - 210,016 B
Method Job Toolchain length Mean Error Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
Append_Char Job-QIOYSB \baseline\Core_Root\corerun.exe 100 744.3 μs NA 744.3 μs 744.3 μs 744.3 μs 1.00 - - - 592 B
Append_Char Job-IYIFDF \compare\Core_Root\corerun.exe 100 559.7 μs NA 559.7 μs 559.7 μs 559.7 μs 0.75 - - - 592 B
Append_Char Job-QIOYSB \baseline\Core_Root\corerun.exe 100000 695.5 μs NA 695.5 μs 695.5 μs 695.5 μs 1.00 - - - 210,016 B
Append_Char Job-IYIFDF \compare\Core_Root\corerun.exe 100000 734.5 μs NA 734.5 μs 734.5 μs 734.5 μs 1.06 - - - 210,016 B

I used the command in the "repro" section and added --corerun baseline\Core_Root\corerun.exe compare\Core_Root\corerun.exe to it. Is that the right way to go @DrewScoggins?

@DrewScoggins
Copy link
Owner

I think the problem is that Append_Char is in fact just a noisy test that also happened to show a regression across this boundary. I would look at any of the others, besides System.Collections.CreateAddAndClear.HashSet, as all of the others look less noisy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants