Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: defer some flow graph reordering until after loop recognition #69878

Merged

Conversation

AndyAyersMS
Copy link
Member

The JIT currently will aggressively reorder the flow graph before running its
loop recognition phases. When there is PGO data this sometimes perturbs the
block order so that loops are no longer recognized, and we miss out on some
loop optimizations.

This change defers most block reordering until after the JIT has gone through
the optimization phases. There is still a limited form of flow cleanup done
early on.

There is also a compensating change in loop recognition in one place where it was
relying on adjacent blocks being combined.

Fixes #67318.

The JIT currently will aggressively reorder the flow graph before running its
loop recognition phases. When there is PGO data this sometimes perturbs the
block order so that loops are no longer recognized, and we miss out on some
loop optimizations.

This change defers most block reordering until after the JIT has gone through
the optimization phases. There is still a limited form of flow cleanup done
early on.

There is also a compensating change in loop recognition in one place where it was
relying on adjacent blocks being combined.

Fixes dotnet#67318.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 26, 2022
@ghost ghost assigned AndyAyersMS May 26, 2022
@ghost
Copy link

ghost commented May 26, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

The JIT currently will aggressively reorder the flow graph before running its
loop recognition phases. When there is PGO data this sometimes perturbs the
block order so that loops are no longer recognized, and we miss out on some
loop optimizations.

This change defers most block reordering until after the JIT has gone through
the optimization phases. There is still a limited form of flow cleanup done
early on.

There is also a compensating change in loop recognition in one place where it was
relying on adjacent blocks being combined.

Fixes #67318.

Author: AndyAyersMS
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@BruceForstall PTAL
cc @dotnet/jit-contrib

This is going to lead to a massive number of diffs. There's no way I've found to incrementally approach this.

If/when we go to a more graph-based loop recognition we can reconsider, but in general we should not be concerned with block ordering early on in the phase pipeline. So I think deferring these sorts of ordering opts is the right long-term approach.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Will be interesting to see how the perf runs react.

Should run jitstress on this.

@AndyAyersMS
Copy link
Member Author

Needed to pull in a bit more of fgCompactBlocks logic.

@AndyAyersMS
Copy link
Member Author

@dotnet/dncenghot CI seems to be in a bad way

 Retrying 'FindPackagesByIdAsync' for source 'https://pkgs.dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_packaging/d1622942-d16f-48e5-bc83-96f4539e7601/nuget/v3/flat2/microsoft.extensions.dependencymodel/index.json'.
  Response status code does not indicate success: 401 (Unauthorized - TF400813: Resource not available for anonymous access. Client authentication required. (DevOps Activity ID: 4CE74E9C-9CEC-474D-BC0C-08094706C5D6)).

@MattGal
Copy link
Member

MattGal commented May 26, 2022

@dotnet/dncenghot CI seems to be in a bad way

 Retrying 'FindPackagesByIdAsync' for source 'https://pkgs.dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_packaging/d1622942-d16f-48e5-bc83-96f4539e7601/nuget/v3/flat2/microsoft.extensions.dependencymodel/index.json'.
  Response status code does not indicate success: 401 (Unauthorized - TF400813: Resource not available for anonymous access. Client authentication required. (DevOps Activity ID: 4CE74E9C-9CEC-474D-BC0C-08094706C5D6)).

@epananth can you handle this? Repro is as simple as going to https://pkgs.dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_packaging/d1622942-d16f-48e5-bc83-96f4539e7601/nuget/v3/flat2/microsoft.netcore.app.ref/index.json from an incognito window; this should (and does with auth) serve up this:

{"$id":"1","innerException":null,"message":"Can't find the package 'microsoft.netcore.app.ref' in feed 'dotnet-tools'.","typeName":"Microsoft.VisualStudio.Services.NuGet.WebApi.Exceptions.PackageNotFoundException, Microsoft.VisualStudio.Services.NuGet.WebApi","typeKey":"PackageNotFoundException","errorCode":0,"eventId":3000}

but instead if prompts for auth despite dotnet-tools being a public feed.

@epananth
Copy link
Member

Yes on it

@epananth
Copy link
Member

@AndyAyersMS
Copy link
Member Author

Local x64 windows diff summary. Overall regression expected as we should be enabling more cloning.

aspnet     8765 total methods with Code Size differences (3416 improved, 5349 regressed), 1821 unchanged.
           Total bytes of delta: 38659 (0.17% of base)
benchmarks 781 total methods with Code Size differences (442 improved, 339 regressed), 369 unchanged.
           Total bytes of delta: -4627 (-0.05% of base)
clr.test   9817 total methods with Code Size differences (8460 improved, 1357 regressed), 1434 unchanged.
           Total bytes of delta: -52739 (-0.04 % of base)
lib.cross  1515 total methods with Code Size differences (801 improved, 714 regressed), 956 unchanged.
           Total bytes of delta: 3231 (0.01 % of base)
lib.pmi    11671 total methods with Code Size differences (6009 improved, 5662 regressed), 5132 unchanged.
           Total bytes of delta: 67054 (0.13 % of base)
lib.test   36329 total methods with Code Size differences (14817 improved, 21512 regressed), 6302 unchanged.
           Total bytes of delta: 134808 (0.10 % of base)

Generating all these dasm and rc took forever.

@AndyAyersMS AndyAyersMS reopened this May 27, 2022
@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented May 27, 2022

SPMI for arm32 timed out, too many diffs to cope with (looks like it finished running but got killed before summary could upload)

...
[17:29:43] 27461 total methods with Code Size differences (10106 improved, 17355 regressed), 10037 unchanged.
[17:29:43] Elapsed time: 1:22:17.110646
['unix-arm' END OF WORK ITEM LOG: Command timed out, and was killed]

Artifacts for the libraries arm run missing so hard to say why it failed. Am going to retry.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@AndyAyersMS
Copy link
Member Author

Got an arm artifact from the failure this time, looks like something to investigate:

Assert failure(PID 23 [0x00000017], Thread: 64 [0x0040]): Assertion failed 'remainingSize == 4' in 'Microsoft.CodeAnalysis.CSharp.Formatting.NewLineUserSettingFormattingRule:WithOptions(Microsoft.CodeAnalysis.Diagnostics.AnalyzerConfigOptions):Microsoft.CodeAnalysis.Formatting.Rules.AbstractFormattingRule:this' during 'Generate code' (IL size 31; hash 0x3116fdc4; Tier0)

    File: /__w/1/s/src/coreclr/jit/codegenarmarch.cpp Line: 1119
    Image: /root/helix/work/correlation/dotnet

@BruceForstall
Copy link
Member

BruceForstall commented May 27, 2022

Got an arm artifact from the failure this time, looks like something to investigate:

#60705

Seems to be happening more now

@AndyAyersMS
Copy link
Member Author

JitStress not happy.

@AndyAyersMS
Copy link
Member Author

JitStress not happy.

No luck reproing this yet.

@AndyAyersMS
Copy link
Member Author

JitStress not happy.

No luck reproing this yet.

My local build is a bit behind main, going to merge up.

@AndyAyersMS
Copy link
Member Author

JitStress not happy.

No luck reproing this yet.

My local build is a bit behind main, going to merge up.

I can repro now, but kind of hard to believe this change is responsible.

There are very similar failures in https://dev.azure.com/dnceng/public/_build/results?buildId=1790509&view=ms.vss-test-web.build-test-results-tab which is a few days old.

So looks like jit stress is in a bad way in general. Last green was 9859f70 and the next rolling after that at 0dba0ee is in bad shape.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented May 28, 2022

JitStress not happy.

No luck reproing this yet.

My local build is a bit behind main, going to merge up.

I can repro now, but kind of hard to believe this change is responsible.

There are very similar failures in https://dev.azure.com/dnceng/public/_build/results?buildId=1790509&view=ms.vss-test-web.build-test-results-tab which is a few days old.

So looks like jit stress is in a bad way in general. Last green was 9859f70 and the next rolling after that at 0dba0ee is in bad shape.

Suspect this is related to the W^X changes. I can see the jit is crashing in emitter::emitOutputByte, seems like emitter::emitOutputAlign is not invoking it with the proper offset in its stress clause.

@drieseng
Copy link
Contributor

@AndyAyersMS Add test from #67318?

@AndyAyersMS
Copy link
Member Author

@AndyAyersMS Add test from #67318?

I have a follow-on change in the works where I'll add these...

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@DrewScoggins
Copy link
Member

Improvements Arm64-Windows: dotnet/perf-autofiling-issues#5755

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Jun 4, 2022

We knew this PR was going to shake things up a bit ... here's an attempt to collect up all the reports we know of. Note many of these seemingly are duplicates.

Regressions

Improvements

@mrsharm
Copy link
Member

mrsharm commented Aug 10, 2022

@AndyAyersMS - we found the following regressions that seemed to line up with this PR. We detected this from our analysis while creating the perf report for August. Would you consider these regressions as "by design" as there are improvements associated with this PR as noted above.

Regressions:

  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\p{Ll}", Options: NonBacktracking)
  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aei", Options: Compiled)
  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: Compiled)
  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: NonBacktracking)
  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: None)
  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "zqj", Options: Compiled)
  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "zqj", Options: NonBacktracking)
  • System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "zqj", Options: None)
  • System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, Ordinal, False))
  • System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, IgnoreCase, False))
  • System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))
  • System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))
  • System.Globalization.Tests.StringSearch.IsSuffix_DifferentLastChar(Options: (en-US, Ordinal, False))
  • System.Linq.Tests.Perf_Enumerable.Count(input: IEnumerable)
  • System.IO.Tests.Perf_Path.GetDirectoryName
  • System.Tests.Perf_Double.Parse(value: "1.7976931348623157e+308")
  • System.Tests.Perf_Double.Parse(value: "-1.7976931348623157e+308")
  • System.Tests.Perf_Double.Parse(value: "12345")
  • System.Tests.Perf_Double.TryParse(value: "-1.7976931348623157e+308")
  • System.Tests.Perf_Double.TryParse(value: "1.7976931348623157e+308")
  • System.Tests.Perf_UInt64.TryParse(value: "18446744073709551615")
  • System.Tests.Perf_UInt16.TryParse(value: "12345")
  • System.Tests.Perf_Double.TryParse(value: "12345")
  • System.Tests.Perf_Double.TryParse(value: "1.7976931348623157e+308")

EDIT(s):

  1. Found more regressions related to *IndexOf_Word_NotFound particularly for the Ubuntu 18.04 x64 configuration:

Ubuntu 18.04 x64 - System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False)):

image

  1. System.Linq.Tests.Perf_Enumerable.Count(input: IEnumerable):

image

  1. System.IO.Tests.Perf_Path.GetDirectoryName:

image

  1. System.Tests.Perf_Double.Parse, System.Tests.Perf_Double.TryParse and System.Tests.Perf_UInt64.TryParse that all have a similar shape as:

image

System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "zqj", Options: None)

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 1.00 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.98 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Faster 1.32 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 0.97 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.02 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.01 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.02 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 1.01 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Same 1.00 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 1.00 +0 Windows 11 X64 AMD Ryzen 9 3950X
Same 1.00 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 1.00 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 1.04 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.96 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.93 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.72 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Faster 1.17 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.89 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.59 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 0.90 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.69 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.83 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.17 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.86 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Same 0.93 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Same 0.89 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: None)

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 1.00 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.98 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Faster 1.32 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 1.04 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.96 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.99 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.97 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 0.99 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Faster 1.31 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Faster 1.30 +0 Windows 11 X64 AMD Ryzen 9 3950X
Same 0.94 +0 Windows 11 X64 AMD Ryzen 9 5900X
Faster 1.45 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 1.00 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.94 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.92 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.89 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Faster 1.18 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.86 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.83 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 0.86 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.71 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.83 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.48 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.86 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Same 0.93 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.89 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aei", Options: Compiled)

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 1.00 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.97 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Faster 1.25 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 1.00 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.99 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.03 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.03 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 1.00 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Same 1.07 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 1.01 +0 Windows 11 X64 AMD Ryzen 9 3950X
Faster 1.14 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 1.01 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 1.04 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.95 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.90 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.78 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Faster 1.26 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Same 0.94 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.80 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 0.90 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.74 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.78 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.70 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.89 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.82 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.89 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: Compiled)

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 1.00 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.98 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Faster 1.32 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 1.00 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.02 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.95 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.99 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 1.00 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Faster 1.33 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.99 +0 Windows 11 X64 AMD Ryzen 9 3950X
Faster 1.43 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 0.99 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 0.99 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.95 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.94 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.72 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Faster 1.18 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Same 0.90 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.67 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 0.92 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.71 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.71 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.31 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.88 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.80 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.88 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: NonBacktracking)

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 1.00 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.98 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Faster 1.32 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 1.00 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.02 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.00 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.98 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 0.98 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Faster 1.31 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Faster 1.29 +0 Windows 11 X64 AMD Ryzen 9 3950X
Same 0.92 +0 Windows 11 X64 AMD Ryzen 9 5900X
Faster 1.45 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 1.02 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.93 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.93 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.72 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Faster 1.18 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Same 0.90 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.83 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 1.14 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.69 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.63 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.30 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.88 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Same 0.93 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.88 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, Ordinal, False))

Result Base Diff Ratio Alloc Delta Operating System Bit Processor Name Modality
Same 64.46 65.08 0.99 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 63.99 64.31 1.00 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Faster 29.34 22.67 1.29 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 21.05 22.06 0.95 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 22.35 20.91 1.07 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Faster 24.06 20.22 1.19 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 26.09 26.14 1.00 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 16.87 16.72 1.01 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Same 15.96 15.90 1.00 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 17.50 16.76 1.04 +0 Windows 11 X64 AMD Ryzen 9 3950X
Same 11.33 11.68 0.97 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 12.18 12.67 0.96 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 18.79 18.85 1.00 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 15.59 15.87 0.98 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 15.75 19.99 0.79 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 19.38 22.35 0.87 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 61.16 60.97 1.00 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Same 19.15 20.72 0.92 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 11.30 16.19 0.70 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 15.26 19.06 0.80 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 20.48 26.77 0.76 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz several?
Slower 20.04 24.96 0.80 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 15.96 19.29 0.83 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 24.83 29.23 0.85 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 22.58 26.93 0.84 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 24.75 29.04 0.85 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 0.98 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.99 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.30 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 1.11 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.05 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.93 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.03 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 1.03 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.98 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 1.00 +0 Windows 11 X64 AMD Ryzen 9 3950X
Same 1.01 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 0.96 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 1.00 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 1.07 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.94 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.81 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.93 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.63 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.82 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 0.80 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.96 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.00 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.02 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.80 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.82 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.79 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Globalization.Tests.StringSearch.IsSuffix_DifferentLastChar(Options: (en-US, Ordinal, False))

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 0.99 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.96 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 1.03 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 0.98 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.97 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.99 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.99 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 0.93 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.97 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.97 +0 Windows 11 X64 AMD Ryzen 9 3950X
Same 0.91 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 0.97 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 1.00 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 1.10 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Noise - +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.85 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Faster 1.19 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.73 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.66 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 0.75 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Same 1.04 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.91 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.01 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.85 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.84 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.84 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Linq.Tests.Perf_Enumerable.Count(input: IEnumerable)

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 1.02 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 1.00 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.99 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 0.93 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.83 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.92 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.88 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Slower 0.82 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.83 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.85 +0 Windows 11 X64 AMD Ryzen 9 3950X
Slower 0.86 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 1.00 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 0.91 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.78 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.84 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Same 0.95 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.09 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Same 0.96 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.90 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 0.83 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.86 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.00 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.00 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 1.00 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Same 0.95 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Same 1.00 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.IO.Tests.Perf_Path.GetDirectoryName

Result Ratio Alloc Delta Operating System Bit Processor Name
Faster 1.22 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.99 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.48 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 0.88 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.95 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.89 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.88 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 0.99 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.87 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.88 +0 Windows 11 X64 AMD Ryzen 9 3950X
Same 0.96 +0 Windows 11 X64 AMD Ryzen 9 5900X
Slower 0.81 +0 Windows 11 X64 AMD Ryzen 9 5950X
Slower 0.87 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.88 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.81 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Same 0.98 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.99 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Same 1.03 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 1.00 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 0.94 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.84 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.77 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.92 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 1.02 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Same 1.02 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Same 0.99 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

System.Tests.Perf_Double.Parse(value: "1.7976931348623157e+308")

Result Ratio Alloc Delta Operating System Bit Processor Name
Same 0.95 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.94 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.79 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Same 0.91 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.76 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.93 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.91 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Slower 0.77 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.82 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.75 +0 Windows 11 X64 AMD Ryzen 9 3950X
Slower 0.83 +0 Windows 11 X64 AMD Ryzen 9 5900X
Slower 0.82 +0 Windows 11 X64 AMD Ryzen 9 5950X
Same 0.91 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.77 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.82 +0 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Same 0.94 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.03 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.89 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.89 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 0.74 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.96 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.88 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.81 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.95 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Same 0.94 +0 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Same 0.94 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

@AndyAyersMS
Copy link
Member Author

Would you consider these regressions as "by design" as there are improvements associated with this PR as noted above.

Yes.

This kind of change will always end up causing some regressions.

@stephentoub
Copy link
Member

stephentoub commented Aug 11, 2022

I don't know what the absolute numbers are, but some of these appear to be huge, e. g.

Slower 0.17 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores

Am I reading that correctly as a 6x regression? S that reproable?

@AndyAyersMS
Copy link
Member Author

I wonder if that was data from @tannergooding

The perf lab doesn't have any x86+amd data. We have x64+amd (not sure which HW version) and shows at best a small regression:

amd x64 windows
newplot - 2022-08-11T082049 388

intel x86 windows -- more "typical" regression ~ 25%

newplot - 2022-08-11T082640 495

@dakersnar
Copy link
Contributor

Another regression caused by this from the perf report, this time for System.Collections.IterateForEach<Int32>.HashSet(Size: 512):

System.Collections.IterateForEach.HashSet(Size: 512)

Result Ratio Operating System Bit Processor Name
Same 0.97 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.96 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.47 macOS Monterey 12.3 Arm64 Apple M1 Max
Slower 0.88 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.86 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.90 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.89 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Slower 0.88 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.97 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.98 Windows 11 X64 AMD Ryzen 9 3950X
Slower 0.89 Windows 11 X64 AMD Ryzen 9 5900X
Slower 0.87 Windows 11 X64 AMD Ryzen 9 5950X
Slower 0.88 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.89 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.83 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Slower 0.77 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.02 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.90 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.89 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Slower 0.88 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Same 1.00 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.99 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 1.00 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.84 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.84 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Slower 0.85 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

@tannergooding
Copy link
Member

I wonder if that was data from @tannergooding

Not this time, my two are (both Windows 11 x64):

  • AMD Ryzen 9 5950X
  • 11th Gen Intel Core i9-11900H 2.50GHz

The reporting AMD Ryzen Threadripper PRO 3945WX 12-Cores has some vary wildly differing numbers across many tests. It's not at all what I'd expect particularly given that its just 12-core/24-thread and is effectively a AMD Ryzen 9 3900X but with a higher TDP, more PCI-e lanes, and more memory channels. I'm not aware of any infinity fabric or cache latency issues like the really high core count 2990WX had.

@mrsharm
Copy link
Member

mrsharm commented Aug 11, 2022

I wonder if that was data from @tannergooding

As far as I can tell, this data was supplied by @adamsitnik:

image

Reaching out to try to repro locally.

@adamsitnik
Copy link
Member

I've re-run the Regex benchmarks for x86 using the same machine for .NET 6, 7 p5 and 7 p7.

For this particular case ("zdj" + Options.None) the Mean was:

  • 512,958.50 ns for .NET 6
  • 019,770.31 ns for .NET 7 preview 5
  • 059,506.42 ns for .NET 7 preview 7

Compared to .NET 6, Preview 7 is 8 times faster, but compared to Preview 5 it's 3 times slower.

BenchmarkDotNet=v0.13.1.1845-nightly, OS=Windows 11 (10.0.22000.856/21H2)
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK=6.0.109
  [Host]     : .NET 6.0.8 (6.0.822.36306), X86 RyuJIT AVX2
  Job-NDWKRI : .NET 6.0.8 (6.0.822.36306), X86 RyuJIT AVX2
Method Pattern Options Mean Error StdDev Median Min Max Gen 0 Allocated
Count (?i)Holmes None 754,052.6 ns 8,550.92 ns 6,675.99 ns 753,024.7 ns 744,325.3 ns 765,359.4 ns 10.4167 57910 B
Count (?i)Holmes Compiled 737,189.9 ns 9,014.99 ns 8,432.63 ns 738,131.5 ns 722,725.9 ns 751,870.7 ns 8.5227 57910 B
Count (?i)Sher[a-z]+ Hol[a-z]+ None 6,255,396.7 ns 38,372.43 ns 34,016.16 ns 6,253,234.4 ns 6,204,706.2 ns 6,329,583.3 ns -
Count (?i)Sher[a-z]+ Hol[a-z]+ Compiled 2,296,957.6 ns 25,846.78 ns 21,583.23 ns 2,297,146.4 ns 2,266,178.6 ns 2,326,673.4 ns 10.4167
Count (?i)Sherlock None 478,897.3 ns 1,470.25 ns 1,303.34 ns 478,651.5 ns 476,986.7 ns 481,901.1 ns 1.8939 12649 B
Count (?i)Sherlock Compiled 509,051.5 ns 3,840.10 ns 3,592.03 ns 508,388.7 ns 504,555.8 ns 515,516.1 ns 2.0161 12649 B
Count (?i)Sherlock Holmes None 380,587.6 ns 6,886.80 ns 6,104.96 ns 377,577.9 ns 374,469.4 ns 394,870.4 ns 1.6026 11905 B
Count (?i)Sherlock Holmes Compiled 310,444.9 ns 6,886.11 ns 7,930.06 ns 307,458.9 ns 300,406.7 ns 327,912.0 ns 1.2255 11905 B
Count (?i)Sherlock Holmes Watson None 7,377,067.5 ns 123,508.79 ns 103,135.43 ns 7,352,261.5 ns 7,270,724.0 ns 7,612,353.1 ns
Count (?i)Sherlock Holmes Watson Compiled 2,648,212.9 ns 29,553.20 ns 27,644.08 ns 2,634,536.2 ns 2,619,642.5 ns 2,713,097.5 ns
Count (?i)Sherlock (...)er John Baker [49] None 23,259,363.3 ns 64,653.02 ns 57,313.21 ns 23,268,271.4 ns 23,102,985.7 ns
Count (?i)Sherlock (...)er John Baker [49] Compiled 3,625,237.7 ns 11,794.66 ns 10,455.66 ns 3,621,028.1 ns 3,611,293.8 ns
Count (?i)the None 2,338,722.2 ns 11,413.97 ns 10,676.64 ns 2,339,698.2 ns 2,323,433.9 ns 2,359,912.5 ns 187.5000 990394 B
Count (?i)the Compiled 2,075,730.5 ns 12,768.46 ns 11,318.91 ns 2,078,493.3 ns 2,048,216.1 ns 2,091,713.4 ns 187.5000 990394 B
Count (?m)^Sherlock(...)rlock Holmes$ [37] None 75,058.6 ns 195.47 ns 173.28 ns 74,989.5 ns 74,885.1 ns 75,461.1 ns 0.5981 4216 B
Count (?m)^Sherlock(...)rlock Holmes$ [37] Compiled 47,775.9 ns 2,489.12 ns 2,663.33 ns 46,429.2 ns 45,933.4 ns 54,122.9 ns 0.7396 4216 B
Count (?s).* None 1,424,171.7 ns 16,126.38 ns 13,466.26 ns 1,425,738.8 ns 1,407,753.8 ns 1,445,335.0 ns - 256 B
Count (?s).* Compiled 115.3 ns 1.06 ns 0.88 ns 115.1 ns 114.0 ns 117.1 ns 0.0473 248 B
Count .* None 2,307,370.5 ns 14,163.39 ns 13,248.44 ns 2,303,607.3 ns 2,288,772.9 ns 2,333,109.4 ns 614.5833 3237027 B
Count .* Compiled 1,836,786.3 ns 13,906.08 ns 12,327.38 ns 1,837,340.3 ns 1,820,222.9 ns 1,863,186.8 ns 618.0556 3237025 B
Count Holmes None 410,846.5 ns 697.15 ns 582.15 ns 410,719.2 ns 409,989.5 ns 412,097.0 ns 9.8684 57165 B
Count Holmes Compiled 542,072.5 ns 901.52 ns 703.85 ns 541,791.7 ns 541,294.1 ns 543,711.3 ns 10.7759 57166 B
Count Holmes.{0,25}(...).{0,25}Holmes [39] None 1,255,836.4 ns 23,564.43 ns 19,677.36 ns 1,247,566.3 ns 1,227,308.7 ns 1,296,488.9 ns - 871 B
Count Holmes.{0,25}(...).{0,25}Holmes [39] Compiled 122,396.2 ns 2,319.42 ns 2,169.58 ns 121,956.8 ns 120,000.5 ns 127,622.1 ns - 868 B
Count Sher[a-z]+ Hol[a-z]+ None 1,193,239.9 ns 13,280.03 ns 10,368.17 ns 1,190,771.9 ns 1,182,279.8 ns 1,221,570.7 ns 9.6154
Count Sher[a-z]+ Hol[a-z]+ Compiled 96,148.9 ns 1,884.05 ns 1,762.34 ns 95,533.7 ns 94,249.3 ns 100,192.5 ns 13.7649
Count Sherlock None 255,848.2 ns 2,561.47 ns 2,138.94 ns 255,315.9 ns 253,726.9 ns 261,010.4 ns 2.0161 12029 B
Count Sherlock Compiled 385,333.7 ns 1,415.14 ns 1,254.48 ns 385,459.4 ns 383,172.2 ns 387,239.1 ns 1.6447 12029 B
Count Sherlock Holmes None 196,090.3 ns 318.44 ns 248.61 ns 196,151.4 ns 195,701.3 ns 196,524.5 ns 1.5625 11285 B
Count Sherlock Holmes Compiled 225,889.7 ns 332.66 ns 311.17 ns 225,936.6 ns 225,191.9 ns 226,529.5 ns 1.8116 11285 B
Count Sherlock\s+Holmes None 252,966.5 ns 542.47 ns 507.43 ns 253,074.7 ns 252,337.3 ns 253,720.9 ns 2.0161 12029 B
Count Sherlock\s+Holmes Compiled 383,518.6 ns 2,429.83 ns 2,272.87 ns 382,350.9 ns 381,245.1 ns 388,755.9 ns 1.5244 12029 B
Count Sherlock Holmes None 1,135,940.0 ns 2,081.96 ns 1,947.47 ns 1,135,918.3 ns 1,131,050.4 ns 1,139,510.7 ns 8.9286
Count Sherlock Holmes Compiled 92,451.5 ns 1,497.40 ns 1,327.41 ns 92,343.8 ns 90,716.8 ns 94,574.0 ns 13.1579
Count Sherlock Holmes Watson None 1,217,614.4 ns 3,873.24 ns 3,623.03 ns 1,215,806.5 ns 1,213,037.3 ns 1,223,630.0 ns
Count Sherlock Holmes Watson Compiled 122,251.8 ns 1,088.67 ns 1,018.34 ns 122,570.3 ns 120,466.9 ns 124,030.9 ns
Count Sherlock Holm(...)er John Baker [45] None 2,071,894.9 ns 15,147.57 ns 13,427.93 ns 2,067,807.5 ns 2,052,608.8 ns
Count Sherlock Holm(...)er John Baker [45] Compiled 628,119.3 ns 11,864.50 ns 11,098.06 ns 625,781.0 ns 617,344.5 ns
Count Sherlock Street None 98,758.7 ns 783.06 ns 694.17 ns 98,670.1 ns 97,673.5 ns 100,041.7 ns 3.6526
Count Sherlock Street Compiled 56,883.9 ns 712.46 ns 594.94 ns 56,613.9 ns 56,296.2 ns 58,377.9 ns 3.5971
Count The None 910,171.8 ns 15,270.80 ns 13,537.17 ns 907,817.8 ns 895,176.1 ns 935,730.1 ns 14.7059 91887 B
Count The Compiled 707,564.1 ns 3,349.17 ns 3,132.82 ns 708,556.8 ns 699,982.7 ns 712,019.6 ns 17.0455 91886 B
Count [^\n]* None 2,241,900.1 ns 9,616.83 ns 8,995.59 ns 2,241,119.8 ns 2,224,885.4 ns 2,255,783.3 ns 614.5833 3237027 B
Count [^\n]* Compiled 1,855,034.0 ns 16,799.32 ns 15,714.09 ns 1,852,061.1 ns 1,838,416.7 ns 1,892,242.4 ns 618.0556 3237025 B
Count [a-q][^u-z]{13}x None 28,243,473.3 ns 72,088.14 ns 67,431.29 ns 28,236,616.7 ns 28,122,733.3 ns 28,366,666.7 ns - 17715 B
Count [a-q][^u-z]{13}x Compiled 6,208,659.2 ns 8,894.83 ns 8,320.22 ns 6,210,318.8 ns 6,190,372.9 ns 6,222,572.9 ns - 17622 B
Count [a-zA-Z]+ing None 27,460,775.8 ns 66,660.13 ns 55,664.23 ns 27,466,628.6 ns 27,349,185.7 ns 27,553,314.3 ns - 350268 B
Count [a-zA-Z]+ing Compiled 9,320,517.2 ns 35,355.55 ns 31,341.78 ns 9,328,232.8 ns 9,275,456.2 ns 9,368,743.8 ns 62.5000 350198 B
Count \b\w+n\b None 23,642,685.0 ns 72,834.84 ns 64,566.19 ns 23,649,580.0 ns 23,476,610.0 ns 23,743,930.0 ns 100.0000 1037537 B
Count \b\w+n\b Compiled 14,010,602.7 ns 66,211.82 ns 55,289.87 ns 13,987,635.7 ns 13,938,128.6 ns 14,114,507.1 ns 142.8571 1037430 B
Count \p{Ll} None 37,144,155.8 ns 143,124.02 ns 119,515.04 ns 37,176,175.0 ns 36,955,900.0 ns 37,405,875.0 ns 10250.0000 53689681 B
Count \p{Ll} Compiled 24,425,516.2 ns 88,646.77 ns 82,920.25 ns 24,425,357.1 ns 24,291,785.7 ns 24,565,585.7 ns 10142.8571 53689612 B
Count \p{Lu} None 2,314,689.2 ns 21,513.23 ns 19,070.92 ns 2,316,081.2 ns 2,288,788.8 ns 2,357,571.2 ns 325.0000 1758328 B
Count \p{Lu} Compiled 1,354,781.1 ns 48,762.06 ns 54,198.90 ns 1,342,019.3 ns 1,289,817.2 ns 1,473,167.2 ns 333.3333 1758323 B
Count \p{L} None 36,454,935.7 ns 178,946.16 ns 158,631.11 ns 36,434,562.5 ns 36,270,300.0 ns 36,778,600.0 ns 10500.0000 55448001 B
Count \p{L} Compiled 26,320,703.1 ns 192,794.87 ns 170,907.63 ns 26,291,464.3 ns 26,047,171.4 ns 26,651,957.1 ns 10571.4286 55447932 B
Count \s[a-zA-Z]{0,12}ing\s None 15,334,079.0 ns 305,064.63 ns 254,742.78 ns 15,412,400.0 ns 14,502,272.7 ns 15,493,509.1 ns - 258103 B
Count \s[a-zA-Z]{0,12}ing\s Compiled 6,754,800.2 ns 5,278.49 ns 4,121.10 ns 6,755,254.2 ns 6,745,866.7 ns 6,760,225.0 ns 41.6667 258058 B
Count \w+ None 13,970,175.2 ns 119,150.43 ns 105,623.75 ns 13,951,300.0 ns 13,845,373.3 ns 14,196,253.3 ns 2533.3333 13542579 B
Count \w+ Compiled 8,120,073.6 ns 225,842.98 ns 260,081.12 ns 8,060,685.9 ns 7,830,065.6 ns 8,797,225.0 ns 2562.5000 13542556 B
Count \w+\s+Holmes None 11,612,346.2 ns 69,034.22 ns 64,574.65 ns 11,613,693.8 ns 11,524,993.8 ns 11,735,565.6 ns - 39578 B
Count \w+\s+Holmes Compiled 3,427,146.6 ns 15,167.31 ns 14,187.51 ns 3,422,823.8 ns 3,409,331.2 ns 3,458,786.2 ns - 39565 B
Count \w+\s+Holmes\s+\w+ None 12,175,925.4 ns 36,424.54 ns 32,289.40 ns 12,169,659.4 ns 12,131,246.9 ns 12,240,868.8 ns - 17010 B
Count \w+\s+Holmes\s+\w+ Compiled 3,390,937.9 ns 4,190.32 ns 3,499.11 ns 3,389,681.2 ns 3,386,808.8 ns 3,398,675.0 ns - 16997 B
Count aei None 653,255.6 ns 376.33 ns 314.25 ns 653,357.6 ns 652,639.3 ns 653,790.9 ns - 2 B
Count aei Compiled 677,740.5 ns 6,375.48 ns 5,963.63 ns 676,190.8 ns 667,298.4 ns 688,459.0 ns - 2 B
Count aqj None 524,937.5 ns 1,354.29 ns 1,200.54 ns 524,932.3 ns 523,307.3 ns 527,059.4 ns - 1 B
Count aqj Compiled 873,146.4 ns 1,017.16 ns 901.69 ns 873,112.3 ns 871,718.8 ns 874,519.4 ns - 2 B
Count the None 1,534,109.3 ns 13,902.69 ns 13,004.59 ns 1,527,801.9 ns 1,513,433.8 ns 1,556,267.5 ns 168.7500 895036 B
Count the Compiled 1,541,660.1 ns 11,250.23 ns 9,394.46 ns 1,541,351.9 ns 1,525,743.1 ns 1,559,265.6 ns 168.7500 895036 B
Count the\s+\w+ None 1,912,813.1 ns 12,829.54 ns 12,000.76 ns 1,910,495.3 ns 1,895,882.8 ns 1,932,555.5 ns 125.0000 670845 B
Count the\s+\w+ Compiled 1,537,566.9 ns 29,616.08 ns 27,702.90 ns 1,524,593.4 ns 1,505,029.7 ns 1,584,471.6 ns 125.0000 670844 B
Count zqj None 512,958.5 ns 341.39 ns 266.54 ns 512,967.7 ns 512,380.8 ns 513,520.6 ns - 1 B
Count zqj Compiled 733,671.8 ns 5,018.51 ns 4,190.68 ns 733,996.0 ns 728,043.5 ns 743,370.5 ns - 2 B
BenchmarkDotNet=v0.13.1.1845-nightly, OS=Windows 11 (10.0.22000.856/21H2)
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK=7.0.100-preview.5.22276.3
  [Host]     : .NET 7.0.0 (7.0.22.27203), X86 RyuJIT AVX2
  Job-LKHRHJ : .NET 7.0.0 (7.0.22.27203), X86 RyuJIT AVX2
Method Pattern Options Mean Error StdDev Median Min Max Allocated
Count (?i)Holmes None 646,946.08 ns 3,867.702 ns 3,617.851 ns 648,069.27 ns 641,228.91 ns 651,123.96 ns 2 B
Count (?i)Holmes Compiled 430,208.98 ns 1,786.607 ns 1,671.193 ns 430,141.67 ns 427,976.04 ns 432,561.98 ns 1 B
Count (?i)Holmes NonBacktracking 646,694.97 ns 1,638.770 ns 1,532.906 ns 646,206.38 ns 644,638.12 ns 649,005.38 ns 2 B
Count (?i)Sher[a-z]+ Hol[a-z]+ None 4,378,969.93 ns 15,075.222 ns 14,101.372 ns 4,374,923.96 ns 4,359,044.79 ns 4,401,065.62 ns
Count (?i)Sher[a-z]+ Hol[a-z]+ Compiled 1,041,150.86 ns 4,136.001 ns 3,666.458 ns 1,039,328.54 ns 1,036,364.17 ns 1,046,628.75 ns
Count (?i)Sher[a-z]+ Hol[a-z]+ NonBacktracking 1,224,805.50 ns 5,381.831 ns 5,034.168 ns 1,223,244.95 ns 1,219,868.99 ns 1,235,895.43 ns
Count (?i)Sherlock None 120,214.83 ns 216.163 ns 180.506 ns 120,146.95 ns 120,010.02 ns 120,627.10 ns -
Count (?i)Sherlock Compiled 79,875.57 ns 469.463 ns 439.136 ns 79,969.36 ns 79,149.74 ns 80,639.38 ns -
Count (?i)Sherlock NonBacktracking 122,289.22 ns 295.570 ns 276.477 ns 122,299.12 ns 121,685.40 ns 122,699.56 ns -
Count (?i)Sherlock Holmes None 121,166.54 ns 565.185 ns 528.674 ns 121,055.14 ns 120,598.12 ns 122,250.72 ns -
Count (?i)Sherlock Holmes Compiled 80,108.84 ns 984.809 ns 873.007 ns 79,936.89 ns 78,941.53 ns 82,063.69 ns -
Count (?i)Sherlock Holmes NonBacktracking 122,296.93 ns 419.358 ns 392.268 ns 122,066.26 ns 121,933.50 ns 122,959.52 ns -
Count (?i)Sherlock Holmes Watson None 7,411,806.25 ns 24,898.263 ns 23,289.851 ns 7,411,034.38 ns 7,369,271.88 ns
Count (?i)Sherlock Holmes Watson Compiled 1,457,524.92 ns 5,292.173 ns 4,691.374 ns 1,458,263.35 ns 1,451,628.98 ns
Count (?i)Sherlock Holmes Watson NonBacktracking 2,688,159.23 ns 3,197.720 ns 2,834.695 ns 2,687,523.44 ns 2,684,653.12 ns
Count (?i)Sherlock (...)er John Baker [49] None 25,054,462.50 ns 146,748.809 ns 137,268.927 ns 25,043,925.00 ns
Count (?i)Sherlock (...)er John Baker [49] Compiled 2,688,395.38 ns 6,343.531 ns 5,933.742 ns 2,686,054.69 ns
Count (?i)Sherlock (...)er John Baker [49] NonBacktracking 4,004,464.06 ns 4,125.406 ns 3,657.065 ns 4,004,419.35 ns
Count (?i)the None 1,301,173.68 ns 2,773.877 ns 2,594.687 ns 1,299,801.04 ns 1,297,869.27 ns 1,306,241.15 ns 3 B
Count (?i)the Compiled 817,298.59 ns 1,624.212 ns 1,439.822 ns 817,223.44 ns 813,642.81 ns 820,120.31 ns 2 B
Count (?i)the NonBacktracking 1,427,193.88 ns 7,131.521 ns 6,670.829 ns 1,426,678.69 ns 1,417,481.53 ns 1,440,190.62 ns 4 B
Count (?m)^Sherlock(...)rlock Holmes$ [37] None 46,861.94 ns 1,474.091 ns 1,577.261 ns 46,085.94 ns 45,315.81 ns 50,157.77 ns -
Count (?m)^Sherlock(...)rlock Holmes$ [37] Compiled 31,912.70 ns 120.225 ns 93.864 ns 31,878.78 ns 31,836.72 ns 32,150.60 ns -
Count (?m)^Sherlock(...)rlock Holmes$ [37] NonBacktracking 53,825.51 ns 46.922 ns 41.595 ns 53,810.70 ns 53,774.48 ns 53,910.29 ns -
Count (?s).* None 1,291,129.46 ns 77,167.437 ns 88,866.135 ns 1,288,893.47 ns 1,180,896.59 ns 1,403,653.41 ns 4 B
Count (?s).* Compiled 62.97 ns 0.174 ns 0.145 ns 62.94 ns 62.80 ns 63.26 ns -
Count (?s).* NonBacktracking 6,820,515.81 ns 25,445.756 ns 21,248.359 ns 6,818,205.56 ns 6,785,602.78 ns 6,858,294.44 ns 18 B
Count .* None 1,626,390.62 ns 3,235.473 ns 2,701.767 ns 1,627,285.94 ns 1,621,200.78 ns 1,630,686.72 ns 5 B
Count .* Compiled 1,017,730.81 ns 4,583.084 ns 4,287.020 ns 1,018,182.03 ns 1,010,559.38 ns 1,025,371.48 ns 3 B
Count .* NonBacktracking 8,664,434.07 ns 22,457.024 ns 18,752.632 ns 8,660,080.95 ns 8,642,652.38 ns 8,697,076.19 ns 31 B
Count Holmes None 63,372.29 ns 751.695 ns 627.700 ns 63,260.33 ns 62,736.57 ns 64,531.75 ns -
Count Holmes Compiled 56,339.37 ns 343.970 ns 321.750 ns 56,153.45 ns 56,025.81 ns 56,933.71 ns -
Count Holmes NonBacktracking 80,862.78 ns 398.767 ns 373.007 ns 80,788.81 ns 80,410.90 ns 81,488.53 ns -
Count Holmes.{0,25}(...).{0,25}Holmes [39] None 343,834.81 ns 2,254.715 ns 2,109.062 ns 343,935.33 ns 339,772.55 ns 346,704.21 ns 1 B
Count Holmes.{0,25}(...).{0,25}Holmes [39] Compiled 72,381.25 ns 534.150 ns 473.510 ns 72,323.65 ns 71,501.53 ns 73,494.23 ns -
Count Holmes.{0,25}(...).{0,25}Holmes [39] NonBacktracking 151,278.39 ns 888.762 ns 831.349 ns 151,180.47 ns 150,120.67 ns 152,781.73 ns -
Count Sher[a-z]+ Hol[a-z]+ None 202,882.54 ns 736.313 ns 614.855 ns 202,850.48 ns 202,249.60 ns 204,378.69 ns
Count Sher[a-z]+ Hol[a-z]+ Compiled 79,896.31 ns 419.085 ns 392.013 ns 79,812.76 ns 79,393.14 ns 80,634.46 ns
Count Sher[a-z]+ Hol[a-z]+ NonBacktracking 170,335.81 ns 607.677 ns 568.422 ns 170,330.24 ns 169,194.76 ns 171,261.16 ns
Count Sherlock None 34,512.49 ns 126.855 ns 112.454 ns 34,546.51 ns 34,270.96 ns 34,690.53 ns -
Count Sherlock Compiled 40,791.99 ns 561.604 ns 497.847 ns 40,632.91 ns 40,267.13 ns 41,698.28 ns -
Count Sherlock NonBacktracking 40,088.45 ns 626.785 ns 523.394 ns 39,883.84 ns 39,534.43 ns 41,292.63 ns -
Count Sherlock Holmes None 35,333.48 ns 112.533 ns 93.970 ns 35,305.33 ns 35,218.27 ns 35,564.56 ns -
Count Sherlock Holmes Compiled 41,287.87 ns 821.219 ns 806.547 ns 40,907.35 ns 40,453.77 ns 42,733.12 ns -
Count Sherlock Holmes NonBacktracking 43,932.22 ns 824.338 ns 846.535 ns 43,655.57 ns 42,931.18 ns 45,529.74 ns -
Count Sherlock\s+Holmes None 45,760.38 ns 234.942 ns 208.270 ns 45,751.99 ns 45,487.17 ns 46,167.86 ns -
Count Sherlock\s+Holmes Compiled 32,567.37 ns 489.021 ns 433.505 ns 32,359.70 ns 32,202.59 ns 33,452.54 ns -
Count Sherlock\s+Holmes NonBacktracking 53,985.46 ns 1,623.421 ns 1,737.042 ns 53,600.85 ns 52,399.81 ns 57,223.62 ns -
Count Sherlock Holmes None 192,959.61 ns 2,346.615 ns 1,832.083 ns 192,847.78 ns 190,563.05 ns 195,515.19 ns
Count Sherlock Holmes Compiled 75,870.62 ns 375.906 ns 313.898 ns 75,949.54 ns 75,414.17 ns 76,423.91 ns
Count Sherlock Holmes NonBacktracking 159,808.71 ns 964.267 ns 752.836 ns 159,993.78 ns 158,522.96 ns 160,948.34 ns
Count Sherlock Holmes Watson None 297,645.47 ns 6,950.681 ns 7,137.838 ns 295,717.27 ns 289,962.84 ns
Count Sherlock Holmes Watson Compiled 108,032.71 ns 1,524.985 ns 1,426.472 ns 107,313.87 ns 106,452.10 ns
Count Sherlock Holmes Watson NonBacktracking 221,386.36 ns 10,608.780 ns 12,217.087 ns 218,307.73 ns 206,925.08 ns
Count Sherlock Holm(...)er John Baker [45] None 2,498,383.30 ns 106,417.950 ns 122,551.069 ns 2,494,182.59 ns
Count Sherlock Holm(...)er John Baker [45] Compiled 2,326,725.45 ns 50,548.835 ns 58,212.113 ns 2,320,651.34 ns
Count Sherlock Holm(...)er John Baker [45] NonBacktracking 3,040,149.82 ns 111,615.816 ns 128,536.940 ns 3,044,579.41 ns
Count Sherlock Street None 97,991.29 ns 3,479.661 ns 3,723.198 ns 97,120.77 ns 94,660.16 ns 107,899.07 ns
Count Sherlock Street Compiled 45,965.71 ns 2,157.325 ns 2,484.379 ns 44,934.34 ns 43,248.63 ns 51,043.13 ns
Count Sherlock Street NonBacktracking 71,697.68 ns 523.836 ns 489.997 ns 71,673.09 ns 71,011.95 ns 72,794.91 ns
Count The None 80,579.11 ns 109.332 ns 96.920 ns 80,586.57 ns 80,451.45 ns 80,752.48 ns -
Count The Compiled 69,958.12 ns 2,149.200 ns 2,475.022 ns 69,751.17 ns 66,456.68 ns 74,210.51 ns -
Count The NonBacktracking 102,790.78 ns 1,353.020 ns 1,199.417 ns 102,611.51 ns 101,399.92 ns 105,849.72 ns -
Count [^\n]* None 1,697,820.39 ns 25,593.561 ns 23,940.233 ns 1,685,109.77 ns 1,671,973.83 ns 1,750,891.02 ns 5 B
Count [^\n]* Compiled 1,065,630.76 ns 35,084.170 ns 38,995.962 ns 1,054,713.96 ns 1,020,881.88 ns 1,155,991.46 ns 3 B
Count [^\n]* NonBacktracking 8,761,397.14 ns 68,787.421 ns 64,343.796 ns 8,757,475.00 ns 8,627,071.43 ns 8,860,625.00 ns 23 B
Count [a-q][^u-z]{13}x None 60,877.28 ns 392.223 ns 327.524 ns 60,809.64 ns 60,342.56 ns 61,514.20 ns -
Count [a-q][^u-z]{13}x Compiled 37,179.32 ns 210.435 ns 196.841 ns 37,096.70 ns 36,973.02 ns 37,632.77 ns -
Count [a-q][^u-z]{13}x NonBacktracking 77,278.60 ns 1,950.748 ns 2,246.484 ns 76,282.45 ns 75,242.79 ns 83,176.93 ns -
Count [a-zA-Z]+ing None 13,081,931.47 ns 227,989.484 ns 202,106.739 ns 13,020,460.94 ns 12,899,087.50 ns 13,579,800.00 ns 20 B
Count [a-zA-Z]+ing Compiled 4,090,437.01 ns 80,753.838 ns 79,311.067 ns 4,104,550.78 ns 3,915,623.44 ns 4,248,964.06 ns 10 B
Count [a-zA-Z]+ing NonBacktracking 6,285,562.88 ns 38,411.802 ns 29,989.408 ns 6,281,562.12 ns 6,226,424.24 ns 6,334,981.82 ns 20 B
Count \b\w+n\b None 26,173,396.43 ns 416,161.316 ns 368,916.167 ns 26,194,090.00 ns 25,733,840.00 ns 26,933,390.00 ns 65 B
Count \b\w+n\b Compiled 9,790,192.92 ns 183,615.221 ns 153,327.025 ns 9,758,700.00 ns 9,650,908.00 ns 10,181,276.00 ns 26 B
Count \b\w+n\b NonBacktracking 9,180,711.72 ns 105,735.091 ns 88,293.590 ns 9,154,866.67 ns 9,113,252.38 ns 9,387,990.48 ns 31 B
Count \p{Ll} None 23,437,684.52 ns 160,826.198 ns 125,562.517 ns 23,426,121.43 ns 23,277,628.57 ns 23,686,942.86 ns 93 B
Count \p{Ll} Compiled 13,805,341.43 ns 69,150.426 ns 61,300.052 ns 13,763,286.67 ns 13,748,540.00 ns 13,910,840.00 ns 43 B
Count \p{Ll} NonBacktracking 32,941,116.25 ns 694,417.120 ns 799,691.788 ns 32,472,037.50 ns 32,211,400.00 ns 34,415,950.00 ns 163 B
Count \p{Lu} None 1,840,413.10 ns 31,127.712 ns 29,116.881 ns 1,840,221.48 ns 1,805,855.86 ns 1,883,094.92 ns 5 B
Count \p{Lu} Compiled 1,267,890.59 ns 268,226.165 ns 308,889.651 ns 1,240,147.98 ns 940,860.66 ns 1,698,140.44 ns 2 B
Count \p{Lu} NonBacktracking 2,299,865.69 ns 19,420.555 ns 17,215.816 ns 2,301,878.57 ns 2,266,858.04 ns 2,322,925.89 ns 6 B
Count \p{L} None 24,461,938.89 ns 483,541.832 ns 474,902.734 ns 24,338,272.22 ns 24,020,077.78 ns 25,838,922.22 ns 72 B
Count \p{L} Compiled 14,171,134.69 ns 98,394.603 ns 87,224.252 ns 14,130,964.29 ns 14,082,021.43 ns 14,365,657.14 ns 47 B
Count \p{L} NonBacktracking 33,591,002.50 ns 1,147,391.032 ns 1,321,337.219 ns 33,395,912.50 ns 32,076,975.00 ns 36,342,400.00 ns 163 B
Count \s[a-zA-Z]{0,12}ing\s None 15,975,055.38 ns 130,410.360 ns 121,985.932 ns 16,031,753.85 ns 15,733,923.08 ns 16,125,715.38 ns 50 B
Count \s[a-zA-Z]{0,12}ing\s Compiled 4,663,690.43 ns 89,011.730 ns 87,421.421 ns 4,635,842.97 ns 4,565,529.69 ns 4,902,931.25 ns 10 B
Count \s[a-zA-Z]{0,12}ing\s NonBacktracking 3,653,747.13 ns 40,876.681 ns 34,133.880 ns 3,655,028.57 ns 3,610,180.95 ns 3,726,588.89 ns 10 B
Count \w+ None 9,320,812.62 ns 102,584.938 ns 85,663.070 ns 9,367,857.81 ns 9,120,895.31 ns 9,405,673.44 ns 20 B
Count \w+ Compiled 6,072,624.68 ns 61,290.833 ns 51,180.622 ns 6,056,775.00 ns 6,018,793.75 ns 6,189,152.08 ns 14 B
Count \w+ NonBacktracking 17,113,112.43 ns 201,186.443 ns 167,999.792 ns 17,052,515.38 ns 16,860,030.77 ns 17,461,915.38 ns 50 B
Count \w+\s+Holmes None 9,182,075.22 ns 176,080.911 ns 156,091.141 ns 9,161,896.88 ns 9,008,753.12 ns 9,576,321.88 ns 20 B
Count \w+\s+Holmes Compiled 3,765,560.78 ns 56,342.471 ns 57,859.570 ns 3,769,866.67 ns 3,690,047.92 ns 3,902,016.67 ns 14 B
Count \w+\s+Holmes NonBacktracking 3,286,613.66 ns 33,689.535 ns 29,864.895 ns 3,281,168.75 ns 3,238,023.75 ns 3,335,206.25 ns 8 B
Count \w+\s+Holmes\s+\w+ None 9,848,147.20 ns 420,111.755 ns 466,953.103 ns 9,671,090.62 ns 9,319,512.50 ns 10,859,328.12 ns 20 B
Count \w+\s+Holmes\s+\w+ Compiled 3,740,501.25 ns 17,558.187 ns 16,423.939 ns 3,738,806.25 ns 3,709,787.50 ns 3,771,356.25 ns 14 B
Count \w+\s+Holmes\s+\w+ NonBacktracking 3,428,630.25 ns 121,849.477 ns 140,322.039 ns 3,388,320.00 ns 3,256,893.75 ns 3,679,370.00 ns 8 B
Count aei None 49,517.30 ns 573.698 ns 536.637 ns 49,628.68 ns 48,729.05 ns 50,615.87 ns -
Count aei Compiled 47,077.01 ns 2,326.809 ns 2,679.557 ns 47,065.28 ns 42,822.84 ns 51,147.65 ns -
Count aei NonBacktracking 43,601.68 ns 1,695.002 ns 1,951.967 ns 43,122.35 ns 41,020.60 ns 47,268.47 ns -
Count aqj None 27,075.59 ns 122.318 ns 108.432 ns 27,072.68 ns 26,883.43 ns 27,306.27 ns -
Count aqj Compiled 35,466.61 ns 111.285 ns 104.096 ns 35,452.05 ns 35,296.95 ns 35,670.58 ns -
Count aqj NonBacktracking 28,300.61 ns 1,104.688 ns 1,182.004 ns 27,996.04 ns 27,059.61 ns 31,220.68 ns -
Count the None 595,360.48 ns 2,001.653 ns 1,872.347 ns 595,607.18 ns 591,411.34 ns 597,613.19 ns 2 B
Count the Compiled 389,823.09 ns 1,175.979 ns 918.127 ns 389,937.80 ns 387,896.34 ns 390,941.92 ns 1 B
Count the NonBacktracking 753,336.19 ns 1,178.208 ns 1,102.096 ns 753,452.98 ns 751,694.05 ns 754,796.43 ns 2 B
Count the\s+\w+ None 884,488.08 ns 3,496.388 ns 3,270.523 ns 884,077.08 ns 880,477.50 ns 890,084.17 ns 3 B
Count the\s+\w+ Compiled 553,158.59 ns 582.333 ns 516.223 ns 553,278.24 ns 552,165.85 ns 553,758.04 ns 1 B
Count the\s+\w+ NonBacktracking 1,381,084.83 ns 1,792.035 ns 1,676.270 ns 1,380,364.38 ns 1,378,920.00 ns 1,383,577.50 ns 4 B
Count zqj None 19,770.31 ns 57.805 ns 51.242 ns 19,769.80 ns 19,663.57 ns 19,852.86 ns -
Count zqj Compiled 19,819.53 ns 40.767 ns 38.134 ns 19,835.13 ns 19,751.46 ns 19,871.93 ns -
Count zqj NonBacktracking 19,776.54 ns 44.960 ns 42.055 ns 19,799.35 ns 19,711.39 ns 19,822.75 ns -
BenchmarkDotNet=v0.13.1.1845-nightly, OS=Windows 11 (10.0.22000.856/21H2)
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK=7.0.100-preview.7.22370.3
  [Host]     : .NET 7.0.0 (7.0.22.36904), X86 RyuJIT AVX2
  Job-USYZGL : .NET 7.0.0 (7.0.22.36904), X86 RyuJIT AVX2
Method Pattern Options Mean Error StdDev Median Min Max Allocated
Count (?i)Holmes None 741,321.97 ns 2,279.055 ns 2,131.829 ns 741,051.70 ns 737,929.26 ns 745,145.17 ns 2 B
Count (?i)Holmes Compiled 426,563.54 ns 3,720.293 ns 3,479.965 ns 426,531.08 ns 421,231.25 ns 431,254.22 ns 1 B
Count (?i)Holmes NonBacktracking 825,943.54 ns 1,880.510 ns 1,759.030 ns 826,249.17 ns 822,250.50 ns 828,635.31 ns 2 B
Count (?i)Sher[a-z]+ Hol[a-z]+ None 4,279,106.94 ns 24,429.468 ns 22,851.339 ns 4,275,406.25 ns 4,251,943.75 ns 4,324,056.25 ns
Count (?i)Sher[a-z]+ Hol[a-z]+ Compiled 953,636.62 ns 2,924.768 ns 2,735.829 ns 953,916.54 ns 949,325.74 ns 957,706.99 ns
Count (?i)Sher[a-z]+ Hol[a-z]+ NonBacktracking 1,397,789.50 ns 4,109.367 ns 3,843.905 ns 1,396,670.95 ns 1,392,957.54 ns 1,404,191.62 ns
Count (?i)Sherlock None 131,753.08 ns 1,068.638 ns 999.605 ns 131,378.99 ns 130,669.49 ns 133,709.35 ns -
Count (?i)Sherlock Compiled 78,770.19 ns 317.231 ns 296.738 ns 78,759.38 ns 78,242.99 ns 79,353.12 ns -
Count (?i)Sherlock NonBacktracking 131,892.38 ns 747.601 ns 699.306 ns 132,122.37 ns 130,647.00 ns 132,846.37 ns -
Count (?i)Sherlock Holmes None 129,969.60 ns 418.149 ns 391.137 ns 130,086.15 ns 129,421.41 ns 130,568.96 ns -
Count (?i)Sherlock Holmes Compiled 79,172.17 ns 520.264 ns 461.200 ns 79,220.66 ns 78,170.45 ns 79,785.86 ns -
Count (?i)Sherlock Holmes NonBacktracking 130,810.55 ns 502.617 ns 470.148 ns 130,765.02 ns 130,209.91 ns 131,684.25 ns -
Count (?i)Sherlock Holmes Watson None 6,415,992.92 ns 33,353.437 ns 31,198.826 ns 6,401,908.33 ns 6,375,960.42 ns
Count (?i)Sherlock Holmes Watson Compiled 1,472,941.86 ns 13,827.898 ns 12,934.624 ns 1,470,821.59 ns 1,457,102.27 ns
Count (?i)Sherlock Holmes Watson NonBacktracking 2,908,484.59 ns 13,909.460 ns 12,330.374 ns 2,909,225.66 ns 2,884,773.68 ns
Count (?i)Sherlock (...)er John Baker [49] None 21,653,801.79 ns 87,626.443 ns 77,678.559 ns 21,660,962.50 ns
Count (?i)Sherlock (...)er John Baker [49] Compiled 2,590,903.40 ns 9,207.718 ns 8,612.905 ns 2,587,761.46 ns
Count (?i)Sherlock (...)er John Baker [49] NonBacktracking 4,504,339.80 ns 5,031.831 ns 4,460.588 ns 4,505,100.00 ns
Count (?i)the None 1,551,821.92 ns 1,832.605 ns 1,624.557 ns 1,551,544.79 ns 1,550,020.14 ns 1,555,325.69 ns 5 B
Count (?i)the Compiled 775,506.96 ns 818.147 ns 765.295 ns 775,528.27 ns 773,405.06 ns 776,426.19 ns 2 B
Count (?i)the NonBacktracking 1,806,590.27 ns 2,552.914 ns 2,387.997 ns 1,806,682.48 ns 1,801,436.50 ns 1,811,562.04 ns 5 B
Count (?m)^Sherlock(...)rlock Holmes$ [37] None 46,989.01 ns 225.737 ns 211.154 ns 47,005.71 ns 46,640.33 ns 47,258.73 ns -
Count (?m)^Sherlock(...)rlock Holmes$ [37] Compiled 41,157.66 ns 223.304 ns 208.879 ns 41,175.61 ns 40,853.56 ns 41,425.12 ns -
Count (?m)^Sherlock(...)rlock Holmes$ [37] NonBacktracking 64,344.20 ns 11,361.152 ns 13,083.519 ns 53,494.51 ns 52,559.67 ns 79,393.56 ns -
Count (?s).* None 1,128,344.34 ns 2,627.509 ns 2,194.089 ns 1,128,822.22 ns 1,125,234.03 ns 1,132,725.69 ns 5 B
Count (?s).* Compiled 63.00 ns 0.184 ns 0.172 ns 62.98 ns 62.69 ns 63.28 ns -
Count (?s).* NonBacktracking 4,463,599.62 ns 18,065.352 ns 16,014.464 ns 4,461,514.29 ns 4,439,576.79 ns 4,491,780.36 ns 12 B
Count .* None 1,544,370.19 ns 5,324.080 ns 4,980.147 ns 1,545,652.78 ns 1,536,868.75 ns 1,550,210.42 ns 5 B
Count .* Compiled 880,479.19 ns 4,778.296 ns 4,235.835 ns 879,598.78 ns 874,610.42 ns 889,532.64 ns 2 B
Count .* NonBacktracking 6,730,591.67 ns 19,576.168 ns 18,311.559 ns 6,733,846.88 ns 6,695,634.38 ns 6,752,771.88 ns 20 B
Count Holmes None 85,025.16 ns 10,289.669 ns 11,849.598 ns 84,132.75 ns 71,491.15 ns 100,281.13 ns -
Count Holmes Compiled 58,763.59 ns 370.257 ns 309.181 ns 58,833.72 ns 58,164.44 ns 59,305.55 ns -
Count Holmes NonBacktracking 118,438.18 ns 1,843.344 ns 1,724.265 ns 119,353.90 ns 113,845.47 ns 119,701.21 ns -
Count Holmes.{0,25}(...).{0,25}Holmes [39] None 270,321.58 ns 780.023 ns 729.634 ns 270,428.81 ns 268,606.89 ns 271,357.52 ns 1 B
Count Holmes.{0,25}(...).{0,25}Holmes [39] Compiled 70,500.12 ns 530.281 ns 470.080 ns 70,608.14 ns 69,691.44 ns 71,078.41 ns -
Count Holmes.{0,25}(...).{0,25}Holmes [39] NonBacktracking 142,449.76 ns 523.469 ns 408.690 ns 142,658.36 ns 141,737.33 ns 142,857.53 ns -
Count Sher[a-z]+ Hol[a-z]+ None 217,141.51 ns 783.807 ns 694.824 ns 216,976.80 ns 216,140.67 ns 218,611.04 ns
Count Sher[a-z]+ Hol[a-z]+ Compiled 78,780.65 ns 420.480 ns 393.317 ns 78,640.07 ns 78,256.82 ns 79,438.71 ns
Count Sher[a-z]+ Hol[a-z]+ NonBacktracking 176,068.38 ns 731.072 ns 683.845 ns 176,057.44 ns 175,064.68 ns 177,279.21 ns
Count Sherlock None 44,443.82 ns 212.009 ns 198.314 ns 44,488.77 ns 43,993.18 ns 44,621.46 ns -
Count Sherlock Compiled 41,524.64 ns 793.277 ns 662.422 ns 41,399.04 ns 40,861.47 ns 43,563.25 ns -
Count Sherlock NonBacktracking 69,466.36 ns 6,347.151 ns 7,309.388 ns 74,418.01 ns 53,733.55 ns 76,643.10 ns -
Count Sherlock Holmes None 61,680.07 ns 10,077.161 ns 11,604.873 ns 70,127.94 ns 45,430.78 ns 71,589.36 ns -
Count Sherlock Holmes Compiled 41,506.39 ns 273.825 ns 213.784 ns 41,542.18 ns 41,098.86 ns 41,718.85 ns -
Count Sherlock Holmes NonBacktracking 51,287.50 ns 310.806 ns 290.728 ns 51,175.61 ns 50,839.30 ns 51,927.01 ns -
Count Sherlock\s+Holmes None 64,831.03 ns 9,499.629 ns 10,939.787 ns 70,929.13 ns 48,272.46 ns 74,997.46 ns -
Count Sherlock\s+Holmes Compiled 42,478.63 ns 118.535 ns 110.877 ns 42,468.07 ns 42,241.39 ns 42,649.62 ns -
Count Sherlock\s+Holmes NonBacktracking 78,911.53 ns 7,413.543 ns 8,537.447 ns 82,914.42 ns 54,058.62 ns 84,964.29 ns -
Count Sherlock Holmes None 198,358.55 ns 447.026 ns 418.148 ns 198,336.87 ns 197,806.17 ns 199,298.18 ns
Count Sherlock Holmes Compiled 75,139.30 ns 525.243 ns 491.312 ns 75,203.41 ns 74,314.77 ns 76,055.95 ns
Count Sherlock Holmes NonBacktracking 146,348.85 ns 497.559 ns 441.073 ns 146,263.26 ns 145,890.83 ns 147,413.96 ns
Count Sherlock Holmes Watson None 283,500.93 ns 729.597 ns 609.246 ns 283,603.41 ns 282,352.95 ns
Count Sherlock Holmes Watson Compiled 96,065.24 ns 281.901 ns 249.898 ns 96,036.43 ns 95,731.82 ns
Count Sherlock Holmes Watson NonBacktracking 188,843.53 ns 422.868 ns 353.114 ns 188,717.85 ns 188,407.38 ns
Count Sherlock Holm(...)er John Baker [45] None 2,040,253.44 ns 126,919.499 ns 146,160.684 ns 2,033,977.43 ns
Count Sherlock Holm(...)er John Baker [45] Compiled 2,098,517.45 ns 8,521.910 ns 7,971.400 ns 2,096,242.97 ns
Count Sherlock Holm(...)er John Baker [45] NonBacktracking 3,233,434.34 ns 10,504.980 ns 9,312.392 ns 3,236,646.15 ns
Count Sherlock Street None 76,744.42 ns 445.326 ns 416.558 ns 76,708.84 ns 76,112.62 ns 77,491.10 ns
Count Sherlock Street Compiled 37,605.29 ns 397.772 ns 372.076 ns 37,548.20 ns 37,051.87 ns 38,290.07 ns
Count Sherlock Street NonBacktracking 67,275.87 ns 310.667 ns 290.599 ns 67,160.30 ns 66,697.08 ns 67,804.69 ns
Count The None 106,844.21 ns 10,370.004 ns 11,942.112 ns 115,062.57 ns 86,965.68 ns 117,204.52 ns -
Count The Compiled 69,350.34 ns 193.595 ns 181.089 ns 69,389.08 ns 69,092.80 ns 69,678.48 ns -
Count The NonBacktracking 139,221.51 ns 2,782.555 ns 3,092.802 ns 140,340.42 ns 133,371.05 ns 142,713.20 ns -
Count [^\n]* None 1,645,870.14 ns 7,479.366 ns 6,996.203 ns 1,646,688.19 ns 1,636,769.44 ns 1,657,963.19 ns 5 B
Count [^\n]* Compiled 886,535.32 ns 5,983.866 ns 5,597.312 ns 886,417.71 ns 879,584.38 ns 896,233.33 ns 2 B
Count [^\n]* NonBacktracking 6,576,847.68 ns 28,324.348 ns 26,494.612 ns 6,581,639.39 ns 6,530,966.67 ns 6,613,163.64 ns 20 B
Count [a-q][^u-z]{13}x None 61,406.99 ns 253.170 ns 224.428 ns 61,438.48 ns 60,972.44 ns 61,759.99 ns -
Count [a-q][^u-z]{13}x Compiled 32,901.34 ns 127.085 ns 112.658 ns 32,886.72 ns 32,774.87 ns 33,123.13 ns -
Count [a-q][^u-z]{13}x NonBacktracking 65,297.14 ns 354.743 ns 331.827 ns 65,333.69 ns 64,649.02 ns 65,733.51 ns -
Count [a-zA-Z]+ing None 12,612,592.71 ns 82,227.728 ns 76,915.868 ns 12,625,771.88 ns 12,496,171.88 ns 12,750,781.25 ns 20 B
Count [a-zA-Z]+ing Compiled 3,739,264.02 ns 10,163.774 ns 9,009.921 ns 3,739,509.38 ns 3,723,246.88 ns 3,755,883.12 ns 8 B
Count [a-zA-Z]+ing NonBacktracking 5,051,221.83 ns 33,967.913 ns 31,773.606 ns 5,052,695.00 ns 4,998,330.00 ns 5,107,302.50 ns 16 B
Count \b\w+n\b None 27,807,980.95 ns 134,807.991 ns 119,503.772 ns 27,837,438.89 ns 27,494,544.44 ns 27,980,122.22 ns 72 B
Count \b\w+n\b Compiled 9,440,941.67 ns 19,744.721 ns 15,415.379 ns 9,444,880.00 ns 9,401,606.67 ns 9,455,040.00 ns 43 B
Count \b\w+n\b NonBacktracking 6,953,168.33 ns 45,320.347 ns 42,392.681 ns 6,961,718.75 ns 6,877,478.12 ns 7,006,471.88 ns 20 B
Count \p{Ll} None 28,127,407.86 ns 759,823.644 ns 875,014.038 ns 28,501,192.86 ns 26,223,642.86 ns 28,737,114.29 ns 93 B
Count \p{Ll} Compiled 13,321,410.71 ns 81,390.662 ns 72,150.702 ns 13,347,462.50 ns 13,167,200.00 ns 13,413,100.00 ns 41 B
Count \p{Ll} NonBacktracking 39,419,346.25 ns 801,683.581 ns 923,220.004 ns 39,830,937.50 ns 37,700,000.00 ns 40,786,225.00 ns 163 B
Count \p{Lu} None 2,023,986.48 ns 92,251.105 ns 106,236.509 ns 1,969,701.17 ns 1,925,703.12 ns 2,231,607.03 ns 5 B
Count \p{Lu} Compiled 1,315,508.80 ns 271,856.433 ns 313,070.273 ns 1,513,709.05 ns 844,541.78 ns 1,573,706.25 ns 2 B
Count \p{Lu} NonBacktracking 2,221,876.11 ns 43,444.586 ns 44,614.391 ns 2,208,992.98 ns 2,177,269.30 ns 2,331,979.82 ns 6 B
Count \p{L} None 25,654,160.71 ns 68,741.539 ns 53,668.872 ns 25,669,200.00 ns 25,544,957.14 ns 25,726,928.57 ns 93 B
Count \p{L} Compiled 18,014,849.78 ns 200,464.710 ns 187,514.814 ns 18,043,300.00 ns 17,415,020.00 ns 18,219,566.67 ns 43 B
Count \p{L} NonBacktracking 38,934,993.33 ns 652,844.754 ns 610,671.388 ns 39,364,225.00 ns 38,141,225.00 ns 39,496,325.00 ns 163 B
Count \s[a-zA-Z]{0,12}ing\s None 12,208,412.18 ns 11,155.275 ns 8,709.305 ns 12,207,561.54 ns 12,195,753.85 ns 12,223,115.38 ns 50 B
Count \s[a-zA-Z]{0,12}ing\s Compiled 4,419,674.90 ns 17,145.051 ns 16,037.491 ns 4,421,226.56 ns 4,382,335.94 ns 4,441,600.00 ns 10 B
Count \s[a-zA-Z]{0,12}ing\s NonBacktracking 2,796,513.55 ns 12,937.848 ns 11,469.066 ns 2,799,361.54 ns 2,775,414.10 ns 2,812,706.41 ns 8 B
Count \w+ None 10,636,459.17 ns 58,583.876 ns 54,799.394 ns 10,651,481.25 ns 10,556,512.50 ns 10,740,368.75 ns 20 B
Count \w+ Compiled 6,328,674.58 ns 106,299.649 ns 99,432.758 ns 6,371,910.42 ns 6,168,962.50 ns 6,445,389.58 ns 14 B
Count \w+ NonBacktracking 15,494,032.50 ns 96,047.840 ns 89,843.209 ns 15,484,793.75 ns 15,338,325.00 ns 15,665,362.50 ns 41 B
Count \w+\s+Holmes None 9,694,694.27 ns 67,672.275 ns 63,300.688 ns 9,675,701.56 ns 9,623,120.31 ns 9,848,145.31 ns 20 B
Count \w+\s+Holmes Compiled 3,615,191.96 ns 17,485.753 ns 15,500.664 ns 3,612,416.67 ns 3,595,335.42 ns 3,650,470.83 ns 14 B
Count \w+\s+Holmes NonBacktracking 2,444,741.61 ns 13,154.856 ns 12,305.060 ns 2,445,942.11 ns 2,425,164.21 ns 2,466,281.05 ns 7 B
Count \w+\s+Holmes\s+\w+ None 9,641,982.08 ns 30,395.594 ns 28,432.057 ns 9,641,915.62 ns 9,603,881.25 ns 9,689,093.75 ns 20 B
Count \w+\s+Holmes\s+\w+ Compiled 3,619,839.44 ns 15,794.409 ns 14,774.100 ns 3,616,418.75 ns 3,600,887.50 ns 3,642,425.00 ns 14 B
Count \w+\s+Holmes\s+\w+ NonBacktracking 2,434,820.91 ns 10,082.239 ns 8,937.642 ns 2,436,627.60 ns 2,409,888.54 ns 2,445,075.00 ns 7 B
Count aei None 75,163.44 ns 11,332.904 ns 13,050.990 ns 82,078.59 ns 53,796.44 ns 85,911.35 ns -
Count aei Compiled 53,334.08 ns 471.858 ns 441.377 ns 53,447.37 ns 52,223.94 ns 54,061.25 ns -
Count aei NonBacktracking 52,699.06 ns 121.886 ns 108.049 ns 52,702.46 ns 52,425.68 ns 52,834.65 ns -
Count aqj None 53,953.03 ns 8,966.194 ns 10,325.482 ns 55,561.38 ns 37,174.11 ns 65,091.75 ns -
Count aqj Compiled 74,327.56 ns 34,235.426 ns 39,425.567 ns 55,996.39 ns 36,241.40 ns 124,153.28 ns -
Count aqj NonBacktracking 45,510.05 ns 14,692.055 ns 15,720.334 ns 36,751.48 ns 36,410.73 ns 82,800.97 ns -
Count the None 582,786.16 ns 1,148.235 ns 1,017.880 ns 582,431.25 ns 581,885.47 ns 584,756.72 ns 2 B
Count the Compiled 390,300.33 ns 957.133 ns 895.303 ns 390,229.53 ns 388,299.69 ns 391,867.97 ns 1 B
Count the NonBacktracking 889,485.57 ns 11,831.208 ns 9,879.595 ns 887,103.53 ns 875,157.24 ns 908,464.31 ns 2 B
Count the\s+\w+ None 925,506.70 ns 3,146.730 ns 2,627.662 ns 925,341.91 ns 919,962.87 ns 930,500.00 ns 2 B
Count the\s+\w+ Compiled 551,541.44 ns 3,716.995 ns 3,476.880 ns 553,286.85 ns 546,624.78 ns 556,300.22 ns 1 B
Count the\s+\w+ NonBacktracking 1,233,466.70 ns 9,386.297 ns 8,320.708 ns 1,231,738.97 ns 1,222,754.41 ns 1,253,408.33 ns 3 B
Count zqj None 59,506.42 ns 29,040.412 ns 33,442.982 ns 39,845.82 ns 36,619.92 ns 122,955.75 ns -
Count zqj Compiled 86,479.15 ns 34,960.757 ns 40,260.859 ns 113,196.14 ns 38,201.71 ns 122,509.62 ns -
Count zqj NonBacktracking 93,503.47 ns 36,241.934 ns 41,736.265 ns 125,229.36 ns 39,296.77 ns 127,768.44 ns -

@stephentoub
Copy link
Member

stephentoub commented Aug 12, 2022

this particular case ("zdj" + Options.None)...
Compared to .NET 6, Preview 7 is 8 times faster, but compared to Preview 5 it's 3 times slower.

This regex is going to be heavily dominated by calls to span.IndexOf(string). Would this PR have had material impact on that method's codegen? It's unlikely changes anywhere else in or impacting other areas of regex would show up significantly with this expression. Do benchmarks directly on IndexOf show such a hit between previews 5 and 7?

@mrsharm
Copy link
Member

mrsharm commented Aug 12, 2022

Do benchmarks directly on IndexOf show such a hit between previews 5 and 7?

Seems like the Span.IndexOf(string..) tests haven't regressed between preview 5 and 7. Here's an example for System.Memory.ReadOnlySpan.IndexOfString(input: "AAAAA5AAAA", value: "5", comparisonType: InvariantCulture)

image

Please let me know if you have a specific test you'd like data for.

@AndyAyersMS
Copy link
Member Author

How about data for Span.IndexOf(string..) on Adam's box?

@stephentoub
Copy link
Member

stephentoub commented Aug 13, 2022

And specifically, a test like this:

private string _haystack;
private string _needle;

[GlobalSetup]
public async Task Setup()
{
    using HttpClient hc = new HttpClient();
    _haystack = await hc.GetStringAsync("https://www.gutenberg.org/files/1661/1661-0.txt");
    _needle = "zqj";
}

[Benchmark]
public int IndexOf() => _haystack.AsSpan().IndexOf(_needle);

This appears to have regressed significantly for me between .NET 6 and .NET 7 (this is on my Win11 x64); I've not checked if it regressed between Preview 5 and 7 on my machine.

@EgorBo, I know we spoke about this particular case before, but I thought it was mostly addressed. Is this still expected?

Method Runtime Mean Error StdDev Median Ratio RatioSD
IndexOf .NET 6.0 25.67 us 0.508 us 0.643 us 25.27 us 1.00 0.00
IndexOf .NET 7.0 39.07 us 0.241 us 0.226 us 39.00 us 1.53 0.04

@EgorBo
Copy link
Member

EgorBo commented Aug 15, 2022

@stephentoub Yeah it's that the worst case for the new algorithm - extremely low density of the first char ('z') - plain indexOf where we align data and only perform a single comparison at once is faster - I'll check again what I can do for it but overall it's expected.

E.g. other inputs for your benchmark:

[Benchmark]
[Arguments("zqj")]
[Arguments("tbd")]
[Arguments("suppressing")]
public int IndexOf(string needle) => _haystack.AsSpan().IndexOf(needle);
|  Method |        |      needle |        Mean |
|-------- |------- |------------ |------------:|
| IndexOf | net6.0 | suppressing | 1,103.95 us |
| IndexOf | net7.0 | suppressing |    83.68 us |

| IndexOf | net6.0 |         tbd | 1,663.58 us |
| IndexOf | net7.0 |         tbd |   123.91 us |

| IndexOf | net6.0 |         zqj |    61.75 us |
| IndexOf | net7.0 |         zqj |    78.24 us |

@stephentoub
Copy link
Member

Thanks. I'm interested in the results on @adamsitnik's box. Your box is showing a 25% regression for that input. My box is showing a 50% regression for that input. And apparently something on Adam's box was resulting in this input being 6x slower for a regex that should basically boil down to that same IndexOf call.

@jozkee
Copy link
Member

jozkee commented Oct 17, 2022

The regression for [Last]IndexOf_Word_NotFound* showed-up for .NET 6 vs .NET 7-rc2 report:

System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, IgnoreCase, False))

Result Ratio Alloc Delta Operating System Bit Processor Name Modality
Same 1.00 +0 ubuntu 18.04 Arm64 Unknown processor
Same 1.11 +0 Windows 11 Arm64 Unknown processor
Same 1.10 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 1.10 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.69 +0 macOS Monterey 12.6 Arm64 Apple M1
Slower 0.71 +0 macOS Monterey 12.6 Arm64 Apple M1 Max
Same 0.90 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R) bimodal
Same 0.97 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.97 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 1.06 +0 Windows 11 X64 AMD Ryzen 9 7950X
Slower 0.87 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.68 +0 debian 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.83 +0 ubuntu 18.04 X64 AMD Ryzen 9 5900X
Slower 0.71 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.78 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 0.98 +0 ubuntu 20.04 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Slower 0.71 +0 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.79 +0 macOS Big Sur 11.7 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.80 +0 macOS Monterey 12.6 X64 Intel Core i7-4870HQ CPU 2.50GHz (Haswell)

System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))

Result Ratio Alloc Delta Operating System Bit Processor Name Modality
Same 1.00 +0 ubuntu 18.04 Arm64 Unknown processor
Same 1.11 +0 Windows 11 Arm64 Unknown processor
Same 1.08 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 1.10 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.67 +0 macOS Monterey 12.6 Arm64 Apple M1
Slower 0.71 +0 macOS Monterey 12.6 Arm64 Apple M1 Max
Slower 0.88 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 0.97 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.97 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 1.07 +0 Windows 11 X64 AMD Ryzen 9 7950X
Slower 0.87 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.68 +0 debian 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.83 +0 ubuntu 18.04 X64 AMD Ryzen 9 5900X
Slower 0.74 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.77 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Faster 1.23 +0 ubuntu 20.04 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R) several?
Slower 0.71 +0 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.80 +0 macOS Big Sur 11.7 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.78 +0 macOS Monterey 12.6 X64 Intel Core i7-4870HQ CPU 2.50GHz (Haswell)

System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (, IgnoreCase, False))

Result Ratio Alloc Delta Operating System Bit Processor Name Modality
Same 1.01 +0 ubuntu 18.04 Arm64 Unknown processor
Faster 1.12 +0 Windows 11 Arm64 Unknown processor
Same 1.09 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 1.09 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.71 +0 macOS Monterey 12.6 Arm64 Apple M1
Slower 0.67 +0 macOS Monterey 12.6 Arm64 Apple M1 Max
Same 0.93 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Same 1.01 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.97 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 1.07 +0 Windows 11 X64 AMD Ryzen 9 7950X
Same 0.94 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.67 +0 debian 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.82 +0 ubuntu 18.04 X64 AMD Ryzen 9 5900X
Slower 0.81 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.78 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 1.10 +0 ubuntu 20.04 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Slower 0.71 +0 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.85 +0 macOS Big Sur 11.7 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.84 +0 macOS Monterey 12.6 X64 Intel Core i7-4870HQ CPU 2.50GHz (Haswell)

System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))

Result Ratio Alloc Delta Operating System Bit Processor Name Modality
Same 1.01 +0 ubuntu 18.04 Arm64 Unknown processor
Faster 1.13 +0 Windows 11 Arm64 Unknown processor
Same 1.08 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 1.09 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Slower 0.67 +0 macOS Monterey 12.6 Arm64 Apple M1
Slower 0.68 +0 macOS Monterey 12.6 Arm64 Apple M1 Max
Same 0.94 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R) bimodal
Same 1.00 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 0.96 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 1.08 +0 Windows 11 X64 AMD Ryzen 9 7950X
Same 0.93 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.67 +0 debian 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.82 +0 ubuntu 18.04 X64 AMD Ryzen 9 5900X
Slower 0.80 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Slower 0.78 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Faster 1.25 +0 ubuntu 20.04 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Slower 0.69 +0 ubuntu 20.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Slower 0.85 +0 macOS Big Sur 11.7 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Slower 0.85 +0 macOS Monterey 12.6 X64 Intel Core i7-4870HQ CPU 2.50GHz (Haswell)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: PGO-based block reordering interferes with loop recognition