Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize stackalloc and access static byte data directly #767

Merged
merged 7 commits into from
Feb 14, 2022

Conversation

iamcarbon
Copy link
Contributor

@iamcarbon iamcarbon changed the title Various improvements Utilize stackalloc and access static byte data directly Feb 13, 2022
@iamcarbon
Copy link
Contributor Author

@jstedfast ready for review.

char[] chars = new char[count];
Span<char> chars = count < 16
? stackalloc char[16]
: new char[count];

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using Span<> here really help?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so if my quick research is correct (don't have time for a full reading because I'm about to run out), I guess using Span<> with stackalloc removes the need to declare the method unsafe?

Copy link
Contributor Author

@iamcarbon iamcarbon Feb 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is always a WIN when in throughput on .NET 5+ [we eliminate an allocation and avoid zeroing the stack], but there's a small penalty on older runtimes where we don't have SkipLocalsInit.

You are also right in that stackallocing into a Span is safe (otherwise, we deal with an unsafe pointer when using its natural type)

var chars = new char[fieldNameLength];
Span<char> chars = fieldNameLength <= 32
? stackalloc char[32]
: new char[fieldNameLength];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about Span<>, but also why 32 here but 16 elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some random checking to see the averages sizes and found them to be small or large (12 bytes or 100+ bytes). Standardizing on 32 bytes seems reasonable however.

@@ -312,9 +312,14 @@ protected Header (ParserOptions options, HeaderId id, string name, byte[] field,
/// <param name="fieldNameLength">The length of the field name (not including trailing whitespace).</param>
/// <param name="value">The raw value of the header.</param>
/// <param name="invalid"><c>true</c> if the header field is invalid; othereise, <c>false</c>.</param>
#if NET5_0_OR_GREATER
[System.Runtime.CompilerServices.SkipLocalsInit]
#endif
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this does...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This is what gives us the win. It's basically free to use the stack if we use uncleared memory -- while avoiding the allocation.

@@ -37,7 +37,7 @@ namespace MimeKit.Encodings {
/// </remarks>
public class Base64Decoder : IMimeDecoder
{
static readonly byte[] base64_rank = new byte[256] {
static ReadOnlySpan<byte> base64_rank => new byte[256] {
255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be static readonly ReadOnlySpan<byte> ...? and why => vs =?

Copy link
Contributor Author

@iamcarbon iamcarbon Feb 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a really weird syntax that requires you to understand how Roslyn transforms it. It looks like it's allocating on every use, but Roslyn burns the data directly into the data file instead. This allows the code to access the data directly and eliminate various indirections and bounds checks.

@jstedfast
Copy link
Owner

Thanks for the link to https://vcsjones.dev/csharp-readonly-span-bytes-static/ in one of your commits. Very useful documentation.

@@ -508,6 +508,8 @@ public static string Quote (string text)
return Quote (text.AsSpan ());
}

private static readonly char[] unquoteChars = new[] { '\r', '\n', '\t', '\\', '"' };

/// <summary>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to define my "constants" at the top and using PascalCase.

Copy link
Contributor Author

@iamcarbon iamcarbon Feb 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@jstedfast jstedfast merged commit 045f5c0 into jstedfast:master Feb 14, 2022
@jstedfast
Copy link
Owner

jstedfast commented Feb 14, 2022

Pre-merge:

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9700 CPU 3.00GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=6.0.200
[Host] : .NET 5.0.14 (5.0.1422.5710), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET 5.0.14 (5.0.1422.5710), X64 RyuJIT

Method Mean Error StdDev
Base64Decoder 420,581.7 ns 4,948.10 ns 4,628.46 ns
QuotedPrintableDecoder 848.7 ns 16.74 ns 25.05 ns
UUDecoder 606,899.6 ns 1,058.71 ns 884.07 ns
Method Mean Error StdDev
Base64Encoder 254.2 us 2.62 us 2.46 us
HexEncoder 737.8 us 3.72 us 3.30 us
UUEncoder 213.3 us 1.00 us 0.88 us

Post-merge:

Method Mean Error StdDev
Base64Decoder 415,734.2 ns 8,241.76 ns 19,746.72 ns
QuotedPrintableDecoder 840.8 ns 6.52 ns 6.10 ns
UUDecoder 564,594.0 ns 5,206.78 ns 4,615.67 ns
Method Mean Error StdDev
Base64Encoder 208.0 us 1.14 us 1.06 us
HexEncoder 706.4 us 2.41 us 2.13 us
UUEncoder 213.4 us 0.87 us 0.77 us

@jstedfast
Copy link
Owner

Just for fun:

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9700 CPU 3.00GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=6.0.200
[Host] : .NET 6.0.2 (6.0.222.6406), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET 6.0.2 (6.0.222.6406), X64 RyuJIT

Method Mean Error StdDev
Base64Decoder 363,079.5 ns 4,604.11 ns 4,081.43 ns
QuotedPrintableDecoder 690.8 ns 12.53 ns 19.51 ns
UUDecoder 457,311.7 ns 2,263.63 ns 2,006.65 ns
Method Mean Error StdDev
Base64Encoder 210.8 us 2.21 us 1.96 us
HexEncoder 630.1 us 3.45 us 3.06 us
UUEncoder 214.3 us 0.85 us 0.71 us

@iamcarbon
Copy link
Contributor Author

iamcarbon commented Feb 14, 2022

Ahh, .NET6.0! ❤️

@jstedfast
Copy link
Owner

If you really want to be blown away:

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9700 CPU 3.00GHz, 1 CPU, 8 logical and 8 physical cores
[Host] : .NET Framework 4.8 (4.8.4420.0), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET Framework 4.8 (4.8.4420.0), X64 RyuJIT

Method Mean Error StdDev Median
MimeParser_StarTrekMessage 353.67 ms 6.933 ms 6.809 ms 353.74 ms
MimeParser_StarTrekMessagePersistent 262.98 ms 5.230 ms 6.800 ms 261.44 ms
MimeParser_ContentLengthMbox 32.55 ms 0.717 ms 2.104 ms 31.49 ms
MimeParser_ContentLengthMboxPersistent 26.94 ms 0.105 ms 0.087 ms 26.93 ms
MimeParser_JwzMbox 263.29 ms 3.201 ms 2.994 ms 263.95 ms
MimeParser_JwzMboxPersistent 211.77 ms 1.594 ms 1.491 ms 211.61 ms
MimeParser_HeaderStressTest 56.94 ms 0.393 ms 0.368 ms 56.78 ms
ExperimentalMimeParser_StarTrekMessage 333.69 ms 5.282 ms 4.940 ms 331.58 ms
ExperimentalMimeParser_StarTrekMessagePersistent 240.90 ms 1.865 ms 1.745 ms 241.26 ms
ExperimentalMimeParser_ContentLengthMbox 30.78 ms 0.190 ms 0.168 ms 30.75 ms
ExperimentalMimeParser_ContentLengthMboxPersistent 26.51 ms 0.129 ms 0.121 ms 26.53 ms
ExperimentalMimeParser_JwzMbox 258.84 ms 2.870 ms 2.544 ms 258.67 ms
ExperimentalMimeParser_JwzMboxPersistent 209.61 ms 1.351 ms 1.197 ms 209.75 ms
ExperimentalMimeParser_HeaderStressTest 52.84 ms 0.637 ms 0.532 ms 52.88 ms
MimeReader_StarTrekMessage 176.17 ms 0.984 ms 0.921 ms 176.26 ms
MimeReader_ContentLengthMbox 12.56 ms 0.036 ms 0.030 ms 12.57 ms
MimeReader_JwzMbox 130.99 ms 0.531 ms 0.497 ms 130.99 ms
MimeReader_HeaderStressTest 15.33 ms 0.130 ms 0.121 ms 15.35 ms

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9700 CPU 3.00GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=6.0.200
[Host] : .NET 5.0.14 (5.0.1422.5710), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET 5.0.14 (5.0.1422.5710), X64 RyuJIT

Method Mean Error StdDev Median
MimeParser_StarTrekMessage 293.98 ms 4.907 ms 6.381 ms 293.48 ms
MimeParser_StarTrekMessagePersistent 231.20 ms 4.423 ms 11.574 ms 225.23 ms
MimeParser_ContentLengthMbox 26.45 ms 0.151 ms 0.134 ms 26.42 ms
MimeParser_ContentLengthMboxPersistent 22.94 ms 0.208 ms 0.184 ms 22.94 ms
MimeParser_JwzMbox 228.91 ms 1.944 ms 1.623 ms 228.72 ms
MimeParser_JwzMboxPersistent 191.09 ms 3.742 ms 3.500 ms 190.50 ms
MimeParser_HeaderStressTest 49.65 ms 0.784 ms 0.695 ms 49.33 ms
ExperimentalMimeParser_StarTrekMessage 270.28 ms 2.162 ms 2.022 ms 270.39 ms
ExperimentalMimeParser_StarTrekMessagePersistent 223.18 ms 1.491 ms 1.395 ms 223.41 ms
ExperimentalMimeParser_ContentLengthMbox 26.93 ms 0.404 ms 0.378 ms 26.98 ms
ExperimentalMimeParser_ContentLengthMboxPersistent 23.18 ms 0.377 ms 0.352 ms 23.08 ms
ExperimentalMimeParser_JwzMbox 233.51 ms 3.549 ms 5.629 ms 231.61 ms
ExperimentalMimeParser_JwzMboxPersistent 185.65 ms 1.778 ms 1.576 ms 185.33 ms
ExperimentalMimeParser_HeaderStressTest 41.11 ms 0.518 ms 0.485 ms 41.13 ms
MimeReader_StarTrekMessage 172.53 ms 3.328 ms 3.417 ms 171.93 ms
MimeReader_ContentLengthMbox 11.23 ms 0.045 ms 0.040 ms 11.25 ms
MimeReader_JwzMbox 122.64 ms 0.535 ms 0.501 ms 122.73 ms
MimeReader_HeaderStressTest 12.14 ms 0.118 ms 0.111 ms 12.13 ms

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9700 CPU 3.00GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=6.0.200
[Host] : .NET 6.0.2 (6.0.222.6406), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET 6.0.2 (6.0.222.6406), X64 RyuJIT

Method Mean Error StdDev Median
MimeParser_StarTrekMessage 284.47 ms 3.984 ms 3.327 ms 284.93 ms
MimeParser_StarTrekMessagePersistent 221.08 ms 4.297 ms 9.959 ms 215.28 ms
MimeParser_ContentLengthMbox 25.78 ms 0.204 ms 0.181 ms 25.70 ms
MimeParser_ContentLengthMboxPersistent 21.96 ms 0.214 ms 0.200 ms 21.91 ms
MimeParser_JwzMbox 225.38 ms 1.505 ms 1.334 ms 225.36 ms
MimeParser_JwzMboxPersistent 181.97 ms 1.724 ms 1.613 ms 181.89 ms
MimeParser_HeaderStressTest 47.07 ms 0.421 ms 0.394 ms 47.02 ms
ExperimentalMimeParser_StarTrekMessage 264.82 ms 1.183 ms 1.107 ms 265.28 ms
ExperimentalMimeParser_StarTrekMessagePersistent 208.25 ms 1.522 ms 1.350 ms 208.46 ms
ExperimentalMimeParser_ContentLengthMbox 25.28 ms 0.179 ms 0.159 ms 25.27 ms
ExperimentalMimeParser_ContentLengthMboxPersistent 21.88 ms 0.164 ms 0.145 ms 21.89 ms
ExperimentalMimeParser_JwzMbox 225.96 ms 3.570 ms 3.339 ms 225.59 ms
ExperimentalMimeParser_JwzMboxPersistent 178.55 ms 1.198 ms 1.062 ms 178.42 ms
ExperimentalMimeParser_HeaderStressTest 38.61 ms 0.329 ms 0.308 ms 38.55 ms
MimeReader_StarTrekMessage 159.09 ms 0.728 ms 0.681 ms 159.18 ms
MimeReader_ContentLengthMbox 10.73 ms 0.082 ms 0.073 ms 10.73 ms
MimeReader_JwzMbox 118.88 ms 0.707 ms 0.591 ms 118.82 ms
MimeReader_HeaderStressTest 11.24 ms 0.062 ms 0.048 ms 11.24 ms

@iamcarbon
Copy link
Contributor Author

iamcarbon commented Feb 14, 2022

Those are lovely performance improvements. Excited to see these improve even further on .NET7.0!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants