Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorized Websocket HandshakeHelpers.IsRequestKeyValid #13196

Closed
wants to merge 11 commits into from

Conversation

gfoidl
Copy link
Member

@gfoidl gfoidl commented Aug 16, 2019

Description

Based on #12386 (comment) here's a vectorized version of Websocket's HandshakeHelpers.IsRequestKeyValid.
The code is commented to give the outline of the algorithm, which is based on dotnet/corefx#34529.

In short:

  • if the key-length is invalid, we short-circuit validation
  • for the validation no actual base64 decoding is performed, it's just validation if the input is correct to the base64 alphabet and if padding is correct

Benchmarks

Code

Valid key

|     Method |      Mean |     Error |    StdDev | Ratio | RatioSD |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------- |----------:|----------:|----------:|------:|--------:|-------:|------:|------:|----------:|
|   Baseline | 94.167 ns | 1.7203 ns | 2.3548 ns |  1.00 |    0.00 | 0.0126 |     - |     - |      40 B |
|    PR12386 | 52.838 ns | 0.5165 ns | 0.4831 ns |  0.56 |    0.02 |      - |     - |     - |         - |
| Vectorized |  3.773 ns | 0.0518 ns | 0.0485 ns |  0.04 |    0.00 |      - |     - |     - |         - |

Invalid key

|     Method |                  Key |          Mean |       Error |      StdDev |        Median | Ratio | RatioSD |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------- |--------------------- |--------------:|------------:|------------:|--------------:|------:|--------:|-------:|------:|------:|----------:|
|   Baseline | 2006a(...)c8e29 [32] |    118.798 ns |   2.3807 ns |   4.4128 ns |    116.486 ns |  1.00 |    0.00 | 0.0153 |     - |     - |      48 B |
|    PR12386 | 2006a(...)c8e29 [32] |     74.311 ns |   0.7227 ns |   0.6760 ns |     74.115 ns |  0.60 |    0.02 |      - |     - |     - |         - |
| Vectorized | 2006a(...)c8e29 [32] |      1.581 ns |   0.0112 ns |   0.0100 ns |      1.579 ns |  0.01 |    0.00 |      - |     - |     - |         - |
|            |                      |               |             |             |               |       |         |        |       |       |           |
|   Baseline | fP5Xt(...)PCg== [24] | 23,878.036 ns | 219.9314 ns | 205.7239 ns | 23,955.622 ns | 1.000 |    0.00 | 0.1831 |     - |     - |     576 B |
|    PR12386 | fP5Xt(...)PCg== [24] |     69.002 ns |   0.6012 ns |   0.5330 ns |     69.021 ns | 0.003 |    0.00 |      - |     - |     - |         - |
| Vectorized | fP5Xt(...)PCg== [24] |      3.871 ns |   0.0359 ns |   0.0336 ns |      3.865 ns | 0.000 |    0.00 |      - |     - |     - |         - |

Open questions

@gfoidl
Copy link
Member Author

gfoidl commented Aug 16, 2019

/cc: @BrennanConroy

And how did you run the benchmarks like in #12386 (comment)?

@BrennanConroy
Copy link
Member

BrennanConroy commented Aug 16, 2019

I added some microbenchmarks as part of the PR

https://github.com/aspnet/AspNetCore/pull/12386/files#diff-9053f4069af50f90080ba287c034c066

@BrennanConroy
Copy link
Member

BrennanConroy commented Aug 16, 2019

I couldn't find any unit-tests for IsRequestKeyValid -- as it is an internal member, is it tested through public apis or should I add tests for it?

I have some tests in the PR https://github.com/aspnet/AspNetCore/pull/12386/files#diff-66cc560b423b43d03d69a6fbd14ea5f8

@gfoidl
Copy link
Member Author

gfoidl commented Aug 16, 2019

Ohh, shame on me. I've viewed the PR and can't see the changes 😉
Thanks!

Can I copy these (bench + tests) to this PR? Or is there a better way to avoid a duplication?

@BrennanConroy
Copy link
Member

I'm ok with you copying them :)

… isrequestkeyvalid_vectorize

Removed / reverted unrelated changes to this PR.
@BrennanConroy BrennanConroy added the feature-websockets Includes: WebSockets label Aug 16, 2019
@gfoidl
Copy link
Member Author

gfoidl commented Aug 16, 2019

Baseline

Method Mean Error StdDev Op/s Gen 0 Allocated
IsRequestKeyValid 142.1 ns 2.098 ns 1.963 ns 7,037,173.7 0.0267 400 B

PR12386

Method Mean Error StdDev Op/s Allocated
IsRequestKeyValid 84.50 ns 1.180 ns 1.104 ns 11,834,976.3 0 B

This PR (vectorized)

Method Mean Error StdDev Op/s Allocated
IsRequestKeyValid 7.907 ns 0.1167 ns 0.1092 ns 126,468,298.9 0 B
HW info
BenchmarkDotNet=v0.10.13, OS=ubuntu 16.04
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 4 logical cores and 4 physical cores
.NET Core SDK=3.0.100-preview9-013927
  [Host]     : .NET Core 3.0.0-preview9-19415-13 (CoreCLR 4.700.19.40902, CoreFX 4.700.19.40917), 64bit RyuJIT
  Job-NGKFFV : .NET Core 3.0.0-preview9-19415-13 (CoreCLR 4.700.19.40902, CoreFX 4.700.19.40917), 64bit RyuJIT

Runtime=Core  Server=True  Toolchain=.NET Core 3.0
RunStrategy=Throughput

public class HandshakeTests
{
[Fact]
public void CreatesCorrectResponseKey()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just merged the branch from #12386 into my branch. That's why this unrelated (to this PR) test is also here.

@BrennanConroy
Copy link
Member

The "interesting part" of the algorithm is based on https://github.com/aklomp/base64, which is BSD 2-Clause, that's why in dotnet/corefx#34529 the license info got updated. Do we need an update here too?

@Pilchie

@GrabYourPitchforks
Copy link
Member

Lots of unsafe code being added as part of this PR. This appears to be something that occurs only once per WebSocket request and which already only takes fewer than 100 nanoseconds.

aspnet team, do you believe the added complexity to be a worthwhile trade-off? If so, please consider using APIs like Unsafe.ReadUnaligned instead of Unsafe.As when performing widening conversions unless you can guarantee that the data is already aligned. I know it's not strictly required for the current implementation of the JIT, but it is good hygiene.

Additionally, if using the packuswb instruction it'd be good to have unit tests that include non-ASCII string values, that way you can test that saturation and overflow conditions are being properly handled and aren't being treated as false positives.

@analogrelay analogrelay added this to the 5.0.0-alpha1 milestone Aug 17, 2019
@Tratcher
Copy link
Member

Tratcher commented Aug 18, 2019

I'm with Levi on this one, I don't see the benefit of micro optimizing this code path, especially in such an unmaintainable way.

@BrennanConroy
Copy link
Member

@gfoidl Looks like we're not going to take this at this time, sorry.

I do like seeing the amazing improvements vectorizing code can do!

Thanks for trying it out.

@gfoidl gfoidl deleted the isrequestkeyvalid_vectorize branch August 28, 2019 16:19
@gfoidl
Copy link
Member Author

gfoidl commented Aug 28, 2019

I do like seeing the amazing improvements vectorizing code can do!

Me too 😃 -- unfortunately code gets non-intuitive quickly...

@amcasey amcasey added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-websockets Includes: WebSockets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants