-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can Span<T>.SequenceEqual be optimized further to be faster for small buffers (buffer.Length < 5)? #32363
Comments
If you want to get more wild with gotos I think it could be done as per #402 (comment) |
+1 to Ben's suggestion. Another pattern which is more common in C-style libraries is to expose a secondary method that uses an alternative algorithm. Let's call it |
Do you have any links? I have never seen multiple |
I've never seen it for memcmp specifically, but symcrypt follows this pattern in a few places for other APIs. Here's an example of two different wipe (mem clear) routines. They're functionally the same but implemented differently. The caller has the choice of selecting whichever implementation is more optimized for their scenario. |
Yes, I can imagine that niche APIs can use pattern like this, especially when the difference between different approaches is substantial. I do not think it makes sense to introduce this complexity for common APIs like copying or comparing memory. There is no way people would get their use right. I have my fingers crossed that the hardware will eventually put the transistors to good use and introduce memory copy and compare instructions that will be strictly better than any alternative software equivalent. |
An alternative route for sequence equals for some of the s.t.j use cases might be the comparison against integer values, as implemented in the asp.net core route matcher or in spanjson. That approach needs codegen though. |
Though note that's with the SequenceEqual being (Benchmark in gist on linked issue) public bool UseSequenceEqual()
{
var input = _input;
return SequenceEqual(
ref MemoryMarshal.GetArrayDataReference(input),
ref MemoryMarshal.GetArrayDataReference(_expected),
(nuint)input.Length);
} Rather than creating two Spans as part of it, and linear being behind a call (so they match) [Benchmark(Baseline = true)]
public bool UseLinearSearch()
{
return LinearSearch(_input, _expected);
} |
Think I've got it; only worse on 2 and 3 #32371
|
@Tornhoof - can you share links to the example pattern? |
E.g. @stephentoub used it for RegEx in #1654 . It is pretty standard for high-performance codegened parsers/formatters. Some of the Json libraries that are beating System.Text.Json on performance use it as well. |
@ahsonkhan aspnetcore for Routing: |
SequenceEqual performs much better than a naive linear search for any buffer with
length >= 8
.However, the linear search does better for buffers of length 0-4 (and is on par for 5-7).
It would be great if it was always better, even for smaller buffers. Is that possible to achieve (of course, without regressing perf for larger buffers)?
Specifically
Span<byte>.SequenceEqual
.runtime/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.cs
Lines 421 to 437 in 83e30cc
SequenceEqualThreshold benchmark
Benchmark results
This shows up in implementations within the JSON context like comparing known ASCII property names (which tend to be small).
Processing JSON metadata properties benchmark
Benchmark results
@benaadams, @GrabYourPitchforks - got any suggestions?
FYI @jozkee, @steveharter
The text was updated successfully, but these errors were encountered: