-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix offset + globalization issues in StringSegment #45022
Fix offset + globalization issues in StringSegment #45022
Conversation
Tagging subscribers to this area: @eerhardt, @maryamariyan Issue DetailsFixes #39140. I also experimented with adding nullable annotations to this type to get some additional error checking, but I ended up reverting that commit because it was out of scope of this work. Marked draft because I haven't run any tests whatsoever over this code. It's just a demonstration of how I'm thinking of solving the problem.
|
f8fbdb0
to
daccefe
Compare
src/libraries/Microsoft.Extensions.Primitives/src/ThrowHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/Microsoft.Extensions.Primitives/src/StringSegment.cs
Outdated
Show resolved
Hide resolved
Performance tests!! 😄 |
@davidfowl There are lots of perf wins we could get from this type if we were willing to change some of its behavior. I leave that to a different issue. |
Yea, I just don't want regressions, this is in a hot path |
What methods specifically do you need optimized? The only one that is likely to regress is Reiterating my earlier comment: if this is really in a hot path, seriously consider allowing some behavioral changes here. For instance, the fact that there's special-casing all over the place to distinguish between Allowing this change in behavior would make existing methods dumb wrappers around the already highly-optimized span versions. For example: // Pretty much the whole call stack gets inlined into the caller at this point.
public bool Equals(StringSegment other) => this.AsSpan().SequenceEquals(other.AsSpan()); |
I haven't looked deeply enough to tell if that change will have an impact m, I'm just saying these types are only used in ASP.NET Core so make sure you do some performance testing |
src/libraries/Microsoft.Extensions.Primitives/src/StringSegment.cs
Outdated
Show resolved
Hide resolved
@maryamariyan @eerhardt this is now ready for review. I've added a bunch of unit tests to check various edge cases, especially with respect to globalization. I've also added extra argument checks around the Equals and Contains fast-paths to mirror the checks that |
|
||
fixed (char* p = Buffer) | ||
int i; | ||
for (i = span.Length - 1; (uint)i < (uint)span.Length; i--) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(uint)i < (uint)span.Length
Is this micro optimization worth the mental exercise someone reading this code has to go through? Honestly, this might be the first time I've ever seen a loop written this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to elide the bounds check, but we could write it using standard i >= 0
syntax if desired. Last I checked, the JIT still emitted the bounds check, which would make the "simple" way a perf regression compared to the existing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm definitely not an expert here, so double-check my work to make sure I didn't do anything wrong. But the current JIT'd code looks worse to me.
UPDATE: I found my first mistake - it defaults to Debug
. Updated the JIT'd code to Release
and they look the same to me.
public static bool M1(string a)
{
var span = a.AsSpan();
int i;
for (i = span.Length - 1; (uint)i < (uint)span.Length; i--)
{
if (char.IsWhiteSpace(span[i]))
{
return true;
}
}
return false;
}
public static bool M2(string a)
{
var span = a.AsSpan();
int i;
for (i = span.Length - 1; i >= 0; i--)
{
if (char.IsWhiteSpace(span[i]))
{
return true;
}
}
return false;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's x86 debug output. I'm trying to get x64 release output but it'll take a few minutes.
Edit: Looks like x64 release is nearly identical between the two cases. I'll do the simple thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment on code that I found surprising.
LGTM. Nice work on filling out the unit tests for this type.
I had forgotten that netfx (and netcore before 5.0) have some bugs w.r.t. the globalization IsPrefix and IsSuffix methods. I've suppressed on netfx the test cases which exercise those edge conditions. |
Opened #47374 to track CI failures. Continuing merge since CI failures are unrelated. |
@GrabYourPitchforks did you do any performance testing? @sebastienros Lets keep an eye out here for regressions since ASP.NET Core is the only consumer of this. |
@davidfowl No formal benchmarking, but I looked at the codegen for hot methods like |
Fixes #39140.
I also experimented with adding nullable annotations to this type to get some additional error checking, but I ended up reverting that commit because it was out of scope of this work.
Marked draft because I haven't run any tests whatsoever over this code. It's just a demonstration of how I'm thinking of solving the problem.