-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StringBuilder.Replace with ReadOnlySpan<char> #93938
StringBuilder.Replace with ReadOnlySpan<char> #93938
Conversation
Converted the StringBuilder.Replace(string, string, int, int) method and underlaying methods to take ReadOnlySpan<char> as arguments instead. The Replace methods with string arguments create spans to use the same new method. Fix dotnet#77837
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @dotnet/area-system-runtime Issue DetailsConverted the String-based tests still pass, as do the added ones for the Fix #77837
|
src/libraries/System.Private.CoreLib/src/System/Text/StringBuilder.cs
Outdated
Show resolved
Hide resolved
Moved null checks to the string-specific overloads, as spans won't be null. (PR feedback)
Re benchmarking, there are docs in dotnet/performance repo that may help. |
src/libraries/System.Private.CoreLib/src/System/Text/StringBuilder.cs
Outdated
Show resolved
Hide resolved
Fixed code-style failure and implemented further feedback.
I had found that documentation, @danmoseley, however when trying to run any benchmarks it kept complaining about SDK mismatches, which I wasn't able to solve for several days. |
OK, I don't work in this repo day to day any more, so @stephentoub may be more useful as he does this kind of benchmarking all the time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code LGTM, but it needs a little of polishing before it's approved. PTAL at my comments.
however when trying to run any benchmarks it kept complaining about SDK mismatches, which I wasn't able to solve for several days
You can use the python 3 script provided by the dotnet/performance repository. It's going to download the SDK for you and run the benchmarks:
git clone https://github.com/dotnet/performance.git
cd performance
python3 .\scripts\benchmarks_ci.py -f net8.0 --filter '*yourFilter*' --corerun $pathToCoreRunWithoutChanges $pathToCoreRunWithChanges
But one thing to keep in mind is that benchmarking new APIs is PITA. To make your life easier, you can simply benchmark the existing string-based overload:
public StringBuilder Replace(string oldValue, string? newValue, int startIndex, int count)
To do that you will need to add new benchmarks to https://github.com/dotnet/performance/blob/main/src/benchmarks/micro/libraries/System.Text/Perf.StringBuilder.cs as we currently don't have benchmarks for this particular overload.
If you can't run the benchmarks or simply don't have the time to do it, please let me know.
@TheMaximum thank you for your contribution!
src/libraries/System.Runtime/tests/System/Text/StringBuilderTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's almost ready 👍 Thank you @TheMaximum !
src/libraries/System.Runtime/tests/System/Text/StringBuilderReplaceTests.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Runtime/tests/System/Text/StringBuilderReplaceTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you @TheMaximum !
@adamsitnik, I've been able to run the benchmarks using the following command:
Created a benchmark as follows: [Benchmark]
public StringBuilder Replace_Strings()
{
StringBuilder builder = new StringBuilder("initialvalue");
builder.Replace("initial", "new");
builder.Replace("value", "text");
return builder;
} The results are as follows below ( I'm not 100% sure I'm reading this right, and also not sure if the benchmark is representative. Don't know if these results are satisfactory, or maybe you want to give some pointers (or run some yourself)? Not sure if these benchmarks are necessary either way (but wanted to attempt it anyway)...
A second run seems to produce similar results:
|
@@ -1950,6 +1961,23 @@ public bool Equals(ReadOnlySpan<char> span) | |||
/// are removed from this builder. | |||
/// </remarks> | |||
public StringBuilder Replace(string oldValue, string? newValue, int startIndex, int count) | |||
{ | |||
ArgumentException.ThrowIfNullOrEmpty(oldValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can just be ThrowIfNull. The called method has the check for empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would change the exception message (I think) in case it's an empty string (SR.Arg_EmptySpan
instead of SR.Argument_EmptyString
), I don't know if that would be an issue? It might be confusing to the user who's using strings to get an exception message indicating an empty span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if the benchmark is representative
The benchmark you have provided creates a relatively small StringBuilder
that is internally backed by a single segment and performs two simple replace operations that do find the strings and replace them with shorter values.
It's OK, but we need more scenarios to be covered.
Ideally we would need:
-
StrignBuilders that are represented by one and many internal segments. You can take a look at https://github.com/dotnet/performance/blob/b685a3f2e8381c872576aee81fc88730bbba0e9c/src/benchmarks/micro/libraries/System.Text/Perf.StringBuilder.cs#L39-L44 and https://github.com/dotnet/performance/blob/b685a3f2e8381c872576aee81fc88730bbba0e9c/src/benchmarks/micro/libraries/System.Text/Perf.StringBuilder.cs#L51-L56 to see how we create such builders.
-
Replace operations that do the following:
- don't find anything to replace
- replace, but:
- the new value is shorter than the old one
- the new value is longer than the old one
Once we have that you can tell the python script to run the benchmarks more than once by adding --bdn-arguments "--launchCount 6 --memoryRandomization true"
. It will run every benchmark six times and try to apply some memory randomization (so the alignment of internal string builder buffer is different every time)
@adamsitnik, I've attempted to create the (micro)benchmarks for the cases you've described: [Benchmark]
[Arguments(100)]
[Arguments(LOHAllocatedStringSize)]
public StringBuilder Replace_Strings_Simple(int length)
{
StringBuilder builder = new StringBuilder(new string('a', length));
builder.Replace("a", "b");
return builder;
}
[Benchmark]
[Arguments(100)]
[Arguments(LOHAllocatedStringSize)]
public StringBuilder Replace_Strings_NoReplace(int length)
{
StringBuilder builder = new StringBuilder(new string('a', length));
builder.Replace("b", "test");
return builder;
}
[Benchmark]
[Arguments(100)]
[Arguments(LOHAllocatedStringSize)]
public StringBuilder Replace_Strings_ReplaceShorter(int length)
{
StringBuilder builder = new StringBuilder();
builder.Insert(0, "ab", (length / 2));
builder.Replace("ab", "a");
return builder;
}
[Benchmark]
[Arguments(100)]
[Arguments(LOHAllocatedStringSize)]
public StringBuilder Replace_Strings_ReplaceLonger(int length)
{
StringBuilder builder = new StringBuilder();
builder.Insert(0, "ab", (length / 2));
builder.Replace("ab", "zab");
return builder;
} I got the following results with executing them 6 times and applying memory randomization. To be honest, I'd like to leave the conclusions to you - as I'm not so sure I can make any myself (and I'm not 100% sure the benchmarks are correct)...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmark numbers LGTM, thank you for sharing the results @TheMaximum !
I've found two places where we could improve the comments. I'll apply my suggestions and merge the PR. Thanks!
src/libraries/System.Private.CoreLib/src/System/Text/StringBuilder.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Text/StringBuilder.cs
Outdated
Show resolved
Hide resolved
…-replace-readonlyspan # Conflicts: # src/libraries/System.Runtime/tests/System.Runtime.Tests/System/Text/StringBuilderReplaceTests.cs
Converted the
StringBuilder.Replace(string, string, int, int)
method and underlaying methods to takeReadOnlySpan<char>
as arguments instead. TheReplace
methods with string arguments are cast to spans (calling.AsSpan()
) to use the same new method.String-based tests still pass, as do the added ones for the
ReadOnlySpan<char>
variants. Was looking to see if I can somehow benchmark the implementations, but I can't figure out how to run these against the self-built runtime.Fix #77837