Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StringBuilder.Append(ROM<char>) to existing Append(ROS<char>) for perf #27427

Closed
eerhardt opened this issue Sep 19, 2018 · 23 comments
Closed
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime good first issue Issue should be easy to implement, good for first-time contributors help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Milestone

Comments

@eerhardt
Copy link
Member

using System;
using System.Text;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Diagnosers;
using BenchmarkDotNet.Running;

namespace MyBenchmarks
{
    public class StringBuilderROM
    {
        private readonly ReadOnlyMemory<char> m1 = "xAbCdEfG".AsMemory().Slice(1);
        private readonly ReadOnlyMemory<char> m2 = "xAbCdEfGabcdefghijklmnopqrstuvwxyz".AsMemory().Slice(1);
        private readonly StringBuilder s = new StringBuilder();

        [Benchmark]
        public StringBuilder ShortAppendROM() => s.Clear().Append(m1);

        [Benchmark]
        public StringBuilder ShortAppendROS() => s.Clear().Append(m1.Span);

        [Benchmark]
        public StringBuilder LongAppendROM() => s.Clear().Append(m2);

        [Benchmark]
        public StringBuilder LongAppendROS() => s.Clear().Append(m2.Span);
    }

    public class Program
    {
        public static void Main(string[] args)
        {
            var summary = BenchmarkRunner.Run<StringBuilderROM>(DefaultConfig.Instance.With(MemoryDiagnoser.Default));
        }
    }
}

Results in:

BenchmarkDotNet=v0.11.1, OS=Windows 10.0.17134.286 (1803/April2018Update/Redstone4)
Intel Core i7-6700 CPU 3.40GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-alpha1-20180720-2
  [Host]     : .NET Core 2.1.3 (CoreCLR 4.6.26725.06, CoreFX 4.6.26725.05), 64bit RyuJIT
  DefaultJob : .NET Core 2.1.3 (CoreCLR 4.6.26725.06, CoreFX 4.6.26725.05), 64bit RyuJIT


         Method |     Mean |     Error |    StdDev |  Gen 0 | Allocated |
--------------- |---------:|----------:|----------:|-------:|----------:|
 ShortAppendROM | 34.48 ns | 0.7666 ns | 0.9968 ns | 0.0171 |      72 B |
 ShortAppendROS | 20.06 ns | 0.2060 ns | 0.1927 ns |      - |       0 B |
  LongAppendROM | 45.51 ns | 0.8662 ns | 0.8103 ns | 0.0305 |     128 B |
  LongAppendROS | 20.78 ns | 0.1742 ns | 0.1629 ns |      - |       0 B |

The reason StringBuilder.Append(ROM<char>) even works is because Append has an overload that takes an object, which just turns around and calls .ToString() on the object. ROM<char> overrides the ToString, and returns string.Substring, which is going to make a copy of the string. Thus resulting in worse performance.

I'm sure I'm not the last person who is going to make this mistake assuming that StringBuilder.Append(ROM<char>) should "just work" as expected.

Proposal

A simple fix for this would be to add another overload to StringBuilder.Append that takes a ReadOnlyMemory<char> and just calls the existing Append(ReadOnlySpan<char>) method, which doesn't need to copy the string before appending.

namespace System.Text
{
    public class StringBuilder
    {
        public StringBuilder Append(ReadOnlyMemory<char> value);
    }
}

/cc @GrabYourPitchforks @ahsonkhan @codemzs

@ViktorHofer
Copy link
Member

Good catch. I would have also added allocation numbers which outline the unnecessary string allocation.

@eerhardt
Copy link
Member Author

Good call. I've updated it with the allocations, and I've also added a "longer" string to show that it gets worse as the string gets longer. ROS barely even grows as the string gets longer. ROM, not so much.

@ViktorHofer
Copy link
Member

Thanks. It would be interesting to find out if we have other performance critical paths in the BCL that accept an object and call ToString on it.

@ahsonkhan
Copy link
Member

@joperezr, @pjanotti - do you have any thoughts before this API is marked as api-ready-for-review?

@joperezr
Copy link
Member

Api looks good to me. Marking it as ready for review.

@danmoseley danmoseley changed the title StringBuilder.Append(ROM<char>) a lot slower than Append(ROS<char>) Add StringBuilder.Append(ROM<char>) to existing Append(ROS<char>) for perf Sep 21, 2018
@GrabYourPitchforks
Copy link
Member

A simple fix for this would be to add another overload to StringBuilder.Append that takes a ReadOnlyMemory and just calls the existing Append(ReadOnlySpan) method, which doesn't need to copy the string before appending.

You may have a GC hole here. Consider the proposed implementation, which logically expands to the below.

public StringBuilder Append(ROM<char> rom) {
    var span = rom.Span;
    /* 'rom' is no longer referenced at this line */
    return Append(span);
}

If the ReadOnlyMemory<char> is backed by a string or a char[], the span will keep the underlying object alive by acting as a GC root, so there's no problem. But if the ReadOnlyMemory<char> is backed by a MemoryManager<char> (such as when pointing to native memory), the span will take on the value of the MemoryManager<char>'s underlying pointer, but there's potentially no longer a GC root to the MemoryManager<char> instance itself. This makes the MemoryManager<char> eligible for garbage collection, including finalization of any underlying SafeHandle instances, potentially while your call to Append(span) is still running.

A GC.KeepAlive would prevent this, as would anything else that forces rom._object to be referenced after the call to Append(span).

@GrabYourPitchforks
Copy link
Member

Also, if @jkotas or @Maoni0 or anybody else could comment on my GC hole theory above, that would be helpful. Perhaps I'm way off base here. But there's certainly no requirement that the caller keep the reference alive, so this seems like a variant on the classic "this gets disposed while a p/invoke is taking place" issue.

@Maoni0
Copy link
Member

Maoni0 commented Sep 21, 2018

@stephentoub could probably take a look?

@jkotas
Copy link
Member

jkotas commented Sep 21, 2018

Memory<T> is a fundamentally unsafe type, so one has to be really careful about lifetimes.

There is no GC hole in the fragment at the top (when taken as a whole) because of the Memory<T> is never created from unmanaged memory. The lifetime bugs only happen when Memory<T> is created from unmanaged memory (or manually managed pooled memory).

@GrabYourPitchforks I believe that you have written the rules that have to be followed for safe use of Memory<T>. Have they make it into the documentation?

IIRC, the rules said that the caller is responsible for ensuring the correct lifetime. It is actually not very practical to call GC.KeepAlive for the callee because of a simple GC.KeepAlive(rom) would cause boxing.

@ahsonkhan
Copy link
Member

I believe that you have written the rules that have to be followed for safe use of Memory<T>. Have they make it into the documentation?

It's a work in progress: dotnet/docs#4823

cc @rpetrusha, @mairaw

@ThadHouse
Copy link
Contributor

For the GC.KeepAlive boxing issue, what about adding a generic GC.KeepAlive that could take structs?

@GrabYourPitchforks
Copy link
Member

@jkotas They were more guidelines rather than strict rules. But perhaps you're right - this is more of a problem for the caller to handle and the callees shouldn't be responsible for handling it. I'll spin up a separate conversation about it so that we don't derail the discussion for this particular work item.

@danmoseley
Copy link
Member

@GrabYourPitchforks did clarity emerge sufficient to take this further (or not)?

@GrabYourPitchforks
Copy link
Member

@danmosemsft Don't block this work item on my comment.

@danmoseley
Copy link
Member

OK let's get this through review then
cc @eerhardt

@terrajobst
Copy link
Member

We asked ourselves whether we need both overloads (ReadOnlyMemory<T> and Memory<T>). It seems the answer is we don't need the Memory<T> one because the compiler prefers the implicit conversion to ReadOnlyMemory<T> over the conversion to object.

Approved as proposed.

@danmoseley
Copy link
Member

@eerhardt I guess we can mark this up for grabs?

@eerhardt
Copy link
Member Author

Yep - done.

@Wraith2
Copy link
Contributor

Wraith2 commented Nov 2, 2018

This is easy to add but how do you test it? If you don't have the ROM overload it'll just fallback to the object overload which is the reason there's a problem in the first place and it'll succeed anyway. Should the test do a reflection check to find the specific overload wanted is present and assert if it isn't?

@stephentoub
Copy link
Member

stephentoub commented Nov 2, 2018

This is easy to add but how do you test it?

The only benefit of the overload is performance, so you'd validate that benefit with a performance test. Functionally we'd make sure we have StringBuilder.Append coverage that passes in both object and ReadOnlyMemory, and make sure they both behave correctly.

Should the test do a reflection check to find the specific overload wanted is present and assert if it isn't?

The build system can do that at compile time, failing if the method is in the ref but not in the src. Reflection should not be used.

@Wraith2
Copy link
Contributor

Wraith2 commented Nov 2, 2018

StirngBuilder is in corelib so no ref project. How and where is the exposed surface checked?
I can add a perf test in System.Runtime.Performance.Tests but that will just start accumulating statistics allowing us to see future degradations. It won't validate that it compares favourably to object

@stephentoub
Copy link
Member

How and where is the exposed surface checked?

In corefx.

It won't validate that it compares favourably to object

Add both and compare.

@ViktorHofer
Copy link
Member

You can use https://apisof.net/ to check in which assembly an API is exposed. StringBuilder is exposed in System.Runtime/ref, implemented in System.Private.CoreLib, tests for it are in System.Runtime.Tests and performance tests in System.Runtime.Performance.Tests.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime good first issue Issue should be easy to implement, good for first-time contributors help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests