-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce allocations in string.Normalize #34774
Conversation
Tagging @tarekgh, @safern as an area owner |
src/libraries/Common/src/Interop/Unix/System.Globalization.Native/Interop.Normalization.cs
Outdated
Show resolved
Hide resolved
This one is 4096 (bytes, not chars)
I assume we should pick a number that is suitable everywhere since we generally do not know how much stack we have - and presumably it's the same number on all platforms for simplicity. @stephentoub are you comfortable with 4096? 1024? |
We've generally tried to not go above 1K. 4K feels like a lot, especially on macOS where the default stack size is lower than other platforms. In some cases there's a natural upper-bound based on the scenario, and we've been a bit more liberal in those cases. In others where it's arbitrary, we've talked about just standardizing on a size that we use everywhere, e.g. 256 bytes or 128 chars. I would still like us to go through and do such a standardization where appropriate, we've just not prioritized it. |
src/libraries/Common/src/Interop/Unix/System.Globalization.Native/Interop.Normalization.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/Normalization.Unix.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/Normalization.Unix.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Globalization.Native/Interop.Normalization.cs
Outdated
Show resolved
Hide resolved
It should be a new API that will do the right thing for the given context and platform without your worrying about it: #25423 |
src/libraries/Common/src/Interop/Unix/System.Globalization.Native/Interop.Normalization.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/Normalization.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Globalization.Native/Interop.Normalization.cs
Outdated
Show resolved
Hide resolved
{ | ||
return new string(buf, 0, realLen); | ||
ArrayPool<char>.Shared.Return(toReturn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up: we generally don't return buffers to the shared pool inside finally blocks. The sole exception is where we're in 100% control of every single code path that might be invoked inside the preceding try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up: we generally don't return buffers to the shared pool inside finally blocks. The sole exception is where we're in 100% control of every single code path that might be invoked inside the preceding try.
We're inconsistent on this, e.g.
runtime/src/libraries/System.Private.CoreLib/src/System/IO/Stream.cs
Lines 114 to 130 in 26b6e4e
private async Task CopyToAsyncInternal(Stream destination, int bufferSize, CancellationToken cancellationToken) | |
{ | |
byte[] buffer = ArrayPool<byte>.Shared.Rent(bufferSize); | |
try | |
{ | |
while (true) | |
{ | |
int bytesRead = await ReadAsync(new Memory<byte>(buffer), cancellationToken).ConfigureAwait(false); | |
if (bytesRead == 0) break; | |
await destination.WriteAsync(new ReadOnlyMemory<byte>(buffer, 0, bytesRead), cancellationToken).ConfigureAwait(false); | |
} | |
} | |
finally | |
{ | |
ArrayPool<byte>.Shared.Return(buffer); | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, but we should try to get it under check for new code (like this). The GitHub UI is misbehaving for me so I can't see the rest of the code. So this usage might not be problematic at all but somebody should double check.
src/libraries/System.Private.CoreLib/src/System/Globalization/Normalization.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/Normalization.Unix.cs
Outdated
Show resolved
Hide resolved
Added a check to return the original string if normalization didn't change anything (which should be the common-case as well) |
src/libraries/System.Private.CoreLib/src/System/Globalization/Normalization.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/Normalization.Unix.cs
Outdated
Show resolved
Hide resolved
} | ||
int lastError = Marshal.GetLastWin32Error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this valid even if realLength > 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I see the above comment now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question about comparing to Interop.BOOL.TRUE. Otherwise LGTM.
While inherently an allocatey API (
string Normalize(string)
), we can avoid allocating the temporary char[] buffer for the P/Invoke.These char[] allocations account for ~2% of allocated bytes when using HttpClient with a non-ascii Uri in my simple test. The allocated result (even if unchanged) another ~1.5%.
Is there a common stackalloc threshold we should agree upon across runtime? Highest I've seen so far is 1024 chars. I used 512 here as that is what IdnMapping uses as well.