Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations in string.Normalize #34774

Merged
merged 9 commits into from
Apr 10, 2020

Conversation

MihaZupan
Copy link
Member

@MihaZupan MihaZupan commented Apr 9, 2020

While inherently an allocatey API (string Normalize(string)), we can avoid allocating the temporary char[] buffer for the P/Invoke.

These char[] allocations account for ~2% of allocated bytes when using HttpClient with a non-ascii Uri in my simple test. The allocated result (even if unchanged) another ~1.5%.

Is there a common stackalloc threshold we should agree upon across runtime? Highest I've seen so far is 1024 chars. I used 512 here as that is what IdnMapping uses as well.

@MihaZupan MihaZupan added this to the 5.0 milestone Apr 9, 2020
@MihaZupan MihaZupan requested review from stephentoub and tarekgh April 9, 2020 18:02
@ghost
Copy link

ghost commented Apr 9, 2020

Tagging @tarekgh, @safern as an area owner

@danmoseley
Copy link
Member

danmoseley commented Apr 9, 2020

Is there a common stackalloc threshold we should agree upon across runtime? Highest I've seen so far is 1024 chars. I used 512 here as that is what IdnMapping uses as well.

This one is 4096 (bytes, not chars)

byte* pBuffer = stackalloc byte[PROC_PIDPATHINFO_MAXSIZE];

I assume we should pick a number that is suitable everywhere since we generally do not know how much stack we have - and presumably it's the same number on all platforms for simplicity.

@stephentoub are you comfortable with 4096? 1024?

@stephentoub
Copy link
Member

We've generally tried to not go above 1K. 4K feels like a lot, especially on macOS where the default stack size is lower than other platforms. In some cases there's a natural upper-bound based on the scenario, and we've been a bit more liberal in those cases. In others where it's arbitrary, we've talked about just standardizing on a size that we use everywhere, e.g. 256 bytes or 128 chars. I would still like us to go through and do such a standardization where appropriate, we've just not prioritized it.
cc: @jkotas

@jkotas
Copy link
Member

jkotas commented Apr 9, 2020

I would still like us to go through and do such a standardization where appropriate, we've just not prioritized it.

It should be a new API that will do the right thing for the given context and platform without your worrying about it: #25423

{
return new string(buf, 0, realLen);
ArrayPool<char>.Shared.Return(toReturn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up: we generally don't return buffers to the shared pool inside finally blocks. The sole exception is where we're in 100% control of every single code path that might be invoked inside the preceding try.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up: we generally don't return buffers to the shared pool inside finally blocks. The sole exception is where we're in 100% control of every single code path that might be invoked inside the preceding try.

We're inconsistent on this, e.g.

private async Task CopyToAsyncInternal(Stream destination, int bufferSize, CancellationToken cancellationToken)
{
byte[] buffer = ArrayPool<byte>.Shared.Rent(bufferSize);
try
{
while (true)
{
int bytesRead = await ReadAsync(new Memory<byte>(buffer), cancellationToken).ConfigureAwait(false);
if (bytesRead == 0) break;
await destination.WriteAsync(new ReadOnlyMemory<byte>(buffer, 0, bytesRead), cancellationToken).ConfigureAwait(false);
}
}
finally
{
ArrayPool<byte>.Shared.Return(buffer);
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but we should try to get it under check for new code (like this). The GitHub UI is misbehaving for me so I can't see the rest of the code. So this usage might not be problematic at all but somebody should double check.

@MihaZupan
Copy link
Member Author

Added a check to return the original string if normalization didn't change anything (which should be the common-case as well)

}
int lastError = Marshal.GetLastWin32Error();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this valid even if realLength > 0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I see the above comment now)

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question about comparing to Interop.BOOL.TRUE. Otherwise LGTM.

@jkotas jkotas merged commit cd185d9 into dotnet:master Apr 10, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants