-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Stream.ReadAtLeast perf regression in DataContractSerializer #69879
Conversation
dotnet#69272 changed DCS to no longer call Stream.Read inside a loop, but instead call the new ReadAtLeast method. ReadAtLeast only takes a Span<byte>, and not a byte[]. This caused a regression because the internal encoding stream wrapper classes don't override Read(Span<byte>), so the base implementation is used. The base implementation is slower because it needs to rent a byte[] from the pool, and do 2 copies. Overriding Read(Span<byte>) on the internal encoding stream wrapper classes allows ReadAtLeast to be just as fast. Fix dotnet#69730
{ | ||
try | ||
{ | ||
if (_byteCount == 0) | ||
{ | ||
if (_encodingCode == SupportedEncoding.UTF8) | ||
{ | ||
return _stream.Read(buffer, offset, count); | ||
return _stream.Read(buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that _stream
is always a BufferedStream
, which also overrides Read(Span<byte>)
, so we shouldn't have the same problem here.
{ | ||
try | ||
{ | ||
if (_byteCount == 0) | ||
{ | ||
if (_encodingCode == SupportedEncoding.UTF8) | ||
return _stream.Read(buffer, offset, count); | ||
return _stream.Read(buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, EncodingStreamWrapper isn't guaranteed to be a BufferedStream
- I'm not sure why these two encoding wrappers differ. The _stream
here is the Stream that was passed into the public API. So if the caller passes a Stream
that doesn't override Read(Span<byte>)
, this could run into the same problem.
public override int Read(byte[] buffer, int offset, int count) => | ||
Read(new Span<byte>(buffer, offset, count)); | ||
|
||
public override int Read(Span<byte> buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should we override Write(span) on each of these streams as well?
Are there other streams in dotnet/runtime for which we should provide these overrides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should we override Write(span) on each of these streams as well?
Maybe? It could prevent future regressions, like the one I made here.
Looking at where these streams are used for writing, they only get instantiated when the Encoding is not UTF8 (which is different than read - where they always are used). And looking through all the "Write" methods to these streams, they all pass an array, offset, and count. I don't think I should add it since they will never get used. Thoughts?
Are there other streams in dotnet/runtime for which we should provide these overrides?
I looked through all the places that override Read(byte[] ...)
and looked to see if they override Read(Span)
as well.
Here's are ones I think we should add the additional override to:
- Passed to
BinaryReader
/ XmlReader / other place I changed to call ReadAtLeast / ReadExactly:
Line 87 in 472050d
public override int Read(byte[] buffer, int offset, int count) |
public override int Read(byte[] buffer, int offset, int count) |
Line 67 in 472050d
public override int Read(byte[] buffer, int offset, int count) |
Line 31 in 472050d
public override int Read(byte[] buffer, int offset, int count) |
- These get returned through public methods, so they probably should:
public override int Read(byte[] buffer, int offset, int count) |
runtime/src/libraries/System.IO.Packaging/src/System/IO/Packaging/ZipWrappingStream.cs
Line 61 in 472050d
public override int Read( |
public override int Read(byte[] buffer, int offset, int size) |
Here are the ones I'm not sure we need to:
- Only appears to be used by
FtpControlStream
, I can't tell if anyone ever calls Read with a Span. Looks easy to do though.
runtime/src/libraries/System.Net.Requests/src/System/Net/NetworkStreamWrapper.cs
Line 137 in 472050d
public override int Read(byte[] buffer, int offset, int size) |
- Has a lot of derived Stream classes, but I don't see them ever getting
Read(Span)
call on them. This also looks easy to do.
public override int Read(byte[] buffer, int offset, int count) |
Here's the ones I don't think we need to:
- Only gets BeginRead and EndRead called on it
runtime/src/libraries/System.Net.HttpListener/src/System/Net/HttpRequestStream.cs
Line 16 in 472050d
public override int Read(byte[] buffer, int offset, int count) |
- Only used for writing
runtime/src/libraries/System.IO.Packaging/src/System/IO/Packaging/IgnoreFlushAndCloseStream.cs
Line 124 in 472050d
public override int Read(byte[] buffer, int offset, int count) |
- Only do "async" reading, so we can't do anything with a Span
runtime/src/libraries/System.Net.Security/src/System/Net/Security/NegotiateStream.cs
Line 305 in 472050d
public override int Read(byte[] buffer, int offset, int count) |
public override int Read(byte[] buffer, int offset, int count) |
runtime/src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/CryptoStream.cs
Line 292 in 472050d
public override int Read(byte[] buffer, int offset, int count) |
public override int Read(byte[] array, int offset, int count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the overrides for the first 4 locations above, and 2 of the next 3. The SqlBytes one is a little more involved, so I decided to open a separate issue for it - #69921.
…'t currently override the Span overloads.
@@ -424,6 +424,22 @@ public override int Read(byte[] buffer, int offset, int count) | |||
return iBytesRead; | |||
} | |||
|
|||
// Duplicate the Read(byte[]) logic here instead of refactoring both to use Spans | |||
// in case the backing _stream doesn't override Read(Span). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most of our library code we've just assumed that any stream that cares about perf will end up overriding the relevant methods, and so for example wrapper streams like DeflateStream just use, say, ReadAsync(Memory)
rather than having two code paths. But if we think it's worthwhile here to special-case it, maybe because we expect streams wrapped here to be more legacy, that's ok. Eventually it'd be nice to consolidate, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was my thinking - this stream can come from a public constructor parameter, and this type is rather old - so older code could be calling it.
I took a quick look, and the first place I found actually has this problem:
internal SqlXml ToSqlXml()
{
SqlXml sx = new(ToStream());
return sx;
}
internal Stream ToStream()
{
return new SqlCachedStream(this);
}
and SqlCachedStream
doesn't override the Span-based API.
I've opened dotnet/SqlClient#1628 for the Sql team to consider providing Span-based overrides.
#69272 changed DCS to no longer call Stream.Read inside a loop, but instead call the new ReadAtLeast method. ReadAtLeast only takes a
Span<byte>
, and not abyte[]
. This caused a regression because the internal encoding stream wrapper classes don't overrideRead(Span<byte>)
, so the base implementation is used. The base implementation is slower because it needs to rent abyte[]
from the pool, and do 2 copies.Overriding
Read(Span<byte>)
on the internal encoding stream wrapper classes allows ReadAtLeast to be just as fast.Fix #69730