-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base64.Decode: fixed latent bug for invalid input that is less than a block-size #79952
Conversation
Tagging subscribers to this area: @dotnet/area-system-memory Issue DetailsRepro: ReadOnlySpan<byte> base64 = stackalloc byte[] { (byte)'A', (byte)'B', (byte)'C', (byte)'D' };
Span<byte> data = stackalloc byte[128];
base64 = base64[..3];
OperationStatus status = Base64.DecodeFromUtf8(base64, data, out int consumed, out int written);
Console.WriteLine($"status: {status}, consumed: {consumed}, written: {written}"); We fill See #79334 (comment) for an investigation of this 🐛. runtime/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Decoder.cs Lines 200 to 210 in dca7ee6
srcMax is more than int.MaxValue away from src and iff [src, src + 4) contains valid base64 encoded bytes, then it may consume a lot of data outside of valid ranges.
There was a test-hole, as the test This 🐛 got introduced with dotnet/corefx#34529 (🙈), so it's there since .NET Core 3.1. The repro above is artificial and constructed to investigate #79334 (comment). In real-world usage it may or may not happen, that depends on the value read Even if
|
@@ -101,7 +101,7 @@ public static unsafe OperationStatus DecodeFromUtf8(ReadOnlySpan<byte> utf8, Spa | |||
} | |||
|
|||
ref sbyte decodingMap = ref MemoryMarshal.GetReference(DecodingMap); | |||
srcMax = srcBytes + (uint)maxSrcLength; | |||
srcMax = srcBytes + maxSrcLength; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The uint
-cast was there to avoid the movsxd
(on x86), so removing the cast will introduce the movsxd
and I expect that be cheaper than having a if
to guard the loop.
PS: this is a reason why I dislike walking with pointers around (ptr++
), and prefer index-based addressing (ptr[i]
or ptr + offset
) as it's clearer where to start and where to end.
Thanks. |
Repro:
We fill
base64
with four valid base64-bytes, then we slice it to only contain 3 bytes.Thus decoding should result in
InvalidData
(which is does) andconsumed
,written
should be both0
, but they are4
,3
which is wrong, as it's read beyond the valid range.See #79334 (comment) for an investigation of this 🐛.
As in the loop condition
runtime/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Decoder.cs
Lines 200 to 210 in dca7ee6
srcMax
is more thanint.MaxValue
away fromsrc
and iff[src, src + 4)
contains valid base64 encoded bytes, then it may consume a lot of data outside of valid ranges.There was a test-hole, as the test
BasicDecodingInvalidInputLength
has a too big start for the range. Thus a new specific test for this case (input length < BlockSize) got added.This 🐛 got introduced with dotnet/corefx#34529 (🙈), so it's there since .NET Core 3.1.
And as by accident I know the author of that PR quite well the
uint
-cast is placed there to avoid amovsxd
.The repro above is artificial and constructed to investigate #79334 (comment). In real-world usage it may or may not happen, that depends on the value read
base64[3]
. If this is by accident valid base64 byte, then the 🐛 manifests.Even if
InvalidData
is reported correctly, the real problem is that it's read beyond the given / allowed range.Since this bug exists for quite some time now and we don't have any bug-reports for this, I don't assume it's critical enough to backport that change.