Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/9.0-rc2] NRBF Fuzzer and bug fixes #107788

Merged
merged 8 commits into from
Sep 17, 2024

Conversation

adamsitnik
Copy link
Member

@adamsitnik adamsitnik commented Sep 13, 2024

This PR combines the PRs that added Fuzzing for NrbfDecoder with the most recent bug fixes.

It contains of:

  1. [NRBF] Don't use Unsafe.As when decoding DateTime(s) #105749 where we have stopped using unsafe cast from ulong to DateTime and are now compliant
  2. Add NrbfDecoderFuzzer #107385 where we have added an excellent Fuzzer that has discovered plenty of edge case bugs
  3. [NRBF] Fix bugs discovered by the fuzzer #107368 where we have fixed some of the bugs discovered by the Fuzzer, mainly around enum parsing and throwing documented exceptions
  4. [NRBF] throw SerializationException when a surrogate character is read  #107532 where we have started throwing SerializationException rather than ArgumentException for surrogate characters
  5. [NRBF] Fuzzing non-seekable stream input #107605 where we have extended the Fuzzer to cover non-seekable Stream code path
  6. [NRBF] More bug fixes #107682 where we have removed int32 overflow, stopped using Debug.Fail and limited the max array length to Array.MaxLength (so far we were doing that only for single dimension arrays, now we do it for multi dimensional arrays as well)
  7. [NRBF] Comments and bug fixes from security review #107735 where we have stopped doing unsafe tricks to read arrays of boolean and DateTime, added null handling to SerializationRecord.TypeNameMatches copied a lot of valuable comments from the internal code review
  8. [NRBF] Address issues discovered by Threat Model  #106629 where we have addressed issues discovered by Threat Model by exposing a new API that allows the users to check the total number of elements of any array record, including potentially hostile jagged arrays.

Customer Impact

  • Customer reported
  • Found internally

Half of the bugs were discovered by the Fuzzer, the other half were reported internally by @GrabYourPitchforks.

Regression

  • Yes
  • No

[If yes, specify when the regression was introduced. Provide the PR or commit if known.]

Testing

All bugs have been turned into unit tests (and of course are passing now).

Risk

The bug fixes present in this PR were relatively simple and each of them individually represents a low risk. But the fact that there are so many increases the risk. Because of this risk, we decided to Mark the System.Formats.Nrbf assembly as [Experimental] with SYSLIB5005 (#107905).

adamsitnik and others added 6 commits September 13, 2024 13:31
* bug #1: don't allow for values out of the SerializationRecordType enum range

* bug #2: throw SerializationException rather than KeyNotFoundException when the referenced record is missing or it points to a record of different type

* bug #3: throw SerializationException rather than FormatException when it's being thrown by BinaryReader (or sth else that we use)

* bug #4: document the fact that IOException can be thrown

* bug #5: throw SerializationException rather than OverflowException when parsing the decimal fails

* bug #6: 0 and 17 are illegal values for PrimitiveType enum

* bug #7: throw SerializationException when a surrogate character is read (so far an ArgumentException was thrown)
# Conflicts:
#	src/libraries/System.Formats.Nrbf/src/System/Formats/Nrbf/NrbfDecoder.cs
- Don't use `Debug.Fail` not followed by an exception (it may cause problems for apps deployed in Debug)
- avoid Int32 overflow
- throw for unexpected enum values just in case parsing has not rejected them
- validate the number of chars read by BinaryReader.ReadChars
- pass serialization record id to ex message
- return false rather than throw EndOfStreamException when provided Stream has not enough data
- don't restore the position in finally 
- limit max SZ and MD array length to Array.MaxLength, stop using LinkedList<T> as List<T> will be able to hold all elements now
- remove internal enum values that were always illegal, but needed to be handled everywhere
- Fix DebuggerDisplay
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Sep 13, 2024
@adamsitnik adamsitnik added area-System.Formats.Nrbf and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Sep 13, 2024
* copy comments and asserts from Levis internal code review

* apply Levis suggestion: don't store Array.MaxLength as a const, as it may change in the future

* add missing and fix some of the existing comments

* first bug fix: SerializationRecord.TypeNameMatches should throw ArgumentNullException for null Type argument

* second bug fix: SerializationRecord.TypeNameMatches should know the difference between SZArray and single-dimension, non-zero offset arrays (example: int[] and int[*])

* third bug fix: don't cast bytes to booleans

* fourth bug fix: don't cast bytes to DateTimes

* add one test case that I've forgot in previous PR
# Conflicts:
#	src/libraries/System.Formats.Nrbf/src/System/Formats/Nrbf/SerializationRecord.cs
@adamsitnik adamsitnik changed the title [release/9.0] Backport NRBF fixes and Fuzzer [release/9.0] NRBF Fuzzer and bug fixes Sep 16, 2024
* introduce ArrayRecord.FlattenedLength

* do not include invalid Type or Assembly names in the exception messages, as it's most likely corrupted/tampered/malicious data and could be used as a vector of attack.

* It is possible to have binary array records have an element type of array without being marked as jagged
@adamsitnik adamsitnik marked this pull request as ready for review September 16, 2024 13:08
@adamsitnik
Copy link
Member Author

@MihuBot fuzz NrbfDecoder

@@ -75,29 +75,30 @@ ulong value when TypeNameMatches(typeof(UIntPtr)) => Create(new UIntPtr(value)),
_ => this
};
}
else if (HasMember("_ticks") && MemberValues[0] is long ticks && TypeNameMatches(typeof(TimeSpan)))
else if (HasMember("_ticks") && GetRawValue("_ticks") is long ticks && TypeNameMatches(typeof(TimeSpan)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is double-dipping the dictionary. Ideally it would be enhanced to use a TryGetValue approach.

That doubles the perf cost of this, and "double" isn't a threat, so it's not critical to fix, but would maybe make sense to do for vNext/main.

I was going to comment that the HasMember call was unnecessary, but it looks like GetMember (called by GetRawValue) will throw for an unknown name.

@carlossanlop
Copy link
Member

@adamsitnik this won't make it before the RC2, which I will do in a few minutes.

Please retarget this PR to the release/9.0-rc2 branch and send an email to Tactics requesting approval.

Assert.True(a);
bool b = bools[1];
Assert.True(b);
bool c = a && b;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you need to use single-ampersand to test what you were trying to test (reliably)

Sharplab shows that sometimes the compiler writes down the IL instruction "and" (which turns into the x86 "and") for both operations, and sometimes it uses brfalse for "&&".

The only time I've seen it use brfalse involved spans, e.g. this example on sharplab...

but & always ends up as IL:and.

Probably doesn't super matter, as this probably uses IL:and.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After finding some boundary examples, I believe that Roslyn turns expr1 && expr2 into and if expr2 is a literal-, local-, or parameter- expression, and brfalse otherwise. (In my linked example it used brfalse because the RHS was an indexer-expression).

If I'm right, then your && here will use the IL and instruction, hence the CPU and instruction, so it is a valid test.

But, wow, that took a long time to "prove".

writer.Write((byte)PrimitiveType.Char);
}

writer.Write((byte)0xC0); // a surrogate character
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there also tests that show that a valid surrogate pair work? Or does BinaryFormatter not support characters outside the BMP?

{
result = reader.ReadChars(count);
}
catch (ArgumentException) // A surrogate character was read.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a single character I can maybe see it throwing for a surrogate half; but I would expect a multiple character read to work for correctly paired surrogates.

Is the comment wrong, and it meant "an unpaired surrogate character was read" (emphasis on "unpaired") or does whatever calls this not support characters outside the BMP? (question is slightly repeated in the tests)


SerializationRecord record = NrbfDecoder.Decode(Serialize(input));

Assert.Throws<ArgumentNullException>(() => record.TypeNameMatches(type: null));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Assert.Throws<ArgumentNullException>(() => record.TypeNameMatches(type: null));
AssertExtensions.Throws<ArgumentNullException>("type", () => record.TypeNameMatches(type: null));

Is a better test, as it ensures that the parameter name is correct.

throw new SerializationException(ex.Message, ex);
}

return Unsafe.As<long, DateTime>(ref data);
[MethodImpl(MethodImplOptions.NoInlining)]
static DateTime CreateFromAmbiguousDst(ulong ticks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see new tests for this code. Hopefully there are existing ones.

Copy link
Member

@bartonjs bartonjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything that would preclude backporting; but there are a few "room for improvement" things for main/vNext.

@adamsitnik adamsitnik changed the base branch from release/9.0 to release/9.0-rc2 September 17, 2024 18:58
@jeffhandley
Copy link
Member

We need to retarget this PR to the release/9.0-rc2 branch. Before we close this PR and replace it with one against the correct target (where we'd break the association with the review comments), we're going to try retargeting it to see if it will cleanly retarget without pulling undesired commits in with it.

@jeffhandley jeffhandley changed the title [release/9.0] NRBF Fuzzer and bug fixes [release/9.0-rc2] NRBF Fuzzer and bug fixes Sep 17, 2024
@jeffhandley jeffhandley added the Servicing-consider Issue for next servicing release review label Sep 17, 2024
@jeffhandley jeffhandley added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Sep 17, 2024
@carlossanlop carlossanlop merged commit fde8a3b into dotnet:release/9.0-rc2 Sep 17, 2024
86 of 93 checks passed
@jeffhandley jeffhandley added the binaryformatter-migration Issues related to the removal of BinaryFormatter and migrations away from it label Sep 18, 2024
Copy link
Contributor

Tagging subscribers to 'binaryformatter-migration': @adamsitnik, @bartonjs, @jeffhandley, @terrajobst

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Formats.Nrbf binaryformatter-migration Issues related to the removal of BinaryFormatter and migrations away from it Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants