Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of the IFormatter infrastructure, safe polymorphic deserialization (not just BinaryFormatter) #50909

Closed
koszeggy opened this issue Apr 8, 2021 · 13 comments
Labels
area-System.Runtime design-discussion Ongoing discussion about design without consensus
Milestone

Comments

@koszeggy
Copy link

koszeggy commented Apr 8, 2021

As we know, BinaryFormatter is being obsoleted in the upcoming .NET versions as it is a security risk (see Levi's excellent design document about the topic).

In #29976 he also mentions that a "safe" formatter (quotes from him) will be also implemented and it will be available in a separate NuGet package along with the obsoleted BinaryFormatter.

In this issue I would like to propose a generic way for resolving types by a serializer, and I would like to clarify the future of the IFormatter infrastructure itself.

If I'm not mistaken there are two requirements that must be fulfilled to make any polymorphic serializer safe:

  1. The already loaded assemblies must not have any serializable types that can be exploited for attacks
  2. It must not be allowed to load any assembly during the deserialization, even if the serialization stream contains assembly qualified type names

Due to point 1.) the .NET Framework cannot be considered safe because it has potentially dangerous serializable types in mscorlib.dll and System.dll. Though a custom IFormatter implementation may contain special handling for known potentially dangerous types such as TempFileCollection (delete files attack) or StructuralEqualityComparer (DoS attack). Starting with .NET Core these types are not serializable anymore, and TempFileCollection is not even part of the core assemblies as it has been moved into a separate NuGet package.

As for point 2.), the current implementation of BinaryFormatter loads all assemblies automatically that are referred in the serialization stream, which is a huge security issue (as illustrated in this video), therefore it cannot be considered safe even in .NET Core and above. But it does not mean it cannot be prevented by a safe type resolving logic. It could then throw an exception like this:

Cannot resolve assembly in safe mode: "MyDangerousAssembly, Version=1.0.0.0, PublicKeyToken=null".
You may try to preload the assembly before deserialization or to disable SafeMode if the serialization stream is from a trusted source.

Actually this is the error message from my own IFormatter implementation when safe mode is enabled. And this is how all polymorphic serializers should work (including text-based ones, eg. XML/JSON) when there is no custom control applied such as a SerializationBinder.

But even BinaryFormatter has totally safe usages: deep cloning objects, creating in-memory snapshots for undo/redo, etc. I understand that it is much more often misused than not so after all I can understand if it will be moved into a separate package.

But please do not move the IFormatter infrastructure itself (including all attributes, interfaces, GetUninitializedObject and so on) so it still will be possible to create (safely) serializable types and also safe(r) formatters without the need of referencing an external package, which btw. would load also the obsolete BinaryFormatter even it's not needed.

And finally, I can't really support the idea that totally harmless new types (such as Date and TimeOfDay) should not be [Serializable]. Obviously, in the .NET Framework it was too generously (mis)used so it lost it's 'I'm safe to be used by formatters' meaning, but at least this was fixed in .NET Core. And forcing users to disable safe mode to be able to serialize such types by formatters would just weaken the whole infrastructure even more.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 8, 2021
@danmoseley
Copy link
Member

@GrabYourPitchforks

@GrabYourPitchforks
Copy link
Member

he also mentions that a "safe" formatter (quotes from him) will be also implemented and it will be available in a separate NuGet package

I didn't commit to that plan, only stated that it was one possible course of action. At this time we don't have such a serializer funded.

totally harmless new types (such as Date and TimeOfDay)

Neither of those types would be safe given the way BinaryFormatter operates. It's a bit too nuanced for me to describe here, but there is almost no type in the entire framework that would be "safe" to deserialize.

@GrabYourPitchforks GrabYourPitchforks added area-Serialization design-discussion Ongoing discussion about design without consensus labels Apr 8, 2021
@koszeggy
Copy link
Author

koszeggy commented Apr 8, 2021

It's a bit too nuanced for me to describe here

If you mean the possibly out-of-range non-validated field values (eg. invalid Ticks, undefined enum, etc.), then I think I know what you mean (but such types can implement ISerializable and validate the data in their special constructor). Otherwise, I would love to see an example.

@koszeggy
Copy link
Author

koszeggy commented Apr 9, 2021

Just a few additional thoughts. I'm starting to believe that actually all types that are currently serializable in .NET 5 can be made secure for IFormatters while maintaining compatibility (at least in terms of their BinaryFormatter support, but see the note below).

We can divide the [Serializable] types into 3 groups:

  1. The type does not implement ISerializable and it's not a problem - The simplest group, though it has only a few members: primitive types, enums, KeyValuePair<TKey, TValue>, ValueTuple<...>, etc.
  2. The type already implements ISerializable - the SerializationInfo must be validated to prevent creating an invalid instance during the deserialization. The latest implementation of the Dictionary<TKey, TValue> is a good example for this (of course, these rules must apply for its TKey and TValue recursively).
  3. The type does not implement ISerializable but an invalid instance can be created by setting its fields randomly
    3.a. An [OnDeserialized] method could check the consistency and throw a SerializationException if the instance is invalid
    3.b. The type should implement ISerializable. Good news: It turns out that BinaryFormatter actually supports the custom deserialization of a default object graph (this happens also when a surrogate selector is used on deserialization), so for example, adding an ISerializable implementation to List<T> would not break even the deserialization of the legacy payloads. It's just that the original field names must be used in the SerializationInfo entries.

This could definitely restore the 'I'm safe to be serialized by an IFormatter' meaning of the [Serializable] attribute (at least in the 1st party assemblies).

Note for 3.b: I don't know whether other IFormatter implementations are able to custom-deserialize a default object graph. Though mine supports it as well (but it's not that widely used so who cares... 😁), 3.b. could be a breaking change for some other IFormatters.

And of course, the points in my opening post still apply. So (1) a TempFileCollection-like type must never be serializable, and (2) a formatter must not load assemblies during the deserialization in safe mode.

@GrabYourPitchforks
Copy link
Member

This could definitely restore the 'I'm safe to be serialized by an IFormatter' meaning of the [Serializable] attribute

The [Serializable] attribute never meant "I'm safe to be serialized" in any version of .NET. It only ever meant "it's possible to serialize / deserialize me." Repurposing the definition to imply safety would be impractical.

The latest implementation of the Dictionary<TKey, TValue> is a good example for this

Dictionary<TKey, TValue>'s currently implementation is not safe for deserialization from untrusted input. Even for "simple" instantiations like Dictionary<string, string>.

@koszeggy
Copy link
Author

koszeggy commented Apr 9, 2021

Dictionary<TKey, TValue>'s currently implementation is not safe for deserialization from untrusted input.

Ah, ok. I didn't review it just saw the throw statements in OnDeserialization so I was happy about finding a positive example.

@koszeggy
Copy link
Author

In the meantime I realized that making serializable types safe is just one more necessary but still not sufficient condition to make the whole deserialization safe.

BinaryFormatter is vulnerable at a deeper level (and I need to fix it in my serializer, too): the way of arrays (and maybe also strings) are allocated during the deserialization makes possible to cause OutOfMemoryException by a manipulated stream. The current Dictionary<TKey, TValue> implementation also has a possible OutOfMemoryException vulnerability due to its capacity initialization but a similar issue exists also in the formatter itself. In SafeMode it should not be allowed to allocate suspiciously large collections at once. Instead, they must be dynamically built, even if this actually means a larger memory consumption (due to the copies and resizes). A corrupted stream then could just cause a regular SerializationException either due to unexpected end of the stream, or when the invalid content is detected otherwise (eg. in the deserialization constructors or invalid reference ids, etc).

I try to build these protections into my serializer in SafeMode. On the other hand I begin to understand why BinaryFormatter is deemed so hopeless. All of these rules should have been strictly demanded from the beginning. Just one tiny gap here or there and the whole construction gets brittle.

So just back to the original topic: I don't mind if BinaryFormatter will be moved into a separate NuGet without fixing it. But I still would like to suggest leaving the IFormatter infrastructure itself in the core assemblies so 3rd party formatters will not require to reference the obsoleted BinaryFormatter and its package.

@koszeggy
Copy link
Author

FYI: I fixed the possible OutOfMemoryException attacks in my serializer when deserializing arrays and natively supported collections in SafeMode. As Dictionary<TKey, TValue> is among the natively supported types it is also safe, unless the ForceRecursiveSerializationOfSupportedTypes option was used on serialization, in which case the regular ISerializable constructor kicks in.
(Native support has another obvious benefit: the serialization stream will be extremely compact.)

BinaryFormatter is vulnerable at a deeper level (and I need to fix it in my serializer, too): the way of arrays (and maybe also strings) are allocated during the deserialization makes possible to cause OutOfMemoryException by a manipulated stream.

As for strings, I have to correct myself: BinaryFormatter uses BinaryReader to read strings, which is already safe against such attacks.

@ghost
Copy link

ghost commented Jul 22, 2021

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

As we know, BinaryFormatter is being obsoleted in the upcoming .NET versions as it is a security risk (see Levi's excellent design document about the topic).

In #29976 he also mentions that a "safe" formatter (quotes from him) will be also implemented and it will be available in a separate NuGet package along with the obsoleted BinaryFormatter.

In this issue I would like to propose a generic way for resolving types by a serializer, and I would like to clarify the future of the IFormatter infrastructure itself.

If I'm not mistaken there are two requirements that must be fulfilled to make any polymorphic serializer safe:

  1. The already loaded assemblies must not have any serializable types that can be exploited for attacks
  2. It must not be allowed to load any assembly during the deserialization, even if the serialization stream contains assembly qualified type names

Due to point 1.) the .NET Framework cannot be considered safe because it has potentially dangerous serializable types in mscorlib.dll and System.dll. Though a custom IFormatter implementation may contain special handling for known potentially dangerous types such as TempFileCollection (delete files attack) or StructuralEqualityComparer (DoS attack). Starting with .NET Core these types are not serializable anymore, and TempFileCollection is not even part of the core assemblies as it has been moved into a separate NuGet package.

As for point 2.), the current implementation of BinaryFormatter loads all assemblies automatically that are referred in the serialization stream, which is a huge security issue (as illustrated in this video), therefore it cannot be considered safe even in .NET Core and above. But it does not mean it cannot be prevented by a safe type resolving logic. It could then throw an exception like this:

Cannot resolve assembly in safe mode: "MyDangerousAssembly, Version=1.0.0.0, PublicKeyToken=null".
You may try to preload the assembly before deserialization or to disable SafeMode if the serialization stream is from a trusted source.

Actually this is the error message from my own IFormatter implementation when safe mode is enabled. And this is how all polymorphic serializers should work (including text-based ones, eg. XML/JSON) when there is no custom control applied such as a SerializationBinder.

But even BinaryFormatter has totally safe usages: deep cloning objects, creating in-memory snapshots for undo/redo, etc. I understand that it is much more often misused than not so after all I can understand if it will be moved into a separate package.

But please do not move the IFormatter infrastructure itself (including all attributes, interfaces, GetUninitializedObject and so on) so it still will be possible to create (safely) serializable types and also safe(r) formatters without the need of referencing an external package, which btw. would load also the obsolete BinaryFormatter even it's not needed.

And finally, I can't really support the idea that totally harmless new types (such as Date and TimeOfDay) should not be [Serializable]. Obviously, in the .NET Framework it was too generously (mis)used so it lost it's 'I'm safe to be used by formatters' meaning, but at least this was fixed in .NET Core. And forcing users to disable safe mode to be able to serialize such types by formatters would just weaken the whole infrastructure even more.

Author: koszeggy
Assignees: -
Labels:

Design Discussion, area-Serialization, area-System.Runtime, untriaged

Milestone: -

@jeffhandley jeffhandley added this to the Future milestone Jul 30, 2021
@jeffhandley jeffhandley removed the untriaged New issue has not been triaged by the area owner label Jul 30, 2021
@GrabYourPitchforks
Copy link
Member

Closing this issue since no action is required on behalf of the libraries team.

@koszeggy
Copy link
Author

And what is the decision then? This was my suggestion:

But please do not move the IFormatter infrastructure itself (including all attributes, interfaces, GetUninitializedObject and so on) so it still will be possible to create (safely) serializable types and also safe(r) formatters without the need of referencing an external package, which btw. would load also the obsolete BinaryFormatter even when it's not needed.

@GrabYourPitchforks
Copy link
Member

And what is the decision then?

The IFormatter infrastructure will obviously stick around for compat. The final home for it hasn't been determined. The way things are today, simply using IFormatter doesn't activate the BinaryFormatter code paths, and if you're using .NET's assembly minification capabilities the BinaryFormatter type won't even exist in the final assembly. There's no reason to believe that would change going forward. But otherwise, the framework always needs the flexibility to be able to move types around as benefits the long-term health of the ecosystem.

Existing APIs like RuntimeHelpers.GetUninitializedObject will remain. There was never any plan to move those.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime design-discussion Ongoing discussion about design without consensus
Projects
None yet
Development

No branches or pull requests

5 participants