-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntactic sugar for late-initialization non-nullable auto-properties #2452
Comments
/cc @divega @ajcvickers |
Thanks @roji. I suspect that this kind of sugar will be heavily requested by EF Core users once they start using the feature. |
@roji Might be a good idea to allow this at the class level in addition to individual properties and then in cases you want to opt out you could do the opposite |
Should it also put an exception in the setter if you try to set the value to null? Perhaps enabled with an optional parameter in the attribute? |
Makes sense to me... Maybe a member-level opting out of a type-level opt-in is a bit much, but a type-level opt-in definitely makes sense for pure DTO types. |
I think you're worried that "late" would be construed as "lazy"... I personally don't feel there's much risk of that - I don't recall having seen "late initialization" mean that in other contexts - but naming is always the hardest part :)
I don't think this makes sense - there's no implicit runtime checking for normal NRT properties, why should there be one for late-initialized ones? The general approach of the NRT feature is to rely on the compiler to avoid nulls where they don't belong. |
I wouldn't expect the compiler team to develop something like that. Instead, I'd develop myself a Fody "weaver" to achieve the requested behavior. It's been helping me a lot since I found it. |
Fody and weaving are great, but the issue this is trying to address is likely to affect many, many users (anyone using DTOs or various late initialization scenarios), so it seems reasonable to expect a solution from the compiler, rather than rely on an external weaving solution. |
It seems to me that this feature simply replaces the infamous Instead of doing this, I think that frameworks such EF should support entities that don't have a parameterless constructor and that work with constructors to initialize values. |
@ymassad |
@ymassad The difference between |
Agree very much with what @YairHalberstadt and @eyalsk wrote above - an early InvalidOperationException from the correct place is much better than a NullReferenceException thrown from some random place in the code where a null inadvertently ended up.
That simply doesn't always work. For example EF sometimes have to load two entities which reference each other - there's simply no way to initialize that kind of circular graph via constructors. Another scenario is an object that has some I/O operation as part of its initialization. This can't happen in the constructor (since constructors can't be async), so another initialization methods is needed. It seems wrong to make the relevant fields nullable just because they're initialized late, and force all access to perform null checks, because we know they'll never be null (assuming the type was properly initialized). |
I think, C# should change the meaning of Object initializer syntax and lift it from a syntactic sugar to an actual object constructor. This will be a breaking change, but breakage can be minimized if designed carefully. There can be a flag to enable previous behavior. And it will actually solve this problem, instead of some makeshift solution. |
@gulshan this proposal is more about DTO types and other late initialization scenarios, and not only about object initializer syntax. However, I agree in general that the current object initializer feature doesn't intersect well with the new nullability feature (see the last paragraph in my original issue description). |
This will break massive amounts of code, thus there's no chance it would be considered.
The team has been quite vocal about being against introducing flags/dialects to the language. The barrier for acceptance there is extraordinarily high. |
I'm a little suspicious of this proposal. It feels more like a library concern than a language concern, although I'm not sure how it could be solved as such. But the motivation seems to be around the NRT feature, and one of the design goals of that feature was to not affect the behavior of the code at all. So having a related feature which would change the behavior seems inappropriate. |
@HaloFour As object initialization became the default .net way of initializing objects, surely any change in the meaning of object initializer syntax will affect a massive amount of code. But I think, lifting it to valid object constrctor will break a lot less code. Any common example code that will break? |
So, if there's a constructor that smells sorta like the initializer it will automatically convert to calling that? What if only some of the properties match? What if there are multiple overloads? How is the compiler going to do the matching at all, especially given that the naming convention between properties and parameters is different. In the end, at best, it results in silently changing the meaning of a lot of existing code, which is a very bad thing. It would be very difficult to reason if, when and how property initializers are promoted to constructor parameters vs. remaining property setters.
As long as it changes the meaning of existing code there's really nothing to discuss. |
I'm not sure what you mean by library concern... This proposal comes from my experience after both annotating actual codebases and working on EF Core's support for nullability. My concern is that the uninitialized warnings introduced in the new nullability feature makes it very difficult/verbose to write DTOs or other late-initialized types - a very common scenario. Current options are to either manually wrap a nullable field as I showed above, or to suppress the uninitialized warning - which is very undesirable (could very easily lead to suppressing the warning on other fields which are not late-initialized, etc.).
This is a completely opt-in feature (developers would need to use the new attribute), so a change in behavior seems to be acceptable. It's also not part of the NRT feature - more like a supporting feature designed to make null-aware code easier and less verbose to write. Regardless, how do you see this problem? Do you agree that there's an issue here to be solved, and if so, do you have another solution in mind? |
I agree there's a problem. For normal late-initialized types I think the problem needs to be solved through better flow analysis that can take property assignment into consideration. The compiler needs to keep track of which properties must be initialized with a non-null value and which properties have been initialized and warn if the object in another manner or escapes before all properties are initialized. As for frameworks like EF or JSON.NET that construct these objects via reflection or other means this definitely gets a lot trickier. Ultimately I think those libraries should take into account NRTs and should throw if it can't properly initialize the type. This is akin to model validation with MVC, for instance. Of course that would take time to percolate out, and would likely need to be an opt-in feature at least initially. In either case, the NRTs warnings within the POCO is a problem that I think needs to be addressed without requiring additional language changes. |
It sounds like you're thinking of something that's a bit similar to #2328, where a non-nullable property is somehow allowed to remain null within the scope of a single method as long it's properly assigned within that method. There seem to be quite a few issues with that approach: as I wrote in #2328, it's very common for instantiation and population to happen in different methods, and I don't see how that could be tracked by the compiler. This seems to also imply that the compiler would never warn for non-nullable properties which aren't initialized in a constructor - which seems like a pretty vital part of the NRT feature.
I'm not sure we're talking about the same thing... There's no actual problem with those libraries - the question is how ordinary users are supposed to write the types which EF or JSON.NET populate.
But what would that mean, unless you're proposing to drop those warnings altogether somehow? |
I assume that there are going to be tons of edge cases with NRTs in general where the compiler can't track nullability, especially across a boundary like a method call. Only thing I can suggest for that is yet-more-attributes, e.g. one that would denote that a method can accept a partially-initialized object and will result in that object being initialized. But I don't think that the compiler will ever be able to solve all of these cases and we're going to have to adopt new patterns going forward.
I'm not implying that would apply universally. Perhaps only for POCOs that have parameterless constructors, or POCOs that are specifically annotated with an attribute to indicate that they are late-initialized.
Yes there is: those libraries haven't been updated to support NRTs. If they were, and could enforce the validation of the types as they were being created, there'd be no need for this proposal at all. Those libraries are the better place for this as they can manage the policy. For example, how do you propose this feature work with EF POCO properties that are optionally update on create? They'd be nullable on initialization and allowed to be nullable when passed to EF. If EF tries to read that property it would throw that exception. But those properties would always be non-nullable when read from EF.
It would mean that the warnings would move from the constructor to the code that is constructing/initializing that type. |
Do you have anything to back this claim, because I sure have never seen that. |
Wouldn't the same objection apply to Simplified parameter null validation code (#2145), which is a championed proposal? |
That seems orthogonal to NRTs, and the team seems to stress that it doesn't affect how the compiler interprets the type, only that value. Theoretically a parameter declared |
This is quite close to the discussion in #2328. Like I wrote there, I really can't see how attributes can help here in the general case. To me, late initialization is a property of a property, not of a type - some properties may be late-initialized on a type, while others not. So you'd have to express which properties are initialized in a given type by a given method which returns that type. That's starting to get a bit much, I think - but maybe I'm wrong.
So once again - in my mind this isn't all all-or-nothing setting on a type, but rather something that may need to be specified on a member-by-member basis (and it definitely can't be assumed just because a type has a parameterless constructor).
So you're writing with the assumption of some solution being already present, i.e. with the warnings having moved from the definition site to the construction site (as you wrote below).
Fair enough. I think that would depend on some cross-method boundary solution, which may make this unfeasable, but I may be wrong. |
I took a quick look at JSON.NET and found the following: https://github.com/JamesNK/Newtonsoft.Json/blob/b34a3ccca7ff6535c0c9e7981479ee1abf169a94/Src/Newtonsoft.Json/Serialization/JsonSerializerInternalReader.cs#L476 I don't know the codebase at all so may be totally mistaken, but there seems to be a I've also written various deserialization/materialization routines in my life, and this kind of separation seemed to be frequent: instantiation and population are typically different operations with different concerns. |
I'm not saying it's specific to a type. It can be specific to given properties. But whether or not an instance is initialized or not depends on whether the compiler determines that the required properties have been initialized for that instance.
I'm writing this under the assumption that many people have reported that NRTs with POCOs is useless and thus the team will work on a solution to address that.
In my opinion what JSON.NET does under the hood is pretty irrelevant. As a serialization library of generic types NRTs are largely not going to apply, at least to how those internals are written. |
The point was that trying to attack the problem of the method boundary via attributes seems like an extremely heavy approach for this, assuming you want the method to declare, for each property of its returned type, whether it has been initialized or not.
Of course, and that solution is exactly what we're discussing here. To attempt a quick summary:
Why is that? Isn't it a goal to allow libraries such as JSON.NET (and similar) to be written with NRTs? Regardless, we've been concentrating on POCOs and materialization, but late initialization is something that occurs in many other scenarios. |
The type of the |
Fair enough, although doesn't that show a problematic aspect of moving the uninitialized warnings from the definition to the construction? Any instance that gets cast to object loses any tracked information about which properties have been initialized, etc. But more generally, I was trying to say that late initialization goes beyond deserializers. Basically it would be good to hear something from the language team about the directions being considered here (or at least that the issue is under consideration). Any approach that removes the uninitialized warnings from the definition removes the immediate pain point, but opens up other problems. |
When constructing an instance of a type you always know the type (excluding via reflection). Usages of that instance (including casting it) before it has been initialised would produce a warning. |
It seems like we're going around in circles. If I'm the only person who thinks that initialization/population can occur across several methods (and even types), then fine - I'll drop it. In that case, a compile-time approach within a single method should be sufficient, while eliminating warnings from the definition. But if that's the case, let's be super clear about it, and there's no need to be discussing any attribute-based approach to the problem - because the problem is by definition limited to the scope of a single method. This ambiguity in the discussion has been part of the confusion from the start IMHO. |
Personally, I think the 99% case outside of reflection based creation and initialisation is to initialise something in a single method. An attribute based DSL might be useful to help with some of the remaining 1%, but it's not strictly necessary. Of course your experience in entity Framework is likely to be very different. |
To clarify again the case we want considered by the compiler: private Foo? _foo;
public Foo Foo
{
get => _foo ?? throw new InvalidOperationException($"The property has not been initialized");
set => _foo = value;
} The compiler does not and cannot know when |
I thought this was about non-nullable properties?
As it stands at the moment, any POCO I write that, for example, gets
instantiated by Dapper from a database, either lists a property as:
`public string? PropName { get; set; }`
Or:
`public string PropName { get; set; } = string. Empty;`
I'd like to have null options, because the database has them, but I'd
rather not have to pre-initialize every non-null, non-value-type property...
…On Mon, 22 Apr 2019, 22:52 Arthur Vickers, ***@***.***> wrote:
To clarify again the case we want considered by the compiler:
private Foo? _foo;
public Foo Foo
{
get => _foo ?? throw new InvalidOperationException($"The property has not been initialized");
set => _foo = value;
}
The compiler does not and cannot know when set is going to be called or
from where.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2452 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADIEDQOUV55VTLLWNWFTEGLPRYXRBANCNFSM4HGXYFVA>
.
|
I can't say that I've ever seen initialization spread all over the place like that, at least not in code constructing concrete types. Maybe that's common for serialization libraries where a feature like this doesn't seem particularly useful.
I don't like a runtime approach to this, not the least of which one espoused by the compiler. There is too much policy here and too little control as to what is actually emitted. If source generators were a thing yet this would very clearly fall under their domain, especially since the suggestion here is to use attributes. Furthermore I can see this causing issues with serializers and ORMs that would otherwise handle a
This is what is currently being designed for NRTs. I expect that the POCO situation will be resolved in some way for the majority of use cases. That very well may not account for initialization spread out all over the place. It's tough to say what the non-POCO late initialization pass might look like, if it's considered at all. The problem with NRTs will be the same as the problem with this approach: the compiler has no idea which of the properties may have been initialized by any given constructor.
This is quite backwards to the approach that the team is taking with NRTs. |
I don't really understand this... Why serializers and ORMs (or anyone) be reading a property which hasn't yet been initialized? The whole point of NRTs is to make sure that a null isn't observed on non-nullable members, so how is it better for them to return null?
Once again, as I already wrote, late initialization is something more general than just deserialization of POCOs. An object may contain some properties that need to be populated from some other source, without this being doable from their constructor (e.g. because it requires async I/O) - but once properly initialized, will never contain null. Another possible example is two objects which need to contain references to one another. My intuition is that method-scoped compile-time analysis is not going to be enough in the general case, but I may be wrong.
It's definitely not ideal, just a proposal attempting to be pragmatic. It seems like we're no longer in the early days of C# 8, and for now this problem hasn't been addressed - and it would be better to have a runtime solution than nothing at all. In any case, it seems like what's needed next in this discussion is some input from the language folks, which have undoubtedly discussed this internally. |
Because not all operations are symmetrical. Properties which are initialized or updated via the ORM itself do not require initialization, or may allow optional initialization with a database default. Those properties would never be null on read, and reads would constitute the vast majority of operations, so it makes sense to mark the properties as not null. Maybe those aren't situations where you envision this feature being useful, but I can see another developer falling into the pit of failure of thinking that it should apply here.
I disagree. I think that scoping such initialization within a method would cover the majority of use cases of late-initialized objects. I think that the spreading of such initialization across multiple methods would be an edge case and one that I seriously doubt that NRTs will be able to solve to a degree that you'd likely find satisfactory.
C# 8.0 hasn't shipped yet and the team is still working on NRTs based on feedback. I fully expect that the team will address late-initialization before C# 8.0 goes live. You have a runtime solution today. It's just more verbose than you'd prefer. You could already make it less verbose through a helper method. Either way you're not blocked. But sure, I'd also like to hear the language team triage this. |
I think we have a disconnect here... If a property may contain null as an actual value (i.e. in the database), then obviously it should be nullable in C# and this whole discussion doesn't apply . On the other hand, if null isn't a valid data value, but needs to be temporarily allowed because of late initialization, then it should be a programming error for anyone to read it before it has been initialized, and therefore an InvalidOperationException makes sense, no?
That's true. But seeing what the repetition of the verbose pattern does to a codebase makes me fear that users (e.g. of EF Core) would prefer to abandon the NRT feature rather than going down that road.
How could a helper method help here? We're talking about the definition of the property as backed by a nullable field, no? One more point on this... The current alternative proposals seem to concentrate on removing the uninitialized warnings from the definition site, and instead relying on compile-time analysis at the initialization sites. While that may be good for late initialization, it seems important to keep the uninitialized warnings for properties which aren't late-initialized. If that is so, then users still need a means of distinguishing late-initialized properties from regular properties, i.e. a |
You can't differentiate between a programmer and some library doing reflection-based interrogation of that instance, such as a serializer or an ORM.
It's the same pattern that has to be employed by anyone using properties for anything other than auto-properties. There are other proposals that are trying to look into making them less verbose for simple cases which may apply here and would be significantly more useful than a one off. Source generators would make all proposals like this accomplishable via plugins and would require zero language changes. There are probably a half-dozen other similar proposals involving sticking an attribute on an auto-property to have the compiler automatically emit some kind of code within the accessors all of which have been put in the Source Generator bucket, for better or worse. As for users of EF Core, or any other ORM/deserializer, that responsibility should be built into the library. It should detect that it can't initialize a type that has required properties and throw immediately rather than leak out some half-initialized type that won't throw until some indeterminate time later.
You could use a helper method to return or throw from the property getter. It doesn't eliminate the property boilerplate, but it makes the getter more concise.
That is the point of those proposals. If you don't late-initialize the required properties during, the compiler will emit a warning. I do agree that either way an attribute is likely required here so that the compiler can tell which types are actually late-initialized vs. ones that assign their required properties in a parameterless constructor. Either way, given NRTs are shipping with C# 8.0 and they're actively working on that now I expect that a resolution for late-initialized types will come as a part of that effort. I personally wouldn't expect new work to be considered/designed/implemented within the C# 8.0 timeframe. |
I admit this conversation has made see that the value of a runtime check here would be more limited than I originally thought. As there seems to be general resistance to the idea, I'll close this. Here's a quick summary with the hope it helps.
|
This attempts to solve the same issues as #2328, but via a far simpler mechanism.
Users have raised concerned with the warning produced by the compiler when an NRT field is uninitialized (e.g. #36 (comment), #2411). This happens in many late initialization scenarios where a non-nullable property may be nullable until it is initialized at some point after construction. In a nutshell, it seems that NRT uninitialized warnings are making it very difficult/verbose to write DTOs and similar late-initialized objects in the NRT world, to the point where people are likely to just turn nullability off.
Examples:
These properties should never return null: if accessed without proper initialization/loading, they should typically cause InvalidOperationException to be thrown (it could be said that premature access is against the type's API contract).
It's currently possible to get around this problem via the following:
But duplicating this again and again gets old really quickly.
We could have syntactic sugar for auto-properties that produces the same thing:
The attribute would:
This could also be a partial solution to the object initializer problem, although an ideal solution wouldn't involve anything at the definition, only proper initialization at the call site (this would of course require a complete redefinition of what object initialization is).
Note: as the discussion in this thread is long, here is a summary below from my point of view
The text was updated successfully, but these errors were encountered: