-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Records as a collection of features #3137
Comments
There's a lot to love here, especially that complex data structures can be achieved through lots of little features that can be combined together in interesting ways. One notable version that appears to be missing would be the feature that enables "case classes", or very abbreviated data carriers, such as Also, I'm still dubious on potential designs around "init-only" properties. Don't get me wrong, I like the idea of extending object initialization, but the designs I've seen so far all seem to involve trickery such as exposing but trying to hide mutator methods. I'd hope that whatever solution could fit into the existing ecosystem and downlevel compilers. |
For direct constructor: 1st, to make it less repetitive and more identifiable from other members, could you consider some alternative syntax like: public class Point
{
public int X { get; }
public int Y { get; }
public new(X, Y)
{
// ... validation
}
} 2nd, there could be times both a property name and some other parameters are needed. e.g. public class Point
{
public int X { get; }
public int Y { get; }
public new(X, Y, bool skipValidation = false)
{
if (skipValidation) return;
// ... validation
}
} 3rd, the direct constructor can also solve the common dependency injection case, if we declare dependency as public property. If it can allow non-public properties and fields, that would be perfect. E.g. public class ProductController: Controller
{
public ProductService Products { get; }
private readonly ILogger<ProductController> logger;
public ProductController(
Products, // will work as in OP's proposal
logger, // if non-public can be allowed
IOptions<MyOption> options) // some parameter which will be read once.
{
}
} For primary constructor: public abstract class Person(public string Name { get; }); While gaining the ability, I'd hope to avoid multiple ways to do the exactly same thing without enough difference, which is a main issue with C++. The terse form |
I have a few reservations about the features suggested here. |
For var p2 = new Point { X = 1, ..p1 }; To someone unfamiliar with I think it would also be clearer to someone skimming code: we're used to looking for The case of cloning an object is also a bit clearer with this syntax IMO: var p2 = p1 with {};
var p2 = new Point { ..p2 }; |
@canton7 Javascript also has a spread operator to do basically the same. But in a strong-typed language, it has a drawback that you have to know and write the exact type, which typically is expected to be the same as |
The concern that I have is that it seems that changing the orders of your property could break the |
... in my admittedly somewhat limited experience, record-like data structures in languages tend to be non-inheritable, because value equality breaks immediately in the face of inheritance. Which means you'd want some sort of transform-and-compare on some specific set of properties. I assume that shapes would help in this area as well. |
There are many interesting points and I'm looking forward to the records feature. What I don't like is the proposed "validation accessors for auto-properties". The syntax is very limited. What if you want to assign an empty string in the given example instead of throwing? How do you get the previous/old value? I prefer here proposal #140. The given example could become: public string Name { get; set => field = value ?? throw new ArgumentNullException(nameof(Name)); } And with proposal #2145 we might simply write: public string Name! { get; set; } |
I don't quite understand the need for
I'm not sure this synergy is great. Code that is placed in such a prominent place (the first line of a class declaration), should stand on its own, it should not depend on something inside the body of the class. Otherwise, it makes understanding the code harder, not easier. |
@svick Making two objects of different runtime types not compare as equal is the purpose of the |
@Joe4evr But you don't need
|
Just because noone's mentioned it yet: the proposed EqualityContract breaks the Liskov Substitution Principle (as does GetType, or indeed any approach which maintains symmetry). Value equality among inherited types is fundamentally thorny (and often best avoided altogether, or at least carefully considered), and I'm not convinced it's something that the language itself should be getting involved with. (This doesn't apply to value equality among DU members, of course) |
The spread operator is perfect for Javascript / Typescript because of the lack of typing or structural typing respectively. In those braces you can spread almost any object. In C# however, you would only be able to spread on object of the same type as the target, which means it only really makes sense to be able to spread a single source object. "With" syntax makes more sense in a nominal typing environment. |
|
Init properties are there to reuse the initializer syntax. If you have more than, say, four properties, constructors become cumbersome. EqualityContract lets you ensure you always compare two values of the same runtime type. Otherwise you could compare an instance of the base class on the left with an instance of the derived class on the right using the base class logic. |
What benefit does |
It allows the subclasses to decide the contract. For example, i may decide in my case that my equality contract is such that i have: class Base { } sealed class Derived1 : Base { } sealed class Derived2 : Base { } And i want Derived1/Derived2 to be equatable (because htey represent the same value-oriented data, perhaps with different impls for efficiency). I explicitly do not want a pregenerated check that the types must be equal. instead, i want to say that i can compare the types as long as they agree on the equality contract. |
Would all of these features apply to structs? What differences would there be between structs and classes? |
To me, that seems to be niche use case and I think it's not a good enough reason to make the feature more complicated for everyone. |
Would using |
They'd be contextual keywords in that they only act like keywords when used as modifiers, where they're not currently legal. You could still have identifiers |
The intention is that this works. |
I guess this would be an orthogonal proposal to allow constructors to be specified with the |
Other than your proposed
|
Should you be allowed to use For example: value interface IPoint
{
public int X { get; }
public int Y { get; }
}
public data class CartesianPoint(int X, int Y) : IPoint;
public data class PolarPoint(int Radius, int Angle) : IPoint
{
public int X => ...
public int Y => ...
}
...
new PolarPoint(0, 0).Equals(new CartesianPoint(0, 0) |
I personally don't like that adding the Plus, it makes it impossible to add a private field to the class without having to rewrite it entirely using the old syntax. |
Says who? |
Maybe? Not sure what it would mean exactly. |
I get that criticism. At the same time, I want the abbreviation, and I can't think of a better way to trigger it. |
Just add a |
I don't think @orthoxerox is arguing that we should embrace this, and neither would I. I believe the LSP requires that you can substitute a subtype for a supertype and the code would still work, not that the behavior would be the same. Otherwise method overriding would pretty much be banned. |
Reading through the LDM notes for Jan 29, 2020, I was surprised to see them two approaches considered for "wither" implementation; were other approaches previously considered and rejected, or is there still scope for new ideas to be introduced? FWIW, the idea that popped into my head was only allow withers to be used on types that have a copy constructor, and to solve (as a prerequisite) initialization syntax for immutable types. To reuse the same running example, a copy constructor for public class Point
{
public Point(Point original)
{
X = original.X;
Y = original.Y;
}
// elided
} For record types, this is trivially generated; for custom types, authors can easily opt-in to support of withers by writing one themselves. Using a copy constructor avoids the dangers of using Subsequent modification of the clone could take either of two different paths. If the property is writable, it can be directly set. If not, the same technique as introduced for initialization expressions of get-only properties could be reused. To illustrate, for a mutable var p1 = new Point(2, 4);
var p2 = p1 with { Y = 14 }; would generate var p1 = new Point(2, 4);
var p2 = new Point(p1);
p1.Y = 14; Does this approach have a fatal flaw that I'm not seeing? |
Is there going to be a way to use withers though only update? - mutating an existing value and return Save((db.Get() ?? new()) with { Property = newValue }); Even though immutable |
As I've read it so far, the whole point of withers is to allow easy creation of a near clone of an existing object without modification of the original, regardless of whether the object is immutable or mutable. Making the syntax return a different object if immutable, but the same object if mutable, would not only undermine half the motivating scenarios for the feature, but would be extremely confusing and likely a source of many subtle bugs. |
If you define your own What I'm saying is that the behavior largely depends on the target object. Only if it's a proper record you can be sure that it'll return a new object, So I don't mind the difference if I'm using with with a POCO. |
@alrz, |
I've come late to this thread and I've skipped the comments, so apologies if I'm repeating the thoughts of others. Whilst I understand where Mads is coming from with his cascade of features, I found myself a little lost toward the end. I'm hoping that I'll be able to write a record in one of two forms: data struct Point(int X, int Y);
data class Point(int X, int Y); And for both, I get:
var a = new Point(1, 2);
var b = new Point(1, 2);
a == b && a.Equals(b) with the caveat that
data class Email(string Address)
{
Email(string address)
{
if (! address is valid email) throw …
Address = address;
}
} Other people may see benefit to all the other features discussed, but they worry me. It risks turning a simple idea (a compact way of declaring a value/domain object) into something that takes a very long time to implement and that may therefore not make it in time for v9... |
Can Records also be used for structs? I would like to do more semantic programming so instead of So would this be possible: public data struct PersonId(int Value); and than would be it be possible to add 'features' without coding them explicitly: public data struct PersonId(int Value) : IEquatable<PersonId>, IComparable<PersonId>;
// No implementation of IEquatable and IComparable is specified so a default is generated. Features could be a set on well known interface like and if a Record contains only one member, could we someway force conversion operators in it: public data struct PersonId(int Value) : implicit operator int;
// implicit conversion from and to 'int' is generated and would it be possible to enforce validation: public data struct Email(string Value) where ValidatorTools.IsEmail(Value); (I'd prefer this would be implemented using a static TryParse method and not with exceptions.) and would I be able to combine it: public data struct PersonId(int Value) where Value>=100000 && Value<=99999 : IEquatable<PersonId>, IComparable<PersonId>; and in the end would this code work: public data struct PersonId(int Value) : implicit operator int;
public data struct StudentId(int Value) : implicit operator int;
PersonId p = 23;
StudentId s = 14;
if (p == s) {} // generates compile error.
s = p; // generates compile error Sorry for the long comment... I'm can explain myself better with examples than with words. |
This is essentially what Rust does with "derivable" traits (interfaces). Note that for things like equality/comparison, you would not be able to do that with an inheritable class tree (Rust's structs don't have inheritance). I actually really want a good chunk of that stuff, although alas, I don't think contract syntax is coming anytime soon. But making "wrappers" simpler would go a long way to helping remove primitive obsession. @alrz - > Even though immutable with would be desirable in a lot of cases, there are places that we still need mutable records, an obvious example is database entities, or aspnet options, It's not obvious that either of those places need mutable types to me. |
@alrz Your case can be as simple as: Upsert(o => o.Property = newValue); |
Will |
I was reading the Jan 29 LDM, and it was a concern regarding withers about their possible backwards compatibility. And I want to suggest considering a way to deal with that which is a very common approach when binary compatibility is important. And it seems to me it might be a great fit for the feature as a separate general feature. Record case: public data class Point(int X, int Y);
var p2 = p1 with { X = 2 }; generates public class Point
{
// struct for withers
struct WithParameters
{
public int _x, _y;
public bool _x_provided, _y_provided;
public int X { set { _x = value; _x_provided = true; } }
public int Y { set { _y = value; _y_provided = true; } }
}
public virtual Point With(WithParameters p) => new Point(p._x_provided ? p._x : X, p._y_provided ? p._y : Y);
}
var p2 = p1.With(new Point.WithParameters { X = 2 }); So it
Some possible alternatives/enhancements:
And the feature can be coded manually independently from record types: class Data
{
public int Id { get; }
public string Name { get; }
public byte[] ExpensiveData { get; }
class WithState // I can use class instead or different name
{
public string Name; // just field
public byte[]? ExpensiveData; // I can vary how I define emptyness
public WithState(Data d)
{
Name = d.Name;
}
}
public Data With(WithState w) => w.ExpensiveData == null ? new Data(GetNextId(Id), w.Name) : new Data(GetNextId(Id), w.Name, w.ExpensiveData);
}
data with { ExpensiveData = new [1] } // cannot use Id
// translates to
data.With(new Data.WithState { ExpensiveData = new [1] }) |
This was discussed in a later LDM: https://github.com/dotnet/csharplang/blob/master/meetings/2020/LDM-2020-03-23.md |
Ah, that's almost exactly what I suggested. Thank you. That's sad that they think init-only had advantages over builders. |
Closing as the C# 9 records feature is now tracked by #39 |
Records as a collection of features
As we've been looking at adding a "records" feature to C#, it is evident that there are many different behaviors that you might want from such a feature. It is not obvious that they should all be available only when "bundled" together.
When we added LINQ in C# 3.0 it was in the form of many individual new features, that were independently useful (lambda expressions, extension methods, expression trees, etc.), as well as a syntax for "bundling them together", namely query expressions.
The records feature set revolves around succinctly expressing the shape and behavior of data, which in many ways is ill-served by object-oriented defaults. Let's try to catalog individual expressiveness that you might want to use independently, and suggest ways that those could be expressed as separate language features.
At the same time, just like query expressions, we want a way that these can all come together, and towards the end I make an attempt at that.
Running examples
We'll use two extremely simple running examples. One is a simple data class, and the other is a hierarchy of an abstract base class and a derived class (of which there would presumably be more). The former represents the simplest use case:
The latter example represents the use case that other languages use discriminated unions for: a family of data shapes united by a common type:
I've expressed them above in the simplest form allowed today. The simplicity leads to the following:
The following seems like common things you'd want to achieve independently or together, that are currently either difficult, verbose or downright impossible:
Value-based equality
Hand-implementing value-based equality is hard, cumbersome and error-prone - especially when inheritance is involved.
With the original records proposal we figured out how to augment the relatively straightforward automatic generation of equality for a given type with the structure that allows correct value-equality across a hiearchy of types.
The difficulty with implementing value-based equality across a type hierarchy lies in ensuring symmetry - that the two values agree on the equality being applied. In order to ensure that, types with value-based equality (or any custom equality) that can participate in a hierarchy of mutually comparable objects must essentially agree on who implements their equality. They can do that by declaring a
Type
-valued virtual propertyEqualityContract
in the root class of the hierarchy, with every derived type that alters equality overriding that to return its own type. Part of the equality implementation then is to compareEqualityContract
as well as the individual data members:In order to auto-generate such support for value-based equality we need
Strawman: value members
Allow properties and fields to have a
value
modifier. If they do, equality-related members are declared and/or overridden to define equality in terms of those members, together with any inherited value equality, as specified by the records proposal.Generates something like:
If value-based equality is inherited from a base class (i.e. it already has an
EqualityContract
), there'll be base calls to include that part of the equality computation:Would turn into something like:
Open questions:
IEquatable<T>
implementation as well?==
and!=
?EqualityContract
or not?value
members on classes are mutable? Is this just fine, or should we warn that these will e.g. be lost in dictionaries if they mutate?Strawman: value types
The
value
modifier could be allowed on types as well as on members. Just like the presence of a value member this would cause the type to generate or override value equality.This addresses the issue with value members that you cannot get generated value equality with no (additional) members, whether in the base or a derived type. For instance
which generates:
This is a key "discriminated union" scenario, if we want to easily express value-based equality that works across a whole set of variants expressed as classes derived from an empty base class.
It also addresses being able to express whether a derived class with no new value members should override the
EqualityContract
or not:Strawman: value type implies value members
In addition, we may want to have the
value
member on a type imply that all public properties and fields participate in equality. It would be a nice shorthand:On the other hand this would make it impossible to specify no-member value equality when a public property, e.g. a computed property, is present. For overrides that could be achieved manually by simply overriding the
EqualityContract
property, but for the root of the hierarchy it is not quite that simple.Removing construction boilerplate
Classes often have a lot of trivial boilerplate around declaring a member, declaring a corresponding constructor parameter, and then initializing the member. Before auto-properties there used to be even more, requiring both a property and a backing field, and that's still sometimes the case when auto-properties don't serve the needs.
To this end the records proposal has a primary constructor, where the class itself allows a parameter list, causing members to be automatically declared and initialized from those parameters.
Strawman: Direct constructor parameters
One way to eliminate some of the boilerplate would be to allow a constructor parameter list to directly mention members to be initialized instead of declaring a new parameter. A parameter would implicitly be declared of the same name and type as the member, and the initialization would happen at the beginning of the constructor body, in the order of appearance.
Would generate:
One problem is repeated initialization in the face of inheritance. We could have a more discerning rule such that both direct and inherited members are allowed, but we only initialize the direct ones, whereas the inherited ones are expected to be passed to
base
:Repeated initialization could still occur if a constructor calls another one from the same class with
this(...)
. We could consider warning on this (they would have to use an ordinary parameter in the calling constructor), or we could refine the rule to not generate initialization of a property whenever a parameter is passed tothis(...)
orbase(...)
.Strawman: Primary constructors
We've often talked about allowing (and once almost shipped) primary constructors for all classes. The class name would be optionally followed by a constructor parameter list, and any base class could be followed by an argument list to call a base constructor with:
Other constructors in the body would have to call directly or indirectly through the primary constructor through
this(...)
. The parameters would be in scope for initializers. Possibly they'd also be available in the body of function members, and would automatically get captured into a private field if necessary.We could adopt special syntax to provide a constructor body for primary constructors, e.g.:
Direct constructor parameters and primary constructors have great synergy:
Strawman: Primary constructor member declarations
For primary constructors we could allow a further shorthand of not just mentioning but directly declaring a property or field of the enclosing class in the primary constructor parameter list:
If this is allowed, there are likely to be a lot of classes with empty bodies. We could allow such class declarations to end in
;
instead of{}
, as shown above.Open questions:
;
s etc)?params
)?Strawman: Primary constructor "inheritance"
One big source of boilerplate is the repetition required when the constructor of a derived class needs to take all the constructor parameters of the base class, only to directly pass them on to the base constructor.
We could introduce a syntax for automatically "inheriting" all the constructor parameters and passing them on to the base class. To do so would require
For example:
The
...
means that the primary constructor parameters of the base class (string name
in this case) are copied in with the same name, type and order, and that those parameters are implicitly passed on to the base:Improvements for object inititalizers
Object initializers are an alternative/supplement to constructors which allows avoiding initialization code entirely at the declaration site, including constructors and their chaining, and provides flexibility at the call side (which properties in which order). The big downside is that they require the properties/fields to be mutable!
It would be desirable to allow this even for immutable properties and fields. Unfortunately, we cannot just start allowing object initializers on existing
readonly
fields and getter-only auto-properties, because that would bypass any constructor-based validation logic that authors were relying on. It would have to be enabled only for fields and properties that opt in.Strawman: Init-only properties
We could introduce a new
init
accessor to properties, which is mutually exclusive withset
. It works in the same way as aset
accessor, except that it can only be used in the initialization of the object - including from an object initializer (or with-expression - more about those later).Also, we could consider requiring that an init-only auto-property that is not initialized by the constructor or has an initializer, must be initialized upon construction.
Open questions:
Strawman: validation accessors for auto-properties
One common way to "fall of the cliff" with auto-properties is to need validation (or other) logic in the setter. We could extend auto-properties so that a
set { ... }
body can be provided. What still identifies it as an auto-property is that theget;
accessor is empty.A setter body in an auto-property doesn't have to - and doesn't get to - assign to the backing field, which remains anonymous. It is assigned automatically before (or after?) the specified setter body runs. The
value
contextual keyword is available as in all setters, and can be used for side effects (such as raising exceptions).This would of course blend well with init-only properties, where validation logic could be placed directly in the
init
accessor of an auto-property.Strawman: object initializers for direct constructor parameters
When a connection between a constructor parameter and a property is specified (e.g. through direct constructor parameters), we could let a caller initialize the property through an object initializer, but have it mapped to a constructor parameter. For a class with direct constructor parameters:
We could have:
Translate to:
Non-destructive mutation and data classes
With immutable objects it is common to want to produce a new object from an old one, with just a few properties changed. There's an attractive object-initializer-like syntax we could use for that:
The idea is that
p2
is an exact copy ofp1
- including its runtime type and the value of properties that are not statically known at this point in the code - except for the changes specified in the object initializer part of the expression.The question is what that means? There are two seemingly competing approaches. One overall question is whether "withing" is something that needs to be opted into? Can anyone copy any object by saying
o with {}
? If you need to opt in, what does that look like?Strawman: withers through copy-and-update
For classes that rely on object initializers for property initialization and validation the desired behavior would be:
MemberwiseClone
(can be done without opt-in but is it "safe"?) or some required virtual clone method on the class.set
orinit
accessor to be changed.Validation happens per member, as the properties are called.
Strawman: withers through virtual factories
For classes that rely on constructors for property initialization and validation the desired behavior would be:
Validation happens again on all values passed through, even the ones that weren't changed, since the constructor that's ultimately called can't tell the difference.
For this kind of wither, several things would need to be in place for the compiler to know what to generate, both for the implementation of the virtual factory (let's call it
With
) and for the call to it:With
method needs to generate a constructor call, it needs to be able to collect the arguments to it from a) the properties assigned in thewith
expression and b) the existing property values in the source object. In practice it seems all the primary constructor parameters need to be property parametersI'm going to tentatively "burn" the
data
modifier for the purpose of designating that a wither is desired. Later I'm going to hang more off of that.Would generate:
Data classes inheriting other data classes are required to override the wither of the base class:
Generates:
Strawman: Auto-generated deconstructors
For "positional" data types, in particular small ones, it is often convenient to have a positional deconstructor that is the "inverse" of the primary constructor.
In order to auto-generate a deconstructor the compiler would need to know:
Those are exactly the same requirements as for withers, when implemented as virtual factories! It seems reasonable that deconstructors are controlled by the same opt-in as withers. In that case, all
data
classes would generate a wither as well as a deconstructor:Would generate:
Strawman: Abbreviated data members
As proposed the
data
keyword requires that all primary constructor parameters map to a property or field, so they may be of the formX
(referencing a declared memberX
) orpublic int X { get; }
(if we allow members to be declared directly as constructor parameters), but we would not allow ordinary constructor parametersint X
.This means that the syntax
int X
is "free" to be used otherwise in data class primary constructor parameters. We could make it a shorthand for declaring a public getter-only property on the class, just as proposed in the records proposal:Generates the same as above - in particular the members
For explicit member declarations, we could also consider coopting the "default" meaning of
int X;
to generate public init-only properties. This would allow a similar shorthand for data members that aren't part of the primary constructor, and thus for data classes that are less "positional" and more "nominal":This would be shorthand for the class declaration:
Strawman: Implied inherited constructors
In derived data classes we could not only require the primary constructor inherits is base members with
...
, we could simply make it so, letting you - or making you - leave out the...
and just concatenating the constructor parameters by default. So:Means the same as
Strawman: data classes as value classes
Across all of the above, there are two new main "kinds" of classes proposed: Value classes which automatically support value-based equality over a set of members, and data classes which automatically support non-destructive mutation and deconstruction. Both are brought into play with a modifier =
value
anddata
respectively - and while both operate over a set of members, it is not obvious that those would necessarily be the same members.How can we most seemlessly and naturally combine them?
It does not seem appetizing to suggest that you need to use both the
value
anddata
modifiers if you want both sets of functionality. After all, both features are useful primarily in scenarios where data is immutable (otherwise value-based equality is "dangerous" when combined with e.g. dictionaries, and non-destructive mutation is really unnecessary when you have old-fashioned destructive mutation!), and it is going to be very common to want to apply them both.Would it be reasonable to say that all value classes are data classes? Probably not. Data classes come with a lot of restrictions that don't seem warranted for value classes. You can certainly imagine classes that just want to add value equality, without being forced into primary constructors, mapping between constructor parameters and members, etc. Also, you may not want to allow your objects to be copied! Turning that off would not be easy, as you couldn't just provide your own implementation of something to take precedence over a generated one - the "something" is a
With
method that you don't want to have even exist!Would it be reasonable to say that all data classes are value classes? Probably. The whole notion of non-destructive mutation sort of implies that object identity doesn't really matter, and that multiple physical objects can represent the same "value" at different times. If you really do want to keep reference equality by default, we could let you explicitly (and easily) implement it yourself, and let that implementation prevent one from being generated.
So the proposal is that
data
on a class also implies value equality. On which members, though?I'd propose that this is somewhat left to the user, in the following way:
value
keyword would need to be manually applied.value
is implied.Thus:
This does leave a small wrinkle, similar to one we saw with value classes above: What if I want a derived data class with no new members that is "equality compatible" with its base class (doesn't override
EqualityContract
)? For value classes we solved it by whether thevalue
modifier was on the type, but that's a no go here.My best proposal is to allow the
(...)
or()
empty primary constructor to be omitted, and for that to mean that value equality isn't overridden. Not the most obvious syntactic hint, but right there for the taking:This declaration overrides the
With
method, but not theEqualityContract
.Conclusion
The above many sub-proposals together span most or all of what we've talked about records doing. In the end,
data
classes become the combined feature that lets you get (nearly) the same brevity as the all-in-one proposals, while many aspects are still factored out to be usable independently, most notably value equality.There are many details to iron out but I think this paints a fairly promising picture of how full-blown records with unprecedented inheritance resilience can be achieved in a gradual, "cliff-less" fashion.
LDM notes:
The text was updated successfully, but these errors were encountered: