Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nominal records #3226

Closed
MadsTorgersen opened this issue Feb 22, 2020 · 15 comments
Closed

Nominal records #3226

MadsTorgersen opened this issue Feb 22, 2020 · 15 comments

Comments

@MadsTorgersen
Copy link
Contributor

MadsTorgersen commented Feb 22, 2020

Nominal records

Many of the recent LDM discussions around records tend to circle back to a fundamental question: do we want to support "nominal records"?

The "classic" proposal for records that got us going is what one might call "positional": proposals/csharp-9.0/records.md. The "primary" members of the record are listed as part of a primary-constructor-like parameter list on the class name itself, and everything "record-like" applies only to those members.

Here is an example of positional records:

public record Person(string FirstName, string LastName);
public record Student(string FirstName, string LastName, int ID) : Person(FirstName, LastName);

An alternative proposal is more "nominal": proposals/recordsv2.md. It considers all public fields and properties for "record-like" behavior, making the primary constructor an optional and largely orthogonal addendum, that more or less allows positional records as a special case.

Here is an example of nominal records:

public record Person { string FirstName; string LastName; }
public record Student: Person { int ID; }

These are both specific proposals, and there are many other ways each could have been achieved. But they represent two fundamentally different directions:

  • Positional-only records, where the record behavior for a given member comes from it being in the record's parameter list
  • Nominal records, where the record behavior for a given member derives from it being a member, not from whether it is listed in a primary constructor.

Nominal records are largely a superset of positional records. However, this is still a fundamental fork in the road, since there are many ways in which a positional-only implementation of records would make different choices than one that either includes or anticipates a generalization to nominal records.

The case for nominal records

The case for nominal records is really a two-step argument:

  1. In classes in general we should allow nominal creation (object initializers) for immutable properties - today we only allow it for mutable properties.
  2. If we go on to have both positional and nominal creation modes for immutable classes, those should both be equally available in records.

The nominal records proposal suggests a general feature of "initonly" properties and fields to support nominal creation of immutable objects (1), and then goes on to include those in records (2). Here I will just briefly summarize the problems of constructor-based object creation that are addressed by supporting object initializers:

  1. On the consumer side, creating objects with a constructor works well when they are small, but gets unwieldy for medium to large objects: order must be respected, default values are limited to what can be represented in metadata, and it's a binary break when the constructor e.g. adds a member, even if it's optional.
  2. On the declaration side there's duplication between constructor parameters, the member declarations themselves, and the assignment between them. All this must be kept in sync, or bugs arise. Positional records can mitigate some of the duplication with default behaviors, but are still prone to bugs when members are manually declared.
  3. Constructors break the inheritance boundary by requiring a derived class to repeat the base class members, typically just to blindly pass them on. This leads to quadratic code growth down the inheritance chain, and brittleness in the interface between base and derived class: If a base class adds a member, the derived constructors break.

With nominal construction, you just add a member declaration, and it doesn't need to be mentioned anywhere else for the consumer of a class or its derived classes to be able to initialize it.

The main thing you lose out on with nominal construction is a centralized place - the constructor body - for validation. Property setters can have member-wise validation, but cross-member holistic validation is not possible. However, for a feature such as records that is for data not behaviors, that seems to be a particularly small sacrifice.

The proposal is not to get rid of positional creation of records but to supplement it with a nominal alternative, just like constructors and object initializers co-exist today.

Record behaviors

The main behaviors bestowed by records in the current consensus are:

  • Abbreviated member declarations: Records allow a shorthand for declaring public immutable members that participate in the "recordness".
  • Value equality: It is the current plan that records should support some form of value equality by default.
  • Non-destructive mutation: records enable with-expressions that create a new object from an old one that is identical (including its runtime type) except for changes specified in the with-expression.
  • Auto-deconstructors: It is the current plan that records would generate a deconstructor to match their primary constructor.

Let's examine each of these in turn in the context of nominal records.

Abbreviated member declarations

Just like positional-only records, nominal records would support a shorthand whereby simple primary constructor parameters int X imply a public getter-only auto-property by default:

record Point(int X, int Y);
// means
record Point(X, Y)
{
    public int X { get; }
    public int Y { get; }
}

(Assuming some as-yet undecided means of tying constructor parameters and members together).

On top of that, a simple member declaration int X; would expand to a public init-only auto-property:

record Point { int X; int Y; }
// means
record Point
{
    public int X { get; init; }
    public int Y { get; init; }
}

(Assuming some as-yet undecided syntax for init-only auto-properties).

Both come with the risk that something that looks like a simple parameter or private field declaration assumes more meaning specifically within records. This may be confusing at first. Making records "more different" by giving them their own record keyword instead of class may help set the right expectations here; similar to how interface changes the defaults on accessibility.

Value equality

We've worked out the mechanics and implementation choices around how to generate code for value equality, even in the presence of inheritance. The main discussion has been how to opt in and which members participate.

We currently lean towards value equality being a separate feature from records, opted in at the type level and based on equality of the fields of the class (#3213).

However, this is one of the things that seems to hinge somewhat on the outcome of the nominal records discussion. If we are willing to commit to a positional-only approach, and we're willing to differ from other classes (or change our minds on them), then a positional-only approach has a clear set of members to base value equality off of: the "primary" members listed in the constructor parameter list of the record.

For nominal records, the set of members participating in equality must be broadened to include members that are not part of the constructor. There seem to be two avenues:

  1. Use the fields (the physical state of the object)
  2. Use the public fields and properties (the logical public state of the object)

While we are currently leaning in the direction of 1, our decision may hinge on what we decide on the following issue of non-destructive mutation, and to what extent we want a consistent approach between the two.

Non-destructive mutation

Non-destructive mutation is envisioned as a new with operator, that creates a new object as a copy of an old object with specified modifications: Point p2 = p1 with { Y = 6 };.

In the positional-only proposals, the copy-and-modify is achieved in a combined manner by a virtual With method that takes all the statically known members of the record as arguments and is overridden to call the constructor of the runtime type of the object with those members, as well as its own values for the ones not statically known at the call site.

public record Person(string FirstName, string LastName);
public record Student(string FirstName, string LastName, int ID) : Person(FirstName, LastName);

Person p1 = new Student("Mads", "Kristensen", 1);
Person p2 = p1 with { LastName = "Torgersen" };

// becomes

public record Person(string FirstName, string LastName)
{
    ...
    public virtual Person With(string FirstName, string LastName) => new Person(FirstName, LastName);
}
public record Student(string FirstName, string LastName, int ID) : Person(FirstName, LastName)
{
    ...
    public sealed override Person With(string FirstName, string LastName) => With(FirstName, LastName, this.ID);
    public virtual Student With(string FirstName, string LastName, int ID) => new Student(FirstName, LastName, ID);
}

Person p1 = new Student("Mads", "Kristensen", 1);
Person p2 = p1.With(p1.FirstName, "Torgersen");

Note that the generated With call has to copy the FirstName property of the old object to the parameter list of the With method. The benefit of this complicated dance is that the primary constructor of the new object does end up getting called, and any validation code gets run again (provided we offer a feature to manually augment the primary constructor with validation code).

For nominal record members, nothing as complicated is necessary. There does need to be a means of copying the object itself with the correct runtime type and state, but then the modified members can just be assigned as a separate step after copying. Here is a version of that:

public record Person { string FirstName; string LastName; }
public record Student: Person { int ID; }

Person p1 = new Student { ID = 1, FirstName  = "Mads", LastName = "Kristensen" };
Person p2 = p1 with { LastName = "Torgersen" };

// becomes

public record Person
{
    ...
    protected Person(Person p) => (FirstName, LastName) = (p.FirstName, p.LastName);
    public virtual Person With() => new Person(this);
}
public record Student: Person
{
    ...
    protected Student(Student s) : base(s) => ID = s.ID;
    public override Student With() => new Student(this); // Assuming covariant return
}

Person p1 = new Student { ID = 1, FirstName  = "Mads", LastName = "Kristensen" };
Person p2 = p1.With(); p2.LastName = "Torgersen";

The question is: How to mix the nominal and positional approach to non-destructive mutation?

There seem to be two dimensions to this question:

  • Do we buy that the positional members need special treatment? I.e. is it important that they get sent through the primary constructor for validation? Or can we just apply the nominal approach to all of them?
  • Do we want the nominal copy behavior to be based off of the public members (fields and properties), which would take it through their setter code, including validation? Or should we just indiscriminately copy all the fields of the object, circumventing all validation?

Both of these questions go to our ability to validate input as new record objects are created from old ones. And even though they are separate questions, probably only two combinations of answers make sense (though I am very interested to hear counterarguments to that):

  1. Everything is validated the way it's supposed to be: Positional properties go through the primary constructor, and the rest get copied public member to public member, running all setters (and getters) accordingly.
  2. Nothing is validated. With-expressions just copy the fields and perform modifications.

Option 2 has not really been explored elsewhere, so let's spend a few paragraphs on it.

It is similar to the "natural value equality" approach to equality in its "just the state" philosophy. We could call it "natural non-destructive mutation" to mirror that. Just like that proposal, it has a strong parallel to how structs work: You don't get to decide how structs are copied, that's all determined by the physical state.

Just like "natural value equality" you could imagine [Key(boolean)] attributes to affect whether state is copied or not. If something is [Key(false)] then it doesn't participate in value equality (because it is just a transient cache) and it does not participate in non-destructive mutation (because it is just a transient cache).

The simplicity of option 2 does come with a severe cost: records and validation don't mix well. Records would truly be for data, not behavior.

Whether we pick 1 or 2, this is a fundamental fork in the road that we can't easily change our minds on later.

The only way not to pick a fork is to limit records so much that there isn't any observable difference. If we make it so that you cannot provide user-defined behavior on object creation - no constructor bodies, no get/set bodies - then the remaining default behavior would be indistinguishable between option 1 and 2. Perhaps this is a good place to start in C# 9.0?

Auto-deconstructors

This seems to be the least complex feature to generalize towards nominal records! We can just say that if you provide a primary constructor, you get a corresponding deconstructor. This creates a subtle difference between record { int X; int Y; } and record (){ int X; int Y; } in that the former would not get a deconstructor and the latter would get an empty one. Would that surprise anyone?

LDM notes:

@MadsTorgersen MadsTorgersen added this to the 9.0 candidate milestone Feb 22, 2020
@MadsTorgersen MadsTorgersen self-assigned this Feb 22, 2020
@qrli
Copy link

qrli commented Feb 22, 2020

From my experience, most simple data classes grow over time. The obstacles on adding new members to a positional record makes me feel that it is better to avoid it.

The nominal record proposal has more or less the same level of terseness as positional in its minimum form. I doubt I'd ever use the positional form at all.

@orthoxerox
Copy link

I agree with @qrli, the positional form is terser only if it becomes truly positional, that is, record Point(int, int);.

@HaloFour
Copy link
Contributor

Again, I see absolutely no reason why this syntax and functionality can't be applied to struct. Having record be the type and imply class makes no sense. That Point type is a perfect use case for a struct record.

@HaloFour
Copy link
Contributor

HaloFour commented Feb 22, 2020

Are these nominal record properties immutable or not? The expanded "wither" code implies that they're not. And even if the compiler pretends really hard that they are, they're not.

I've proposed following a builder pattern for the purposes of initializing required members of a nominal record. I also feel that they make it relatively simple to enable "withers", and they do so by promoting existing patterns that work across all versions of the runtime and languages in the ecosystem:

https://gist.github.com/HaloFour/bccd57c5e4f3261862e04404ce45909e

@WolvenRA
Copy link

I like some of the simplicity of the "nominal" records. I've read through all of the issues related to records I could find but I still have a couple of questions...

Most importantly, Exactly what are we trying to accomplish with "records"?
Why is immutability of some fields\properties a (seemingly) important thing?

Finally, looking at your examples of Non Destructive Mutation, I'm wondering why you need the "with" word at all?
Given example;

Person p1 = new Student("Mads", "Kristensen", 1);
Person p2 = p1 with { LastName = "Torgersen" };

Why not just;

Person p1 = new Student("Mads", "Kristensen", 1);
Person p2 = p1 { LastName = "Torgersen" };

It would seem quite apparent that you're overriding the p1 value of LastName for p2, even without the "with".

@ChristianHoe
Copy link

A lot of the problems come from the mutability. So why not restrict records to const types?

public const class/struct Point { public int X { get; set; } /* setter can be only used by initializer */ }

@HaloFour
Copy link
Contributor

So why not restrict records to const types?

What's a const type? Sounds like something that would depend on a completely different proposal which hasn't been proposed yet. And const already means something very different in C#.

The complexity isn't around "mutating" a record, it's around creating a new instance of the record with slightly different values. with doesn't modify the existing instance.

@WolvenRA
Copy link

There are many different data types (video, sound, text, etc.). Since I'm a Business Application programmer, when I think of "records" I normally think of database table records (rows) and, generally, when I'm dealing with that type of "record" it's for the express purpose of changing (i.e. mutating) it. From that perspective, making "records" to be immutable by default would be a PITA. Imagine a database that only allowed immutable records (rows).

On the other hand, maybe database based applications is not the point of this "records" proposal. If that's the case, maybe they should be called something other than "records".

@WolvenRA
Copy link

@HaloFour If "const" doesn't mean immutable, what does it mean? I'm not trying to argue, I'm just trying to understand the difference between const and immutable.

@drewnoakes
Copy link
Member

Bear with me, as this will seem unrelated at first...

I've long wanted to create "strict aliases" for common values.

typedef Kilometres double;
typedef Miles double;

Miles m = 123;
Kilometres k = m; // error

This would statically prevent using values in incorrect domains, just because they have the same physical representation. Today you'd have to create a new type with all the boilerplate stuff (so no one does it). There should be no runtime overhead for this safety.

Other motivating examples are identifiers (OrderId/CustomerId), coordinate systems (WorldPoint/ScreenPoint), etc.

With such a capability, the ordinal approach would be very similar to a "strict aliased" value tuple:

typedef Point (int X, int Y);

@HaloFour
Copy link
Contributor

HaloFour commented Feb 24, 2020

@WolvenRA

If "const" doesn't mean immutable, what does it mean?

In C# is specifically refers to a value that can be computed by the compiler and embedded directly into the assembly. When you refer to that const by name the compiler actually replaces it with that constant value. This is why consts are limited to expressions that can be evaluated at compile time and not those that require runtime evaluation. A const can't be an instance of a class, or the result of method invocation. And given a const is a single named value it doesn't make much sense for it to apply to a type.

Not saying that this can't change, but that would be a separate proposal.

Since I'm a Business Application programmer, when I think of "records" I normally think of database table records (rows) and, generally, when I'm dealing with that type of "record" it's for the express purpose of changing (i.e. mutating) it.

Unfortunately "record" is a massively overloaded term in computer science and many languages have an implementation of something that they call "records" that are completely different from what another language might call "records". Even in C# the term is being used to refer to very different things, which is why you'll hear "positional records" and "nominal records" to refer to records where the elements are defined positionally or by name. You could make the argument that C# structs are already records and you wouldn't be wrong.

@dsaf
Copy link

dsaf commented Feb 24, 2020

The main thing you lose out on with nominal construction is a centralized place - the constructor body - for validation. Property setters can have member-wise validation, but cross-member holistic validation is not possible. However, for a feature such as records that is for data not behaviors, that seems to be a particularly small sacrifice.

Any news on method contracts? Maybe that could be used?

@TheUnlocked
Copy link

So just to clarify, would mixing nominal and positional arguments like in:

record Person(string FirstName, string LastName) { string MiddleName; }
record Student(string FirstName, string LastName): Person(FirstName, LastName) { int ID; }

be legal?

@Joe4evr
Copy link
Contributor

Joe4evr commented Feb 24, 2020

@drewnoakes Your post is not the same as the Records proposal talked about here, and is largely a duplicate of #1695. Please continue in that thread for any discussion on auto-wrappers/typedefs.

This was referenced Mar 18, 2020
@jcouv jcouv removed this from the 9.0 candidate milestone Nov 11, 2020
@jcouv
Copy link
Member

jcouv commented Nov 11, 2020

Closing as the C# 9 records feature is now tracked by #39

@jcouv jcouv closed this as completed Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests