Exploration: Shapes and Extensions #164

MadsTorgersen · 2017-02-22T01:18:58Z

MadsTorgersen
Feb 22, 2017
Maintainer

Shapes and Extensions

This is essentially a merger of two other proposals:

Extension everything, which allows types to be extended with most kinds of members in the manner of extension methods, and
Type Classes, which provide abstraction over sets of operations that can be added to a type separate from the type itself.

These two features have good synergy, and would benefit from being designed together. This proposal is a concrete shot at doing so, knowing full well that there are a myriad of different decisions that could be made. This is not to be particularly opinionated about those choices - it's just easier to understand and discuss a general proposal when it has a concrete shape.

Extensions

The idea behind "extension everything" in most proposals is to use a different approach to declaration syntax from today's "static methods with a this modifier", instead providing a type-like declaration with a name, an indication of the type to be extended, and a set of member declarations for that type. This syntactic approach generalizes more easily to other member kinds, including properties, static members and even operators.

Here is an example adding and using a static property:

public extension IntZero of int
{
    public static int Zero => 0;
}

WriteLine(5 + int.Zero); // in the scope of the extension, int has a Zero property

The name of the extension declaration (like that of the static class containing an extension method today) is useful primarily for disambiguation purposes. We'll get back to that later.

What should extension declarations compile into? The straightforward answer is a static class with static members (taking an extra parameter for the receiver if necessary). However, as we'll see below, this proposal suggests a different approach.

Shapes

Interfaces abstract over the shape of objects and values that are instances of types. The idea behind type classes is essentially to abstract over the shapes of the types themselves instead. Furthermore, where a type needs to opt in through its declaration to implement an interface, somebody else can make it implement a type class in separate code.

In C#, let's call type classes "shapes":

public shape SGroup<T>
{
    static T operator +(T t1, T t2);
    static T Zero { get; }
}

This declaration says that a type can be an SGroup<T> if it implements a + operator over T, and a Zero static property.

As an example, the int type is halfway to implementing SGroup<int>, since it has a + operator over int, and above we showed how to use an extension to add a static int-valued property Zero. Let's make it so that an extension declaration can also declare that the extended type implements a given shape:

public extension IntGroup of int : SGroup<int>
{
    public static int Zero => 0;
}

This declaration extends int not only with the Zero property, but with SGroup<int>-ness. In the scope of this extension, int is known to be an SGroup<int>.

In general, a "shape" declaration is very much like an interface declaration, except that it:

Can define almost any kind of member (including static members)
Can be implemented by an extension
Can be used like a type only in certain places

That last restriction is important: a shape is not a type. Instead, the primary purpose of a shape is to be used as a generic constraint, limiting type arguments to have the right shape, while allowing the body of the generic declaration to make use of that shape:

public static AddAll<T>(T[] ts) where T : SGroup<T> // shape used as constraint
{
    var result = T.Zero;                   // Making use of the shape's Zero property
    foreach (var t in ts) { result += t; } // Making use of the shape's + operator
    return result;
}

So as an important special case, shapes address the long-desired goal of abstracting numeric and computational code over the specific data types being manipulated, while allowing clean use of operators.

Let's call the AddAll method with some ints:

int[] numbers = { 5, 1, 9, 2, 3, 10, 8, 4, 7, 6 };
WriteLine(AddAll(numbers)); // infers T = int

Clearly we need to check the constraint at the call site. If this is called within the scope of the IntGroup extension declaration above, the compiler does indeed know that T = int satisfies the SGroup<T> constraint. However, there is more going on: how does the AddAll method know how int is an Sgroup<int> - at the call site? There needs to be more information passed in than just the (inferred) int type argument and the numbers array.

Implementation

There is an implementation trick at play here, which is stolen straight out of the type classes proposal referenced above. The trick starts as follows:

Shapes are translated into interfaces, with each member (even static ones) turning into an instance member on the interface
Extensions are translated into structs, with each member (even static ones) turning into an instance member on the struct
If the extension implements one or more shapes, then the underlying struct implements the underlying interfaces of those shapes

In our example, the shape and extension declarations translate into this:

public interface SGroup<T>
{
    T op_Addition(T t1, T t2); // Can't use "operator +" here
    T Zero { get; }
}

public struct IntGroup : SGroup<int>
{
    public int op_Addition(int i1, int i2) => i1 + i2;
    public int Zero => 0;
}

(Instance declarations of operators aren't allowed in C#, so those + "methods" are encoded as instance methods, just as today's operator declarations are actually encoded as static methods in IL).

Note that the struct encoding of IntGroup has a declaration for the + operator even though the original extension declaration doesn't. It captures what it thinks + means on ints, and thus fulfills the SGroup<int> interface.

The generic method taking shape-constrained type parameters is translated as follows:

For each type parameter that is constrained by one or more shapes, the generic method actually gets an extra type parameter constrained by struct and by the underlying interfaces of those shapes
The method creates and keeps an instance of each of those extra type parameters (it can because of the struct constraint)
Whenever an operation from the shape is used in the body, the translation instead calls it on the corresponding instance

Let's see that on the AddAll method from above:

public static T AddAll<T, Impl>(T[] ts) where Impl : struct, SGroup<T>
{
    var impl = new Impl();

    var result = impl.Zero;
    foreach (var t in ts) { result = impl.op_Addition(result, t); }
    return result;
}

See how the extra Impl type parameter carries the knowledge of how to do + and Zero into the method. The benefit of doing this with a struct type parameter, rather than, say, extra delegate parameters, is that the runtime does a really good job of optimizing it: it will specialize the generic method code for each different struct it gets called with, so that the method body can inline and optimize the specific + and Zero implementations. Measurements on the linked type classes proposal show incredibly good performance, with near-zero cost to the abstraction.

In general I show the translations instantiating the Impl structs once when possible, but it is essentially free to create an instance of an empty struct, so we could also consider instantiating it every single time we need to call a member on it. That's less readable though.

Finally, there's a bit of extra work for the call site: It needs to infer and pass that extra type argument:

int[] numbers = { 5, 1, 9, 2, 3, 10, 8, 4, 7, 6 };
WriteLine(AddAll<int, IntGroup>(numbers));

It infers T = int the normal way, and then looks to find exactly one declaration in scope that implements SGroup<int> on int. Finding the IntGroup extension, it passes its underlying struct type. In case of ambiguities, the original code needs to disambiguate, just as when more than one extension applies elsewhere. We'll get to that later.

Implementing shapes directly

Once shapes are in the world, new types will want to implement them directly, instead of via an extension declaration:

public struct Z10 : SGroup<Z10>
{
    public readonly int I;
    public Z10(int i) => I = i % 10;
    public static Z10 operator +(Z10 z1, Z10 z2) => new Z10(z1.I + z2.I);
    public static Z10 Zero => new Z10(0);    
}

This is easily supported by simply a) checking that the type does indeed conform to the shape, and b) generating an extension next to the type declaration, or rather its underlying struct, witnessing the implementation:

public struct Z10
{
    public readonly int I;
    public Z10(int i) => I = i % 10;
    public static Z10 operator +(Z10 z1, Z10 z2) => new Z10(z1.I + z2.I);
    public static Z10 Zero => new Z10(0);
}

public struct __Z10_SComparable : SGroup<Z10>
{
    public Z10 op_Addition(Z10 t1, Z10 t2) => t1 + t2;
    public Z10 Zero => Z10.Zero;
}

Whenever the Z10 type is in scope, so is the fact that it is an SGroup<Z10>.

Instance members

So far we only explored shapes and extensions for static members, but they should apply equally to instance members.

public shape SComparable<T>
{
    int CompareTo(T t);
}

public extension IntComparable of int : SComparable<int>
{
    public int CompareTo(int t) => this - t;
}

In order to create the underlying interface for SComparable<T> we need to take a page out of the current extension methods feature and add an extra parameter to convey the receiver of the CompareTo call. What should be the type of that receiver? Well that depends on what type the shape is ultimately implemented on. In other words, we need to give the interface an extra type parameter representing the "this type", and let implementers fill that in:

public interface SComparable<This, T>
{
    int CompareTo(This @this, T t);
}

public struct IntComparable : SComparable<int, int>
{
    public int CompareTo(int @this, int t) => @this - t;
}

Essentially, any shape that defines instance members needs to also have an extra This type parameter.

From there on, the translation of generic methods over these shapes is unsurprising. This method:

public static T Max<T>(T[] ts) where T : SComparable<T>
{
    var result = ts[0];
    foreach (var t in ts) { if (result.CompareTo(t) < 0) result = t; }
    return result;
}

Translates to this:

public static T Max<T, Impl>(T[] ts) where Impl : struct, SComparable<T, T>
{
    var impl = new Impl();

    var result = ts[0];
    foreach (var t in ts) { if (impl.CompareTo(result, t) < 0) result = t; }
    return result;
}

The instance method call result.CompareTo(t) "on" the result gets translated into an instance method call impl.CompareTo(result, t) on the impl struct, taking the "receiver" as a first parameter.

Extending interfaces with shapes

Note that the shape SComparable<T> is almost identical to the existing interface IComparable<T>. Obviously there's a completely trivial implementation of SComparable<T> on any T that implements IComparable<T>, and so we can write that implementation once and for all by extending the interface itself:

public extension Comparable<T> of IComparable<T> : SComparable<T> ;

We don't even need to provide a body; the compiler can just figure it out. We just have to say it to make it true (and to declare the underlying struct to "witness" the SComparableness to generic methods).

Under the hood, the compiler translates to:

public struct Comparable<T> : SComparable<T, T> where T: IComparable<T>
{
    public int CompareTo(T @this, T t) => @this.CompareTo(t);
}

Shapes in generic types

So far we've seen generic methods with type parameters constrained by shapes. We can do the same for generic classes, where a given type argument gets to come in with its own way of doing certain things.

As an example let's build a SortedList<T> where T needs to be SComparable<T>. This will then work both for T's that inherently implement IComparable<T> (and hence SComparable<T> if the previous section is applied), but for other T's the instantiator of SortedList<T> can apply an extension and imbue T with a suitable comparison to apply inside of the list (please forgive algorithmic errors! 😀):

public class SortedList<T> where T : SComparable<T>
{
    List<T> ts = new List<T>();
    
    public void Add(T t)
    {
        int l = 0, r = ts.Count;
        while (l < r)
        {
            int m = (l + r) / 2;
            if (t.CompareTo(ts[m]) < 0) { r = m; }
            else { l = m + 1; }
        }
    }
}

We can implement this much like we do with generic methods, adding an extra type parameter to pass in the implementation struct. We can even store an instance of that struct in a static field if we want.

public class SortedList<T, Impl> where Impl : struct, SComparable<T, T>
{
    static Impl impl = new Impl();

    List<T> ts = new List<T>();
    
    public void Add(T t)
    {
        int l = 0, r = ts.Count;
        while (l < r)
        {
            int m = (l + r) / 2;
            if (impl.CompareTo(t, ts[m]) < 0) { r = m; }
            else { l = m + 1; }
        }
    }
}

One problem here is that the Impl type argument becomes part of the type identity of the constructed SortedList type. So if SortedList<T> is constructed with the same explicit type argument in two different places that implement SComparable<T> with different extensions, those are different constructed SortedList<T> types! The shape implementation becomes part of the type identity, and if it differs, those types are not interchangeable.

Also, generic types can be overloaded on arity, so introducing secret extra type parameters can potentially throw a wrench into families of generic types all differing only on arity.

The type classes proposal linked above actually makes the "implicit" type parameters explicit. This comes with its own problems, but does have the advantage that the number of type parameters shown in source code corresponds to the number in IL.

Extensions on shapes

Using an approach similar to the shape-parametized types above, we can let extensions extend shapes, not just types. Let's say we want to write an extension that offers the trivial implementation of all the comparison operators on everything that implements SComparable<T>:

public extension Comparison<T> of SComparable<T>
{
    public bool operator ==(T t1, T t2) => t1.CompareTo(t2) == 0;
    public bool operator !=(T t1, T t2) => t1.CompareTo(t2) != 0;
    public bool operator > (T t1, T t2) => t1.CompareTo(t2) >  0;
    public bool operator >=(T t1, T t2) => t1.CompareTo(t2) >= 0;
    public bool operator < (T t1, T t2) => t1.CompareTo(t2) <  0;
    public bool operator <=(T t1, T t2) => t1.CompareTo(t2) <= 0;
}

Just like the generic methods and types explored above, the underlying struct for this extension needs to have an extra type parameter for the implementation of the SComparable<T, T> interface:

public struct Comparison<T, Impl> where Impl : struct, SComparable<T, T>
{
    static Impl impl = new Impl();

    public bool op_Equality(T t1, T t2) => impl.CompareTo(t1, t2) == 0;
    public bool op_Inequality(T t1, T t2) => impl.CompareTo(t1, t2) != 0;
    public bool op_GreaterThan(T t1, T t2) => impl.CompareTo(t1, t2) > 0;
    public bool op_GreaterThanOrEqual(T t1, T t2) => impl.CompareTo(t1, t2) >= 0;
    public bool op_LessThan(T t1, T t2) => impl.CompareTo(t1, t2) < 0;
    public bool op_LessThanOrEqual(T t1, T t2) => impl.CompareTo(t1, t2) <= 0;
}

If that extension is in scope at the declaration of the Max method above, the comparison operators can now be used directly:

public static T Max<T>(T[] ts) where T : SComparable<T>
{
    var result = ts[0];
    foreach (var t in ts) { if (result < t) result = t; }
    return result;
}

This gets straightforwardly implemented by passing the method's Impl type parameter (implementing SComparable) to the Comparison struct above, instantiating that, and calling its operator implementations:

public static T Max<T, Impl>(T[] ts) where Impl : struct, SComparable<T, T>
{
    var impl = new Comparison<T, Impl>();

    var result = ts[0];
    foreach (var t in ts) { if (impl.op_LessThan(result, t)) result = t; }
    return result;
}

Explicit implementation and disambiguation

This section is a potentially useful tangent, that one can choose to go down only a certain part of the way.

We can consider explicit implementation, akin to what interfaces have, where the shape's members don't show up on the extended types themselves, but only when accessed through the shape directly. For instance, integers can also be viewed as a group under multiplication, but since that would mean implementing + as * and Zero as 1, we would not have those versions show up directly on the int type:

public extension IntMulGroup of int : SGroup<int>
{
	static int operator Sgroup<int>.+(int i1, int i2) => i1 * i2;
	static int SGroup<int>.Zero => 1;
}

Thus, if both IntGroup and IntMulGroup were in scope, int.Zero would still yield 0, not 1.

When passing an SGroup constrained type argument, however, we'd still want to be able to disambiguate whether we meant "int with addition" or "int with multiplication".

Specifying which shape or extension to use

When there is more than one declaration in scope providing a given member or shape implementation, the compiler cannot automatically infer which one to use. We may be able to give sensible resolution rules that deal with a lot of cases, but there's going to be situations where you want to specify which extension declaration you meant to use.

AddAll(numbers); // use IntGroup or IntMulGroup?
AddAll<int>(numbers); // Doesn't help, it's Impl that can't be inferred, not T

An approach to this could be to simply allow the name of the extension declaration itself as a type name, with the rough meaning of "same type as the extended type, but give priority to this extension." It's sort of similar to base meaning "this type, but start member lookup in the base type":

AddAll<IntMulGroup>(numbers); // becomes AddAll<int, IntMulGroup>(numbers)

This would also work as an approach to get at explicitly implemented members:

IntMulGroup.Zero; // 1;

When accessing instance members on a receiver, to get at an explicitly implemented member, or to choose an extension to "view it as", cast the instance to the shape or extension name:

((SComparable<Point>)p1).CompareTo(p2); // Access an explicitly implemented but unambiguous member
((PointComparable))p1).CompareTo(p2);   // Access an ambiguous member by naming the declaring extension

Or maybe it looks better with and as expression:

(p1 as SComparable<Point>).CompareTo(p2);
(p1 as PointComparable).CompareTo(p2);

Using extensions as types

The number of places where you can use shapes as types is very limited: we've only seen them as constraints and in disambiguating uses. That is because they do not correspond to a single underlying type.

Extensions however, really do correspond to a single underlying type: the one that they extend. We could therefore imagine allowing them to be used as types of fields, parameters, etc. They would then denote, at runtime, the underlying type, but the compiler would know to "view it as" the extension.

Let's again imagine that PointComparable explicitly implements SComparable<Point> on the type Point. But now I want to write code that compares Points all the time, and I don't want to have to cast every single time. Instead, can I just declare that I want to view these particular ints as PointComparable's?:

PointComparable[] ps = GetPoints();
...
ps[i].CompareTo(ps[j]);

This translates into:

var impl = new PointComparable();

Point[] ps = GetPoints();
...
impl.CompareTo(ps[i], ps[j]);

For public interfaces we would have a way to signal the "overlay" extension type in metadata, e.g. through an attribute.

Extensions as wrapper types

One potentially useful further step to this, is to allow extensions to explicitly implement their own members, not just ones from shapes. What it would mean is, they don't actually expose the member on the underlying extended type, but only when the extension itself is used as the type.

This can be used to create compile time "wrapper types", that compile down to using the underlying type at runtime, but give it an extra face at compile time:

public extension JPoint of JObject
{
	public int JPoint.X => (int)this["X"];
	public int JPoint.Y => (int)this["Y"];
}

JObject o = GetObject[];
WriteLine(o.X); // Error: X is not exposed on JObject, because it is explicitly implemented
JPoint p = o;
WriteLine(p.X); // Now the JObject is seen as a JPoint, so X is there

This is an example of giving a typed overlay to something less typed. That appears to be a common scenario, and is the whole basis for e.g. TypeScript's type system. Whether or not this is the right mechanism for it is probably debatable, but it is certainly a mechanism.

Discussion

This is a very high level proposal - it is more than a proof on concept that a design exists, and many details would need to be locked down (and changed) if we want to pursue this, e.g.:

How is an extension brought into scope? Does it need to be usinged, or is it in effect just through its presence?
How exactly are instance extension members encoded, so that they can have the extra @this parameter?
Which rules should be used to pick which extension members are more specific, so that there aren't ambiguities all the time?
Etc...

Some issues with the proposal as it currently stands:

Two new "type declaration" forms to the language make it heavy on "concept".
Shapes and extensions are only "halfway" types, which may be a confusing notion to wrap your head around.
Hidden type parameters introduce a split between source and IL level generics, that may be ugly to pave over

Other directions one might explore to achieve some of the same goals:

Find a way to extend interfaces to play the role of shapes here: declare static members, apply after the fact, etc. This would likely require runtime changes, but maybe that's better all-up.
Something more dynamic: structural typing, duck typing, whatever the term. This has the potential to fail at runtime, if something "turns out" not to fit the shape you assumed at compile time, and also doesn't clearly address some of the more generic scenarios.

Looking forward to further discussion of the pros and cons!

Mads

gafter · 2017-02-22T01:54:19Z

gafter
Feb 22, 2017

A type class (shape) can declare conversions. I suspect you don't intend to allow extension declarations of conversions. Do you?

0 replies

gafter · 2017-02-22T02:04:49Z

gafter
Feb 22, 2017

The only reason I can think of that a declaration such as

public extension Comparable<T> of IComparable<T> : SComparable<T> ;

is necessary is that not every type that implements IComparable<T> has members with the same signature (because if some type explicitly implements the interface, it doesn't have those members directly in its type).

0 replies

svick · 2017-02-22T03:08:59Z

svick
Feb 22, 2017
Collaborator

This looks really interesting. My thoughts:

Why do extension members need to be marked as public? Is there any other option? (The proposal examples also seems inconsistent about whether explicit extension members should be marked public.)
Also, generic types can be overloaded on arity, so introducing secret extra type parameters can potentially throw a wrench into families of generic types all differing only on arity.

Generic types are not overloaded on arity in IL, instead the number of type parameters is included in the name, after a backtick.

So, what would happen if the actual type name contained the number of generic parameters in source? E.g. C# SortedList<T> → IL SortedList`1<T, Impl> (and not SortedList`2<T, Impl>).

I think this would resolve the conflict, but it could throw off some tools that don't expect this (and it's also not CLS compatible, in case that matters).
Can extensions as types propagate somehow? Consider this code:
```
IntMulGroup a = 2;
var b = a + a;         // a is IntMulGroup, so + is IntMulGroup.+
IntMulGroup c = b + b; // b is int (?), so + is int.+ (??)
```
Based on the proposal, I would expect that they don't propagate, so b is just an int and c would be 8. But I think the user would expect the result to be 16 (using IntMulGroup.+ in both cases), and so this would be an easy to make bug.

Maybe declaring IntMulGroup.+ like this could work?
```
static IntMulGroup operator Sgroup<int>.+(int i1, int i2) => i1 * i2;
```

0 replies

Joe4evr · 2017-02-22T04:35:42Z

Joe4evr
Feb 22, 2017

Shapes and extensions are only "halfway" types, which may be a confusing notion to wrap your head around.

Well, I have to say that this explanation of "Type classes" as shapes and how it could extend anything of a type was what I needed to wrap my head around the concept, at least, instead of needing a Ph. D in functional language jargon. This is now making me pretty excited for the possibilities this could bring once the kinks are worked out.

0 replies

alrz · 2017-02-22T06:09:01Z

alrz
Feb 22, 2017

Find a way to extend interfaces to play the role of shapes here: declare static members, apply after the fact, etc. This would likely require runtime changes, but maybe that's better all-up.

I'd rather to wait for proper clr support for "virtual extension methods" (#52) and further, apply after the fact (dotnet/roslyn#8127). This looks too magical and honestly I don't see much value added for "generic numeric operations" specially that code generators can seemlessly support that scenario without performance penalty enforced via this proposal. Meanwhile, a portion of "extension everything" can be implemented right now as it doesn't need voodoo under the hood.

0 replies

Thaina · 2017-02-22T07:10:56Z

Thaina
Feb 22, 2017

This proposal seem like completely summarize functionality we requested so far

Still there are something I disagree with

Just personal but I wish there would be keyword better than shape
- some keyword plus interface such as static interface maybe?
Please don't introduce extension of keyword, At least drop of keyword

Also I still want to propose static extend syntax

public static class IntExt : int,SGroup<int>
// must place type to extend before any shapes
// same rule as class : class,interface,interface
{
    // same new implementation
    // allow static class extended from type can implement non static member
}

However this is very interesting transpile implementation

0 replies

gafter · 2017-02-22T12:16:20Z

gafter
Feb 22, 2017

@alrz I don't know what performance penalty you're referring to. The invocations of shape methods get specialized by the JIT so they become simple non-virtual invocations of static methods.

0 replies

MgSam · 2017-02-22T16:08:20Z

MgSam
Feb 22, 2017

@ArlZ Could you elaborate as to how you foresee code generators as seamlessly supporting writing generic numeric operations? What would that even look like?

0 replies

alrz · 2017-02-22T16:16:19Z

alrz
Feb 22, 2017

Built-in operators don't emit methods, but shapes need a method call anyways. With generators one can overload a method over numeric types,

[NumericOverload]
int AddAll(int[] nums) {..}

I'd admit that wouldn't be an straightforward generator (and doesn't provide much flexibility) but still possible. I think other use cases like duck typing etc, would be addressed by #52 in which case it doesn't need complex code generation on the compiler side.

0 replies

MgSam · 2017-02-22T17:03:57Z

MgSam
Feb 22, 2017

So you'd have to use int as a proxy for all numeric types? Seems pretty messy to me. Working outside of the type system rather than with it.

I think the "just do it with generators instead" argument is kind of a rabbit hole. You could make an argument that many of the features already in C# shouldn't be there because they could be done with generators instead. Of course, doing this means the tooling support is vastly inferior and the code much harder to follow.

I think generators are a great feature, but they need to be used carefully and thoughtfully. Speaking from experience using PostSharp, though its powerful it makes it much harder to reason about what's happening with your code.

Also, I'm not sure why you keep making a perf argument against this feature. Mads/Gafter already said that tests showed nearly zero perf hit. If that ultimately changes I think they should take that into account but as it stands I'm satisfied to take their word on this.

0 replies

ufcpp · 2017-02-22T17:44:57Z

ufcpp
Feb 22, 2017

@alrz

The JIT optimizes struct generics very well. In Release build, virtual calls are replaced to non-virtual calls. And in many cases, non-virtual calls are expanded inline. As a result, the performance of the op_Addition is almost the same as the build-in add instruction.

Here is the benchmark:

https://gist.github.com/ufcpp/15af6d3d7606fb3771a91c81898dcfa3

0 replies

YaakovDavis · 2017-02-22T20:25:49Z

YaakovDavis
Feb 22, 2017

Regarding disambiguation, some new syntactic support could be handy:

AddAll<int as IntMulGroup>(numbers);
AddAll<int as IntGroup>(numbers);

The as Shape clause is necessary only in ambiguous cases.

0 replies

SamPruden · 2017-02-22T21:15:26Z

SamPruden
Feb 22, 2017

I imagine and hope the answer is yes, but would the following be a valid extension?

public extension PointlessExtension<T> of T
    where T: class
{
    public T PointlessReferenceToSelf => this;
}

In short, would the extension be applicable directly to the generic argument?

0 replies

iam3yal · 2017-02-22T22:21:34Z

iam3yal
Feb 22, 2017

@TheOtherSamP How would it work? by reading this proposal it would get implemented like this:

public struct PointlessExtension<T> : T where T: class
{
}

Disregard the fact that it is a struct but can you derive from T today? something like the following might work:

public extension PointlessExtension<T> of Object
    where T: class
{
    public T PointlessReferenceToSelf => this;
}

0 replies

SamPruden · 2017-02-22T22:54:17Z

SamPruden
Feb 22, 2017

@eyalsk That's a fair question, and one I'll confess I gave little thought. I honestly don't know how to best make that work internally, but it feels like a fairly essential capability to me. That's something you can do with current extension methods, and something I use often enough that I know I'd miss it if it were lacking in these new extensions.

To give a slightly more useful (while still simple) example:

public static T[] ToArrayContaining<T>(this T item) => new []{item};

This is an extension method using the current syntax that uses this capability. Yes that would continue to work, but the ability to do things like this would be nice in the new syntax too. Without it, the new extension features would feel incomplete to me because they're taking a step back, and I'd end up mixing and matching between the two extension method syntaxes in my code. That feels wrong.

0 replies

IS4Code · 2020-08-11T09:56:30Z

IS4Code
Aug 11, 2020

@IllidanS4 The CLR needs to help with this: (SFoo)(object)bar. (Where bar may be an object parameter from arbitrarily far way in terms of program execution.)

That seems too much ambitious and dangerous for the CLR to ever implement. Matching a shape is not casting and it is done on a per-type basis, not per-instance. The actual type that matches SFoo is most of the times also necessary and a cast like this cannot recover it.

Plus imagine this:

shape SCountable
{
    int Count { get; }
}

Should (object)new int[10] match SCountable in such a cast? If yes, should Count invoke the implementation of ICollection<int>.Count, or ICollection.Count? If no, what decides which of the "types" of an instance apply? You may be tempted to say the most derived type, but what if it hides base members by name? You work with an instance of Foo, see that it has Foo.Bar(int) so you can cast it to a shape with Bar(int), but suddenly you cast it to object and it no longer works.

This overcomplicates a proposal that is intended to bridge the gap between static members and interfaces with complex, inconsistent, and dangerous operations carried out by the CLR. Besides, there are already good dynamic duck-typing libraries that can be used for this purpose, so if a simple piece of syntax is really needed, something like (object)x to SFoo could simply be mapped to DuckTyper.Implement<SFoo>(x). But for all other intents and purposes, a shape should behave like a static class.

You can explicitly cast an instance of a type to an Extension for that type. This has no direct affect on code gen. In the type system it wraps the underlying instance in a wrapper struct of the Extension type. An Extension type is considered to inherit from the underlying type, and so has all its interfaces+members, as well as the ones it declares.

You can explicitly use an Extension as a type parameter. It is considered to fulfill the struct constraint.

You can box a struct with an Extension static type. This creates a box object of the Extension type. A boxed Extension type is considered to inherit from the underlying type, and so has all is interfaces+members, as well as the ones it declares.

You are describing a role. Essentially it means a value type can "derive" from any concrete type if it adds no fields and overrides no methods. I like this notion of value type polymorphism, since if even deriving from a generic parameter was possible, it would be a step towards tags. However, I still think this is an extension of the original proposal, and not a completion.

Being able to generate a type implementing particular interfaces and delegating all instance calls to static calls is a good feature on its own. The rest is just giving hints to the compiler.

0 replies

HaloFour · 2020-08-11T12:49:07Z

HaloFour
Aug 11, 2020

@YairHalberstadt

There could be separate helper methods in the reflection namespace which list all Extensions for a type and let's you pick one.

How do the Extensions get into this list and when? Do Extensions for the base type suffice? Or any of the implemented interfaces? There could be any number of such Extensions in an app, each which does something different. This also doesn't seem to help with the case of attempting to match an arbitrary type to a shape at runtime where no such Extensions would exist unless some developer already established that relationship.

Anyway, I don't particularly care how "shapes" (or whatever they end up being called) are implemented as long as they fit the bill of being a named constraint of required members and that it's possible to provide "glue" to make an existing type meet that constraint via extension members or similar, and with a minimum of overhead in the process. It's my opinion that "shapes" be an evolution of interfaces rather than a completely new "type" so that it can be used within the existing ecosystem.

0 replies

YairHalberstadt · 2020-08-11T14:02:09Z

YairHalberstadt
Aug 11, 2020
Collaborator

@IllidanS4

There's I think a number of separate problems which the shapes/roles/extensions discussion is trying to solve:

Defining virtual/abstract static members on types (making sure that a type conforms to a particular shape) for use in generic contexts.
Extending an existing type with new members, and interfaces.
Improving performance of generic code over interfaces by finding some way of devirtualizing the interface calls and avoiding allocations (think SEnumerable, Exploration: What would it take to achieve cost free Linq #2482).

I think they've been muddied together which is leading to a lot of confusion. Whilst solutions to these problems might be related, they aren't necessarily.

Specifically my proposal above for how to put this in the runtime solves 1 and 2 but not 3. Just traits/shapes in their original compiler only incarnation would solve 1 and 3 but not 2.

As an alternative to shapes for solving 3, I think Associated Types could be a reasonable idea

For example in #2482 I suggest the following as a way to achieve an allocation free Select method:

public interface IEnumerable<T, TEnumerator> where TEnumerator : struct, IEnumerator<T>
{
   TEnumerator GetEnumerator();
}

public static SelectEnumerable<TEnumerable, TEnumerator, TSource, TResult> Select<TEnumerable, TEnumerator, TSource, TResult>(this TEnumerable source, Func<TSource, TResult> func) where TEnumerable : struct, IEnumerable<TSource, TEnumerator> where TEnumerator : struct, IEnumerator<TSource>;

public struct SelectEnumerable<TEnumerable, TEnumerator, TSource, TResult> : IEnumerable<TResult, SelectEnumerator<TEnumerator, TSource, TResult>> where TEnumerable : struct, IEnumerable<TSource, TEnumerator> where TEnumerator : struct, IEnumerator<TSource> {...}

public struct SelectEnumerator<TEnumerator, TSource, TResult> : IEnumerator<TResult> where TEnumerator : IEnumerator<TSource> {...}

We could use associated types to replace that with:

public interface IEnumerable
{
    public type Enumerator : struct, IEnumerator;
    Enumerator GetEnumerator();
}

public interface IEnumerator
{
    public type T;
    T Current { get; }
    bool MoveNext();
}

public static SelectEnumerable<TResult, SourceEnumerable> Select<SourceEnumerable , TResult>(this SourceEnumerable enumerable,  Func<SourceEnumerable.Enumerator.T, TResult> func) where SourceEnumerable : IEnumerable;

public struct SelectEnumerable<TResult, SourceEnumerable> : IEnumerable where SourceEnumerable : IEnumerable
{
     type Enumerator = SelectEnumerator<TResult, SourceEnumerable.Enumerator>;
}

public struct SelectEnumerator<TResult, SourceEnumerator> : IEnumerator where SourceEnumerator : IEnumerator
{
     type T = TResult;
}

This allows us to hide most of the type parameters from the signature of Select. The library author needs to deal with them, but the library consumer can ignore them.

This has similar performance implications to shapes, but doesn't require creating a whole new category of type.

0 replies

peter-dolkens · 2020-09-03T05:27:45Z

peter-dolkens
Sep 3, 2020

I'd just like to highlight that whichever implementation approach is selected, performance should absolutely be a top priority for this feature.

The type system, and method invocation are fundamental features of the language, and any implementation of a feature which invokes a performance penalty for a behavior which is almost indistinguishable from a "free" operation, should be avoided.

As a developer, I don't expect a heavy performance penalty for using myobject.MethodA() instead of myobject.MethodB()

As a developer, I don't expect a heavy performance penalty for using class MyType : SomeThing<Int> instead of class MyType : SomeThingElse<Int>

One objective of this is to facilitate clean, readable code. If I'm doing a code review, I shouldn't get caught out because I didn't know that a developer had changed an Interface into a Shape at some point, and now code which previously was optimal, is now victim to a performance hit.

Some areas like gaming, HFT, ML etc were mentioned above - these area care more about performance, than they do about clean code. If we're going to consider these sectors, then we need to consider their actual objectives. Wasting cycles isn't a luxury that can be afforded in these high frequency, or hyper-scale sectors.

0 replies

ByteEater-pl · 2020-09-03T14:33:33Z

ByteEater-pl
Sep 3, 2020

@peter-dolkens, if a neat syntactic disambiguation can be devised, I'm all for it, but if not, I believe your objection shouldn't prevail and block this feature. Unlike C++ and Rust, C# and .NET in general don't hold the tenet of zero cost abstractions (besides, you'd pay in performance overhead only if you actually use this stuff). This would rather be a usage scenario for linters, code colourers and possibly other tools aiding in code review and integration.

0 replies

IS4Code · 2020-09-03T14:49:01Z

IS4Code
Sep 3, 2020

As far as the original approach to this issue goes, a witness struct constrained to an interface has the potential of having zero additional costs, considering there is no boxing and all methods called on the witness are concrete and potentially even inlineable.

0 replies

peter-dolkens · 2020-09-03T19:35:06Z

peter-dolkens
Sep 3, 2020

@ByteEater-pl I never said the feature should be blocked, just that it should be implemented in the appropriate place - if that's the CLR, then so be it. If it can be done efficiently outside the CLR, then fine.

C# and .Net historically don't hold the tenet of zero cost abstractions - but there has been a massive shift with dotnet core, to the point where I believe even basics like string methods have been rewritten in native c# because they're performing faster than reusing the native c implementations.

In regards to linters, etc - C# historically relied heavily on the power of the VS IDE, but again we're seeing a rapid shift away from the heavy reliance on visual studio, and far more emphasis on lightweight, portable code that is accessible, and doesn't require a full VS installation to be effective.

dotnet core, and the emphasis on performance is a massive transformation for the language, and company as a whole, and it's seeing a surge of increased interest as a result - we can't be so willing to fall back into the "old ways" just because they're easier, or else we risk losing that interest and excitement that we've spent the last 5 years building.

0 replies

CyrusNajmabadi · 2020-09-03T21:35:51Z

CyrusNajmabadi
Sep 3, 2020
Collaborator

The topic of zero-cost abstraction (wrt to this space) is absolutely part of the design discussion. It will absolutely be considered, though we may or may not choose which direction to go here based on our personal thinking on the pros/cons of the different options.

0 replies

birbilis · 2020-10-19T06:20:27Z

birbilis
Oct 19, 2020

Why not just call them generic interfaces? If generics don't cover such scenaria they could be extended. Why introduce another separate term and a keyword that means other things in graphics-related code?

10 replies

Iron-E Dec 22, 2020

What about something like implied interface? Since the interface was not explicitly declared at the data structure:

public implied interface IGroup<T>
{
    static T operator +(T t1, T t2);
    static T Zero { get; }
}

We could also remove the of int and just put the type before extension. I feel like this would be more familiar.

public int extension IntGroup : IGroup<int>
{
    public static int Zero => 0;
}

AgentLintZeal Dec 22, 2020

What @Iron-E suggests reads well syntactically and in my mind it really is an implied interface we are talking about here. And I've long wished extension classes had their own keyword and do away with just adding the name 'Extension' to the end of every static class holding them.

yaakov-h Dec 22, 2020

We do already have the keyword implicit.

AgentLintZeal Dec 22, 2020

The nuances of implicit are pretty close, but not quite the same thing - implied is more succinct.

HaloFour Dec 23, 2020

IMO having two flavors of interface only splits the ecosystem unnecessarily. I think that any existing interface should be able to participate as a "shape".

birbilis · 2020-12-23T07:43:11Z

birbilis
Dec 23, 2020

Ideally every interface should be able to define a shape the class already implements. Could glue it on at compile time by extending "using", say "using xx as yy". At runtime could have a special cast (a safe one that uses reflection to check and probably an unsafe one too for already marked-as-unsafe code contexts for speed)

1 reply

huoyaoyuan Dec 23, 2020

Due to the way the runtime organizes objects, such cast can not be efficient (as inefficient as dynamic languages).
Shapes can provide more metadata to the runtime so that it can do the cast in an efficient way.

birbilis · 2020-12-23T08:26:34Z

birbilis
Dec 23, 2020

Could the "efficient" way prove to be an insecure way?

1 reply

huoyaoyuan Dec 23, 2020

No. Interface is called by vtable. There's no such memory region for object not implementing the interface.

birbilis · 2020-12-23T08:42:54Z

birbilis
Dec 23, 2020

The compiler would make such if you do the glueing at compile time. Problem is do you trust the calling library to have it? Unless class loader generates it.
For runtime it's even worse case, such duck-typing casts should be safely checked for the called library (or other called libraries to which you pass received object) security design to not break. Unless you end up not trusting anything and go back to C/C++

17 replies

rhaokiel Dec 30, 2020

@julealgon

Seems counterintuitive to me to have to force a keyword like that to "allow" the behavior to be loose. Just make it loose by default.

Because there is an underlying cost to casting a type to an interface it doesn't implement by creating a shadow shape. It might also make sense to include explicit casting where the cast doesn't currently exist, i.e. Foo((implicit IList)bar); and possibly a pre-compiler directive to default to loose interfaces.

Iron-E Dec 30, 2020

Shapes should be accepted as a generic parameter, not just as a normal argument via polymorphism. It's no use if you don't have access to the actual type.

Yeah, Foo<implicit Bar> should work too.

rhaokiel Dec 30, 2020

@IllidanS4

Shapes should be accepted as a generic parameter, not just as a normal argument via polymorphism. It's no use if you don't have access to the actual type.

I assume you mean as a generic constraint. Like this:
public void Foo<T>(T val) where T : implicit IList

Once again that is something that would be satisfied by the compiler, making a shape boiler plate for types that don't explicitly implement IList.

theunrepentantgeek Dec 31, 2020

@julealgon wrote

Seems counterintuitive to me to have to force a keyword like that to "allow" the behavior to be loose. Just make it loose by default.

Removing the need for a keyword certainly seems to be desirable (it would be my preference too) - but would lead to breaking changes, so I'm pretty sure we're going to be stuck with it.

Here's an example of the breaking change that we'd get by having no keyword:

Consider the fairly common pattern where overloads are used to give special handling:

public void Print<T>(IEnumerable<T> docs) { ... }
public void Print<T>(T doc) { ... }

And assume you have a simple document that does not implement IEnumerable<T>:

public class DemoDoc { ... }

At the moment, calling Print(demoDoc) will use the overload Print<T>(T doc).

But remember that one of the features of shapes is that you can adapt instances by providing extension methods (etc).

So it would be possible that just by upgrading the C# compiler, the code would then call the overload Print<T>(IEnumerable<T> docs). Whoops.

HaloFour Dec 31, 2020

Agreed, I think "eagerness of witnessing" is unrealistic, even if it would require witnesses to be declared and brought into scope. I can kind of see some syntax used at the call site to explicitly glue the type to the target through the witness type.

birbilis · 2020-12-31T02:28:10Z

birbilis
Dec 31, 2020

The caller of some method should be the one telling the compiler (or even the runtime in dynamic cases) that they're passing something as an implicit interface implementation, not the method itself say it only accepts shapes or it accepts interfaces and shapes. I also think adding more terminology and have a split world is problematic. What I'd expect is compiletime and runtime generated interface implementation wrappers around objects (and structs) based on caller claims.

0 replies

declard · 2021-03-17T16:35:18Z

declard
Mar 17, 2021

Were relations between multiple types considered as well?

public shape SConvertible<TFrom, TTo>
{
    TTo Convert(TFrom value);
}

public extension SConvertible<List<TFrom>, List<TTo>> when SConvertible<TFrom, TTo>
{
    TTo Convert(TFrom value) => value.Select(Convert).ToList();
}

0 replies

dylanrafael05 · 2021-04-28T03:34:31Z

dylanrafael05
Apr 28, 2021

This functionality is amazing, but it idolizes the practice of relying on the compiler's optimization skills. Relying on optimizations is especially dangerous for large-scale projects. Just look at C++'s standard templates for example. While functional, things like std::is_null<T> feel hacky, especially when they rely on structs. At the very least, there should be new syntax for no-content structures and inlined generics before this gets implemented, since new implies the creation of some data. Furthermore, secondhand compilers may not function with this idea, since it relies on the compiler's intelligence. Rather than doing this, C# should specify the requirement that a compiler inlines generic structs with no data, and possibly introduce a new keyword in order to enforce this inlining strictly, for example implementation.

0 replies

Exploration: Shapes and Extensions #164

MadsTorgersen Feb 22, 2017 Maintainer

Shapes and Extensions

Extensions

Shapes

Implementation

Implementing shapes directly

Instance members

Extending interfaces with shapes

Shapes in generic types

Extensions on shapes

Explicit implementation and disambiguation

Specifying which shape or extension to use

Using extensions as types

Extensions as wrapper types

Discussion

Replies: 385 comments · 29 replies

svick Feb 22, 2017 Collaborator

YairHalberstadt Aug 11, 2020 Collaborator

CyrusNajmabadi Sep 3, 2020 Collaborator

MadsTorgersen
Feb 22, 2017
Maintainer

Replies: 385 comments 29 replies

svick
Feb 22, 2017
Collaborator

YairHalberstadt
Aug 11, 2020
Collaborator

CyrusNajmabadi
Sep 3, 2020
Collaborator