Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] defined some "sort of interfaces" not implemented by the type #6638

Closed
MatthieuMEZIL opened this issue Nov 7, 2015 · 14 comments
Closed
Labels
Area-Language Design Resolution-Duplicate The described behavior is tracked in another issue

Comments

@MatthieuMEZIL
Copy link

If you think about Roslyn SyntaxNode (as an example), we have 9 types of SyntaxNode with the OperatorToken property without any common type.
Instead of writing a visitor that have 9 specific Visit (VisitBinaryExpression, VisitPrefixUnaryExpression, VisitPostfixUnaryExpression, etc) to be able to use the OperatorToken, I would love to be able to write something like this:

template SyntaxNodeWithToken
{
   SyntaxToken OperatorToken { get; }
}

Then in my visitor, I could just use the Visit:

public override void Visit(SyntaxNode node)
{
    var n = node as SyntaxNodeWithToken;
    if (n != null)
        Foo(n.OperatorToken);
}

template is maybe not the best name but I think you got the idea

@svick
Copy link
Contributor

svick commented Nov 7, 2015

How exactly would this be implemented?

There are already proposals to do something very similar using dynamic: #5306 and #3012. And #2146 is somewhat related too.

@MatthieuMEZIL
Copy link
Author

No I don't want to use dynamic for performance aspect.
I could also use Reflection BTW.

At compile time, the compiler knows which types inherit of SyntaxNode (in my sample) and can determine which types have the members defined in my pseudo-interface and it can use it.

However, as it is a huge impact because it's introduce a notion of "pseudo-type" with some strange conversion, an other option may be that the compiler generates an intermediate solution like this:

public class SyntaxNodeWithOperatorToken
{
    private Func<SyntaxToken> getOperatorToken;

    public SyntaxNodeWithOperatorToken(Func<SyntaxToken> getOperatorToken)
    {
        this.getOperatorToken = getOperatorToken;
    }

    public SyntaxToken OperatorToken
    {
        get { return getOperatorToken(); }
    }
}

and replace my "var n = node as SyntaxNodeWithToken;" by this:

SyntaxNodeWithOperatorToken n = null;
var nodeAsBinaryExpressionSyntax = node as BinaryExpressionSyntax;
if (nodeAsBinaryExpressionSyntax != null)
{
    n = new SyntaxNodeWithOperatorToken(() => nodeAsBinaryExpressionSyntax.OperatorToken);
}
else
{
    var nodeAsPrefixUnaryExpression = node as PrefixUnaryExpressionSyntax;
    if (nodeAsPrefixUnaryExpression != null)
    {
        n = new SyntaxNodeWithOperatorToken(() => nodeAsPrefixUnaryExpression.OperatorToken);
    }
// etc.
}

I think this last solution won't be so hard to implement.
Of course, I prefer the first solution that keeps my original reference instead of instantiating a new type and that avoids delegate usage.

What do you think?

@alrz
Copy link
Member

alrz commented Nov 7, 2015 via email

@svick
Copy link
Contributor

svick commented Nov 7, 2015

@MatthieuMEZIL I don't see how could that work in the general case, when the classes are in different assemblies. When compiling your Visit(), the C# compiler does not know all the types that inherit from SyntaxNode, so I think it can't use either of your proposals.

(BTW, could you format code in your posts? It would make them much easier to read.)

@bbarry
Copy link

bbarry commented Nov 8, 2015

I have written something very similar to this more times than I care to admit:

public struct SyntaxNodeWithOperatorToken
{
    readonly object _node;
    readonly MethodInfo _operatorTokenGetter;
    static readonly object[] empty = new object[0];

    public SyntaxNodeWithOperatorToken(SyntaxNode node)
    {
        _node = node;
        _operatorTokenGetter = _node.GetType().GetProperty(nameof(OperatorToken)).GetMethod;
    }

    public int OperatorToken
    {
        get
        {
            return (int)_operatorTokenGetter.Invoke(_node, empty);
        }
    }
}

nameof allows me to get rid of the string I used to have, but it would be nice to get rid of the 100x+ perf penalty somehow. I am fortunate to not have needed to do this in performance critical code.

@MatthieuMEZIL
Copy link
Author

@alrz : not in this case.
If I would like to also have this on generic, generic won't allow me to use sort of "as" or "is".

@MatthieuMEZIL
Copy link
Author

@svick: As a new feature, you can define any restriction you want. Think about partial method. They are only private because of the implementation. It would be too complex to allow protected partial method for example.
So having the restriction that it's only usable on types known by the assembly that host the code seems very reasonable to me.

@MatthieuMEZIL
Copy link
Author

@bbary : I want to avoid reflection.

@alrz
Copy link
Member

alrz commented Nov 8, 2015

You don't need it, this is already possible in F# inline functions with member constraints

let inline f node = (^T : (member OperatorToken  : unit -> SyntaxToken) node)

But it has to be inline, because it wouldn't be possible if this member is from different classes, as @svick said. Compiler must know what method is called inside the function at compile-time (to avoid any kind of reflection).

@bbarry
Copy link

bbarry commented Nov 9, 2015

Agreed, I don't want to do it with reflection either. I do it today with reflection because it provides me with a syntax that isn't terrible and I can afford the costs (after playing around with dynamic, I think I will be doing that instead from now on; one of the best things about reading random issues on github projects like this is learning things I didn't know before). Even with dynamic being surprisingly fast at this, I would still prefer some compile time safety.

@MatthieuMEZIL
Copy link
Author

After thinking more about it, I don't think my intermediate solution is acceptable. Indeed, if we have thousands of sub types, performance adding thousands as maybe worse than using reflection...

@JoergWMittag
Copy link

Unless I completely misunderstand what you are trying to do, it looks like you want structural types. C# has what is called a nominative type system, i.e. a type system where the relationships between types (compatibility, subtyping, etc.) are based on the names of the types.

The basic idea of structural types is that relationships between types are based on the structure of the types. Two types are compatible if they have the same structure. What exactly is considered to be part of the structure, differs between languages, e.g. whether or not for record types, the names of the fields are considered to be part of the structure as well or only the types. E.g. in some languages record { x: Int, y: Int, z: Int } and record { year: Int, month: Int, day: Int } are considered to be compatible because they both have type Int × Int × Int, in others, they aren't, because the names differ. In languages with subtyping, record { x: Int, y: Int, z: Int } would usually be considered a subtype of record { x: Int, y: Int }.

The most prominent example of a language that has both nominative and structural types, is Scala (It's called a refinement in this section.) Go's interfaces are also an example of structural typing.

In Scala, you can declare a type to be a (structural) refinement of another (nominative) type, i.e. you can add further (or in the case you are refining the top type only) restrictions on top of what the nominative type prescribes. The most common use case is to leave out the nominative type from the declaration in which case it is implicitly the top type. Structural refinements mirror the syntax of the corresponding declarations exactly.

def foo(bar: { def baz(x: Int): Long; val quux: String })

This declares a method foo that can take any object which has a readonly field named quux (note: the name of the member is part of the structural signature) of type String and a method named baz which takes an Int (note: the parameter name is not part of the signature) and returns a Long.

This means that I can pass an instance of this class:

class TotallyUnrelated {
  def baz(xyzzy: Int) = xyzzy.toLong
  val quux = "Hello"
}

In your case, it would look something like this:

public override def visit(node: SyntaxNode { val operatorToken: SyntaxToken}) = 
  if (node != null) foo(node.operatorToken)

I.e. we have declared visit to take a parameter which is a subtype of SyntaxNode with the additional refinement that it must have a val (readonly property) named operatorToken which is a (subtype of) SyntaxToken.

Drawing from the same idea as Scala, re-using the declaration syntax as the type syntax, it might look something like this in C#:

public override void Visit(SyntaxNode { SyntaxToken OperatorToken { get; } } node) => 
    if (node != null) Foo(node.OperatorToken)

Note that in the current version of Scala on the JVM, in some cases, calling such methods is indeed implemented using reflection and thus has a rather high performance cost. Even the spec, which tries to steer clear of implementation issues, has this footnote:

A reference to a structurally defined member (method call or access to a value or variable) may generate binary code that is significantly slower than an equivalent code to a non-structural member.

But, that's a question of implementation, not language design, that was driven by constraints of the target language of the compiler, in particular the fact that such types cannot be represented in the JVM's type system. Maybe, at some point, the implementation can be switched over to use invokedynamic, now that Scala has dropped support for JVMs which don't have it.

The situation is different in case of C#:

  • maybe the CLR can represent such types, it is, after all, a rather different beast than the JVM
  • even if can't, maybe it can be changed to do so (platform-language co-evolution is an option the Scala team doesn't have with the JVM!)
  • even if it won't, maybe it doesn't need reflection (e.g. implemented using the dynamic support)
  • even if it does, maybe the specific access patterns can be special-cased in the CLR
  • even if they can't, maybe they turn out to be not so slow, after all
  • and lastly, even if they turn out to be slow, maybe they are still useful

Now that I think about it: C# actually has a limited form of structural types in the form of anonymous types.

A natural evolution in terms of already existing language semantics would be to allow not only instantiating but also declaring anonymous types. Currently, you can only instantiate an anonymous type with a syntax that resembles an object initializer with the class name missing. A natural extension of this syntax toward structural types would be to

  • lift the restriction on read-only properties as the only allowed members
  • allow anonymous types to appear as … well … types, with a syntax similar to the initialization syntax, i.e. leaving out the name

@MatthieuMEZIL
Copy link
Author

@JoergWMittag: "Two types are compatible if they have the same structure."
I was more thinking about C# interface like structure (so including methods for example that I'm not sure you include in your explanation) but I think that yes you got the main idea of what I'm asking for.

However, in my case, I would prefer the following syntax than the one you proposed:

public override void Visit(SyntaxNode node)
{
    ...
    var nodeWithOperatorToken = ({ SyntaxToken OperatorToken { get; } })node;
    if (nodeWithOperatorToken != null) Foo(nodeWithOperatorToken.OperatorToken);
    ...
}

Maybe both could be better depending if the Visit method is interesting for SyntaxNode that does not have the OperatorToken property or not.

@gafter
Copy link
Member

gafter commented Nov 21, 2015

This is "structural interfaces". See also #154.

@gafter gafter closed this as completed Nov 21, 2015
@gafter gafter added the Resolution-Duplicate The described behavior is tracked in another issue label Nov 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Language Design Resolution-Duplicate The described behavior is tracked in another issue
Projects
None yet
Development

No branches or pull requests

6 participants