Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: user-defined null/default check (non-defaultable value types / nullable-like types) #15108

Closed
ufcpp opened this issue Nov 9, 2016 · 23 comments

Comments

@ufcpp
Copy link
Contributor

ufcpp commented Nov 9, 2016

Nullable types - Nullable<T> or any reference types - are specially treated with some syntax:

  • propagate an invalid value with a ?. operator
  • serve an alternative value with a ?? operator
  • [planned] Flow-analysis based validity checking like non-nullable reference types

There are some types behaving like nullable types, and I would like these "nullable-like" types to be "first-class" in terms of special treatment like ?., ??, and flow-analysis.

Nullable-like types examples

1. value-constrained struct

Suppose that you implement a type which value has some constraints: for instance, an integer type which is constrained to be positive:

struct PositiveInt
{
     public int Value { get; }
     public PositiveInt(int value)
    {
        if (value <= 0) throw new InvalidOperationException();
        Value = value;
    }
}

If C# compiler would have DbC and record types, this sample would be written like:

struct PositiveInt(int Value) requires Value > 0;

This struct is meant not to be zero or less, but can be zero if and only if using default(PositiveInt). The default should be treated as an invalid value like null.

2. Expected

There is a problem with using null as an invalid value, it does not tell why the operation returned null. To solve this problem, some people prefer a type similar to expected<T> in C++ - it is a union type of T and Exception as following:

struct Expected<T>
{
    public T Value { get; }
    public Exception Exception { get; }
    public bool HasValue => Exception == null;
}

When I use such a type, I want to write as following:

Expected<string> s;
Expected<int> len = s?.Length;
int x = len ?? 0;

This code uses "exception propagating operator" ?. and "exception coalescing operator" ?? by analogy with null propagating/coalescing operator.

Proposed syntax

I want some syntax to introduce "nullable-like" types to C#; One idea is "operator null":

// definition
struct Expected<T>
{
    public T Value { get; }
    public Exception Exception { get; }
    public static bool operator null => Exception != null;
}

// usage
Expected<int> e;
int x = e ?? 0;

// generated code
Expected<int> e;
int x = operator null(e) ? 0 : e.Value;
@dsaf
Copy link

dsaf commented Nov 9, 2016

So it's like an Option type but with error details + syntax sugar? Can't we use a tuple and an extension method?

(int result, Exception error) len = s?.Length;
int x = len.GetValueOrDefault();

public static T GetValueOrDefault<T>(this (T Result, Exception Error) source, T defaultValue = default(T))
{
    return source.Error == null ? source.Result : defaultValue;
}

@dsaf
Copy link

dsaf commented Nov 9, 2016

@alrz
Copy link
Member

alrz commented Nov 9, 2016

@dsaf This is Result in Rust and FSharpResult in F#.

@dsaf
Copy link

dsaf commented Nov 9, 2016

@alrz any decent modern language basically :). So a CoreFX issue unless we want the ?? operator? They already have this https://github.com/dotnet/corefx/issues/538 but sadly it's just the presence/absence - no error details.

@DavidArno
Copy link

There already exists a proposed (and possibly still planned) feature that would address this requirement: the is operator for patterns. By changing your type to:

struct Expected<T>
{
    public T Value { get; }
    public Exception Exception { get; }

    public static bool operator is(Expected<T> e, out T v)
    {
        v = Exception == null ? Value : default(T);
        return  Exception == null;
    }
}

Then you'd be able to do:

Expected<int> e;
int x = e is int ? e.Value : 0;

Or, using the new "is var" scope leakage, and the proposed wildcard feature, it just becomes:

Expected<int> e;
_ = e is int x;
// x will either be default(int) or e.Value here

@HaloFour
Copy link

HaloFour commented Nov 9, 2016

I don't like the idea of conflating null with this concept. An Expected<T> with an exception isn't null.

If the primary goal is to support the ?. and ?? operators then perhaps both can be supported by convention. For example, if the type has a HasValue property (or HasValue() extension method, unless extension properties become a thing), then it would be pretty easy to support both operators. The ?? operator could be further supported by a GetValueOrDefault(T) method, although that wouldn't be strictly necessary.

@alrz
Copy link
Member

alrz commented Nov 9, 2016

I smell monad ...

It'd nice to see a proposal for defining monad's operators to cover all these cases. HasValue convention would only work for handling value absence. But I think it'd worth to explore a general solution for these kind of things. As far as I remember someone implemented null check with task-like types e.g. await does the null check and returns early in case of the value null, however, awaitables aren't really a suitable mechanism to implement such behavior.

It's funny that Rust uses ?. operator on Result type for exactly this purpose.

https://github.com/rust-lang/rfcs/blob/master/text/0243-trait-based-exception-handling.md

@ufcpp
Copy link
Contributor Author

ufcpp commented Nov 9, 2016

@dsaf Yes, Option is also the example. I also wish we can use tuples and extensions.

@ufcpp
Copy link
Contributor Author

ufcpp commented Nov 9, 2016

Some more examples:

  • Option or Maybe where T can be both reference and value type in contrast to Nullable where T : struct
  • Thin wrapper structs such as TaskAwaiter which should contain a non-null reference
  • UnityEngine.Object overloads operator true and == with which a value is treated as null if internal native resources have been disposed

@ufcpp
Copy link
Contributor Author

ufcpp commented Nov 9, 2016

To tell the truth, what I want the most is "non-default value types".

@ufcpp
Copy link
Contributor Author

ufcpp commented Nov 9, 2016

default(T) makes null where T is struct and contains members of reference types. Thus, if C# would have non-null reference analysis, C#should have non-default value analysis too.

@HaloFour
Copy link

HaloFour commented Nov 9, 2016

There's nothing special about default(T). All C# does is have the CLR zero-init that memory. The net effect of that is that reference types are null, including those as fields of a value type. This is all C# or the CLR can do when it doesn't know what T is.

@ufcpp
Copy link
Contributor Author

ufcpp commented Nov 9, 2016

@HaloFour Non-nullable reference types in C# vNext is based on flow analysis. This analysis should contain "defaultability" of value types.

@HaloFour
Copy link

HaloFour commented Nov 9, 2016

I see what you're saying. However, in most cases a default value type is perfectly fine and it doesn't make sense to bother warning on them. I think that there would be a lot of potential value in allowing for analyzers to potentially tap into the flow analysis so that custom types like Option<T> or whatever could participate, but I can't see there being a general purpose rule here.

@svick
Copy link
Contributor

svick commented Nov 9, 2016

@alrz I don't think this should go full monad. For example, consider the IEnumerable<T>/sequence monad. I would be very surprised if e. g. ImmutableArray.Create(1, 2, 3)?.ToString() returned IEnumerable<string> or ImmutableArray<string>.

To me, the ?. operator only makes for types that can have zero or one values, not more. And HasValue should be sufficient for that.

(Plus I'd say it's consistent with LINQ, which technically does go full monad, but is not used that way in practice, and is strongly biased towards the sequence monad.)

@dsaf
Copy link

dsaf commented Nov 10, 2016

@HaloFour

However, in most cases a default value type is perfectly fine and it doesn't make sense to bother warning on them.

Is the value of disallowing null in preventing the NullReferenceException or in making the default value choice more conscious?

@DavidArno
Copy link

@ufcpp,

To tell the truth, what I want the most is "non-default value types".

Many people want this and I thought it was coming with the #13921 proposal, by allowing parameterless constructors to be defined for structs. Sadly having properly read it, I've realised it's actually a proposal to make things worse.

Looks like we'll have to wait a while longer.

@alrz
Copy link
Member

alrz commented Nov 10, 2016

@svick "exactly this purpose" refers to the subject of this proposal. Check out the link in my previous comment. For monads in general, I can imagine something between Haskell's do-notation and F#'s computation expressions, for example, var x <- m; ... could be translated to return m.Bind(x => ...) and require the method return an appropriate type, the same rule that we need to satisfy to use await.

@ufcpp
Copy link
Contributor Author

ufcpp commented Nov 11, 2016

@HaloFour
Yes, in most case. So, we might need some kind of annotation on structs.

@DavidArno
Parameterless constructor is usefull anyway, but, there still remains default(T). this is another issue.

@DavidArno
Copy link

@ufcpp,

I disagree (as I explain in a comment near the bottom of that topic). Parameterless constructors on structs will only be useful if default(T) results in the CLR invoking them, otherwise they'll just make things worse.

@ufcpp ufcpp changed the title Proposal: user-defined null/default check (nullable-like types) Proposal: user-defined null/default check (non-defaultable value types / nullable-like types) Nov 12, 2016
@ufcpp
Copy link
Contributor Author

ufcpp commented Nov 12, 2016

To recap:

This issue might have to be separated into two parts:

  • Non-defaultable value types: value types whose instances are guaranteed never being default(T)
  • nullable-like types: user-defined types which can use ?. and ?? operators

Non-defaultable value types

Certain initialization is necessary for Method Contracts, especially Nullability checking. However, if T is a struct, default(T) makes all members 0 or null and skips certain initialization. Thus, flow analysis is needed for not only contracts and nullability checking for reference types but also "defaultability" checking for value types.

Example 1: Thin Wrapper of Reference

From the viewpoint of performance, we sometimes need to create a thin wrapper struct which contains only few members of reference types, e. g. ImmutableArray. Now that performance is one of big themes in C#, needs of such a wrapper struct would increase more and more.

Given the code such as:

struct Wrapper<T> where T : class
{
    public T Value { get; }
    public Wrapper(T value)
    {
        Value = value ?? throw new ArgumentNullException(nameof(value));
    }
}

If Records and nullability checking would be introduced to C#, this could be shortly written as the following:

struct Wrapper<T>(T Value) where T : class;

T is non-nullable from now, and use T? if null is required. The problem here is that an instance of this struct can contain null by using default(Wrapper<T>) although T Value is supposed to be non-nullable.

Example 2: Value-Constrained Structs

Suppose that you implement a type which value has some constraints: for instance, an integer type which is constrained to be positive:

struct PositiveInt
{
     public int Value { get; }
     public PositiveInt(int value)
    {
        if (value <= 0) throw new InvalidOperationException();
        Value = value;
    }
}

If Records and Method Contracts would be introduced to C#, this could be shortly written as the following:

struct PositiveInt(int Value) requires Value > 0;

This looks good at first glance, but, default(PositiveInt) makes 0 which is an invalid value.

Proposal

Nullability checking proposed in #5032 is based on flow analysis. Defaultability checking is essentially the same analysis as the nullability checking and could be implemented by the same algorithm.

However, in most case, default(T) is a valid value of T. Therefore, some kind of annotation would be required. For instance, how about using a [NonDefault] attribute on structs.

[NonDefault]
struct Wrapper<T>(T Value) whereT : class;

[NonDefault]
struct PositiveInt(int Value) requires Value > 0;

I call these types "non-defaultable value types". As with non-nullable reference types, defaultablity should be checked by using flow anlysis.

PositiveInt x = default(PositiveInt); // warning
PositiveInt? y = default(PositiveInt); // OK
PositiveInt z = y; // warning
PositiveInt w = y ?? new PositiveInt(1); // OK

If T is a non-defaultable value type, T? doesn't require Nullable<T> because default(T) is an invalid value and can be used as null.

Moreover, ordinary value types should not be allowed to have members of non-nullable reference types. Only types that be allowed it would be reference types and non-defaultable value types.

nullable-like types

Now, reference types and Nullable<T> have a special position that they can be use ?. and ?? operators. However, we sometimes want to use these operators on other types.

There are some alternative approaches that use query expressions and task-like, but in my opinion, these are abuses. I would like to use ?. and ?? for propagation and coalescence of invalid values.

Example 1: UnityEngine.Object

UnityEngine.Object - a common base type in Unity Game Engine - overloads operator == with which a value is treated as null (x == null is true) if internal native resources have been disposed.

However, ?. and ?? for reference types does not call operator ==. Instead, the C# compiler emits a brtrue instruction which simply checks nullability by reference. Thus, ?. and ?? on UnityEngine.Object doesn't work properly:

int? X(UnityEngine.Object obj)
{
    // OK
    if (obj == null) return null;
    return obj.GetInstanceID();
}

// runtime exception: the native resource is disosed
int? Y(UnityEngine.Object obj) => obj?.GetInstanceID();

So far, this is not so big problem because Unity uses C# 3.0. However, Unity 5.5 will update C# to 6.0. That behavior of ?. operator could be a pitfall.

Example 2: Expected<T>

A certain number of developers tend to avoid using null as an invalid value. A reason of it is that null doesn't tell us why a result became invalid. To solve this problem, they prefer a type similar to expected<T> in C++ - it is a union type of T and Exception as following:

struct Expected<T>
{
    public T Value { get; }
    public Exception Exception { get; }
}

If C# would have a type like this, ?. and ?? might be allowed on the type.

Expected<string> x = new Expected<string>(new Exception());
Expected<int> y = x?.Length;
string z = x ?? "";

Proposal

I propose that ?. and ?? operators can be used on types that implement certain pattern. I call these types "nullable-like types" named after task-like. How about the pattern like:

struct NullableLike<T>
{
    public T Value { get; }
    public bool HasValue { get; }
    // propagate a valid value
    public NullableLike<U> Propagate<U>(U value);
    // propagate an invalid value
    public NullableLike<T> Propagate();
}

For example, the Expected<T> could implement this pattern as follows:

struct Expected<T>
{
    public T Value { get; }
    public Exception Exception { get; }

    public Expected(T value)
    {
        Value = value;
        Exception = null;
    }
    public Expected(Exception exception)
    {
        Value = default(T);
        Exception = exception;
    }

    public bool HasValue => Exception == null;
    public Expected<U> Propagate<U>() => new Expected<U>(Exception);
    public Expected<U> Propagate<U>(U value) => new Expected<U>(value);
}

Now, the sample of Expected<T> described in the previous section is expanded as follows:

Expected<string> x = new Expected<string>(new Exception());
Expected<int> y = x.HasValue ? x.Propagate(x.Value.Length) : x.Propagate<int>();
string z = x.HasValue ? x.Value : "";

@gordanr
Copy link

gordanr commented Nov 13, 2016

It would be great to have.

struct PositiveInt(int Value) requires Value > 0;

But, It is not so simple as following code.

struct PositiveInt
{
    public int Value { get; }

    public PositiveInt(int value)
    {
        if (value <= 0) throw new InvalidOperationException();
        Value = value;
    }
}

Business rule about validation should be encapsulated inside type reflecting somehow requires condition. Validation should be offered as a public service of the type. Developers should check input from unsafe resources (keyboard, database, files) before crossing boundary to safe domain type. Data have to be validated before calling constructor and throwing Exception. Look at the great article "From Primitive Obsession to Domain Modelling" by Mark Seemann.

Something like.

struct PositiveInt
{
    public enum Code { OK, Bad }

    public int Value { get; }

    public PositiveInt(int value)
    {
        var validationCode = ValidationCode(value);
        if (validationCode != Code.OK) throw new InvalidOperationException();
        Value = value;
    }

    public static Code ValidationCode(int candidate)
    {
        if (candidate <= 0) return Code.Bad; // requires Value > 0

        return Code.OK;
    }
}

@ufcpp
Copy link
Contributor Author

ufcpp commented Feb 19, 2017

Ported to dotnet/csharplang#146, dotnet/csharplang#147

@ufcpp ufcpp closed this as completed Feb 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants