Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow references to structs to be stored in fields of ref structs #1147

Open
lucasmeijer opened this issue Nov 25, 2017 · 69 comments
Open

allow references to structs to be stored in fields of ref structs #1147

lucasmeijer opened this issue Nov 25, 2017 · 69 comments
Assignees
Labels
Bottom Up Work Feature Request Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion Proposal
Milestone

Comments

@lucasmeijer
Copy link

lucasmeijer commented Nov 25, 2017

Unity is a game engine that uses C# extensively.

We're in need of a way to have a struct that contains one or more pointers to other structs that we want to be able to read/write from/to. We do this with unsafe code today:

    struct Color { float r,g,b; }
    
    unsafe struct Group
    {
      Color* _color;
      ref Color color => ref *_color;
    }

But we would love to do this without unsafe code, and would like to suggest a feature where it's allowed to store references to structs in fields of ref structs, that can guarantee that these pointers don't escape onto the heap:

    struct Color { float r,g,b; }

    ref struct Group
    {
        ref Color color;
    }

EDIT link to spec for this feature https://github.com/dotnet/csharplang/blob/main/proposals/low-level-struct-improvements.md

@Joe4evr
Copy link
Contributor

Joe4evr commented Nov 26, 2017

I feel like you're trying too hard to use ref struct everywhere. Always keep in mind:

@jnm2
Copy link
Contributor

jnm2 commented Nov 26, 2017

@Joe4evr I don't think that comment is necessarily apropos.

@VSadov
Copy link
Member

VSadov commented Nov 27, 2017

ref fields are not directly expressible in IL.

You can very close to the desired behavior if there is a public type like: https://github.com/dotnet/corert/blob/796aeaa64ec09da3e05683111c864b529bcc17e8/src/System.Private.CoreLib/src/System/ByReference.cs

The bigger problem (and one of the reasons the type is not public) is tracking of life times.
It is assumed right now that that methods that take a ref and return a ref-like cannot return that same ref wrapped in a ref-like. Allowing/assuming such wrapping would make use of ref and ref-likes very difficult.

Such wrapping could be ok though, if wrapped ref is guaranteed to refer to a heap location. There are ideas how this could be allowed via some specialized syntax like:

ByReference<int> r = ref inArray[0];  // ok since array element is definitely on the heap

Just wondering, if the feature comes with "only on the heap" restriction, would it be acceptable limitation?

@joeante
Copy link

joeante commented Nov 27, 2017

You can very close to the desired behavior if there is a public type like:

That is possible but we intend to use this for simplifying our API's significantly for our upcoming new component system. We want the syntax to be straightforward, and typing .Value is quite annoying for what is a very common pattern in that Component system.

For our use case:

  • The referenced data is allocated from C++ via malloc
  • Group in the above example is a ref struct, thus can't be written to any classes or other persistent data

For our particular use case, we can live with any restriction on what you can assign to it. Internally we are patching a pointer to the data in by writing to memory at an offset relative to the adress of the Group struct.

But certainly for the feature to be useful in general beyond our use case it seems useful to be able to assign to the ref fields from the constructor

@gafter
Copy link
Member

gafter commented Nov 27, 2017

Stack-only structs were introduced in C# 7.2. See #666

@gafter
Copy link
Member

gafter commented Nov 27, 2017

This (a ref stored in a struct) would need to be supported by the CLR before/when it could be supported by C#.

@jaredpar
Copy link
Member

jaredpar commented Dec 1, 2017

Can you expand the sample to include a couple of other use cases? In particular:

  • How do you imagine such a struct being initialized? Sample code would be helpful here.
  • Do you need this feature for only unmanaged types (transitively contains no class or interface fields)?

@joeante
Copy link

joeante commented Dec 1, 2017

Our full example usage code looks like this:


struct BoidData : IComponentData
{
    public float3 Position;
}

struct MyTransform : IComponentData
{
    public float3 Position;
}

class MyComponentSystem : ComponentSystem
{
    ref struct MyGroup
    {
        public ref BoidData       Boid;
        public ref MyTransform    Transform;
    }

    ComponentGroup<MyGroup> m_Group = new ComponentGroup<MyGroup>();

    void OnUpdate()
    {
        foreach(var entity in m_Group)
        {
            entity.Transform.Position = entity.Boid.Position;
        }
    }
}

The expected behaviour would be to transform this:

ref struct MyGroup
{
        public ref BoidData       Boid;
        public ref MyTransform    Transform;
}

to this:

ref struct MyGroup
{
    	private Boid* 			        _boid;
    	public ref Boid                 Boid { { get ref *_boid;  }

    	private MyTransform* 	               _transform;
    	public ref MyTransform  Transform { { get ref *_transform;  }
}

How do you imagine such a struct being initialized? Sample code would be helpful here.

For our specific use case we do not require initializing ref values on the struct. We are writing to the pointers by offset using unsafe code inside the enumerator that GetEntities returns.

That is our use case. I can however imagine that being able to assign the ref in the constructor would cover other use cases for this.

Do you need this feature for only unmanaged types (transitively contains no class or interface >fields)?

Our use case is where the ref fields to Boid and MyTransform above are blittable. Thats our current requirements. We probably don't want to support interfaces / class types inside of Boid and MyTransform in the example above for other reasons. Not 100% on this, but if thats a hard requirement for some reason, we will find a way to live with it.

@joeante
Copy link

joeante commented Dec 1, 2017

I would imagine that for other use cases allowing assignment to ref fields from the constructor would be the sensible thing to do. This matches how it works in C++.

ref struct MyGroup
{
        MyGroup(ref MyTransform t) { Transform = t; }
    	private ref MyTransform 	     Transform;
}


@benaadams since you emoticon loved it, I imagine you have some use cases for this as well?

@benaadams
Copy link
Member

benaadams commented Dec 1, 2017

Just wondering, if the feature comes with "only on the heap" restriction, would it be acceptable limitation?

Yes

I imagine you have some use cases for this as well?

Gather ops (returning multiple refs); side car data (ref + info fields)

@jaredpar
Copy link
Member

jaredpar commented Dec 5, 2017

For our specific use case we do not require initializing ref values on the struct. We are writing to the pointers by offset using unsafe code inside the enumerator that GetEntities returns.

This feature request started with an ask that the feature be usable in safe code. This design seems to require there is an unsafe creator behind the scenes newing up these values.

@AJosephsen
Copy link

This was mainly what I was hoping for when I heard about 7.2 ref struct. I thought ref fields was the main reason to introduce ref structs in C# 7.2.

@omariom
Copy link

omariom commented Dec 9, 2017

This was mainly what I was hoping for when I heard about 7.2 ref struct. I thought ref fields was the main reason to introduce ref structs in C# 7.2.

The same for me. I planned to use them as wrappers for Spans.

@benaadams
Copy link
Member

benaadams commented Dec 9, 2017

The same for me. I planned to use them as wrappers for Spans.

You can do that as the ref field is hidden in the Span e.g.

Invalid

struct Thing
{
    Span<byte> span;
}

Valid

ref struct Thing
{
    Span<byte> span;
}

Is bit wasteful for a Span of count 1; also introduces variability to the field (as could now be count > 1)

@omariom
Copy link

omariom commented Dec 9, 2017

@benaadams

Valid

Great! 👍

@omariom
Copy link

omariom commented Dec 9, 2017

@jaredpar

The requirement to initialize such fields could make it safe:
Foe example,

ref struct MessageWrapper
{
    private ref Header header;
    private Span<byte> body; a

    public MessageWrapper(ref Header header, Span<byte> body)
    {
        this.header = ref header;
        this.body = body;
    }
}

So MessageWrappers must be initialized either via excplicit ctor call or if the ref fields are public
via initializer. Parameterless ctors are disallowed.

ref var wrapper = new MessageWrapper { Header = ref header, Body = body };

@bbarry
Copy link
Contributor

bbarry commented Jan 9, 2018

The requirement to initialize such fields could make it safe:

Not really:

MessageWrapper Create()
{
    var header = CreateHeader();
    return new MessageWrapper(ref header, CreateBody());
}

// call this method and watch bad things happen
void Crash() {
    var w = Create();
    // do something with w.header here
}

To make that safe the language/runtime would need to recognize that header needs to be in a pinned location with a lifetime at least as long as the MessageWrapper object. You can play with this sort of thing today via a Span<T> pointer trick:

ref struct MessageWrapper
{
    private Span<Header> header;
    private Span<byte> body;

    public MessageWrapper(ref Header header, Span<byte> body)
    {
        this.header = SpanEx.DangerousCreate(ref header);
        this.body = body;
    }
}

public static class SpanEx
{
    /// <summary>
    /// Creates a new span over the target. It is up to the caller to ensure
    /// the target is pinned and that the lifetime of the span does not escape
    /// the lifetime of the target.
    /// </summary>
    /// <param name="target">A reference to a blittable value type.</param>
    public static unsafe Span<T> DangerousCreate<T>(ref T target) where T : struct
    {
        return new Span<T>(Unsafe.AsPointer(ref target), 1);
    }
}

You can move the unsafe code to a second assembly and mark the one containing MessageWrapper safe, as long as Header is not itself a ref struct (otherwise you cannot use it as a generic parameter and this safety illusion breaks down). And then the lifetime of MessageWrapper must be carefully managed (it cannot for example be returned if Header is a local). Most of the relevant restrictions are enforced by the compiler by virtue of it being a ref struct.

Wrapping the "ref field" in a Span<T> is a tiny bit goofy (a ByReference type or actual support would be more natural and efficient), but there are some neat tricks you can do once you have a span such as reinterpret internal structure as a Vector<T>.

I entered https://github.com/dotnet/corefx/issues/26228 to make this a little nicer (don't expect much though, this sort of thing plays very loose with the safety boundaries of managed code). I sort of think requiring the unsafe flag might be reasonable for playing with Span<T> objects of blittable types.

@gafter
Copy link
Member

gafter commented Jan 9, 2018

To make that safe the language/runtime would need to recognize that header needs to be in a pinned location with a lifetime at least as long as the MessageWrapper object.

You mean like this: https://github.com/dotnet/csharplang/blob/master/proposals/csharp-7.2/span-safety.md ?

@bbarry
Copy link
Contributor

bbarry commented Jan 9, 2018

I am not sure if that is enough because Header here is not a ref struct (and ref struct types cannot be held inside a Span<T> as the generic argument, you have to trick it to see the underlying bytes and reinterpret as a not-ref struct). This program fails (every time for me with the above ) with the exception "Header not valid here":

public class Program {
    static MessageWrapper Create()
    {
        var header = CreateHeader();
        header.one = 1;
        var w = new MessageWrapper(ref header, CreateBody());
        if (w.header[0].one != 1) throw new InvalidOperationException("Header didn't make it");
        return w;
    }

    static Span<byte> CreateBody() => default;
    static Header CreateHeader() => default;

    public static void Main()
    {
        var w = Create();
        // anything that causes a GC event or sufficiently 
        // modifies the stack should cause breakage here?
        Thread.Sleep(1); 
        // do something with w.header here
        if (w.header[0].one != 1) throw new InvalidOperationException("Header not valid here");
    }
}

public ref struct MessageWrapper
{
    public Span<Header> header;
    public Span<byte> body;

    public MessageWrapper(ref Header header, Span<byte> body)
    {
        this.header = SpanEx.DangerousCreate(ref header);
        this.body = body;
    }
}

public static class SpanEx
{
    /// <summary>
    /// Creates a new span over the target. It is up to the caller to ensure
    /// the target is pinned and that the lifetime of the span does not escape
    /// the lifetime of the target.
    /// </summary>
    /// <param name="target">A reference to a blittable value type.</param>
    public static unsafe Span<T> DangerousCreate<T>(ref T target) where T : struct
    {
        return new Span<T>(Unsafe.AsPointer(ref target), 1);
    }
}

public struct Header {
    public byte one;
}

@aobatact
Copy link

aobatact commented Jan 10, 2018

How about creating a "RefWrapper" or a "SingleSpan" ref struct which is almost same as Span but doesn't have a length field?

@bbarry
Copy link
Contributor

bbarry commented Jan 10, 2018

The reason the Span<T> type is useful is because it is public and already has the associated apis for Span<T>. Any other name might as well be the ByReference type. This type already exists and represents the functionality desired in this issue, but is internal because it is very easy to use it to do things that break assumptions we depend on as well as a question of how much support it should get I assume.

I think most of us looking for ref fields would be satisfied if that type was made public and effectively ignored much like how TypedReference and related features are treated. That would still likely mean significant additional work for future changes to the runtime.

@AleckAgakhan
Copy link

@VSadov, what do you say on this?
There is no way to add ref-fields directly to ref struct, as the problem of the default instances arises: the default instances of such structs would contain refs to nowhere.
Thus not so much ref-fields itself need to be introduced in IL as much some sort of the monad Maybe for refs has to be implemented.
if ref-fields are introduced in IL:

public readonly ref struct NullableRef<T>
{
	public NullableRef(ref T r) => _ref = r;
	
	private readonly ref T _ref;
	
	public bool IsNull => Unsafe.AsPointer(ref _ref) == null;
	public ref T Ref => IsNull ? throw new NullReferenceException() : _ref;
}

otherwise:

public readonly ref struct NullableRef<T>
{
	public NullableRef(ref T r) => _ref = new ByReference<T>(ref r);
	
	private readonly ByReference<T> _ref;
	
	public bool IsNull => Unsafe.AsPointer(ref _ref.Value) == null;
	public ref T Ref => IsNull ? throw new NullReferenceException() : _ref.Value;
}

@verelpode
Copy link

@joeante wrote:

The referenced data is allocated from C++ via malloc.

If you're willing to switch over from malloc to allocating your structs in large C# arrays, then you could use the solution that I describe in #2417 in order to eliminate frame rate stalls/stutters caused by GC, without resorting to unsafe code.

An additional advantage: When you use malloc, then obviously you must later free the memory manually, but when you allocate your structs in C# arrays (call them "pages"), then each array ("page") is automatically garbage-collected when no more references to that "page" exist, as you know. Thus you can easily manage and balance the costs by merely changing the length of the arrays/pages that you create.

  • To run GC less often, create longer arrays/pages.
  • To save memory at the expense of running GC a little more often, create shorter arrays/pages.

In the case of a game where frame rates are important, you'd create big arrays/pages to keep GC to the minimum, or potentially never release the arrays/pages while the user is still playing the game and not between levels etc. You would have the control AND the necessary performance, without resorting to unsafe code.

When allocating a struct, should you search for a free slot in an array/page? If you want to, you can, but note that a simpler solution is satisfactory in many apps: Don't reuse any slots. When the page is full, simply allocate a new page, and the previous page will be automatically garbage-collected when all of its slots become unused. This GC cost is low and completely satisfactory for most apps, but if your game is very sensitive to frame rate stalls, then you could simply increase the page size (array length), and if this GC cost is still not low enough, then you could keep track of which slots are free and reuse slots.

You could start using the solution already today with the current version of C#, but in order to make it a comfortable and convenient solution, C# needs the syntactic sugar that I describe in #2417, which is mostly the same syntax as @lucasmeijer described ("ref Color color;") except designed for pointing to elements of C# arrays instead of unsafe C++ malloc. Also note that lucasmeijer wrote "ref struct Group {...}" but in my proposal, both "Group" and "Color" would be normal structs, whereas the field "ref Color color;" is special.

@DaZombieKiller
Copy link
Contributor

DaZombieKiller commented Dec 5, 2018

As I understand it, you would be required to supply the ref from elsewhere in order to initialize a ref field, correct?

Would it be possible to have a struct containing fields itself that are used by ref? For example, something like this, but expressible safely:

unsafe struct Example
{
    Vector3 position;

    Quaternion rotation;

    public ref Vector3 Position => ref *(Vector3*)Unsafe.AsPointer(ref position);

    public ref Quaternion Rotation => ref *(Quaternion*)Unsafe.AsPointer(ref rotation);
}

Which would allow for more convenient syntax, such as:

instance.Position.X += 1;

When using properties.

@jaredpar
Copy link
Member

@AlexRadch

From the [documentation] on MemoryMarshal.CreateSpan

This method should be used with caution. It is dangerous because the length argument is not checked. In addition, the lifetime of the returned span is not validated for safety by span-aware languages.

Essentially this method is incredibly unsafe. The returned ref has no safety rules associated with it. The idea of ref fields is to do this in a safe fashion where the compiler will provide the same rules around lifetime safety that it does for Span<T> today.

@AlexRadch
Copy link

Essentially this method is incredibly unsafe. The returned ref has no safety rules associated with it. The idea of ref fields is to do this in a safe fashion where the compiler will provide the same rules around lifetime safety that it does for Span<T> today.

It is unsafe only for lifetime. But if lifetime of ref is less than lifetime of variable then it is safe.

c# is not Rust it does not have lifetime safety. You try to add it.

For lifetime safety use array of length 1 and Span to it.

@AlexRadch
Copy link

Span is more safe than pointer * and it does not requare to pin memory.

@jaredpar
Copy link
Member

@AlexRadch

It is unsafe only for lifetime. But if lifetime of ref is less than lifetime of variable then it is safe.

Indeed. 100% of unsafe code is safe when used safely.

The point of unsafe code though is that the training wheels are off, the language is no longer providing the typical guarantees it provides (type and memory safety).

c# is not Rust it does not have lifetime safety. You try to add it.

Sorry this is incorrect. For ref and ref struct values the language has well defined and enforced safety rules. As long as users stay in safe code it's not possible to create issues like dangling refs. Once a user ventures into unsafe code, just as when they venture into unsafe in Rust, it is possible to create type and memory safety issues. That is one of the dangers posed by MemoryMarshal.CreateSpan.

@En3Tho
Copy link

En3Tho commented Oct 20, 2020

@AlexRadch Span itself uses a quirk to store a ref instead of using proper language/runtime feature. Now you propose using even more quirks instead of having a proper runtime/language feature. Ref field Color != field Color*, there is huge difference between references and pointers.
Also I do not get that assurance that Span[1] is the only and proper way to go. Spans are not a 100% safe construct, you break things easily if you're not careful with them.

@ufcpp
Copy link

ufcpp commented Oct 20, 2020

You try to add it.

Already. #3936

@jaredpar
Copy link
Member

@En3Tho

Indeed Span<T> uses an intrinsic today called ByReference<T>. That type is fundamentally unsafe as it's essentially a ref field with no language safety attached. The proposal to add ref fields would let us remove that type entirely and have Span<T> be defined entirely in C# terms

https://github.com/dotnet/csharplang/blob/master/proposals/low-level-struct-improvements.md#provide-ref-fields

@AlexRadch
Copy link

AlexRadch commented Oct 20, 2020

That is one of the dangers posed by MemoryMarshal.CreateSpan.

Yes. MemoryMarshal.CreateSpan is what you want but only without safety from compiler.

@En3Tho
Copy link

En3Tho commented Oct 20, 2020

@jaredpar I know about it, yes. I call it a quirk because it's internal, hard to use and has no language guarantees. I know of cases where people had to copy that stuff from runtime but there is a meaning in keeping stuff like that hidden from the general community. And thank you for sharing this proposal, it's a good read :)
@AlexRadch

Yes. But MemoryMarshal.CreateSpan is what you want but only without safety from compiler.

You mean with safety, not without
Also, do not forget that spans have a cost - it's length and I do not believe jit can elide it

@AlexRadch
Copy link

Indeed Span<T> uses an intrinsic today called ByReference<T>.

As I know ByReference<T> is not public and can not be used.

@AJosephsen
Copy link

I hate that I cannot implement functionality similar to Span in C# - it feels like I am being cheated.
Span feels like a workaround for missing ref fields.

@jaredpar
https://github.com/dotnet/csharplang/blob/master/proposals/low-level-struct-improvements.md#provide-ref-fields
Seems to solve all that. Why isn't this on the official rodamap?

@Sergio0694
Copy link

I hate that I cannot implement functionality similar to Span in C# - it feels like I am being cheated.
Span feels like a workaround for missing ref fields.

As you said, you can actually do that already, it's just not ideal. You can make it work and with very good codegen as well though. For instance, you can piggyback one 4 bytes value in your custom span-like type into the int length parameter of Span<T>, so that that space doesn't go wasted either. For instance, in the Span2D<T> type in the Microsoft.Toolkit.HighPerformance package (here) I'm using a single Span<T> field to store both a ref to the target memory area and the height of the target 2D memory area. This way you basically save one field, and the resulting code is fairly optimal. It's just definitely trickier and you need to be careful to ensure the APIs that publicly exposed from your types are safe to use, but it can technically be done already with no real downsides, other than the compiler not helping you out to ensure lifetime safety or bounds checks and having to always use manual GC-ref offsetting everywhere, which is arguably a bit error prone. You can make it work and make it safe too though 😄

Of course, being able to do this with ref T fields in C# 10 (hopefully?) will be infinitely better for a number of reasons and I'm not saying the two things are equivalent, just pointing out a workarounds for those absolutely in need of such a thing today.

@333fred
Copy link
Member

333fred commented Oct 20, 2020

Seems to solve all that. Why isn't this on the official rodamap?

What more are you expecting to see? This is a championed issue, triaged into the Working Set milestone, with a proposal checked in, that the LDM has met twice on within the last month. If that's not on the roadmap, I don't know what is.

@PathogenDavid
Copy link

triaged into the Working Set milestone

Maybe I'm missing something, but I don't think it is:

image

@333fred 333fred added this to the Working Set milestone Oct 20, 2020
@333fred
Copy link
Member

333fred commented Oct 20, 2020

I... Huh. Apparently we missed this when triaging things. Well, it's in the working set now.

@ilexp
Copy link

ilexp commented Feb 28, 2021

I'm confused about the state of this issue - the linked PR appears to add support for ref fields, it has been merged to master, but when checking the master branch via sharplab, it doesn't work. What am I missing? Is this already scheduled for a specific C# version release?

@jnm2
Copy link
Contributor

jnm2 commented Feb 28, 2021

@ilexp Is the linked PR you're referring to the one that merged spec updates into the dotnet/csharplang master branch? The master branch that Sharplab refers to is the dotnet/roslyn master branch. Here's how you can keep tabs on implementation status in dotnet/roslyn: https://github.com/dotnet/roslyn/blob/master/docs/Language%20Feature%20Status.md

@ilexp
Copy link

ilexp commented Feb 28, 2021

Oh - yes, that is the one I was referring to. I didn't actually check the modified files 🤦 Thanks for pointing it out 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bottom Up Work Feature Request Implemented Needs ECMA Spec This feature has been implemented in C#, but still needs to be merged into the ECMA specification Proposal champion Proposal
Projects
None yet
Development

No branches or pull requests