Recursive and unsized enums #1151

tomaka · 2015-06-05T13:06:38Z

It is currently forbidden to have this enum, because it is recursive:

enum Sequence<T> {
    Element { value: T, next: Sequence<T> },
    End
}

However in theory we could accept it by making Sequence unsized.

The text was updated successfully, but these errors were encountered:

glaebhoerl · 2015-06-06T10:43:44Z

I think this is a really cool idea, if it can work (I don't momentarily see any reason why it couldn't, but I haven't thought very hard). Reminds me a bit of serialization formats where you have a flat stream of bytes with tags to indicate what comes next.

ftxqxd · 2015-06-06T11:29:34Z

It could be a little tricky to define how to construct one of these. A similar instance of the same problem exists today: it’s valid to to define a struct whose last field is a non-type-parameter unsized type…

struct Dst {
    x: [u8],
}

…but it’s basically impossible to construct one of these, because there’s no Sized equivalent to coerce from. I believe this proposal suffers from the same problem. The obvious solution would be to somehow allow unsized types in struct initialisers, but I’m not sure about how that would work in practice.

Kimundi · 2015-06-07T16:08:38Z

@P1start: Afaik @eddyb has a few ideas how to make that work.

eddyb · 2015-06-07T17:08:29Z

@P1start you would only allow it in an in-place context (&expr, box expr, in place { expr }) and then drag around enough context to save the extra information after (while?) writing the data.
Though panics might be tricky, it should still be possible.

ticki · 2015-11-16T08:54:46Z

I think this is really a cool idea, and is needed for the sake of consistency.

serprex · 2016-11-08T13:11:34Z

A useful extension of this is that an unsized enum which is non recursive could also be useful for taking up less space (consider enum BigOrSmall { A([u8; 2000]), B([u16; 10])}). Perhaps a #[repr(unsized)]

tupshin · 2017-01-07T18:59:21Z

I'll just weigh in in favor of this given my internals question where I unexpectedly ran into this same limitation.

burdges · 2017-01-07T23:28:34Z

There are discussions elsewhere about doing match on "artificial" enums, so another idea might be a trait

trait Match {
    type MatchEnum;
    fn run_match(&self) -> MatchEnum;
    fn run_match_mut(&mut self) -> MatchEnum;
}

so that if impl Match for FakeEnum exists then match fake_enum { .. } becomes match run_match(&fake_enum) { .. } or the mut version depending upon the patterns.

Assuming the trait fields RFC, there is a version with nicer borrowing semantics

trait Match {
    type MatchEnum;
    type EnumField;
    target: EnumField;
    fn run_match(&EnumField) -> MatchEnum;
    fn run_match_mut(&mut EnumField) -> MatchEnum;
}

where match fake_enum { .. } becomes match run_match(&fake_enum.target) { .. }.

We can maybe do the recursive unwrapping these describe using this approach like

struct Sequence<T>([T]);

enum SequenceEnum<T> {
    Element(pub T, &Sequence<T>),
    End
}

impl<T> Match for Sequence<T> {
    type MatchEnum = SequenceEnum<T>;
    fn run_match(&EnumField) -> MatchEnum {
        if self.0.len() > 0 { Element(self.0[0],Sequence<T>(&self.0[1])) else { End }
    }
}

except with lifetimes and unsafety around the borrowing of self.0. And maybe even an explicit length field if [T] lacks that.

All this looks less ergonomic of course, but you could maybe fix that with macros somehow. And it's definitely something lower-level developers run into frequently.

I donno if this approach might place less nicely with future schemes for type-level numerics though.

RReverser · 2020-03-25T13:39:52Z

To bump this thread: now that we have enums with well-defined memory layout such as repr(u8), repr(C) (ping @eddyb), etc., could we allow at least such enums to contain unsized data as last item of enum payload?

Then it would be possible at least for an unsafe code to construct such enums using same techniques as for any other structs (which such enum is equivalent to), and for other safe code to use it.

eddyb · 2020-03-25T21:41:48Z

@RReverser It helps with the optimization problem, but you still have the issue of the metadata not always being valid, and you also need to read from behind the data pointer to determine the active variant to get e.g. the alignment of the enum, which is needed for an operation like computing the true field offset in a type having that enum as a field (e.g. &pair.1 on (u8, Enum)).

I forget what we did for unions, did we just ban unsized unions? (cc @RalfJung)

RustyYato · 2020-03-25T21:46:05Z

@eddyb Unsized unions are in fact banned, playground.

RReverser · 2020-03-25T22:30:04Z

you also need to read from behind the data pointer to determine the active variant to get e.g. the alignment of the enum

Hmm. Isn't that the case for any unsized data structures anyway? E.g. you need to read from behind the data pointer to get alignment of a struct ending with a dyn Trait.

eddyb · 2020-03-25T22:33:47Z

E.g. you need to read from behind the data pointer to get alignment of a struct ending with a dyn Trait.

No, the reference/pointer to the struct contains the data pointer and the vtable pointer, there's nothing behind the data pointer that's not in a version of the struct where the dyn Trait is replaced by the concrete type (this is how unsizing works after all).

RReverser · 2020-03-25T22:59:28Z

Ah, got it, makes sense... And, I suppose, same applies to actual variant vtables.

On a second thought, given that only one variant of the enum can be active and hold unsized data, couldn't &SomeEnum be encoded as a (data_ptr_for_the_entire_enum, vtable_just_for_active_variant)?

Then other parts of the type system could continue reading layout and vtable information as they normally do. The downside is that now you have to read from behind data pointer when taking a reference... but that's probably still fine, because creating a reference to invalid data is a declared UB anyway, and still better than the alternative.

RustyYato · 2020-03-25T23:14:17Z

On a second thought, given that only one variant of the enum can be active and hold unsized data, couldn't &SomeEnum be encoded as a (data_ptr_for_the_entire_enum, vtable_just_for_active_variant)?

No, counterexample

enum Foo {
    N,
    C(Foo)
}

Some examples of this type Foo::C(Foo::C(Foo::N)), and Foo::C(Foo::N). Both of these have the same active variant for the outermost wrap, but the inner Foo have different variants. How would that be encoded? (hint, it can't in general). We would have to load from the data pointer. That said, that in itself wouldn't kill this. It would just make this the very first thin pointer DST with language support.

Just to make sure we understand why this isn't possible in general, because the enum above is solvable by just encoding the "length" of the enum, but it only get's more complex from here:

Consider this, (a simple length won't suffice, we also need to know what the last element is, but that's just a bit more metadata that we could store)

enum Foo {
    A,
    B,
    C(Foo)
}

Or even this, (where a length is insufficient)

enum Foo {
    A,
    B(Foo),
    C(Foo),
}

Here is a case that would truly require reading from the data pointer, no other way around it.

Then other parts of the type system could continue reading layout and vtable information as they normally do. The downside is that now you have to read from behind data pointer when taking a reference... but that's probably still fine, because creating a reference to invalid data is a declared UB anyway, and still better than the alternative.

I think this is fine, we should be able to do that.

RReverser · 2020-03-25T23:16:20Z

Sorry, before reading in depth, I'm confused:

No, counterexample: ...

I think this is fine, we should be able to do that.

So which one is it? 😀 (as the comment you referenced had only a single suggestion)

RReverser · 2020-03-25T23:34:31Z

Reading through the counter-examples.

Both of these have the same active variant for the outermost wrap, but the inner Foo have different variants. How would that be encoded? (hint, it can't in general).

I don't see why not. Such enum

enum Foo {
    N,
    C(Foo)
}

would be equivalent to already existing and working struct definition (not exactly equivalent code, just an example for demonstration purposes)

pub struct Foo_N;

pub struct Foo_C<V: ?Sized>(Foo<V>);

pub struct Foo<V: ?Sized> {
    tag: u8,
    variant_data: V,
}

pub fn dyn_foo() -> Box<Foo<dyn std::any::Any>> {
    let foo = Foo {
        tag: 1,
        variant_data: Foo_C(Foo {
            tag: 1,
            variant_data: Foo_C(Foo {
                tag: 0,
                variant_data: Foo_N
            })
        })
    };
    Box::new(foo)
}

Its construction is already well-defined and works on stable Rust. Godbolt link

All I'm suggesting is for unsized enums to be encoded in the same way.

RustyYato · 2020-03-25T23:40:28Z

Sorry, I may have misunderstood what you were proposing, but I thought you meant that

It is possible to encode which variant the enum is in without reading from the data pointer (which isn't possible)
- I thought you realized this wasn't possible after I had already written out that portion, so I didn't really want to cut it out

After that realization I may have misinterpreted the second part of your comment to mean

We can allow reading from the data portion, and there wasn't anything wrong with that (this is fine, I think)

would be equivalent to already existing and working struct definition (not exactly equivalent code, just an example for demonstration purposes)

They sure do look the same, but they are very different in a single pivotal way. In the enum example you have just 1 type, but in the struct example you have 3 types FooN, FooC<FooN>, FooC<FooC<FooN>> each of these could have it's own v-table! This is relevant because we can't generate a v-table for every possible combination of the enum, but we can for the struct case. This is fundamental to how enums and traits work: traits offer a open polymorphism, and enum is closed polymorphism.

RReverser · 2020-03-25T23:45:59Z

This is relevant because we can't generate a v-table for every possible combination of the enum, but we can for the struct case.

Now that we're getting there... why not? Note that we wouldn't need to generate it for every possible combination, just for the ones actually constructed in code (and only for unsized enums), which are limited, just like in case with structs.

I believe there was even another long-term RFC going around about allowing to use separate enum variants as individual types, which seems like would intersect quite nicely with this proposal, as this one also suggests to treat enum variants as separate types.

RustyYato · 2020-03-26T00:21:31Z

So you are trying to blur the distinction between enums and traits? That sounds interesting. I could see this as a way of implementing a sort of sealed traits. One small problem, how do you construct the vtable (if we are going down that route)?. With traits we can never affect the bare trait object's implementation of the trait, so it is able to just call out using the v-table. But with enums, we are effectively writing that very implementation. What I'm trying to get at is this: We don't have Foo::C's implementation of some method, we only have Foo's implementation. So we can't generate a vtable.

amosonn · 2020-07-11T13:40:52Z

It might be possible to create a Foo::C "version" of Foo-s implementation, where in every destructuring (match, if let, etc.) only the Foo::C branch is taken - effectively encoding the discriminant into the implementation. This does indeed go further in the direction of making each variant a type of its own. But this also sounds like a lot of work in codegen.

nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 29, 2016

oli-obk mentioned this issue Nov 29, 2016

Allow recursive untagged unions #1804

Closed

taiki-e mentioned this issue Jul 15, 2022

Proposal: allow Fut: ?Sized in Shared<Fut> rust-lang/futures-rs#2623

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recursive and unsized enums #1151

Recursive and unsized enums #1151

tomaka commented Jun 5, 2015

glaebhoerl commented Jun 6, 2015

ftxqxd commented Jun 6, 2015

Kimundi commented Jun 7, 2015

eddyb commented Jun 7, 2015

ticki commented Nov 16, 2015

serprex commented Nov 8, 2016

tupshin commented Jan 7, 2017

burdges commented Jan 7, 2017 •

edited

Loading

RReverser commented Mar 25, 2020

eddyb commented Mar 25, 2020

RustyYato commented Mar 25, 2020

RReverser commented Mar 25, 2020

eddyb commented Mar 25, 2020

RReverser commented Mar 25, 2020

RustyYato commented Mar 25, 2020 •

edited

Loading

RReverser commented Mar 25, 2020

RReverser commented Mar 25, 2020 •

edited

Loading

RustyYato commented Mar 25, 2020

RReverser commented Mar 25, 2020

RustyYato commented Mar 26, 2020

amosonn commented Jul 11, 2020

Recursive and unsized enums #1151

Recursive and unsized enums #1151

Comments

tomaka commented Jun 5, 2015

glaebhoerl commented Jun 6, 2015

ftxqxd commented Jun 6, 2015

Kimundi commented Jun 7, 2015

eddyb commented Jun 7, 2015

ticki commented Nov 16, 2015

serprex commented Nov 8, 2016

tupshin commented Jan 7, 2017

burdges commented Jan 7, 2017 • edited Loading

RReverser commented Mar 25, 2020

eddyb commented Mar 25, 2020

RustyYato commented Mar 25, 2020

RReverser commented Mar 25, 2020

eddyb commented Mar 25, 2020

RReverser commented Mar 25, 2020

RustyYato commented Mar 25, 2020 • edited Loading

RReverser commented Mar 25, 2020

RReverser commented Mar 25, 2020 • edited Loading

RustyYato commented Mar 25, 2020

RReverser commented Mar 25, 2020

RustyYato commented Mar 26, 2020

amosonn commented Jul 11, 2020

burdges commented Jan 7, 2017 •

edited

Loading

RustyYato commented Mar 25, 2020 •

edited

Loading

RReverser commented Mar 25, 2020 •

edited

Loading