Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed enum layout optimization with a NonZeroU64 + more space in an enum #101567

Open
RalfJung opened this issue Sep 8, 2022 · 10 comments
Open
Labels
A-layout Area: Memory layout of types T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Sep 8, 2022

The Provenance type defined here is 24 bytes in size:

use std::num::NonZeroU64;

pub enum Provenance {
    Concrete {
        alloc_id: NonZeroU64,
        sb: NonZeroU64,
    },
    Wildcard,
    None,
}

However, it should be possible to encode Provenance in 16 bytes: e.g. (0usize, 0suize) could encode None and (0usize, 1usize) could encode Wildcard.

In Miri, a slight variant of this would help reduce the size of a fairly common type from 32 bytes to 24 bytes.

Unfortunately even #94075 does not help here. The entire concept of reading a single field to determine the discriminant is not flexible enough to represent this layout.

@oxalica
Copy link
Contributor

oxalica commented Sep 8, 2022

Wouldn't it cause severe performance issue for enum with a number of variants? It makes a single-variant check matches!(enum, Variant) load and compare multiple fields (up to O(number_of_niches)). It also disables jump table optimization.

It's really a trade-off instead of an optimization.

enum Enum {
    _0000, // .0 == 0 && .1 == 0 && .2 == 0 && .3 == 0
    _0001, // .0 == 0 && .1 == 0 && .2 == 0 && .3 != 0
    _0010,
    _0011,
    // ...
    _1110,
    _1111(Box<()>, Box<()>, Box<()>, Box<()>),
}

@RalfJung
Copy link
Member Author

RalfJung commented Sep 8, 2022

It's really a trade-off instead of an optimization.

It optimizes space usage, but overall I agree it's a tradeoff. We might want attributes like repr(compact) or so.

@HKalbasi
Copy link
Member

HKalbasi commented Sep 9, 2022

It makes a single-variant check matches!(enum, Variant) load and compare multiple fields (up to O(number_of_niches)).

Why O(number_of_niches)? Isn't two field enough? Every value for the second field is a niche, so the example can be:

enum Enum {
    _0000, // .0 == 0 && .1 == 0
    _0001, // .0 == 0 && .1 == 1
    _0010, // .0 == 0 && .1 == 2
    _0011, // .0 == 0 && .1 == 3
    // ...
    _1110,  // .0 == 0 && .1 == 15
    _1111(Box<()>, Box<()>, Box<()>, Box<()>),
}

@mikebenfield
Copy link
Contributor

There's even more generality. You don't even need the second field to have any "invalid values" at all. Even if the original enum had only one NonZeroU64 it would work:

use std::num::NonZeroU64;

pub enum Provenance2 {
    Concrete {
        alloc_id: NonZeroU64,
        sb: u64,
    },
    Wildcard, // .0 = 0, .1 = 0
    None, // .0 = 0, .1 = 1
}

@RalfJung RalfJung changed the title Missed enum layout optimization with two NonZeroU64 in an enum Missed enum layout optimization with a NonZeroU64 + more space in an enum Sep 9, 2022
@cuviper
Copy link
Member

cuviper commented Sep 9, 2022

Related, if you have:

pub enum Provenance {
    Concrete {
        alloc_id: NonZeroU64,
        sb: NonZeroU64,
    },
    Wildcard,
}

This does niche to 16 bytes, but Option<Provenance> is still 24 bytes -- one could imagine finding the other niche for None. However, the compiler only ever tracks one largest niche AFAIK, and forgets about others.

@RalfJung
Copy link
Member Author

RalfJung commented Sep 10, 2022 via email

@cuviper
Copy link
Member

cuviper commented Sep 10, 2022

Hmm, actually the Wildcard variant would consider the unused niche to be just padding, so all bits are valid -- no longer niche for an Option wrapper.

@BGR360
Copy link
Contributor

BGR360 commented Sep 12, 2022

@rustbot label T-compiler A-layout

@rustbot rustbot added A-layout Area: Memory layout of types T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 12, 2022
@mikebenfield
Copy link
Contributor

I'm working on this. I'm introducing the concept of a Flag. A Flag is an extra scalar field that may be associated with a Niche. The flag must be set to a particular value to indicate we should find the discriminat in the Niche. Maybe I'll come up with a better name than Flag.

I've also got a commit to improve the flexibility of niche-optimized enums by allowing the fields of a tagged variant to go around the niche, with some appearing before and some after.

And I've got a commit to allow layouts to store more than just the largest_niche; this way multiple niches can be tried.

I should have a PR pretty soon.

mikebenfield pushed a commit to mikebenfield/rust that referenced this issue Oct 7, 2022
Also:
Try to fit other variants by moving fields around the niche.
Keep multiple niches, not just the largest one.
Look for multiple largest variants.
Introduce repr(flag).

Fixes rust-lang#101567
@kaya3
Copy link

kaya3 commented Dec 21, 2022

I found something similar to this today, but I am not sure if the "flag" method described in this thread works for it. I have an enum like below:

pub enum Foo {
    One(NonZeroU64, u64),
    Two(NonZeroU64),
    // Three
}

Foo can be represented using the first word as the discriminant, with a zero there indicating the variant Two; in principle there is still a niche when the second word is also zero. But uncommenting Three, or writing Option<Foo>, expands the size from 16 bytes to 24 bytes, meaning that Rust isn't using that niche.

That surprised me particularly in the case of Option<Foo>, because although Foo has a niche that requires comparing two words, Option<NonZeroU128> does make use of a niche that requires comparing two words.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-layout Area: Memory layout of types T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants