-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option<char> should be represented as just one 32 bit value #5977
Comments
In the same vein as #6791. Is a reasonable optimization but probably not critical. We do similar optimizations for pointer types, I think. |
Okay, this would mean representing |
Once #8974 lands, we can go ahead with this. |
If the compiler knew about possible values a type could take perhaps it could make these optimizations automatically? E.g. an ASCII character can have values in the range 0-127, so enum T {
On,
Off
}
enum Repeats {
Zero
One(T),
Two(T,T),
Three(T,T,T)
Four(T,T,T,T)
Five(T,T,T,T,T)
} Because |
@Thiez what happens for: match repeats {
Five(ref x, _, _, _, _) => {}
_ => {}
} it'd have to be able to handle sub-word references. |
This will only work for |
I think it would work for any enum with up to 2047 nullary variants and one variant containing a |
If you set any of the bits in the |
Can this case be extended to |
@thestinger: Yes. But if you have:
You can freely use up to 11 bits to represent which variant it is, with the restriction that zero is reserved for |
We now have range asserts on |
Similar to #14540. |
If someone can mentor me through this bug, I'd be interested in giving it a try. |
…r> and Option<bool> r?luqmana
…r> and Option<bool> r?luqmana
For anyone wanting to see the status of this, here's code you can run: fn main() {
println!("{}", ::std::mem::size_of::<Option<char>>());
} it currently prints '8' (bytes) |
Refactor type memory layouts and ABIs, to be more general and easier to optimize. To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties: * `Variants` describes the plurality of sum types (where applicable) * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s * `Tagged` has its variants discriminated by an integer tag, including C `enum`s * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants * `FieldPlacement` describes the number and memory offsets of fields (if any) * `Union` has all its fields at offset `0` * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type * `Arbitrary` records all the field offsets, which can be out-of-order * `Abi` describes how values of the type should be passed around, including for FFI * `Uninhabited` corresponds to no values, associated with unreachable control-flow * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component" * `ScalarPair` has two "scalar components", but only applies to the Rust ABI * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s Size optimizations implemented so far: * ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.: * `Option<!>` is 0 bytes * `Result<T, !>` has the same size as `T` * using arbitrary niches, not just `0`, to represent a data-less variant, e.g.: * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte * `Option<char>` is 4 bytes * using a range of niches to represent *multiple* data-less variants, e.g.: * `enum E { A(bool), B, C, D }` is 1 byte Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here. Closes #44426, fixes #5977, fixes #14540, fixes #43278.
Refactor type memory layouts and ABIs, to be more general and easier to optimize. To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties: * `Variants` describes the plurality of sum types (where applicable) * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s * `Tagged` has its variants discriminated by an integer tag, including C `enum`s * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants * `FieldPlacement` describes the number and memory offsets of fields (if any) * `Union` has all its fields at offset `0` * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type * `Arbitrary` records all the field offsets, which can be out-of-order * `Abi` describes how values of the type should be passed around, including for FFI * `Uninhabited` corresponds to no values, associated with unreachable control-flow * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component" * `ScalarPair` has two "scalar components", but only applies to the Rust ABI * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s Size optimizations implemented so far: * ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.: * `Option<!>` is 0 bytes * `Result<T, !>` has the same size as `T` * using arbitrary niches, not just `0`, to represent a data-less variant, e.g.: * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte * `Option<char>` is 4 bytes * using a range of niches to represent *multiple* data-less variants, e.g.: * `enum E { A(bool), B, C, D }` is 1 byte Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here. Closes #44426, fixes #5977, fixes #14540, fixes #43278.
Refactor type memory layouts and ABIs, to be more general and easier to optimize. To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties: * `Variants` describes the plurality of sum types (where applicable) * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s * `Tagged` has its variants discriminated by an integer tag, including C `enum`s * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants * `FieldPlacement` describes the number and memory offsets of fields (if any) * `Union` has all its fields at offset `0` * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type * `Arbitrary` records all the field offsets, which can be out-of-order * `Abi` describes how values of the type should be passed around, including for FFI * `Uninhabited` corresponds to no values, associated with unreachable control-flow * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component" * `ScalarPair` has two "scalar components", but only applies to the Rust ABI * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s Size optimizations implemented so far: * ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.: * `Option<!>` is 0 bytes * `Result<T, !>` has the same size as `T` * using arbitrary niches, not just `0`, to represent a data-less variant, e.g.: * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte * `Option<char>` is 4 bytes * using a range of niches to represent *multiple* data-less variants, e.g.: * `enum E { A(bool), B, C, D }` is 1 byte Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here. Closes #44426, fixes #5977, fixes #14540, fixes #43278.
Refactor type memory layouts and ABIs, to be more general and easier to optimize. To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties: * `Variants` describes the plurality of sum types (where applicable) * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s * `Tagged` has its variants discriminated by an integer tag, including C `enum`s * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants * `FieldPlacement` describes the number and memory offsets of fields (if any) * `Union` has all its fields at offset `0` * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type * `Arbitrary` records all the field offsets, which can be out-of-order * `Abi` describes how values of the type should be passed around, including for FFI * `Uninhabited` corresponds to no values, associated with unreachable control-flow * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component" * `ScalarPair` has two "scalar components", but only applies to the Rust ABI * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s Size optimizations implemented so far: * ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.: * `Option<!>` is 0 bytes * `Result<T, !>` has the same size as `T` * using arbitrary niches, not just `0`, to represent a data-less variant, e.g.: * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte * `Option<char>` is 4 bytes * using a range of niches to represent *multiple* data-less variants, e.g.: * `enum E { A(bool), B, C, D }` is 1 byte Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here. Closes #44426, fixes #5977, fixes #14540, fixes #43278.
…tthiaskrgr Add lint panic in result ### Change Adding a new "restriction" lint that will emit a warning when using "panic", "unimplemented" or "unreachable" in a function of type option/result. ### Motivation Some codebases must avoid crashes at all costs, and hence functions of type option/result must return an error instead of crashing. ### Test plan Running: TESTNAME=panic_in_result cargo uitest --- changelog: none
A
char
is au32
, but Unicode is standardized to only take up 21bit, so there is more than enough space for that optimization.Might be useful for specifying code page conversion tables with holes in them.
The text was updated successfully, but these errors were encountered: