-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The size of (T, Infallible)
is not zero
#93032
Comments
(I don't claim to be authoritative on this topic) tl;dr: The Rustonomicon, which you linked, doesn't seem to guarantee anything about the size of Please correct me if I'm wrong! |
This is intentional; see #49298 |
Indeed, the size of So I think this issue should probably be closed. The one thing we could do here is improve documentation of this problem, but I am not sure where such docs would go. |
Perhaps the same page of the Rustonomicon that |
As @Ericson2314 pointed out, the example that made this unsound does not compile any more. NLL does not accept thus code so when NLL became mandatory on all editions, this stopped working. That decision was made here; there is an open issue at #54987 to re-accept such code. I will hence re-open this issue to track potentially doing the layout optimization again. That said, clearly that is in conflict with #54987, and there might be other soundness issues as well that I am missing. |
Also see the discussion starting at #54987 (comment) for other potential soundness concerns with making let x: (String, !) = (mk_string(), panic!()); and it needs to store the string somewhere (and drop it). Currently a temporary is created for this (and the tuple constructor is not even anywhere to be seen in the MIR), but I am not sure how intentional that is. |
What is the challenge in making -∞ the size of uninhabited types? It makes the math work:
|
And anyway computing the layout of a type involves a lot more than computing its size, so it's not like somehow changing the size of |
struct InhabittedLayout { size: usize, align: usize }
enum Layout {
Uninhabitted,
Inhabitted(InhabittedLayout),
} and thus a Per https://en.wikipedia.org/wiki/Tropical_geometry layout and the divisibility lattice, layout computation does have nice algebraic properties. |
Yeah, and they have different |
let x: (String, !) = (mk_string(), panic!()); My understanding is we evaluate the right and side, it diverges, and so we don't care what the left hand side looks like. let (a, b): (String, !) = (mk_string(), panic!()); Splitting into separate lets doesn't change the behavior of the program, even if it does expose a new location where we didn't have one before. #![feature(never_type)]
fn main() {
let x: !;
println!("asdf");
x = panic!();
} Does work today, and cracks me up. Basically, for spooky reasons of polarity of the sort discussed in many papers of at https://www.pauldownen.com/#publications , I think The problem with things like
|
I am suspicious of any calculus that ends up not being able to treat uninhabited types properly. |
The key word there is "properly". The point of the literature of the sort I linked is not to just ban all the scary stuff, but introduce a little bureaucracy once to get back nice things, like eta rules, de Morgan's laws, etc. etc. I am not proposing that The previous method of breaking the layout algebra by making |
If it sounds any better, I don't think there is problem with |
I have not disagreed with that. But I think nothing is wrong with fn main() {
let x: !;
println!("asdf");
x = panic!();
} and we should keep it working. This is not in contradiction with layout optimizations for IOW, partial initialization seems like such a niche feature that if we have to trade it off against making more layout optimizations, the layout optimizations win. We agree on that, even if we don't agree on the reasons for agreeing. :) However, there are other algebraic properties that are broken by these layout optimizations, so contrary to what you seem to claim it's not like algebra clearly favors one way over the other. Specifically, currently code working on generic I don't know if this is a concern in practice, but it is worrying since this affects unsafe code and we would not be able to easily detect when such code gets broken. |
It was funny to me, but fn main() {
let x: !;
println!("asdf");
x = panic!();
} I don't have a problem with. If With the recasting I wrote in the prior comment, it is just trying to project the "annihilated" locations (when they aren't also ZSTs) from the (uninitialized) location that is the issue.
fn main() {
let x: !;
println!("asdf");
*(&writeOnly x as &writeOnly ZST) = ZST {};
} This is even fine too! Only fn main() {
let x: (!, usize);
println!("asdf");
x.1 = 5;
} is a problem because of the |
Yes, that is worrisome. One option is
|
Ah! So one more point, I think it is a bit artificial to exclude sums types from partial initialization. Consider pub fn insert(&mut self, value: T) -> &mut T; If we had out pointers, we could likewise imagine pub fn set_some(&out self) -> &out T; This would, given an uninitalized However! Note that The partial initializations we have today I consider just a "second class" form of let x: Option<T> = Some(_);
x.0 = makeT(); // OK because we "know" it's `Some`. I bring this up to show that partial initialization is the enemy of empty types everywhere, not just in the special case of products where we used to allow. Basically, unrestricted partial initializing is saying that every type must admit a non-standard element called "uninitialized", and thus pessimizes the layout rules across the board. Because I indeed want all 3 of
The |
I don't think it does? The tag relies on valid bit-patterns of So as long as out references guarantee that some value is written before the lifetime ends, this is all fine even in the presence of layout optimizations and uninhabited types, I think? |
Sorry I wrote that vary misleading. The Very confusingly in the parenthetical, only if people want to read back what they wrote: let x: Option<T> = Some(_);
match x {
Some(_) /* must be _ */ => true,
None => false,
} do we run into issues. I think that is probably a logical thing to expect to work if we can also do let x: (bool, T);
x.0 = false;
if x.0 { "foo" } { "bar" } but it is certainly not a practical killer feature the way |
And just to spell it out, with // Magic type integrating with borrow checker's tracking initializedness
type Uninit<Ty: Inhabitted>;
impl<T: Inhabitted> Option<Uninit<T>> {
fn is_some(&self) -> bool {
match x {
Some(_) /* must be _ */ => true,
None => false,
}
}
}
impl<T, U: Inhabitted> (T, Uninit<U>> {
fn first(&mut self) -> &mut T {
&mut x.0
}
} |
With my lang hat on, I'll note that it's incredibly unlikely that such a proposal would be accepted. We're not looking to have more
Can you show an example where the size of a struct with an uninhabited field is a "performance" problem in practice? While I agree that it's conceptually-nice to say an uninhabited type has -∞ size, I've only seen theoretical cleanliness-based arguments for it, not anything showing it being an issue in real code. It seems fundamentally-unlikely to be a real problem, because any code that fully-initializes such a type is necessarily dead. And if it's not dead because it's only partially-initialized, then the product type not being a ZST is useful, not a problem at all. (It would also fall under "layout optimizations for repr rust" anyway, which are generally not guaranteed, so any code you write wouldn't be correct to depend on it even if it did happen.) |
I'm sorry I do not understand which issues you are referring to here. |
let x: Option<T> = Some(_);
match x {
Some(_) => println!("foo"),
None => println!("bar"),
} I mean if this is to be allowed, where the field of the |
Yes I am aware the Lang Team is dead set against more opt out traits. I think that is extremely unfortunate because I keep on finding problems where they would help (here, Advanced FFI, with the likes of Swift and C++ (see near https://twitter.com/ericson2314_/status/1504477340625084423,
(I have written code that used tons of I think the problem is less shrinking
It's nice to keep all that flexibility for our future selves! |
Oh I see... honestly I don't think we should allow this. The tag is sufficiently different from "normal fields" that treating it differently is justified, and I don't think this comes up often enough to warrant all that complexity. That said, this is quite similar to @eddyb's concern at #49298 (comment), "partial initialization of enum variants" -- except Eddy's comment only really applies to things like
I agree out pointers would be useful. But those would then probably entirely avoid |
@RalfJung Yes the options go way, but the other enums were were state machines, were I would incrementally set the next state before "yielding" in some fashion. Those would stay enums, and so it would be nice to set their tag bit first. I do not need to read their tag bits before they were fully initialized, however, so dropping the tag write in the case of an empty variant is perhaps OK. |
Yeah the #49298 (comment) is the one that I would make illegal to split into two separate loads with The original program might be fine if RHS eval first means there must be a temporary, but the optimization would not be able to project the non-temporary location to write the string to because the type of the containing outer location (the tuple) is not |
I imagine that partial initialization would be kinda like syntactic sugar. struct Foo { a: u8, b: String } additional "type" is defined struct PartiallyInitializedFoo { a: MaybeUninit<u8>, b: MaybeUninit<String> } and which fields are initialized is tracked separately. Therefore if uninhabited field zeroes size of struct, partially initialized version of that struct would have unoptimized size, with space to store all inhabited fields. |
If we actually had syntax to do it and that was the only way it could occur, sure, you could imagine it as syntactic sugar, and transform it appropriately (IIRC generators, which underpin But what happens is that every time you write anything like let tmp1 = foo();
let tmp2 = bar();
dest = (tmp1, tmp2); but given the right circumstances, the goal has ~always been to optimize that to: dest.0 = foo();
dest.1 = bar();
One relatively simple example is to make fn foobar<T, U>(foo: impl FnOnce() -> T, bar: impl FnOnce() -> U) -> (T, U) {
(foo(), bar())
} We want to always optimize functions like the above to something like this: return.0 = foo();
return.1 = bar();
return; (in more interesting situations it should be possible to even end up with e.g. a But the caller of |
If return type is uninhabited there's no reason to place partially initialized return value in return place, function diverges anyway. But I get the idea. union Foo {
a: u8,
b: (u32, !),
}
fn foo(foo: &mut Foo) {
foo.b.0 = 1;
assert!(ptr::eq(&*foo, &*foo.b.0));
} Then keeping I wouldn't think about it if you didn't mention |
The way to think about this is that the With more components/functions: Let's say you learn, during monomorphization/codegen, that If, say, And Either way, by the time you know everything about |
Hi! I was under the impression that this question was well settled in the direction of " |
I don't think that was ever officially decided. It just happens to be the case currently and MIR building relies on it so when If you want to get a T-lang decision, I suggest you write up the motivation and what you want to have decided, and nominate that for T-lang discussion. |
@Nadrieril As far as I know, the lang team thinks that an example like this https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=a7c3a6f3fce6347f33948a7f86e9761a use core::mem::MaybeUninit;
pub fn make_tuple<A, B>(a: impl FnOnce() -> A, b: impl FnOnce() -> B) -> (A, B) {
unsafe {
let mut m: MaybeUninit<(A, B)> = MaybeUninit::uninit();
std::ptr::addr_of_mut!((*m.as_mut_ptr()).0).write(a());
std::ptr::addr_of_mut!((*m.as_mut_ptr()).1).write(b());
m.assume_init()
}
}
enum Never {}
fn main() {
let x = make_tuple(|| "hello".to_string(), || "world".to_string());
dbg!(x);
let _y: (String, Never) = make_tuple(|| "yup".to_string(), || panic!());
} should not be UB, even though So long as that's the case -- basically meaning that field projecting inside a (Existing layout guarantees https://doc.rust-lang.org/std/mem/union.MaybeUninit.html#layout-1 prevent us from saying that So if you wanted to write that up in some appropriate document form (RFC, like the provenance one? Dunno) I think it could be accepted relatively easily. |
Note however that that example only decides the case for tuples. We could still make this a ZST struct S(String, !); You can't write code that is generic over structs. But possibly the lang team intents that example to also apply to non-generic cases, for an arbitrary struct. |
IMHO, it would be pretty weird if for struct St<T>(T, !);
struct StU8(u8, !);
|
We certainly explicitly reserve the right for differences like that.
|
Pattern types are specifically considering such a difference in fact |
Maybe a first step would be to add this to the lang team's "frequently requested changes" and explain there that this is not going to happen. That should be sufficient to close this issue. |
Since
Infallible
is an empty type, I expect(T, Infallible)
also be an empty type, thus they should have the same size. But I noticed that(T, Infallible)
have the same size asT
, so currently, we havemem::size_of::<Option<Infallible>>() == 0
mem::size_of::<Option<(u32, Infallible)>>() = 8
I guess this is because Rust treats empty type to have size 0, so the size of
(T, Infallible)
is size ofT
plus size ofInfallible
equals size ofT
. May be we can use something likeOption<usize>
to represent the size of a type internally, whereNone
means the size of an empty type, in this way, we can distinguish empty types from zero sized types. For example, internally, we can have a function like:To keep the current behavior of
mem::size_of
, it can be implemented as:The text was updated successfully, but these errors were encountered: