-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Representation of tuples #12
Comments
TBH I think this is both the easiest and most expected answer. If #[repr(Rust)]
struct Foo(A, B) then something is seriously wrong. So, I would move that question about arrays to the |
I also lean towards "tuples are just structs", but there's a subtle consequence that I want to record because it has only now occurred to me: tuples have to have the same layout in all compilation units, so if tuples and structs are laid out identically that forces us to answer this question from #11 ...
... with the relatively restrictive choice (for the compiler) that all structs' layout must only depend on the ordered list of field types, not on anything else. That might very well be what we end up deciding anyway, but it's something to consider – we basically have to choose between struct-tuple layout compatibility and more advanced (e.g. PGO-based) layout optimizations for structs. |
I do not agree. I think of tuples as being sugar for using the following infinitely many types defined in libcore: struct Tuple0;
struct Tuple1<T0>(pub T0);
struct Tuple2<T0, T1>(pub T0, pub T1);
// and so on So, tuples are structs, but all tuples of the same length are instances of the same struct. |
Ok, good point, I think I misunderstood what you meant by " |
So I agree that tuples should align with some Rust struct, in the manner than @RalfJung defined here. It is plausible to imagine though that said struct might be (e.g.) In any case I think that the main job of this thread probably is to define precisely which Rust struct tuples map to. |
However, the array question becomes interesting then, because it'd mean that in struct Foo<T> {
field1: T,
field2: T,
} we guarantee that field1 is before field2. I do not see any harm in that, but it is something to be aware of. |
Accessing the first element of a struct through a pointer is (slightly) more efficient. I'd prefer to give that ordering option to the compiler, unless there is a clear need to put it under user control. |
Wouldn't this just mean that if we were to use PGO to optimize layout we would have to do so that the layout of layout-compatible structs and tuples match ? |
@ChrisJefferson Keep in mind that load and store instructions have generous immediate offsets, even in the most fantaically RISC architectures. Thus extra instructions or extra cycles are usually only required if you want to take the address of a field or if further accesses to the field require an addressing mode that the architecture doesn't support or which isn't as fast (e.g. @gnzlbg Access patterns tend to vary by struct, so different structurally-identical structs would probably have different layouts. But @RalfJung's reply resolved this anyway. |
One feature which would quasi-require tuples and structs to have different representations is rust-lang/rfcs#1582. That is, with tuples it is conceivable that given The logical justification for the different representations would be that tuple fields are ordered while struct fields are unordered, so just as it makes sense for 'dynamic' collections like (By "struct" here I mean "struct with named fields" throughout: tuple structs are an interesting in-between case which could conceivably go either way, but that question only becomes relevant as a follow-on.) |
The existence of tuple structs is a real thorn in the side of this reasoning, not just a follow-up: just as it would be extremely surprising for a tuple to behave differently from a tuple struct, it would be extremely surprising (and an efficency footgun) for a tuple struct to behave different from one with named fields. So far the difference is entirely stylistic, we even formalized the notion that tuple structs are just normal structs with fields names Combine that with other problems the representation proposed in that RFC has (the killer one for me is the need for a size/stride split which I don't believe can be introduced backwards compatibly) and I find it hard to take it seriously. |
Given two independent structs
If the answer is yes, then layout compatibility between structs, tuple structs, tuples, and arrays is not possible AFAICT. If the answer is no, then this is something that we are trading for being able to support layout compatibility across these constructs. |
For the record, there's a compromise solution: special casing structs and tuples with homogeneous contents (and making them equivalent to arrays) without guaranteeing much about the layout of either tuples or structs that have heterogeneous contents. |
Possibly silly question: Before specifying the layout of any tuple or struct, what are the usecases for wanting to be able to know the layout for an arbitary type? I can imagine some people want to control the layout for some specific type (to align with an existing in-memory layout of data). They can always use repr(C), or we could introduce other reprs. Once someone is fixed, it's fixed forever. I could imagine (for example) a profiler deciding if it should represent an (int8, int8, int8) as a 3 byte structure (which makes alignment trickier) or a 4 byte structure with padding (and then where put the padding?). |
You may not always control the type you want to do layout-dependent tricks on. Besides, in practice people make mistakes and transmute (see e.g. rust-lang/rust#50842) or type pun repr(Rust) types, all else being equal I'd rather provide a safety net for that than make it UB (especially since that's the sort of UB that miri generally can't identify). Reordering or padding tuples, especially those of bare primitives, based on profiling information seems extremely implausible to me. They are so widely used that you likely won't have a single dominant usage pattern. It's a more interesting question for nominal types, which may be used in very specific ways, but that belongs in #11 so let's move that discussion there (and once I have time I will elaborate over there on my other reasons for being skeptical of profile-guided layout optimizations). |
I had never thought of that. |
This is probably a better fit for discussion in #11 .. my guess is though that we will not =) |
It's notable that generics affect layouts here (due to unsized tuple coercion): // unsafe w.r.t. synchronization
unsafe fn dump<T: ?Sized>(p: &T) -> Vec<u8> {
use std::slice;
use std::mem;
let s = slice::from_raw_parts(p as *const T as *const u8, mem::size_of_val(p));
s.to_vec()
}
fn main() {
struct Foo(u16, u32);
println!("{:?}", unsafe { dump(&(1u16, 2u32)) }); // [1, 0, 0, 0, 2, 0, 0, 0]
println!("{:?}", unsafe { dump(&Foo(1u16, 2u32)) }); // [2, 0, 0, 0, 1, 0, 0, 0]
// Basic rule: large align to small align
// Exception 1: repr(C)
// Exception 2: enum tag
// Exception 3: the last element that can be a candidate for unsized coercion
} However, the difference seems removable. We can just apply the "last element rule" regardless of generics. |
This is a good point. I actually find this kind of appealing. I guess I just think that being able to "index" into a tuple is potentially quite useful, and I can't think of a strong motivation for laying out homogenous tuples in any other way (particularly if that layout must apply to all tuples). I suppose in theory one could imagine laying out some individual tuples differently than others (eg., if you saw from PGO that one particular tuple was used across threads and you wanted extra padding, and you knew somehow that references to this tuple didn't escape or get mixed with other tuples) but that seems quite over the top. I can't imagine us really ever doing that. |
I'm a fan of this. Bikeshed:
This will always fit under the as-if rule, I think. I do not think we should allow this to be observable by a conformant program. |
I'm still a little unclear why we want to give people the ability to specify how tuples and structs are laid out. If we want to specify it, I think something like There are numourous papers which show how profile-guided struct reordering can be a profitable optimisation. Here's one example: https://ieeexplore.ieee.org/abstract/document/7726757/ , here's another https://pdfs.semanticscholar.org/567e/a8bf671735f120faa615ea6313faaaf07084.pdf (which also does struct splitting, which is probably not acceptable to Rust) |
Follow-up bikeshed: Actually, perhaps we could use this for interop between two foreign types, in that two |
I'll echo @gnzlbg's statement in another thread: we're really not in the business of designing new language features here. |
I think we should not be in the business of designing new language features, but I think it's important to at least explore the viability of creating ones for addressing specific cases. We need to avoid a false dichotomy of "either we make this guarantee or this use case is unsupported". We can and should also consider "we can say that we don't make the guarantee, but suggest that if the use case is sufficiently important, it should have its own language feature". Adding a guarantee where none existed is a language feature, anyway. |
Maybe we should split those discussions somehow ? I find the threads get hard to follow once we mix them with exploratory work that goes in different directions :( |
I opened #31 which contains my attempt to summarize our tentative consensus here. Please give feedback! (In particular, let me know if there is anything you think is incorrect.) |
We've called final-comment-period on #31. |
Discussing how tuples are laid out.
Some things that might be useful if they were defined:
(T, T, ..., T)
tuple with a[T; N]
memory representation?The text was updated successfully, but these errors were encountered: