Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: why is Buf32 #[repr(packed)] #29

Closed
kontrafiktion opened this issue Nov 11, 2016 · 11 comments
Closed

question: why is Buf32 #[repr(packed)] #29

kontrafiktion opened this issue Nov 11, 2016 · 11 comments

Comments

@kontrafiktion
Copy link

kontrafiktion commented Nov 11, 2016

If I understand https://doc.rust-lang.org/nomicon/other-reprs.html#reprpacked correctly,
#[repr(packed)] forces byte alignment. And it sounds like this may be problematic:

most architectures strongly prefer values to be aligned.

I very new to Rust (and system level programming), but this sounds scary to me.

Could you explain, why the Buf32 should be packed?

#[repr(packed)]
pub struct Buf32<H> {
    pub ptr: *mut H,
    pub len: u32,
    pub cap: u32,
}
@SimonSapin
Copy link
Member

For Buf32 I think #[repr(packed)] may not be necessary. But it’s also not problematic, because the fields happen to not require any padding: laying them out in order without padding puts each field at an offset from the start of the struct that is a multiple of its size. (u32 fields are 32-bit-aligned.)

The Tendril struct also uses #[repr(packed)]. It is also OK there for the same reason. But it is critical there because, when a Tendril represents a string of 8 bytes or less, two consecutive 32-bit fields are treated like a single [u8; 8] buffer.

@kmcallister, could #[repr(packed)] be removed on Buf32?

@bluss
Copy link

bluss commented Nov 14, 2016

Using packed doesn't seem correct or necessary. It does technically allow you to run into UB in Rust (see rust issue 27060). It turns the struct into a minimum-alignment-1 struct, which is indeed incorrect.

It seems redundant, unless you have some very special use of Buf32? With its fields being of size 8, 4, 4 bytes (64-bit) and 4, 4, 4 bytes (32-bit), the struct already doesn't use any padding at all on common platforms.

@SimonSapin
Copy link
Member

@bluss Per rust-lang/rust#27060, packed can only cause UB when it forces fields to be unaligned which it doesn’t for these specific structs, right?

See my comment above: I think packed is not necessary on Buf32 but it is on Tendril.

@bluss
Copy link

bluss commented Nov 14, 2016

since align_of reports 1 for Buf32, it seems to me that it allows the fields to be unaligned at the whims of the compiler, it's instructed that the struct only needs byte alignment (which is not true).

@bluss
Copy link

bluss commented Nov 14, 2016

Even worse, generic custom code might ask align_of and place the value depending on that reported alignment, then it's very easy to produce UB (technically, but it's hard to demonstrate since x86 allows it anyway..).

@SimonSapin
Copy link
Member

Ok, I’m pretty sure we can safely remove packed from Buf32, but again I’m more worried about Tendril and its unsafe code linked above.

@bluss
Copy link

bluss commented Nov 14, 2016

I would use repr(C) instead, then fields are well aligned but you know that the fields are laid out in order, and then it's safe. I guess we are missing some representation alternatives here (fixed order but not C).

@SimonSapin
Copy link
Member

I suppose this unsafe code should probably use an union now that that exists, but we want this library to run on stable Rust.

So, yeah. In the meantime we should probably remove packed on Buf32 and replaced it with C on Tendril.

@bluss
Copy link

bluss commented Nov 23, 2016

I'm looking into fixing Tendril and Buf32 by using repr(C). @SimonSapin, do you know what the Header struct is for?

@SimonSapin
Copy link
Member

Non-inline tendrils (any tendril with more than 8 bytes of data) have a heap allocated chuck of memory that starts with a Header and continues with a bytes buffer that stores the string’s data. The header contains a refcount (which needs to be shared for reference-counted tendrils that share a buffer) and the capacity (size of the allocation, it’s there to keep size_of::<Tendril<_>>() smaller).

SimonSapin added a commit that referenced this issue Jun 29, 2017
SimonSapin added a commit that referenced this issue Jun 29, 2017
SimonSapin added a commit that referenced this issue Jun 29, 2017
@SimonSapin
Copy link
Member

#33 switches Tendril to #[repr(C)], but leaves #[repr(packed)] on Header to keep the C API working. (An to avoid wasting 4 bytes of padding between the header and data in heap allocations, though maybe that’s not significant.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants