Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a pod language item and marker trait. #1387

Closed
wants to merge 2 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
195 changes: 195 additions & 0 deletions text/0000-pod.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
- Feature Name: pod
- Start Date: 2015-11-30
- RFC PR:
- Rust Issue:

# Summary
[summary]: #summary

Add a `pod` language item and marker trait. The trait can only be implemented by
a subset of all types and identifies objects that are valid when they contain
arbitrary bit patterns.

# Motivation
[motivation]: #motivation

Several not-uncommon ffi operations cannot be expressed in safe code even though
they are completely safe. This causes users to reinvent the necessary functions
for said operations themselves, often incorrectly. For example, the following
piece of code attempts to write a sequence of `u32` with the `Write` trait:

```rust
let to_write = unsafe { mem::transmute::<&[u32], &[u8]>(&self.data) };
self.file.write(to_write);
```

This code is incorrect because the `to_write` slice has the same number of
elements as the original slice instead of four times as many.

Consider the following similar case:

```rust
pub struct linger {
pub l_onoff: c_int,
pub l_linger: c_int,
}

impl Socket {
pub fn linger(&self) -> Result<linger> {
let mut linger = mem::zeroed();
try!(getsockopt(self.fd, SO_LINGER, mem::as_mut_bytes(&mut linger)))
Ok(linger)
}
}
```

This function uses the generic `getsockopt` system call to retrieve the *linger*
setting of a socket:

```rust
fn getsockopt(sockfd: c_int, optname: c_int, optval: &mut [u8]) -> Result;
```

The `linger` function cannot be written today because there is no
`mem::as_mut_bytes` function. Instead, the user would most likely create a slice
ad hoc with unsafe functions and manual size calculations.

Consider the case of C unions:

Various proposals have been made how to express C unions in rust code. The most
recent proposal suggested restricting the types in such unions to `Copy` in
order to avoid various sources of unsafety. However, such unions would still
have to be `unsafe` since, for example, `&u8` is `Copy` but cannot contain
arbitrary data.

Note that C unions only ever contain plain old data (as defined below) since all
possible C types are plain old data. Hence, if the rust equivalent of C unions
were restricted to plain old data, using it would be completely safe.†

(† Some questions regarding the existence of `undef` fields in large variants
remain.)

Lastly, a plain old data type allows for the following simple implementation of
a random value generator:

```rust
pub trait RandomValueGenerator {
fn generate_random_bytes(&mut self, bytes: &mut [u8]);

fn generate<T: Pod>(&mut self) -> T {
let mut t = mem::zeroed();
self.create_random_bytes(mem::as_mut_bytes(&mut t));
t
}
}
```

## Various convenience functions using plain old data

This section contains a list of useful functions that can be written with a
plain old data trait.

```rust
/// Creates an object that has all bytes set to zero.
fn zeroed<T: Pod>() -> T;
```

```rust
/// Returns the mutable in-memory representation of an object.
fn as_mut_bytes<T: Pod>(val: &mut T) -> &mut [u8];
```

```rust
/// Turns a slice into a reference to a Pod type if it's suitable.
///
/// = Remarks
///
/// The buffer is suitable if it is large enough to hold the type and properly
/// aligned.
pub fn from_bytes<T: Pod>(buf: &[u8]) -> Option<&T>;

pub fn from_mut_bytes<T: Pod>(buf: &mut [u8]) -> Option<&mut T>;
```

```rust
impl<T> [T] {
/// Returns a mutable byte slice covering the same range as the slice.
pub fn as_mut_bytes(&mut self) -> &mut [u8] where T: Pod;
}
```

# Detailed design
[design]: #detailed-design

Add a `pod` lang item and safe `Pod` trait to mark types which are valid when
they contain arbitrary bit patterns.

The following types are **Pod candidates**:

* `u8`, `u16`, `u32`, `u64`, `usize`,
* `i8`, `i16`, `i32`, `i64`, `isize`,
* `f32`, `f64`,
* raw pointers,
* arrays of `Pod` types,
* tuples of `Pod` types, and
* structs where all fields are public and `Pod`.

Only types that are Pod candidates can implement the `Pod` trait. Arrays
of `Pod` types and tuples of `Pod` types automatically implement the `Pod`
trait.

Structs with `Drop` implementations are not Pod candidates since the drop flag
is considered a private field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also because Pod implies Copy (just like Copy implies Clone), and Copy types are never Drop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether trait Pod: Copy { } is not dictated by this RFC. In lrs, Pod: Copy but Copy does not imply Clone.


## Justification of the set of Pod candidates

It is easy to see that the list above contains only plain old data types by
applying the following criterion recursively:

>If an object containing an arbitrary bit pattern can be constructed in safe
>code today, then the type of the object is a plain old data type.†

(† This excludes padding between struct fields.)

It is also clear that references (including slices) and enums are not plain old
data types. The case of structs with `Drop` implementations or hidden fields is
somewhat harder. First of all, consider the following type:

```rust
struct Slice<T> {
ptr: *const T,
len: usize,
}
```

The inherent methods of this type will likely make use of the fact that the user
cannot modify the `len` field in order to guarantee safety. Since the compiler
cannot know if the private fields must satisfy certain invariants, an
implementation of the safe (!) `Pod` trait cannot be allowed for such types.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with the reasoning w r t allowing safe Pod for types with private internals. If a struct is declared Pod, then it's an error to assume that there are private invariants. (A struct might want to keep its internals private for other reasons, such as being able to rename the fields without breaking the API.)

Second of all, note that `Pod` implies `Copy` with the set of safe functions
suggested above:

```rust
let mut a: T = /* ... */
let mut b: T = mem::zeroed();
memcpy(mem::as_mut_bytes(&mut b), mem::as_mut_bytes(&mut a));
```

Since `Copy` types cannot be `Drop`, the same restriction should be applied to
`Pod`, even if the drop flag is not stored directly in the object.

# Drawbacks
[drawbacks]: #drawbacks

None known.

# Alternatives
[alternatives]: #alternatives

`Pod` could be unsafe to allow structs with private fields to be `Pod`.

# Unresolved questions
[unresolved]: #unresolved-questions

How LLVM treats uninitialized bytes in struct padding and unions.