Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend Rc/Arc::from_raw() docs regarding unsafety #68099

Merged
merged 3 commits into from
Mar 22, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 18 additions & 5 deletions src/liballoc/rc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,10 @@ use crate::vec::Vec;
#[cfg(test)]
mod tests;

// This is repr(C) to future-proof against possible field-reordering, which
// would interfere with otherwise safe [into|from]_raw() of transmutable
// inner types.
#[repr(C)]
struct RcBox<T: ?Sized> {
strong: Cell<usize>,
weak: Cell<usize>,
Expand Down Expand Up @@ -570,15 +574,24 @@ impl<T: ?Sized> Rc<T> {
ptr
}

/// Constructs an `Rc` from a raw pointer.
/// Constructs an `Rc<T>` from a raw pointer.
///
/// The raw pointer must have been previously returned by a call to a
/// [`Rc::into_raw`][into_raw].
/// The raw pointer must have been previously returned by a call to
/// [`Rc<U>::into_raw`][into_raw] where `U` must have the same size
/// and alignment as `T`. This is trivially true if `U` is `T`.
Comment on lines +579 to +581
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is same size and align strong enough? What stops rustc from ordering the strong and weak fields of RcBox and ArcInner in a different order for different T?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, nothing, but we know today that rustc will not do so I believe. I continue to believe that it would be a good idea to add #[repr(C)] or something like it to RcBox and ArcInner, since there should be no reason to not guarantee (internally at least) that the field layout is consistent.

Notably, the data field must be last, due to unsizing, and reodering two usize fields is not currently ever going to happen anyway, I believe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reordering two usize fields is not currently ever going to happen anyway, I believe

Currently it doesn't, but rust-lang/unsafe-code-guidelines#36 is about whether to make this a guarantee, and I don't think it's decided yet. For example PGO-based field layout would absolutely reorder fields with the same size and align to keep frequently read data together and frequently written data in a different cache line.

I don't necessarily intend on blocking this PR on this, but I would feel more comfortable with #[repr(C)] on those types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of adding #[repr(C)] as well. Doing so in this PR makes sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended RcBox and ArcInner with repr(C)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a comment to those repr(C) annotations that notes that they're not for C-compat, but strictly so that the layout is equivalent for transmutable types (or so)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force-pushed

/// Note that if `U` is not `T` but has the same size and alignment, this is
/// basically like transmuting references of different types. See
/// [`mem::transmute`][transmute] for more information on what
/// restrictions apply in this case.
///
/// This function is unsafe because improper use may lead to memory problems. For example, a
/// double-free may occur if the function is called twice on the same raw pointer.
/// The user of `from_raw` has to make sure a specific value of `T` is only
/// dropped once.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? Isn’t Rc::drop the one responsible for dropping T when the refcount reaches zero?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user is responsible for not calling from_raw (and allowing the returned value to drop) "too many times" is what this is getting at. I want to avoid saying something like "only once," since it can be incredibly useful to keep around a single pointer in FFI code and "clone out" Rc's from it, for example.

Copy link
Contributor

@SimonSapin SimonSapin Feb 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see what you’re getting at but I feel this one sentence does not express it well. Maybe we could talk about how each Rc or Arc “owns” one strong reference, and into_raw/from_raw transfer ownership? I think that’s a better mental model, and double-drop is just one of the possible (bad) consequences.

(By the way I think there should be an alternative to from_raw that does not transfer ownership of the strong reference as a less fragile alternative to ManuallyDrop::new(Rc::from_raw(ptr)), but that’s out of scope for this PR. Maybe &*mut T -> &Rc<T>?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on both points. I'm not sure of the exact wording, but maybe @lukaslueg can come up with some draft wording?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little hesitant on this point, for fear of convolution.

Calling into_raw() / from_raw() does not modify the internal reference counter, cloning and dropping the resulting Rc<T> does. One can think of these functions as temporarily lending exactly one "unit of ownership" to the raw pointer. The compiler can't check unique ownership through raw pointers, though, especially as all raw pointers are Copy. The user has to manually ensure T is always dropped exactly once, which is not the case if (among other things) T is dropped through the raw pointer or if from_raw() is called multiple times on inadvertently made copies of the raw pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mark-Simulacrum any thoughts on the previous comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's certainly more detailed :)

However, I think it should clarify that even if you know T is safe to drop lots of times (e.g., () unit type), it is still not okay to let Rc drop more than once for the "last" time.

///
/// This function is unsafe because improper use may lead to memory unsafety,
/// even if the returned `Rc<T>` is never accessed.
///
/// [into_raw]: struct.Rc.html#method.into_raw
/// [transmute]: ../../std/mem/fn.transmute.html
///
/// # Examples
///
Expand Down
23 changes: 18 additions & 5 deletions src/liballoc/sync.rs
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,10 @@ impl<T: ?Sized + fmt::Debug> fmt::Debug for Weak<T> {
}
}

// This is repr(C) to future-proof against possible field-reordering, which
// would interfere with otherwise safe [into|from]_raw() of transmutable
// inner types.
#[repr(C)]
struct ArcInner<T: ?Sized> {
strong: atomic::AtomicUsize,

Expand Down Expand Up @@ -550,15 +554,24 @@ impl<T: ?Sized> Arc<T> {
ptr
}

/// Constructs an `Arc` from a raw pointer.
/// Constructs an `Arc<T>` from a raw pointer.
///
/// The raw pointer must have been previously returned by a call to a
/// [`Arc::into_raw`][into_raw].
/// The raw pointer must have been previously returned by a call to
/// [`Arc<U>::into_raw`][into_raw] where `U` must have the same size and
/// alignment as `T`. This is trivially true if `U` is `T`.
/// Note that if `U` is not `T` but has the same size and alignment, this is
/// basically like transmuting references of different types. See
/// [`mem::transmute`][transmute] for more information on what
/// restrictions apply in this case.
///
/// This function is unsafe because improper use may lead to memory problems. For example, a
/// double-free may occur if the function is called twice on the same raw pointer.
/// The user of `from_raw` has to make sure a specific value of `T` is only
/// dropped once.
///
/// This function is unsafe because improper use may lead to memory unsafety,
/// even if the returned `Arc<T>` is never accessed.
///
/// [into_raw]: struct.Arc.html#method.into_raw
/// [transmute]: ../../std/mem/fn.transmute.html
///
/// # Examples
///
Expand Down