Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify Box<T> representation and its use in FFI #62514

Merged
merged 8 commits into from
Dec 12, 2019

Conversation

stephaneyfx
Copy link
Contributor

This officializes what was only shown as a code example in the unsafe code guidelines and follows the discussion in the corresponding repository.

It is also related to the issue regarding marking Box<T> #[repr(transparent)].

If the statement this PR adds is incorrect or a more in-depth discussion is warranted, I apologize. Should it be the case, the example in the unsafe code guidelines should be amended and some document should make it clear that it is not sound/supported.

@rust-highfive
Copy link
Collaborator

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 9, 2019
@@ -63,6 +63,27 @@
//! T` obtained from `Box::<T>::into_raw` may be deallocated using the
//! [`Global`] allocator with `Layout::for_value(&*value)`.
//!
//! `Box<SomethingSized>` has the same representation as `*mut SomethingSized`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They would always have the same representation, not just if T is Sized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sorry. I was just so focused on the FFI aspect of it that started this PR in the first place that I phrased that poorly. A fix is on the way. Thank you!

@stephaneyfx stephaneyfx changed the title State that Box<SomethingSized> and *mut SomethingSized have the same representation Clarify Box<T> representation and its use in FFI Jul 9, 2019
@Centril Centril added needs-fcp This change is insta-stable, so needs a completed FCP to proceed. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jul 9, 2019
Copy link
Contributor Author

@stephaneyfx stephaneyfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably change the parameter type of bar to Option<Box<Foo>> so that the Rust code can handle C code calling this function with a null pointer, which would lead to UB. I will update the PR tonight.

src/liballoc/boxed.rs Outdated Show resolved Hide resolved
This officializes what was only shown as a code example in [the unsafe code guidelines](https://rust-lang.github.io/unsafe-code-guidelines/layout/function-pointers.html?highlight=box#use) and follows [the discussion](rust-lang/unsafe-code-guidelines#157) in the corresponding repository.

It is also related to [the issue](rust-lang#52976) regarding marking `Box<T>` `#[repr(transparent)]`.
@dtolnay
Copy link
Member

dtolnay commented Jul 12, 2019

I am on board with documenting this as a guarantee. Let's get a set of lang team eyes as well:
r? @joshtriplett

@Alexendoo
Copy link
Member

Ping from triage @joshtriplett, any updates?

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 24, 2019

This officializes what was only shown as a code example in the unsafe code guidelines and follows the discussion in the corresponding repository.

There is a PR to propose guaranteeing this as part of the UCGs RFC, but that PR has not been merged yet: rust-lang/unsafe-code-guidelines#164

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 24, 2019

cc @rust-lang/lang

@stephaneyfx
Copy link
Contributor Author

rust-lang/unsafe-code-guidelines#164 is more general and does not address Box in particular. Given that the field of Box is private, other fields could theoretically be added (however unlikely it is) and so this UCG RFC is not enough for users to rely on the representation of Box being the same as a C pointer. I think that this PR makes a stronger and straightforward commitment FFI writers can rely on.

@joshtriplett
Copy link
Member

joshtriplett commented Jul 25, 2019

Sorry for the delayed review.

I have a concern here regarding semantics, rather than representation.

If you actually pass a Box<T> to an external function, that would pass ownership of it, so Rust won't know about it anymore and won't free it, so does that work for giving over ownership of freeing it to C? And conversely, if C allocates something and your Rust allocator can free C objects, can you accept a return value of Box<T> to take ownership of that heap object?

(Also, this should not work for any T with a destructor.)

This seems consistent with the language you've used here. I just want to raise the point explicitly and ask if this will actually work in all cases. If it will, then by all means let's specify this, and it seems like a very useful way to specify ownership transfer of a heap object.

@Centril Centril removed I-nominated needs-fcp This change is insta-stable, so needs a completed FCP to proceed. labels Jul 25, 2019
@joshtriplett
Copy link
Member

@rfcbot merge

Based on discussion in the lang team meeting, we're confident that this will work, and does indeed represent ownership transfer of a heap object across an FFI boundary.

@rfcbot
Copy link

rfcbot commented Jul 25, 2019

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 25, 2019
@nikomatsakis
Copy link
Contributor

(Tagging @pnkfelix and @withoutboats as they are on leave.)

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from a grammatical nit, these changes look good! However, we may want to address the point that @gnzlbg raised, where using Rust types for functions defined in C can (in subtle cases where things are compiled with PGO etc) cause UB. Do you want me to take a stab at how to word that?

@@ -63,6 +63,50 @@
//! T` obtained from `Box::<T>::into_raw` may be deallocated using the
//! [`Global`] allocator with `Layout::for_value(&*value)`.
//!
//! So long as `T: Sized`, a `Box<T>` is guaranteed to be represented as a
//! single pointer and is also ABI-compatible with C pointers (i.e. the C type
//! `T*`). This means that you have Rust code which passes ownership of a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//! `T*`). This means that you have Rust code which passes ownership of a
//! `T*`). This means that you can have Rust code which passes ownership of a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you!

//! single pointer and is also ABI-compatible with C pointers (i.e. the C type
//! `T*`). This means that you have Rust code which passes ownership of a
//! `Box<T>` to C code by using `Box<T>` as the type on the Rust side, and
//! `T*` as the corresponding type on the C side. As an example, consider this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So at this point, @gnzlbg has convinced me that there is one remaining issue that we may want to address here. In particular, it seems to me that it is only safe to use Rust types if all callers to the function obey the Rust restrictions (e.g., non-null, valid, etc). The easiest way to express that is to say that it is safe to use Rust in types in cases where the function is defined in Rust, but not in cases where the function is defined in C. I'm debating how much detail to go into here and where.

@stephaneyfx
Copy link
Contributor Author

@nikomatsakis Yes, if you don't mind. I am sure your wording will be more accurate than mine. Thank you.

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 11, 2019

@nikomatsakis I think you correctly understood the issue.

This implies to me that it is not sound to use a Rust prototype that is more narrow than the types at the definition site. So if the function is defined in C, in other words, you had best use Raw pointers and match the definition precisely.

However, I believe it is sound (if suboptimal) to use Rust types if the same types are used at the definition. So, in the example given, where the functions are defined in Rust, using types like Box, but the C headers are using raw pointers, everything is fine.

In C, the behavior of programs containing prototypes that do not exactly match the definition is undefined. When using Rust with xLTO, the only thing that should matter are the semantics of LLVM-IR, but AFAIK these semantics are unfortunately not written anywhere and LLVM-IR does not guarantee yet that declarations that are less narrow than definitions are ok. With features like xLTO it can be very hard for declarations to precisely match definitions at the LLVM-IR level, so I also think that allowing these use cases would make sense. Maybe these issues should be brought up with the LLVM community?

@nikomatsakis
Copy link
Contributor

@gnzlbg in my opinion, this feels like an LLVM bug, I do think it'd be good to raise with them. However, for the time being we can at least advise people to only use Rust types for functions that are defined in Rust.

@nikomatsakis
Copy link
Contributor

OK, I pushed some updates notes. I'm inclined to r+ now but I'll give it a bit to see if anyone has any suggestions on my wording.

Copy link
Contributor

@gnzlbg gnzlbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM!

@nikomatsakis
Copy link
Contributor

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Dec 11, 2019

📌 Commit fafa489 has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 11, 2019
@RalfJung
Copy link
Member

LGTM as well.

The concern about C prototypes also applies to &/&mut in FFI, so probably those should get the same warning?

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 11, 2019

The concern about C prototypes also applies to &/&mut in FFI, so probably those should get the same warning?

Yes, I think so, and NonNull should get this warning as well.

@RalfJung
Copy link
Member

@gnzlbg do you plan to write a PR, or should we open an issue to not forget about this?

@gnzlbg
Copy link
Contributor

gnzlbg commented Dec 11, 2019

@RalfJung I've opened #67234

JohnTitor added a commit to JohnTitor/rust that referenced this pull request Dec 12, 2019
Clarify `Box<T>` representation and its use in FFI

This officializes what was only shown as a code example in [the unsafe code guidelines](https://rust-lang.github.io/unsafe-code-guidelines/layout/function-pointers.html?highlight=box#use) and follows [the discussion](rust-lang/unsafe-code-guidelines#157) in the corresponding repository.

It is also related to [the issue](rust-lang#52976) regarding marking `Box<T>` `#[repr(transparent)]`.

If the statement this PR adds is incorrect or a more in-depth discussion is warranted, I apologize. Should it be the case, the example in the unsafe code guidelines should be amended and some document should make it clear that it is not sound/supported.
bors added a commit that referenced this pull request Dec 12, 2019
Rollup of 8 pull requests

Successful merges:

 - #62514 (Clarify `Box<T>` representation and its use in FFI)
 - #66983 (Fix `unused_parens` triggers on macro by example code)
 - #67215 (Fix `-Z print-type-sizes`'s handling of zero-sized fields.)
 - #67230 (Remove irelevant comment on `register_dtor`)
 - #67236 (resolve: Always resolve visibilities on impl items)
 - #67237 (Some small readability improvements)
 - #67238 (Small std::borrow::Cow improvements)
 - #67239 (Make TinyList::remove iterate instead of recurse)

Failed merges:

r? @ghost
@bors
Copy link
Contributor

bors commented Dec 12, 2019

☔ The latest upstream changes (presumably #67246) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 12, 2019
@bors bors merged commit fafa489 into rust-lang:master Dec 12, 2019
//! pub struct Foo;
//!
//! #[no_mangle]
//! pub extern "C" fn foo_new() -> Box<Foo> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually a function like foo_new() requires that the returned pointer be destroyed by calling the foo_delete() function. Returning a Box is almost always the wrong thing to do here since Box doesn't enforce that foo_delete is called when the Box is dropped. So this example is documenting a pattern that is usually the wrong thing to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith this is a long-merged PR, so new comments here are bound to get lost... please open an issue if you think this needs to be tracked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a Box is almost always the wrong thing to do here since Box doesn't enforce that foo_delete is called when the Box is dropped.

It's no different than "regular" FFI code where one returns raw pointer and requires the caller to call the corresponding _delete function with the same pointer, but returning & accepting Box better documents semantics.

If you open a new issue, please cc me, this is a pattern close to my heart :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.