Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added From<Vec<NonZeroU8>> for CString #64069

Conversation

danielhenrymantilla
Copy link
Contributor

@danielhenrymantilla danielhenrymantilla commented Sep 1, 2019

Added a From<Vec<NonZeroU8>> impl for CString

Rationale

  • CString::from_vec_unchecked is a subtle function, that makes unsafe code harder to audit when the generated Vec's creation is non-trivial. This impl allows to write safer unsafe code thanks to the very explicit semantics of the Vec<NonZeroU8> type.

  • One such situation is when trying to .read() a CString, see issue Reading a CString safely without overhead from Read #59229.

    • this lead to a PR: Implement CString::from_reader #59314, that was closed for being too specific / narrow (it only targetted being able to .read() a CString, when this pattern could have been generalized).

    • the issue suggested another route, based on From<Vec<NonZeroU8>>, which is indeed a less general and more concise code pattern.

  • quoting @Shnatsel:

    • For me the main thing about making this safe is simplifying auditing - people have spent like an hour looking at just this one unsafe block in libflate because it's not clear what exactly is unchecked, so you have to look it up when auditing anyway. This has distracted us from much more serious memory safety issues the library had.
      Having this trivial impl in stdlib would turn this into safe code with compiler more or less guaranteeing that it's fine, and save anyone auditing the code a whole lot of time.

@rust-highfive
Copy link
Collaborator

r? @withoutboats

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 1, 2019
src/libstd/ffi/c_str.rs Outdated Show resolved Hide resolved
@rust-highfive
Copy link
Collaborator

The job mingw-check of your PR failed (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-09-01T17:30:24.9862540Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-09-01T17:30:25.6788628Z ##[command]git config gc.auto 0
2019-09-01T17:30:25.6791906Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-09-01T17:30:25.6795190Z ##[command]git config --get-all http.proxy
2019-09-01T17:30:25.6799980Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/64069/merge:refs/remotes/pull/64069/merge
---
2019-09-01T17:36:38.0473105Z     Checking backtrace v0.3.35
2019-09-01T17:36:43.2265818Z error: impl has missing stability attribute
2019-09-01T17:36:43.2267576Z    --> src/libstd/ffi/c_str.rs:722:1
2019-09-01T17:36:43.2268155Z     |
2019-09-01T17:36:43.2268657Z 722 | / impl From<Vec<NonZeroU8>> for CString {
2019-09-01T17:36:43.2269186Z 723 | |     /// Converts a [`Vec`]`<`[`NonZeroU8`]`>` into a [`CString`] without
2019-09-01T17:36:43.2269717Z 724 | |     /// copying nor checking for inner null bytes.
2019-09-01T17:36:43.2270910Z ...   |
2019-09-01T17:36:43.2271338Z 793 | |     }
2019-09-01T17:36:43.2271800Z 794 | | }
2019-09-01T17:36:43.2272242Z     | |_^
---
2019-09-01T17:36:43.4492110Z == clock drift check ==
2019-09-01T17:36:43.4512422Z   local time: Sun Sep  1 17:36:43 UTC 2019
2019-09-01T17:36:43.6012744Z   network time: Sun, 01 Sep 2019 17:36:43 GMT
2019-09-01T17:36:43.6017002Z == end clock drift check ==
2019-09-01T17:36:50.5865241Z ##[error]Bash exited with code '1'.
2019-09-01T17:36:50.5899521Z ##[section]Starting: Checkout
2019-09-01T17:36:50.5902055Z ==============================================================================
2019-09-01T17:36:50.5902124Z Task         : Get sources
2019-09-01T17:36:50.5902167Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@scottmcm scottmcm added the needs-fcp This change is insta-stable, so needs a completed FCP to proceed. label Sep 2, 2019
@Centril Centril added relnotes Marks issues that should be documented in the release notes of the next release. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 3, 2019
@Centril Centril added this to the 1.39 milestone Sep 3, 2019
@clarfonthey
Copy link
Contributor

Perhaps there could be a method to return the nonzero bytes as a slice as well?

@Shnatsel
Copy link
Member

Shnatsel commented Sep 3, 2019

Perhaps there could be a method to return the nonzero bytes as a slice as well?

Sounds reasonable to me. That should be its own PR though, as that can be feature-gated for some time while trait implementations are instantly stable.

@JohnCSimon
Copy link
Member

Ping from triage
@Centril @withoutboats This PR is still waiting on review.
CC @danielhenrymantilla

Thanks.

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just some nits...

/// - for any `cap: usize`, `Layout<[T; cap]>` needs to be equal to
/// `Layout<[U; cap]>` (for the allocator)
#[inline]
unsafe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's with the line break here?

Copy link
Contributor Author

@danielhenrymantilla danielhenrymantilla Sep 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, my own Rust style leaked here (Out of topic: I see pub and unsafe, among others, as markers, hence the logic of them being on an extra line rather than on the same line, in the same vein as other #[meta] attributes)

#[inline]
unsafe
fn transmute_vec<T, U>(v: Vec<T>) -> Vec<U> {
// necessary conditions for `Layout<[T; N]> == Layout<[U; N]>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not check Layout::array::<T / U>().unwrap() against each other?

/// `[T; length]` and `[U; length]`.
///
/// - for any `cap: usize`, `Layout<[T; cap]>` needs to be equal to
/// `Layout<[U; cap]>` (for the allocator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// `Layout<[U; cap]>` (for the allocator)
/// `Layout<[U; cap]>` (for the allocator).

src/libstd/ffi/c_str.rs Outdated Show resolved Hide resolved
src/libstd/ffi/c_str.rs Outdated Show resolved Hide resolved
src/libstd/ffi/c_str.rs Outdated Show resolved Hide resolved
src/libstd/ffi/c_str.rs Outdated Show resolved Hide resolved
src/libstd/ffi/c_str.rs Outdated Show resolved Hide resolved
src/libstd/ffi/c_str.rs Outdated Show resolved Hide resolved
@Centril
Copy link
Contributor

Centril commented Sep 14, 2019

r? @RalfJung for checking the safety argument more in-depth. Also cc @rkruppe.

(Do ping T-libs once that review is done...)

Copy link
Contributor

@hanna-kruppe hanna-kruppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Soundness argument seems fine to me, just some nits.

//
// - `v` cannot contain null bytes, given the type-level
// invariant of `NonZeroU8` (this would still apply even if
// this invariant was a safety invariant and not a validity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: strike the hypothetical. I am reasonably sure you do need the safety invariant here. Most operations on vectors don't assert validity of any elements except perhaps the ones they move into or out of the vec. Of course, a Vec<T> is only safe if elements 0..len are safe at T, so the soundness argument can rest on that plus the safety/validity (same thing) of the NonZeroU8.

(Insert standard disclaimer about all of these details being not yet ratified here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I will remove the part between brackets then (I'm still glad I put it, it is good to clearly think about these things before merging).

Vec::<U>::from_raw_parts(ptr as *mut U, length, capacity)
}

let v = unsafe {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A type annotation won't hurt here IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

crate::alloc::Layout::new::<U>(),
);
// The previous assert should imply the following one:
debug_assert_eq!(
Copy link
Contributor Author

@danielhenrymantilla danielhenrymantilla Sep 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first check is an assert since it is expected to be easily optimized out (only const / compile-time information here).
On the other hand, the check involving Layout::array cannot be const, since it involves the runtime value of the vector capacity.
That being said, I expect Layout<T> == Layout<U> to imply that for all N: usize, Layout<[T; N]> == Layout<[U; N]>.
If this were not to be correct / guaranteed by the language, this debug assertion could be upgraded to a normal assert!; in this case however, the function is only used with the monomorphisation to T = NonZeroU8 and V = u8, which does not require such check at runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a serious bug in Layout::array if this is not given... its comparison test would fail to take something into account, or so.

Not sure if this check is worth the clutter it introduces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I will remove the debug_assert! involving the Layout::array check

@danielhenrymantilla danielhenrymantilla force-pushed the feature/cstring_from_vec_of_nonzerou8 branch 2 times, most recently from 755fe39 to b32aec7 Compare September 14, 2019 15:26
@RalfJung
Copy link
Member

I'm afraid my Rust inbox is swamped, I won't be able to review this -- I can take a glance, but don't have time to do a full review.

Any takers?

(Also is there some way to tell bors "sorry, not me", if I don't know whom else to assign?)

// End-of-scope `mem::forget` but without aliasing problems.
let mut v = mem::ManuallyDrop::<Vec<T>>::new(v);
let v: &mut Vec<T> = &mut *v;
v.as_mut_ptr() // Cannot be aliased.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ManuallyDrop implements Deref; why is the intermediate let v: &mut Vec<T> = ... needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just me being (overly?) cautious with type inference / method resolution within unsafe code , so I wanted to have that Deref be very explicit.

@Mark-Simulacrum
Copy link
Member

Going to re-assign to @rkruppe as they appear to have done some review here but feel free to reassign to someone else if you don't feel confident reviewing this.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Sep 16, 2019

I'm fine with reviewing this. Will take another close look later for due digilence but from what I recall this should be in good shape.

I think some @rust-lang/libs sign-off is also required since this is a new insta-stable API addition.

@Mark-Simulacrum
Copy link
Member

I've nominated to get someone from T-libs to kick off an FCP merge here

@Dylan-DPC-zz Dylan-DPC-zz added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 12, 2019
@Centril Centril modified the milestones: 1.41, 1.42 Dec 20, 2019
@dtolnay
Copy link
Member

dtolnay commented Jan 10, 2020

I checked Kimundi's box per rust-lang/team@bfbf48f.

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Jan 10, 2020
@rfcbot
Copy link

rfcbot commented Jan 10, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jan 20, 2020
@rfcbot
Copy link

rfcbot commented Jan 20, 2020

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@KodrAus
Copy link
Contributor

KodrAus commented Jan 27, 2020

Did we come to a decision on the multiple unsafe blocks? I don't imagine it needs to block merging, but am also of the opinion that we should combine them into one, especially now that they live alongside one another with no safe code in between. They're logically a single operation so I don't think their granularity is necessarily better here.

But thanks for working on this @danielhenrymantilla! Would you like to squash some of your commits down before merging?

@Centril Centril modified the milestones: 1.42, 1.43 Jan 31, 2020
Updated tracking issue number

Added safeguards for transmute_vec potentially being factored out elsewhere

Clarified comment about avoiding mem::forget

Removed unneeded unstable guard

Added back a stability annotation for CI

Minor documentation improvements

Thanks to @Centril's code review

Co-Authored-By: Mazdak Farrokhzad <twingoow@gmail.com>

Improved layout checks, type annotations and removed unaccurate comment

Removed unnecessary check on array layout

Adapt the stability annotation to the new 1.41 milestone

Co-Authored-By: Mazdak Farrokhzad <twingoow@gmail.com>

Simplify the implementation.

Use `Vec::into_raw_parts` instead of a manual implementation of
`Vec::transmute`.

If `Vec::into_raw_parts` uses `NonNull` instead, then the code here
will need to be adjusted to take it into account (issue rust-lang#65816)

Reduce the whitespace of safety comments
@danielhenrymantilla danielhenrymantilla force-pushed the feature/cstring_from_vec_of_nonzerou8 branch from 8a9725a to 60274a9 Compare February 4, 2020 16:21
@danielhenrymantilla
Copy link
Contributor Author

I did have a preference for the split blocks, but I seem to be at a minority here, so I have merged the unsafe blocks (and updated the stability annotation and squashed and rebased everything).

There should be nothing blocking the merge, now 🙂

@Dylan-DPC-zz
Copy link

r? @KodrAus

@rust-highfive rust-highfive assigned KodrAus and unassigned sfackler Feb 12, 2020
@KodrAus
Copy link
Contributor

KodrAus commented Feb 15, 2020

@bors r+

@bors
Copy link
Contributor

bors commented Feb 15, 2020

📌 Commit 60274a9 has been approved by KodrAus

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 15, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Feb 15, 2020
…_from_vec_of_nonzerou8, r=KodrAus

Added From<Vec<NonZeroU8>> for CString

Added a `From<Vec<NonZeroU8>>` `impl` for `CString`

# Rationale

  - `CString::from_vec_unchecked` is a subtle function, that makes `unsafe` code harder to audit when the generated `Vec`'s creation is non-trivial. This `impl` allows to write safer `unsafe` code thanks to the very explicit semantics of the `Vec<NonZeroU8>` type.

  - One such situation is when trying to `.read()` a `CString`, see issue rust-lang#59229.

      - this lead to a PR: rust-lang#59314, that was closed for being too specific / narrow (it only targetted being able to `.read()` a `CString`, when this pattern could have been generalized).

     - the issue suggested another route, based on `From<Vec<NonZeroU8>>`, which is indeed a less general and more concise code pattern.

  - quoting @Shnatsel:

      - >  For me the main thing about making this safe is simplifying auditing - people have spent like an hour looking at just this one unsafe block in libflate because it's not clear what exactly is unchecked, so you have to look it up when auditing anyway. This has distracted us from much more serious memory safety issues the library had.
Having this trivial impl in stdlib would turn this into safe code with compiler more or less guaranteeing that it's fine, and save anyone auditing the code a whole lot of time.
bors added a commit that referenced this pull request Feb 15, 2020
Rollup of 6 pull requests

Successful merges:

 - #64069 (Added From<Vec<NonZeroU8>> for CString)
 - #66721 (implement LowerExp and UpperExp for integers)
 - #69106 (Fix std::fs::copy on WASI target)
 - #69154 (Avoid calling `fn_sig` on closures)
 - #69166 (Check `has_typeck_tables` before calling `typeck_tables_of`)
 - #69180 (Suggest a comma if a struct initializer field fails to parse)

Failed merges:

r? @ghost
@bors
Copy link
Contributor

bors commented Feb 15, 2020

⌛ Testing commit 60274a9 with merge dbef353...

@bors bors merged commit 60274a9 into rust-lang:master Feb 15, 2020
@danielhenrymantilla danielhenrymantilla deleted the feature/cstring_from_vec_of_nonzerou8 branch March 2, 2020 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.