Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness: clap_lex makes incorrect assumption about OsStr #5280

Closed
Manishearth opened this issue Jan 3, 2024 · 6 comments · Fixed by #5344
Closed

Soundness: clap_lex makes incorrect assumption about OsStr #5280

Manishearth opened this issue Jan 3, 2024 · 6 comments · Fixed by #5344

Comments

@Manishearth
Copy link
Contributor

clap/clap_lex/src/ext.rs

Lines 248 to 256 in 48d28aa

// SAFETY:
// - Lifetimes are the same
// - Types are compatible (`OsStr` is effectively a transparent wrapper for `[u8]`)
// - The primary contract is that the encoding for invalid surrogate code points is not
// guaranteed which isn't a problem here
//
// There is a proposal to support this natively (https://github.com/rust-lang/rust/pull/95290)
// but its in limbo
unsafe { std::mem::transmute(s) }

It's not true that OsStr is stably a transparent wrapper around [u8], this code is technically unsound. It's fine as long as the Rust stdlib avoids making this change.

Fortunately the soon-to-be-stabilized as_os_str_bytes() will fix this. Opening this issue to track it.

@epage
Copy link
Member

epage commented Jan 4, 2024

While I generally feel I'm not creative enough to play these games, when I analyzed this, I could not find a way that the language or standard library can change that would violate this assumption because of the roundtripping involved with the conversions: "ref to DST" (&[u8]) -> "ref to DST" (&str) -> "ref to DST" (&OsStr) -> "ref to DST" (&str) -> "ref to DST" (&[u8]). Is there something I'm missing?

I guess if the language allowed refs to DSTs to choose their representation ((ptr, ptr) vs (ptr, length)) and they chose different representations, then this could happen but (again, maybe not being creative enough) I cannot think of a situation this would actually happen.

As you said, as_os_str_bytes will soon be in our MSRV and we can switch to it. I'm more concerned about the perception of risk from people seeing this issue if the perception of risk doesn't match the reality of the risk.

@Manishearth
Copy link
Contributor Author

Is there something I'm missing?

Yes, that it's still not a documented invariant; I'm in general not comfortable relying on those in unsafe code and usually note them when performing unsafe review. I mostly filed this issue so I have something to point to in the published unsafe audit, it's not a huge deal and this will be a non-problem in the future anyway.

DSTs changing their representation is indeed a real worry. In general it's not a huge one: so much code relies on DST repr that it's unlikely to change without major warning, but that still does make it important to track. Fortunately in this case it will be fixed before any such change can come down the pipeline.

epage added a commit to epage/clap that referenced this issue Feb 8, 2024
@epage
Copy link
Member

epage commented Feb 8, 2024

DSTs changing their representation is indeed a real worry. In general it's not a huge one: so much code relies on DST repr that it's unlikely to change without major warning, but that still does make it important to track. Fortunately in this case it will be fixed before any such change can come down the pipeline.

While I am fixing this (#4344), I'm wanting to understand this further as I find other needs for transmuting DSTs into each other (const functions). Maybe I don't have a creative enough imagination but in what ways can the DST representation change where we can't transmute them between each other when the roundtripping guarantees exist like with &str / &OsStr / &[u8].

@Manishearth
Copy link
Contributor Author

Maybe I don't have a creative enough imagination but in what ways can the DST representation change where we can't transmute them between each other when the roundtripping guarantees exist like with &str / &OsStr / &[u8].

The metadata order could swap. And the provenance could be weird.

@epage
Copy link
Member

epage commented Feb 8, 2024

So two DSTs in the same rustc version could have metadata in different order? That seems odd.

@Manishearth
Copy link
Contributor Author

Rust has randomize-type-layout today and in the long run the hope seems to be that correctly written Rust code will always succeed under randomize-type-layout. I'm not sure if that option currently randomizes DST representations, but I don't think there's any reason for it not to in the long run.

So yes, that is a potential reality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants