Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start collection of basic data types #1

Merged
merged 3 commits into from
Aug 31, 2023
Merged

Start collection of basic data types #1

merged 3 commits into from
Aug 31, 2023

Conversation

djc
Copy link
Member

@djc djc commented Aug 5, 2023

The idea here is to share a long-term stable (as in, no semver-breaking updates) collection of types that can be used as shared context between rustls, webpki, webpki-roots, rustls-native-certs and rustls-pemfile. Right now, there are a number of pain points where the conversion from rustls-native-certs to rustls OwnedTrustAnchor is pretty manual.

Some of these crates (like rustls-native-certs) previously had a dependency on rustls to share its basic Certificate and PrivateKey dependencies, however rustls evolves much more quickly than rustls-native-certs so that also generated some unwanted churn.

The API here is optimized for long-term stability and doesn't need std so it can be used with webpki in no_std mode. Its dependency on alloc is optional. We also use an inner Cow-like type to abstract over slices and Vecs with a lifetime, so that a TrustAnchor using &'static [u8] field values could be used directly in a rustls verifier.

There has been some discussion of this in rustls issues/PRs but I can't find it now. At the time, Brian was not a fan of this idea.

Related PRs:

@djc djc requested review from cpu and ctz August 5, 2023 19:33
@djc djc force-pushed the start branch 5 times, most recently from 9398fb9 to a8153e0 Compare August 5, 2023 20:34
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here that rustls, webpki, webpki-roots and rustls-pemfile will all share a dependency on this crate for interoperable types? Could you update the PR description with a bit of the higher-level rationale? (I think some of it was discussed in another issue/PR but I can't find it off-hand)

I left a couple comments about clippy settings, but otherwise the code in here seems reasonable for my understanding of what we're trying to achieve with the separate repo.

src/lib.rs Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@djc
Copy link
Member Author

djc commented Aug 7, 2023

Would also like feedback on this from @jsha (concerning FFIability) and @complexspaces (regarding usability within rustls-platform-verifier).

@ctz
Copy link
Member

ctz commented Aug 7, 2023

I think this is looking good so far -- my main comment is that we should be making clear somewhere that these are relatively behaviour-less newtypes, so nobody should end up with an expectation that (for example) CertificateDer contains any particular encoding.

@jsha
Copy link
Contributor

jsha commented Aug 8, 2023

For convenient reference I've pushed docs from this PR to https://rustdoc.crud.net/jsha/pki-types/rustls_pki_types/.

One thing I'm wondering: are these newtypes even necessary? Can we pass primitive slices like &[u8] in their place? In general newtypes are nice for preventing mistakes (like passing a key instead of a certificate). But I think for these people just convert bytes to the newtype and gain no real safety. For instance, look at the example for Certificate:

fn load_certificates_from_pem(path: &str) -> std::io::Result<Vec<Certificate>> {
    let file = File::open(path)?;
    let mut reader = BufReader::new(file);
    let certs = rustls_pemfile::certs(&mut reader)?;

    Ok(certs.into_iter().map(Certificate).collect())
}

The map(Certificate) here doesn't seem to add very much safety or clarity. Though I suppose the idea is that rustls_pemfile::certs would return Vec<pki_types::CertificateDer<'a>> instead of Vec<Vec<u8>> as it does now?

Also to make sure I understand correctly: The Cow-alike implementation here (DerInner::Slice and DerInner::Vec) is not a performance optimization, it's to allow operation in a no_std environment that also does not have the alloc library, because Vec is not available in that environment? And in that environment only the DerInner::Slice variant would be available? As a concrete example: on an embedded system, someone might know that some specific 4kB region of memory is available, make a slice out of it (unsafely but with caution), read some bytes into it, and then make a CertificateDer that references that slice?

Assuming all the above is correct, I feel like I've talked myself into understanding why these are necessary, but I'll still include the ruminating since it may be useful to have it written out (and I could be wrong).

From an FFI standpoint I think this should be fine. We currently expose the Certificate type as rustls_certificate: https://github.com/rustls/rustls-ffi/blob/db40a7b8cee883b131d41741135e440a98a0b0eb/src/cipher.rs#L26-L62

That's a simple struct that has one accessor, to get a pointer and length to the contained data. It would probably keep the same interface. It's returned from rustls_connection_get_peer_certificate and rustls_certified_key_get_certificate. These both document the returned pointer's lifetime to be limited by some parent object, so they should be okay with the slice variant.

The other place I would expect to see the rustls_certificate type is rustls_verify_server_cert_params, passed into certificate verification callbacks. But actually the current version just contains a rustls_slice_bytes, which seems reasonable; in the callback, there's always a bounding lifetime (that is, the duration of the callback).

rustls-ffi doesn't current expose TrustAnchor at all. Instead it exposes RootCertStore as rustls_root_cert_store, and offers rustls_root_cert_store_add_pem, which calls add_parseable_certificates.

One slight reservation I have on the FFI front is: will there ever be a case when we want non-Rust code to be able to generate the Slice / borrowed variant of CertificateDer or others? I worry about that scenario because it requires the non-Rust code to carefully manage lifetimes. But I think if it does become necessary we can always document the lifetime requirements, which are not really any more onerous than what C code deals with all the time. And if we want to eventually support no_std + no alloc crate + FFI, it would be necessary.

Oh and one thing to be aware of on the FFI front: &[u8] is nice and easy because we don't worry about #[repr(C)]. We break it down into a pointer and len and give those to the C side as needed. &[Vec<u8>] and &[CertificateDer<'a>] are challenging because we can't just pass a pointer and len; the representation of Vec and CertificateDer are not guaranteed. For rustls_verify_server_cert_callback in rustls-ffi we handled this in a hacky way by collecting a &[&[u8]] in the Rust function before calling the callback. However, I think a more general solution is possible, and preferred: any time we have, e.g. &[CertificateDer<'a>], we need to treat that as its own FFI type. For instance *const rustls_slice_certificate_der. And then to get any given element from that slice, you would call an accessor, e.g. rustls_slice_certificate_der_get_ith(i). That would return *rustls_certificate_der, which would have its own accessor rustls_certificate_der_as_slice() (returning rustls_slice_bytes).

So, overall after chewing it over for a bit: 👍🏻

@djc
Copy link
Member Author

djc commented Aug 8, 2023

Thanks for the thorough review!

The map(Certificate) here doesn't seem to add very much safety or clarity. Though I suppose the idea is that rustls_pemfile::certs would return Vec<pki_types::CertificateDer<'a>> instead of Vec<Vec<u8>> as it does now?

Yes, it would return a Vec<CertificateDer<'static>> -- the same would go for rustls_native_certs::load_native_certs().

Also to make sure I understand correctly: The Cow-alike implementation here (DerInner::Slice and DerInner::Vec) is not a performance optimization, it's to allow operation in a no_std environment that also does not have the alloc library, because Vec is not available in that environment? And in that environment only the DerInner::Slice variant would be available? As a concrete example: on an embedded system, someone might know that some specific 4kB region of memory is available, make a slice out of it (unsafely but with caution), read some bytes into it, and then make a CertificateDer that references that slice?

I think there are two goals:

  • Make it possible to make these data types work in a no-alloc environment
  • Abstract over the use of the data type (that is, the difference between Vec<u8> and &'static [u8]

For the latter, I'm thinking of scenarios where a TrustAnchor or Certificate might be embedded in a library or application as a const (as in webpki-roots), and allocating for it is actually unnecessary.

rustls-ffi doesn't current expose TrustAnchor at all. Instead it exposes RootCertStore as rustls_root_cert_store, and offers rustls_root_cert_store_add_pem, which calls add_parseable_certificates.

So one potential future direction that I think might be interesting is removing the RootCertStore abstraction in favor of just Vec<TrustAnchor<'static>> -- because I think the RootCertStore abstraction isn't providing a whole lot of value.

@jsha
Copy link
Contributor

jsha commented Aug 8, 2023

Abstract over the use of the data type (that is, the difference between Vec and &'static [u8]

For the latter, I'm thinking of scenarios where a TrustAnchor or Certificate might be embedded in a library or application as a const (as in webpki-roots), and allocating for it is actually unnecessary.

Ah, nice. I think the case of embedding a CertificateDer in an application would be quite unusual for a couple of reasons. It means rebuilding and redeploying is required to rotate certificates. Also, the CertificateDer is not much good without a corresponding private key, and it would be bad practice to embed a private key in an application.

The TrustAnchor case is of course sensible and we have a current example of embedding a list of TrustAnchors in a library (webpki-roots). I increasingly worry about the reliability hazard posed by applications that bake in a list of TrustAnchors and don't get updated, but that's not really an issue for this thread (and hopefully we'll solve it by promoting rustls-platform-verifier as the best way).

@djc
Copy link
Member Author

djc commented Aug 9, 2023

I increasingly worry about the reliability hazard posed by applications that bake in a list of TrustAnchors and don't get updated, but that's not really an issue for this thread (and hopefully we'll solve it by promoting rustls-platform-verifier as the best way).

Yup, I'm still on board with promoting rustls-platform-verifier as the better verifier, but I also think there are use cases where webpki-roots are quite nice. In general, I would argue rustls-platform-verifier is especially useful on Windows and macOS as (a) on Linux it falls back to the rustls-native-certs approach anyway which (b) depends on distribution trust store management, which might well be less reliable than the Mozilla roots. Obviously that still depends on timely upgrades of webpki-roots and regular build + deployments, but I think that scenario still covers a lot of ground in SaaS applications.

@djc djc force-pushed the start branch 2 times, most recently from 1a66c47 to fbbcdf0 Compare August 12, 2023 12:11
@djc
Copy link
Member Author

djc commented Aug 12, 2023

Update:

  • Added Debug implementations for the private key types (which do not print the private key bytes)
  • Derived PartialEq for all the types
  • Added From and Deref impls for CertificateRevocationListDer

Of course the PartialEq implementation for private key types effectively also leaks the private key, but in a less bad way than accidentally leaking it through a derived Debug impl of a downstream type containing it. The PartialEq requirement comes from rustls_pemfile::Item -- it was added in the Initial commit so not sure how load-bearing that is.

Copy link
Member

@ctz ctz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me.

One of the things I took from the rustls no-std RFC was whether a unix timestamp newtype would be good, to hoist up that type from webpki into here. Do you think that's useful? I suppose don't need to make that decision now in any case.

src/lib.rs Outdated Show resolved Hide resolved
@djc
Copy link
Member Author

djc commented Aug 30, 2023

One of the things I took from the rustls no-std RFC was whether a unix timestamp newtype would be good, to hoist up that type from webpki into here. Do you think that's useful? I suppose don't need to make that decision now in any case.

Yes, that's definitely something that I think would make sense, but trying to deal with the initial push here first, which is complicated enough when working across 6 different repos.

@djc djc marked this pull request as ready for review August 30, 2023 14:52
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

Would you be willing to take some of the context you added to the PR description and using it to create a README.md? I think it would be useful to capture the purpose of the crate there.

Otherwise my only feedback was related to comments you were porting as-is from existing code. If you don't want to adjust those as part of this work that's fine with me. Re-reading the comments in this branch prompted some thoughts for potential updates but I don't think they need to be blocking.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@djc djc force-pushed the start branch 2 times, most recently from 841dedb to 963fb11 Compare August 31, 2023 08:44
@djc
Copy link
Member Author

djc commented Aug 31, 2023

Addressed a bunch of documentation issues, added a README and crate root documentation, added #![deny(missing_docs)] to make sure all public API is documented and #![cfg_attr(docsrs, feature(doc_cfg, doc_auto_cfg))] to make sure the alloc-based API are documented as such. I think this is ready to publish as a 0.1.0 release (for iteration, then we should probably bump it to 1.0 before we release rustls 0.22?). Once that is published, we can move forward with the webkpi and pemfile PRs.

@djc djc force-pushed the start branch 2 times, most recently from 5890321 to c7100fc Compare August 31, 2023 08:55
@djc djc added this pull request to the merge queue Aug 31, 2023
Merged via the queue into main with commit b2acb92 Aug 31, 2023
20 of 22 checks passed
@cpu cpu deleted the start branch August 31, 2023 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants