Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Direct support for PEM-decoding of this crate's types #53

Merged
merged 12 commits into from
Sep 27, 2024
Merged

Direct support for PEM-decoding of this crate's types #53

merged 12 commits into from
Sep 27, 2024

Conversation

ctz
Copy link
Member

@ctz ctz commented Sep 3, 2024

This incorporates the contents of the pemfile crate, including rustls/pemfile#53

rustls/pemfile#55 is a draft over there, to rebase that crate on top of this.

Copy link
Member

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

src/pem.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this 👍 I didn't have too much to add over what Djc has already commented.

One meta-question: what sort of testing have you done for the constant-time properties of the new base64 decoder for the secret data paths? I know this is famously very difficult to nail down concretely when the compiler has so much power to optimize the code differently based on a number of properties, but is there some sanity checking we could do at this stage to gain more confidence? Maybe you already have?

src/pem.rs Outdated Show resolved Hide resolved
@ctz
Copy link
Member Author

ctz commented Sep 4, 2024

One meta-question: what sort of testing have you done for the constant-time properties of the new base64 decoder for the secret data paths? I know this is famously very difficult to nail down concretely when the compiler has so much power to optimize the code differently based on a number of properties, but is there some sanity checking we could do at this stage to gain more confidence? Maybe you already have?

Good question! During development I did look at the disassembly of CodePoint::decode_secret to see there were no branches, but that is a bit unsatisfying with a new stable rustc every 6 weeks.

The other thing I tried (just now, in fact) was a test like:

    #[test]
    fn decode_secret_does_not_branch_or_index_on_secret_input() {
        // this is using the same theory as <https://github.com/agl/ctgrind>
        use crabgrind as cg;

        if matches!(cg::run_mode(), cg::RunMode::Native) {
            std::println!("SKIPPED: must be run under valgrind");
            return;
        }

        let input = b"aGVsbG8gd29ybGQh".to_vec();
        cg::monitor_command(format!(
            "make_memory undefined {:p} {}",
            input.as_ptr(),
            input.len()
        ))
        .unwrap();

        let mut output = [0u8; 32];
        let output = decode_secret(&input, &mut output).unwrap();
        cg::monitor_command(format!(
            "make_memory defined {:p} {}",
            output.as_ptr(),
            output.len()
        ))
        .unwrap();

        std::println!("{output:?}");
    }

But this is messy: it confirms there are no branches inside CodePoint::decode_secret, but then fires on the match that immediately follows it (to skip over whitespace, or error on invalid characters, or count pad characters).

Verifying just CodePoint::decode_secret is a bit cleaner:

    #[test]
    fn codepoint_decode_secret_does_not_branch_or_index_on_secret_input() {
        // this is using the same theory as <https://github.com/agl/ctgrind>
        use crabgrind as cg;

        let input = [b'a'];
        cg::monitor_command(format!(
            "make_memory undefined {:p} {}",
            input.as_ptr(),
            input.len()
        ))
        .unwrap();

        let out = CodePoint::decode_secret(input[0]);
    }

And that passes. If you println out, then it fails (in the depths of trying to print the secret value), and if you replace CodePoint::decode_secret with CodePoint::decode_public, you get:

==2921557== Thread 2 base64::tests:::
==2921557== Conditional jump or move depends on uninitialised value(s)
==2921557==    at 0x1204ED: rustls_pki_types::base64::CodePoint::decode_public (base64.rs:482)
==2921557==    by 0x13767D: rustls_pki_types::base64::tests::codepoint_decode_secret_does_not_branch_or_index_on_secret_input (base64.rs:748)

(As expected, since that function definitely will contain a computed jump or array index on the secret input.)

@ctz ctz force-pushed the jbp-pem branch 5 times, most recently from 9c06637 to a76ad42 Compare September 6, 2024 09:23
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some more feedback (almost universally small nits, sorry!) from reviewing this branch to remind myself where we left off.

src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@ctz ctz force-pushed the jbp-pem branch 2 times, most recently from 24a4f6a to 97ef98d Compare September 18, 2024 11:20
@ctz ctz marked this pull request as ready for review September 18, 2024 11:39
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

src/pem.rs Outdated Show resolved Hide resolved
tests/pem.rs Outdated Show resolved Hide resolved
@cpu
Copy link
Member

cpu commented Sep 18, 2024

Also noticed that rustls/pemfile#55 is taking this as a dep w/ a git patch but didn't need to change the targetted version.

I think this branch could use a Cargo.toml version bump?

src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Outdated Show resolved Hide resolved
src/base64.rs Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
@ctz ctz force-pushed the jbp-pem branch 4 times, most recently from 4c04e90 to 9b34632 Compare September 23, 2024 16:11
@djc
Copy link
Member

djc commented Sep 24, 2024

@ctz I pushed a commit that replaces the macro with a trait. Have a look and see what you think?

@ctz
Copy link
Member Author

ctz commented Sep 24, 2024

That looks good, but I've been trying to avoid relying on the private key type autodetection -- I think it should be reserved for cases where someone has some opaque bytes containing a private key (but no type information), and need to determine the type to make a PrivateKeyDer.

It seems odd for =====BEGIN RSA PRIVATE KEY=====\n{pkcs8 key}\n=====END RSA PRIVATE KEY===== to produce a PrivateKeyDer::Pkcs8, though I can see the argument it is a more useful behaviour in the postel's law sense (and would have covered up/avoided rustls/rustls#2038)

With that said, I think you've avoided the earlier issue I had with the trait (by publicising the iterator types) so there's certainly a happy medium to be found here.

@djc
Copy link
Member

djc commented Sep 24, 2024

That looks good, but I've been trying to avoid relying on the private key type autodetection -- I think it should be reserved for cases where someone has some opaque bytes containing a private key (but no type information), and need to determine the type to make a PrivateKeyDer.

It seems odd for =====BEGIN RSA PRIVATE KEY=====\n{pkcs8 key}\n=====END RSA PRIVATE KEY===== to produce a PrivateKeyDer::Pkcs8, though I can see the argument it is a more useful behaviour in the postel's law sense (and would have covered up/avoided rustls/rustls#2038)

Since PemObject::from_pem() is still a trait method, we could just override the default implementation for PrivateKeyDer to do something else:

  • Yield based on the SectionType without parsing DER
  • Parsing DER and erroring out if the SectionType doesn't match the DER found

Copy link
Member

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good direction!

fuzz/fuzz_targets/pem.rs Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping an eye on this branch as it evolves. Just had a few small things to surface from my catch-up pass:

src/lib.rs Outdated Show resolved Hide resolved
src/pem.rs Show resolved Hide resolved
@ctz
Copy link
Member Author

ctz commented Sep 24, 2024

addressed those comments in the latest commit. i'll give some space for any extra comments, and then work on the commit history (which should address the remaining comments)

@djc
Copy link
Member

djc commented Sep 24, 2024

I think I'm satisfied with the direction here, would like to do a last pass after commit history cleanup.

Would maybe suggest trying to rebase the downstream PR in rustls-pemfile before cleaning up commits here.

@ctz
Copy link
Member Author

ctz commented Sep 25, 2024

Would maybe suggest trying to rebase the downstream PR in rustls-pemfile before cleaning up commits here.

That is done, and I've addressed the comments over there.

@ctz ctz force-pushed the jbp-pem branch 2 times, most recently from 6cd7872 to dddfb8d Compare September 25, 2024 15:52
@ctz
Copy link
Member Author

ctz commented Sep 25, 2024

OK, I think the commit history is a bit more sensible.

I did a docs and doctest pass and left that separate for now, but would squash that into the "directly support PEM decoding" commit. But that probably deserves a careful look in case I a word.

Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tidied up commit history & the updated rustdoc comments all look good to me. I think where this API landed is a really nice end-state. Great work 🌠

Copy link
Member

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we now end up with two different Item types across this crate and rustls-pemfile? Why is that necessary/helpful? I thought it might be because of inherent methods on Item, but it looks like Item has no public inherent methods in 2.1.3?

(I also wonder if we should not have Item in this crate at all, or maybe make it #[doc(hidden)] for now in favor of the newer API.)

src/pem.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
#[cfg(feature = "alloc")]
impl PemObject for PrivateKeyDer<'static> {
fn from_pem(kind: SectionKind, value: Vec<u8>) -> Option<Self> {
match kind {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to call this out specifically: should we validate here that the DER is in the correct format? If we don't, I suppose rustls is going to error out at a later stage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is maintaining the status-quo. We could have a check_format() or something on PrivateKeyDer (extracts the data, guesses the format, and checks that is consistent with the used variant) but it would come short of any sort of key validation or thorough validation of the format -- I think these things need to happen at the level that fully supports DER decoding and cryptography.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would come short of any sort of key validation or thorough validation of the format -- I think these things need to happen at the level that fully supports DER decoding and cryptography.

Not sure I agree? The earlier we can signal errors, the more context might be available to the user, especially in this case of format confusion that I imagine would be tricky for inexperienced users. But, it is the status quo so no need to block this PR on it.

src/pem.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
src/pem.rs Outdated Show resolved Hide resolved
- base64.rs: build, msrv & clippy fixes
- pem.rs: fix location of types
Unfortunately existing test cases weren't expanded to
cover this new API, so the preexisting `different_line_endings`
test case didn't work at all.

A later commit instates test coverage for this change.
The `pem::PemObject` trait is the high-level entry point
into this, and allows (eg) an iterator of `CertificateDer`s
to be created from a filename (amongst other options).
This caters to types that are typically plural (certificates,
CRLs) and ones that are singular (keys) -- the later with
APIs that select the first matching object or error if there
are none.

At a lower level, there are iterator-based APIs for consuming
all the supported PEM sections in a file, stream or slice in
one pass.

Simple types who have a one-to-one mapping between PEM section
(ie, `CertificateDer` is only ever from a PEM `BEGIN CERTIFICATE`
section) impl `pem::PemObject` via `pem::PemObjectFilter` which
minimises boilerplate.

`PrivateKeyDer` is more complex, and impls `pem::PemObject` itself
so it can dispatch the three source PEM section types
(`PRIVATE KEY`, `RSA PRIVATE KEY`, & `EC PRIVATE KEY`) to the
right enum variant.
Extend zen.pem to include:

- add an unknown section that is base64
- add an unknown section that is not base64
- add an SPKI

Delete unused zen2.pem (which was two copies of zen.pem
originally, but fell out of sync and is now unreferenced).
@ctz ctz added this pull request to the merge queue Sep 27, 2024
Merged via the queue into main with commit fb2d9a8 Sep 27, 2024
28 checks passed
@ctz ctz deleted the jbp-pem branch September 27, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants