Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add x509 Certificate Validation #2381

Closed
danetrain opened this issue Sep 28, 2015 · 54 comments · Fixed by #8873
Closed

Add x509 Certificate Validation #2381

danetrain opened this issue Sep 28, 2015 · 54 comments · Fixed by #8873
Assignees

Comments

@danetrain
Copy link

Currently, cryptography contains no functionality to validate a certificate chain against a trusted root certificate. This is a fairly standard operation; it is described in detail by RFC 5280. I would like to implement this by adding a x509Validator interface to the cryptography.x509 module.

@reaperhulk
Copy link
Member

references #1957 and #1660 (although that issue overlaps but is not the same as this).

If the scope of this issue is limited to validation of a cert by an already provided chain (rather than trying to build a chain from an arbitrary set of certificates) we can solve that and then move on to chain building.

OpenSSL obviously implements chain building, but most versions still in use have limitations that can be problematic (see: https://lukasa.co.uk/2015/04/Certifi_State_Of_Union/ and certifi/python-certifi#26).

I thought we had an issue discussing what an API for this might look like, but I can't find it. Do you have a proposal @danetrain?

@danetrain
Copy link
Author

I'm thinking the API should expose pretty limited functionality. Basically I think the user will just initialize the validating object from a list of trusted certs and call a validate method with the cert chain of an untrusted cert. Like signature verification, it'll either return true or throw exceptions.

I'm thinking it'll go something like this from the user end:

from cryptography.x509 import x509Validator

trusted_certs = [ ] #array of cryptography.x509 objects which are self-signed trusted roots#
cert_chain = [ ] #array of cryptography.x509 objects which is the cert chain for an untrusted cert#
validator = x509Validator(trusted_certs)

try:
validator.validate(cert_chain)
except:
#throws exceptions for invalid signature chain, if root of cert chain is not trusted#
#or if root of cert chain is not signed by a trusted root#

@schlenk
Copy link

schlenk commented Sep 29, 2015

pyopenssl has the code for basic cert verification. Would be trivial to port to cryptography.

#1660 has some basic code to do it and allow callbacks too.

It is not perfect though, as it does not set all the possible flags for verification of timestamps (do you check at the time of issue, current time, time of signature?), check CRLs, OCSP etc.

@etrauschke
Copy link
Contributor

I'm too, gonna need this. I proposed a way in my first pull requests for CRLs (which also need validation methods). OpenSSL does offer the X509_*_verify() method which allows to verify the object against a public key. This requires that the chain would have to be validated manually but the code I'm trying to port already does that anyway.

I'm willing to do the work for this if someone decides this is a viable solution to the problem.

@danetrain
Copy link
Author

Here's a code sample of what I'm working on so far. The only user-facing methods are verify_certificate_chain and verify_certificate_signature. Note that the verify_certificate_signature method is only a stub for the moment.

I'd like to request that we expose the signature bytes through the cryptography.x509.Certificate API. On a general level, it fits with the overall purpose of the x509 Certificate object, since the function of this object is generally just to parse the underlying certificate bytes. In the context of my feature, it'd simplify the implementation of the verify_certificate_signature method since I could rely solely on functionality and data exposed by the cryptography API, and wouldn't have to directly invoke any bound OpenSSL methods. I could retrieve the hash method, retrieve the signature bytes, and generate the cert's fingerprint through the x509 Certificate interface. Then, I could retrieve the backend object and the public key through the signing cert and use cryptography's asymmetric modules to verify the signature.

class CertificateValidator(object):

    def __init__(self, root_certs):
        if not all(
            isinstance(cert, Certificate) for cert in root_certs
        ):
            raise TypeError(
                "root_certs must be a list of "
                "cryptography.x509.Certificate objects"
            )
        if not all(
            self._is_root_cert(cert) for cert in root_certs
        ):
            raise ValueError(
                "root_certs must be a list of "
                "root certificates"
            )
        if not all(
            self._is_valid_at_current_time(cert) for cert in root_certs
        ):
            raise ValueError(
                "At least one root certificate is not "
                "valid at this time"
            )
        self._root_certs = root_certs

    def _is_root_cert(self, cert):
        return cert.subject == cert.issuer

    def _is_valid_at_current_time(self, cert):
        now = datetime.datetime.utcnow()
        return (now > cert.not_valid_before) and (now < cert.not_valid_after)

    def _get_issuing_root_cert(self, cert):
        for root_cert in self._root_certs:
            root_name = root_cert.subject
            if (root_name == cert.issuer or root_name == cert.subject):
                return root_cert
        return None

    '''When implemented, this will return True or raise an error'''
    def verify_certificate_signature(self, issuing_cert, subject_cert):
        return True

    '''Returns True or throws error'''
    def validate_certificate_chain(self, cert_chain):
        if not all(
            isinstance(cert, Certificate) for cert in cert_chain
        ):
            raise TypeError(
                "cert_chain must be a list of "
                "cryptography.x509.Certificate objects"
            )
        if not all(
            self._is_valid_at_current_time(cert) for cert in cert_chain
        ):
            raise ValueError(
                "cert_chain must consist of cryptography.x509.Certificate "
                "objects which are valid at the current time"
            )

        '''Start by checking that the chain issues from a trusted root'''
        issuing_cert = cert_chain[0]
        issuing_root = self._get_issuing_root_cert(issuing_cert)

        if not issuing_root:
            raise ValueError(
                "No matching root certificate found for certificate chain"
            )
        self.validate_certificate_signature(issuing_root, issuing_cert)

        '''Verify certificate signatures down the chain'''
        for i in range(len(cert_chain) - 1):
            self.validate_certificate_signature(cert_chain[i],
                                                cert_chain[i + 1])

        return True

@etrauschke
Copy link
Contributor

Yes, that was basically what I had in mind too, however, I'd make this part of the X509 object itself. So to verify a signature against a CA or just its public key you would just do that:
cert.verify(pub_key)

For verifying the whole chain, I think it's a good idea to have this as part of a certstore. So you just throw all the certs in there and then check if you have a valid chain.

Why would the helper functions like _is_root_cert() be "protected"?

@schlenk
Copy link

schlenk commented Sep 29, 2015

The code is probably too simple to handle all the tricky details in RFC 5280. Especially the _is_root_cert() and _get_issuing_root_cert() code is naive and will probably fail miserably for the cases @reaperhulk mentioned in his two links about certifi (cross certification).

If you have want to have any hope of getting verification right, assume you have the proper certificate chain completely and in the correct order upfront. Do not try to do path building in a first step.

I'm not sure if it is better to have a rather limited API for cert validation on the python layer or if a wrapper for the openssl verify code is better. It should be possible to write it with the cryptography API alone.

@danetrain
Copy link
Author

@etrauschke : The decision to build a Validator class instead of adding additional functionality to the x509 object is that it presents a more convenient way for the user to validate things against a root(s) of trust. If we just have a cert.verify(pub_key), then we pass the work of testing the candidate cert against all the trusted public keys on to the user. The Validator approach provides a pretty literal representation of roots of trust to the user. They can just add in root certs for CAs they trust, and verify candidate certs against them with a single line of python.

@schlenk : I'm confused by your comments. This code is meant primarily to illustrate a potential API change, not represent a complete implementation. Of course it will 'fail miserably', one of the most important methods is a stub that returns True. I'm also not attempting path building; the code assumes that the given certificate chain is more or less complete (either begins with a trusted root or begins with a cert signed by a trusted root) and assumes that it is ordered correctly (i.e. that each cert is signed by the previous cert in the chain).

@schlenk
Copy link

schlenk commented Sep 29, 2015

@danetrain Sorry for confusing.

Basically if you pass a 'certificate chain', why do you need to search for the root CA with _get_issuing_root_cert() ? cert_chain[0] should be the root CA to use. If it is not, your certificate chain is incomplete and you need (a very limited form of) path building to chain it up to your root certificates. There should be no ambiguity in the API more or less complete is a bit too vague for my taste. Example: If you have two CA certificates with different Key Identifiers but identical subjects your API randomly picks one of those and misvalidates.

A bit similar issues for the timestamp checks. You verify the Root CAs are valid right now. That is great if you want to verify a SSL/TLS certificate for a web browser. But it is wrong if you want to verify a digital signature with a policy like 'chain-model' for validation where the timestamp of the signature is the one to check. And if there are timestamp checks, the chain certificates should also be checked if they are in the valid range.

@danetrain
Copy link
Author

@schlenk: "cert_chain[0] should be the root CA to use" is inconsistent with RFC 5280. Section 6.1 indicates that the first certificate should be issued by a trust anchor. These trust anchors are represented in my code by _root_certs attribute of the Validator class. That being said, I will make the code stricter so that it will not accept cert chains which include the actual trust anchor. I'll also clean up the language of the class so it maps more directly to the specification in RFC 5280.

I see your point about the temporal checks needing to be based on the signing times rather than the current time. I'll make this change as well.

@reaperhulk
Copy link
Member

Added signature #2387, but we're going to need something that outputs tbsCertificate as well if you want to verify a signature.

@danetrain
Copy link
Author

Thanks! Yeah I realized that this would also be an issue yesterday. Unless the fingerprint() method only hashes the tbsCertificate bytes then we have to do one of the following in order to get the proper data to verify the signature:

  • Add another API method to return the cert bytes without the signature
  • Manually remove the signature from the end of the certificate bytes in the Validator code

I'd prefer to go the former route since it seems cleaner and less error-prone, but if there are arguments for the latter I'd be open to hearing them.

moreati added a commit to moreati/python-u2flib-server that referenced this issue Oct 5, 2015
Use cryptography + pyasn1 to implement MetadataResolver._verify()

This replaces the last use of M2Crypto. However it's rough and probably misses
varisou corner cases. It's probably better to wait for pyca/cryptography#2381
@etrauschke
Copy link
Contributor

Any update yet on which path we want to go down with this?

@PeterHamilton
Copy link
Contributor

Hi all, I'm going to be stepping in to assist @danetrain with the implementation for the CertificateValidator. I'm hoping to get a pull request up by next week.

@etrauschke As far as I know for Option 1, @reaperhulk hasn't updated #2387 with an API call that provides access to the tbs_certificate bytes. I'm currently going forward with Option 2 (manually obtaining the tbs_certificate bytes by "subtracting out" the signature bytes from the certificate) but I'm definitely open to using the API if it gets updated.

@etrauschke
Copy link
Contributor

I think that was because the usage of an interface like this was fairly cumbersome. Imo we should use the functions the OpenSSL library already provides for this. You can still add all the other magic like assembling the whole cert chain, retrieving CRLs, etc..

But as was mentioned before (can't find it anymore, though) you don't want to make network connections to retrieve CRLs or test revocation via OCSP by default. I'm not sure if it forcing the user to provide a whole cert chain actually buys you that much. It should be up to the user if you just want to verify a sub-chain, the whole chain but without revocations or the whole chain with revocations.

@reaperhulk
Copy link
Member

If we end up going down this path we'll either need a tbs_certificate property so you can use the hazmat asym verification primitives or else we'll need to define new backend methods (e.g. verify_x509_signature(self, cert, ca)) that can abstract it away. I'm ambivalent right now as to which way we should go. The advantage of the backend method would be that we can leverage backend specific behaviors (like X509_verify in the case of OpenSSL).

@PeterHamilton
Copy link
Contributor

@etrauschke I agree that making additional network connections when doing lookups should be avoided, at least for this first cut. Right now, the implementation we're working on requires the full chain from the user. I think once we have that working and in place, we can build off of it to support the other options you listed.

@reaperhulk If I had to choose, I'd pick adding the tbs_certificate property, to make working with the different parts of the certificate bytes consistent.

@rowland-smith
Copy link

Hi all. First off, thanks for 'cryptography', I've tried all the different python crypto libraries over many years and this is the best design and API I've seen yet. I have written about 80% of an extensible certificate path validation package, but I need access to the tbs_certficate property you mentioned above as well as the signatureValue from the outer body of the certificate so I can verify the signature on the certificate. I just started looking at the openssl backend to see if I could expose this information myself until you decide on how you will make signature verificative possible officially. Any pointers on how to get to the toBeSigned DER encoded bytes and the DER encoded signatureValue would be greatly appreciated. Also I would like to share my Validation module with you if you're interested.

@PeterHamilton
Copy link
Contributor

@rowland-smith Check out #2387. It adds a signature property to the Certificate API that allows you to get the signature bytes. I'm hoping that'll get merged soon. We're still debating on the right approach for getting the tbs_certificate bytes (pinging @reaperhulk :).

I'm also in the process of adding certificate chain validation to cryptography, see #2460. Any comments you might have on what I have so far are definitely appreciated.

@rowland-smith
Copy link

Hi Peter, I'll check that out today. FYI my code is now on GitHub: https://github.com/river2sea/X509Validation

Thanks,
Rowland

On Nov 2, 2015, at 8:16 AM, Peter Hamilton notifications@github.com wrote:

@rowland-smith Check out #2387. It adds a signature property to the Certificate API that allows you to get the signature bytes. I'm hoping that'll get merged soon. We're still debating on the right approach for getting the tbs_certificate bytes (pinging @reaperhulk :).

I'm also in the process of adding certificate chain validation to cryptography, see #2460. Any comments you might on what I have so far are definitely appreciated.


Reply to this email directly or view it on GitHub.

@rowland-smith
Copy link

Peter, I was able to integrate the changes (made by reaperhulk?) that I found in issue #2387 with a clone of the latest cryptography master. I can get the signature from the Certificate now. Any word on getting the tbs_bytes?

@PeterHamilton
Copy link
Contributor

@rowland-smith No word yet on the tbs_certificate bytes. It definitely needs to be added to cryptography, either as an update to the Certificate API or to the backend API. @etrauschke and @reaperhulk were still debating it last I heard. I'll try pinging them again on it if they don't pop in to discuss soon.

@rowland-smith
Copy link

OK, Thanks, I'll keep checking back periodically. I'm trying a workaround using pyasn1 for the tbs_certificate bytes, we'll see how that goes.

@PeterHamilton
Copy link
Contributor

@rowland-smith FYI, @reaperhulk updated #2387 to include a tbs_certificate_bytes property in the Certificate API. Hopefully this will merge soon.

@rowland-smith
Copy link

Great, thanks for the heads up!

Sent from my iPhone

On Nov 3, 2015, at 12:02 PM, Peter Hamilton notifications@github.com wrote:

@rowland-smith FYI, @reaperhulk updated #2387 to include a tbs_certificate_bytes property in the Certificate API. Hopefully this will merge soon.


Reply to this email directly or view it on GitHub.

@rowland-smith
Copy link

@PeterHamilton If you're interested in talking about certificate path validation, my email address is on my GitHub profile page. As far as I can tell you can't send someone a direct email from GitHub(?). Apologies to all for using the issues forum as an email client ;)

@alex
Copy link
Member

alex commented Nov 4, 2015

tbs_certificate_bytes is now on master

KonstantinShemyak added a commit to KonstantinShemyak/cryptography that referenced this issue Jun 8, 2021
Tests missing, documentation incomplete.
KonstantinShemyak added a commit to KonstantinShemyak/cryptography that referenced this issue Jun 8, 2021
Tests missing, documentation incomplete.
KonstantinShemyak added a commit to KonstantinShemyak/cryptography that referenced this issue Jun 10, 2021
This method is a prerequisite for Certificate.is_issued_by(). It only
checks the validity of the cryptographic signature.
KonstantinShemyak added a commit to KonstantinShemyak/cryptography that referenced this issue Jul 25, 2021
This method is a prerequisite for Certificate.is_issued_by(). It only
checks the validity of the cryptographic signature.
KonstantinShemyak added a commit to KonstantinShemyak/cryptography that referenced this issue Jul 25, 2021
Tests missing, documentation incomplete.
@pprindeville
Copy link

I don’t see any progress here. I checked the three projects in #2381 (comment); two of them are dead, and Cursive is undocumented.

If I really need it (maybe?), I’ll fork x509-validator.

I'm also wondering why this seems to be in limbo. Hard to understand why, after 5 years, a process which is well defined and well documented can't be implemented, reviewed, and merged...

@pprindeville
Copy link

pprindeville commented Aug 3, 2021

I see there's been plenty of discussion on adding validation - is there any progress? I started work on a solution, only to find there's no validation. At the least a verify() method should do all checks up to the root. Not too worried about date-range checks and key-usage etc, as they can be done with ease. thanks

@x509cert I agree with most of that.

Key-usage tends to be application specific, so that's hard to do generically.

But the date-range check isn't. You can pass in an optional time (as datetime.datetime?), which if not present, defaults to the current time, e.g. .now().

If the certificate being evaluated contains AIA attributes for CRL and OCSP URL's, would we do validation that way too? And do we want to pass in a boolean to control online vs. offline mode to disable the former?

@socketpair
Copy link
Contributor

If one pass keyusage -- check match. If do not pass -- do not check.

KonstantinShemyak added a commit to KonstantinShemyak/cryptography that referenced this issue Aug 8, 2021
This method is a prerequisite for Certificate.is_issued_by(). It only
checks the validity of the cryptographic signature.
KonstantinShemyak added a commit to KonstantinShemyak/cryptography that referenced this issue Aug 8, 2021
Tests missing, documentation incomplete.
@kislyuk
Copy link

kislyuk commented Dec 15, 2021

@reaperhulk I appreciate your time and effort - for people's reference, is there a way for commercial organizations to sponsor the introduction of certificate chain validation to cryptorgraphy? It seems like validating certificates without a SSL connection context is a core capability for which there is currently no good solution in the Python ecosystem.

@reaperhulk
Copy link
Member

reaperhulk commented Dec 16, 2021

@kislyuk Unfortunately, there's no system in place where an org can supply money + feature request and have that feature appear in the future. This is mostly because any system like that assumes there's someone in place with the expertise and willingness to take on contract work.

In the absence of core contributors willing to take contract work that the way forward is probably to identify a feature, find a person (internal to your org or external through community contacts) and fund that person to build the feature (perhaps directly, perhaps through PSF?). Since landing a feature in this project does require the buy-in of Alex and myself there would naturally need to be dialogue around design prior to implementation so that we know this is something we could actually land in the project.

x509 certificate validation is a good example of something we're willing to have in the project, but have very high requirements for. e.g. for me to properly evaluate a feature like this I'd want to see proposed APIs, discussion of what is and is not in scope (e.g. AIA chasing, revocation handling, etc), as well as plans for how to ensure this is extensively tested. A survey of past chain building and validation bugs in other libraries, pulling in every test vector suite we can find, exhaustive testing of subcomponents that are amenable to it, property-based testing, etc. If we're going to land a validation system I want it to be industry leading in its test regime. This ultimately starts as a design doc that leads to a project plan and a series of PRs culminating in the feature being fully supported. Anyone who wants to sponsor this work should consider how long it would take an engineer to complete this work to a production quality with an assumption of significant modifications to the approach through numerous rounds of feedback.

Edit: To expand even more on the validation APIs: I believe that x509 validation in cryptography effectively requires that we build a differential testing framework that allows us to determine the behavior of multiple libraries for any given chain and have that be part of our test suite. I don't believe such a tool exists at this time, but that just means the state of the art is insufficient. A well-written test suite for this feature is likely to shake out bugs in other libraries. This would be in keeping with tradition as cryptography has found quite a few bugs in OpenSSL in the past 8 years. :)

@codedust
Copy link

Regarding testing: The German Federal Office for Information Security (BSI) build a tool / test suite for exactly this purpose:

CharString added a commit to maykinmedia/django-simple-certmanager that referenced this issue Oct 13, 2022
Uses pyOpenSSL primitives to approximately check the validity of a
chain. And instead of using pem's in the repo that will expire, it
creates temporary fixture certificates.

Someday there might be a correct abstraction in the cryptography
library. See:
pyca/cryptography#6229
pyca/cryptography#2381
@alex
Copy link
Member

alex commented Jan 14, 2023

In 40.0 we'll have https://cryptography.io/en/latest/x509/reference/#cryptography.x509.Certificate.verify_directly_issued_by -- this is not complete cert verification, but may be a useful building block.

@woodruffw
Copy link
Contributor

As a PSA, to hopefully deduplicate effort: I'm working with a team at Trail of Bits on this; we'll have more to share publicly shortly!

(Writing as a comment because I can't assign myself.)

@reaperhulk
Copy link
Member

@alex and I are aware of and supportive of this work. @woodruffw I’ve assigned this issue to you 🎉

@woodruffw
Copy link
Contributor

#8873 is rapidly approaching a ready state, so I'm going to use this issue to track some potential follow-ups:

  • The MVP in rust: add crate skeleton for X.509 path validation #8873 doesn't expose client certificate verification APIs, only server verification.
  • The MVP hard-bakes CABF policy enforcement, rather than RFC 5280 or another certificate profile. Future Python-side API changes will be necessarily to allow users to loosen the policy here (e.g. by configuring their own set of allowed algorithms).
  • There are a few places where the MVP chooses not to be strictly CABF compliant, in terms of accepting certificates that a strict read of CABF would forbid. We've limited these to cases that other Web PKI validators also accept:
    • The MVP doesn't currently enforce the "whole EKU" model, i.e. where each CA cert in the also has the same EKU(s) as the leaf cert (even if only the leaf is actually used for e.g. TLS). This is consistent with Go's crypto/x509 and Rust's webpki, which also don't enforce this.
    • The MVP doesn't currently enforce AKIs on intermediate CA certificates. This is also consistent with the implementations we're testing against.

@chrisdlangton
Copy link

Love the update
Amazing work

However be cautious when using the excuse "they're not doing it" because clearly anyone could point at our repo and say that too

Doesn't make a point, doesn't give confidence that we know what we're doing, or evidence of doing it correctly

Food for thought

@alex alex linked a pull request Dec 21, 2023 that will close this issue
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.