-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Add support for OCSP on Linux, overhaul X509Chain processing #35367
Conversation
The vast majority of cert handles don't cross the P/Invoke boundary, reducing the GC load of creating and releasing SafeHandle objects.
...ryptography.X509Certificates/src/Internal/Cryptography/Pal.Unix/OpenSslX509ChainProcessor.cs
Show resolved
Hide resolved
{ | ||
// While the IntPtrs aren't secret, clearing them helps prevent | ||
// accidental use-after-free because of pooling. | ||
ArrayPool<IntPtr>.Shared.Return(tempChainRent, clearArray: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tempChain.Clear();
ArrayPool<IntPtr>.Shared.Return(tempChainRent);
?
{ | ||
st.Item1.AsSpan().CopyTo(buf); | ||
st.Item2.Span.CopyTo(buf.Slice(st.Item1.Length)); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you could instead write this as:
return string.Create(count, (baseUri, resource), (buf, st) =>
{
st.baseUri.AsSpan().CopyTo(buf);
st.resource.Span.CopyTo(buf.Slice(st.baseUri.Length));
});
avoiding both the direct use of ValueTuple as well as using Item1/Item2.
private unsafe struct ErrorCollection | ||
{ | ||
// As of OpenSSL 1.1.1 there are 74 defined X509_V_ERR values. | ||
private const int BucketCount = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does 74 relate to 3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, 32 bits per int * 3 == 96, which is > 74.
private static int FindBucket(Interop.Crypto.X509VerifyStatusCode statusCode, out int bitValue) | ||
{ | ||
int val = (int)statusCode; | ||
Debug.Assert(val < 32 * BucketCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if OpenSSL releases a new version that we bind to and it ends up having > 96 errors defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an overflow state, and verified behavior by having the shim report 94 / 95 / 96 as the value instead of CERT_EXPIRED
I was really hoping / expecting based on previous profiling that this would make a dent in https://github.com/dotnet/corefx/issues/35086. Good news and bad news... The good news is it makes a great impact on allocation / GCs. Creating a thousand SslStream client/server pairs over a NetworkStream and doing an AuthenticateAsClient/ServerAsync for each, before this change I would see Gen0/1/2 GC numbers like this:
and with this PR I see numbers like this:
That's excellent. Unfortunately, for my test, this barely makes an impact on throughput, maybe around a 5% improvement, e.g. before I get numbers like:
and after I get numbers like:
For comparison, on Windows I get numbers for the same test like:
|
Are you comparing apples to apples? e.g. are both sides making connections to trusted endpoints? |
Here's the code. If it's not apples-to-apples, what would I need to change to make it so?
Based on previous conversations, I tried a variety of things, like adding the certs into my local store. Didn't seem to make a difference. |
Regarding the perf only getting a little better: Based on some quick instrumentation (System.Diagnostics.Stopwatch) it looks like a good amount of time is spent reading the user stores as live operations (~60%+ of the time of my builds). There's a fine balance between reading them once ever and reading them hot that I think we can attain to not lose significant fidelity against Windows' fully-hot(-but-fast); but it seems like that will involve some tweaking and tuning that I think makes sense to defer to a followup change. |
That's fine. Thanks. |
Any other feedback @stephentoub (or @krwq)? |
LGTM. Thanks. |
{ | ||
X509* leaf = X509_STORE_CTX_get0_cert(ctx); | ||
X509Stack* untrusted = X509_STORE_CTX_get0_untrusted(ctx); | ||
X509_STORE* store = X509_STORE_CTX_get0_store(ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this change broke our RedHat6 official build:
/__w/1/s/src/Native/Unix/System.Security.Cryptography.Native/pal_x509.c(311,25): error GC97D5F25: implicit declaration of function 'X509_STORE_CTX_get0_store' is invalid in C99 [-Werror,-Wimplicit-function-declaration] [/__w/1/s/src/Native/build-native.proj]
X509_STORE* store = X509_STORE_CTX_get0_store(ctx);
^
/__w/1/s/src/Native/Unix/System.Security.Cryptography.Native/pal_x509.c(311,17): error G5F2A7227: incompatible integer to pointer conversion initializing 'X509_STORE *' (aka 'struct x509_store_st *') with an expression of type 'int' [-Werror,-Wint-conversion] [/__w/1/s/src/Native/build-native.proj]
X509_STORE* store = X509_STORE_CTX_get0_store(ctx);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/__w/1/s/src/Native/Unix/System.Security.Cryptography.Native/pal_x509.c(758,21): error G5F2A7227: incompatible integer to pointer conversion initializing 'X509_STORE *' (aka 'struct x509_store_st *') with an expression of type 'int' [-Werror,-Wint-conversion] [/__w/1/s/src/Native/build-native.proj]
X509_STORE* store = X509_STORE_CTX_get0_store(storeCtx);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 errors generated.
make[2]: *** [System.Security.Cryptography.Native/CMakeFiles/objlib.dir/pal_x509.c.o] Error 1
make[1]: *** [System.Security.Cryptography.Native/CMakeFiles/objlib.dir/all] Error 2
make: *** [all] Error 2
Failed to build corefx native components.
/__w/1/s/src/Native/build-native.proj(37,5): error MSB3073: The command ""/__w/1/s/src/Native/build-native.sh" x64 Release Linux outconfig netcoreapp-Linux-Release-x64 stripsymbols" exited with code 1.
Build FAILED.
/__w/1/s/src/Native/Unix/System.Security.Cryptography.Native/pal_x509.c(311,25): error GC97D5F25: implicit declaration of function 'X509_STORE_CTX_get0_store' is invalid in C99 [-Werror,-Wimplicit-function-declaration] [/__w/1/s/src/Native/build-native.proj]
/__w/1/s/src/Native/Unix/System.Security.Cryptography.Native/pal_x509.c(311,17): error G5F2A7227: incompatible integer to pointer conversion initializing 'X509_STORE *' (aka 'struct x509_store_st *') with an expression of type 'int' [-Werror,-Wint-conversion] [/__w/1/s/src/Native/build-native.proj]
/__w/1/s/src/Native/Unix/System.Security.Cryptography.Native/pal_x509.c(758,21): error G5F2A7227: incompatible integer to pointer conversion initializing 'X509_STORE *' (aka 'struct x509_store_st *') with an expression of type 'int' [-Werror,-Wint-conversion] [/__w/1/s/src/Native/build-native.proj]
/__w/1/s/src/Native/build-native.proj(37,5): error MSB3073: The command ""/__w/1/s/src/Native/build-native.sh" x64 Release Linux outconfig netcoreapp-Linux-Release-x64 stripsymbols" exited with code 1.
0 Warning(s)
4 Error(s)
Hi, @bartonjs |
@heng-liu Yes, in .NET Core 3.0 (because of this PR). |
@bartonjs Great! |
@heng-liu If a CA only offers one of the two then that's what's used. Otherwise it'll take whichever answer works first. If you're seeing some sort of problem, please file an issue with details. |
@bartonjs I'm not sure if it's an issue, because I'm not clear if there is any other restriction on Linux, which may disable OCSP even if NET Core 3.0 support OCSP.. Do you have any suggestions? Thanks! |
@heng-liu The CA not offering it, network errors, et cetera. OCSP should "just work" when needed. If you're trying to watch for the network request, it won't be made if the response is already cached. I just verified against a Let's Encrypt issued certificate with .NET Core 3.0 on a new Linux machine, and it processed the revocation (they're OCSP-only). So the feature works on something other than my dev machine 😄. |
Thanks for your help! @bartonjs |
The primary motivator for this change is to enable support for validating certificates with the On-line Certificate Status Protocol (OCSP) in addition to Certificate Revocation Lists (CRLs).
OCSP requests can't be done speculatively, as the issuer public key is used as part of the request payload; so the chain building can't just be done as a big data dump.
While tinkering with the chain building, move a lot of the processing to the native side to cut down on the number of P/Invokes that are required. In particular, loading CurrentUser\My no longer has 3 P/Invokes per entry (File.Open, interpret the contents, add to the chain build context), but just one call for the whole store (the fopen/interpret/add are all done in the same iterated function).
Additionally, move some allocating things to rent from the array pool, or do one-time lookups and save the result in statics.
Fixes #3034.