Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Otlp grpc tls certificates #4731

Closed
wants to merge 15 commits into from

Conversation

dhhoang
Copy link

@dhhoang dhhoang commented Jul 29, 2023

Fixes #2009
Design discussion issue #2009

Changes

Implement the configurations according to specs:

  • Certificate file. OTEL_EXPORTER_OTLP_CERTIFICATE
  • Client key file. OTEL_EXPORTER_OTLP_CLIENT_KEY
  • Client certificate file. OTEL_EXPORTER_OTLP_CLIENT_CERTIFICATE

Notes

Some known issues:

  • When using HTTP/Protobuf protocol, the options only work for NET6_0_OR_GREATER targets. Older .NET lacks the required APIs to load PEM certificates. Implementing this would require essentially porting those APIs back from .NET 6. Given that HttpClientFactory can be used for HTTP/Protobuf I don't think it's worth the effort.
  • Testing: I'm not currently satisfied especially regarding GRPC flow. I'd love to have some end-to-end tests where the test case can communicate with a GRPC server to verify the mTLS functionalities. That's the only way to know whether the code works, because some of the options here passed to native C bindings (for GRPC in older .NET) and hence not possible to do any meaningful unit test. I'd like to hear recommendations on how these kinds of test could be done.

Merge requirement checklist

  • CONTRIBUTING guidelines followed (nullable enabled, static analysis, etc.)
  • Unit tests added/updated
  • Appropriate CHANGELOG.md files updated for non-trivial changes
  • Changes in public API reviewed (if applicable)

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 29, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@dhhoang dhhoang force-pushed the otlp-grpc-tls-certificates branch from 53e4c43 to 3e773a1 Compare July 29, 2023 04:02
@dhhoang dhhoang marked this pull request as ready for review July 31, 2023 09:43
@dhhoang dhhoang requested a review from a team July 31, 2023 09:43
@@ -84,6 +88,21 @@ public OtlpExporterOptions()
this.TimeoutMilliseconds = timeout;
}

if (configuration.TryGetStringValue(ClientCertificateFileEnvVarName, out var clientCertificateFile))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to add a TryGetFilePath method. Similar to the TryGetUriValue it could validate that the string is a file path for an existing file and log a message if it isn't.

if (!Uri.TryCreate(stringValue, UriKind.Absolute, out value))
{
LogInvalidEnvironmentVariable?.Invoke(key, stringValue!);
return false;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After trying this I think a TryGetFilePath() method doesn't make much sense here. The TryGet... methods simply check if the value is syntactically valid and not whether the resource actually exists (e.g. URI).
If the certificate file with specified path doesn't exist then TryGetFilePath() would return false and the option would be set to null and treated as if no option has been set by the developer.
I think from a developer PoV, if the file doesn't exist I'd prefer to be notified with a FileNotFoundException.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntactic validation is what I had in mind - using something like Path.GetInvalidPathChars or something. Though file paths may be tricky to validate syntactically in a meaningful/helpful way, so I'm fine with not doing anything more in this PR.

@@ -22,6 +22,12 @@
<PackageReference Include="Microsoft.Extensions.Http" />
</ItemGroup>

<ItemGroup>
<Content Include="Certificates\**">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #4732 I tried to avoid adding test certs/keys to the repo and generated them on the fly. See: https://github.com/open-telemetry/opentelemetry-dotnet/pull/4732/files#diff-5b358d0b47462dc9c3919270476736612be64b9db3362961d5df3d8b6defc257R4-R6.

The tests you have in this PR aren't run in docker of course, but I'm wondering if we can do something similar to avoid adding certs and keys.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to do it I guess would be to add an MSBuild Task to run the scripts to generate the certs (sh on Linux, Powershell on Windows).
Gonna take a crack at this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dhhoang, apologies, I haven't had a chance yet to circle back and give this PR another pass. I will look at it more closely again tomorrow.

Though, quick thought about generating these certs, another idea may be to perform these tests using the integration test framework I landed in #4732. It's a minor bummer to have the tests you have so far depend on docker-compose to run them, but I figure you'll want to expand the integration tests anyways to test mTLS end to end.

Copy link
Member

@alanwest alanwest Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occurred to me a readme for how to run the integration test locally might be useful. See #4741.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving this a try right now, I'll mark the PR as draft in the mean time.

@alanwest
Copy link
Member

  • I'd love to have some end-to-end tests where the test case can communicate with a GRPC server to verify the mTLS functionalities.

💯 yes this would be great!

Once we can get #4732 landed, then I think you could expand on it and test mTLS. You could probably add an additional OTLP receiver with the necessary mTLS config. See: https://github.com/open-telemetry/opentelemetry-dotnet/pull/4732/files#diff-836987be2117c46a9d22eb1a86fe30cee3eee98299b46abceb7fcc6e9e84e1c9R14

@codecov
Copy link

codecov bot commented Aug 3, 2023

Codecov Report

Merging #4731 (3250ab0) into main (e294312) will decrease coverage by 0.16%.
Report is 11 commits behind head on main.
The diff coverage is 86.66%.

❗ Current head 3250ab0 differs from pull request most recent head ef6e2b8. Consider uploading reports for the commit ef6e2b8 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4731      +/-   ##
==========================================
- Coverage   82.13%   81.97%   -0.16%     
==========================================
  Files         313      313              
  Lines       12783    12773      -10     
==========================================
- Hits        10499    10471      -28     
- Misses       2284     2302      +18     
Files Changed Coverage Δ
...TelemetryProtocol/OtlpExporterOptionsExtensions.cs 92.52% <60.00%> (-4.29%) ⬇️
...orter.OpenTelemetryProtocol/OtlpExporterOptions.cs 97.61% <100.00%> (+1.25%) ⬆️

... and 33 files with indirect coverage changes

@dhhoang dhhoang marked this pull request as draft August 4, 2023 15:16
@dhhoang
Copy link
Author

dhhoang commented Aug 12, 2023

Finally got the time to work on this! I've finished the test cases:

  • Added scripts to generate test certificates for unit tests on Windows (pwsh) and Unix (bash)
  • Added the logic needed for integration test (Linux/Docker):
    • Trusted certificates with TLS/mTLS.
    • Untrusted certificates with custom trust store.
    • Negative validation for cases above.
  • Manual testing on Windows for net6.0 net7.0 and net462 (i.e. gRPC using C binding).

Marking this PR as ready for review.

@dhhoang dhhoang marked this pull request as ready for review August 12, 2023 10:26
Comment on lines +280 to +286
handler.ServerCertificateCustomValidationCallback = (message, serverCert, chain, policyErrs) =>
{
chain.ChainPolicy.TrustMode = X509ChainTrustMode.CustomRootTrust;
chain.ChainPolicy.CustomTrustStore.Add(trustedCA);
var buildResult = chain.Build(serverCert);
return buildResult;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert on these things, so still need to do some of my own research to understand if this is sufficient.

One thing that leaves me a bit unsure here is that policyErrs is ignored. My understanding is the "out-of-the-box" validations would be reflected in policyErrs (e.g., is the cert signed by a trusted CA, is the cert expired, etc). I suspect we want the custom validation here to be in addition to the out-of-the-box validations.

Copy link
Author

@dhhoang dhhoang Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct that policyErrs would contain the error that the "default" validation method returns. The default validator would trust the certificates in the root trust store (aka certificates "installed" on the machine).
If user is manually trusting a certificate, it means the certificate might not be in the machine root store and we need to at least ignore that error.

The other validation criteria (e.g. expiration, revocation) is encoded in the X509Chain parameter. In this implementation I keep them as default settings (other than TrustMode of course). This means that other failure criteria would still work (e.g. the trusted cert still fails if it's expired).

@alanwest
Copy link
Member

Hey @dhhoang, thanks for your patience. I've had limited time to give this PR a thorough review, so I appreciate you bearing with me 😄 - I am looking forward to getting this support landed!

Since this is your first PR to this repo, would you mind landing it in a few iterations? If you land a small PR first, then your "first-time contributor" status will go away and then on subsequent PRs the CI will just run automatically.

Here's a suggestion for breaking this PR up into multiple PRs:

  1. First PR I really like what you're doing by generating the necessary certificates/keys via MSBuild. First PR could be to improve the existing integration test to use this new method. I like that it stores the certs in the build output directory. We should just be able to configure the docker compose volume to point to the output directory. Also, I think we could then do away with the create-cert container that spins up via docker compose since now you have the cert generated at build time.
  2. Second PR mTLS support. This seems to be the more straight-forward piece to implement from the standpoint of the APIs used to add client certificates to the handler.
  3. Third PR Server certificate support. I think the validation you're doing is pretty close to what we need, but I'm keen on splitting this into its own PR because I think we'll be able to solicit a review from a security expert at Microsoft. I've researched the validation a bit and found a few examples that I believe are closer to what we'll want. Here's just one example Add TrustedRootCertificates property to HttpClientHandler dotnet/runtime#39835 (comment).

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.

@github-actions github-actions bot added the Stale Issues and pull requests which have been flagged for closing due to inactivity label Aug 29, 2023
@alanwest alanwest removed the Stale Issues and pull requests which have been flagged for closing due to inactivity label Aug 30, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 6, 2023

This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.

@github-actions github-actions bot added the Stale Issues and pull requests which have been flagged for closing due to inactivity label Sep 6, 2023
@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@sandy2008
Copy link
Contributor

@alanwest Hi! #5818 We made the first PR accordingly, would you mind taking a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale Issues and pull requests which have been flagged for closing due to inactivity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add OTLP Exporter TLS/mTLS configuration options
3 participants