-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make the OTLP exporters non exclusive #147
Conversation
Great job! I think this will reduce CI build times by a huge amount, and it will allow for more flexibility. It might even fix one of @prasek s concerns, when he set up federation-demo with grpc. huge +1 for me, as soon as the cargo xtask commands are up to date. |
with 9180bf4 the total CI time (with cached cargo dir) is now 12mn44s (down from 16mn) https://app.circleci.com/pipelines/github/apollographql/router/204/workflows/18c48a92-70b6-4769-97b0-d572d5e87e8d |
2778c65
to
5c5b527
Compare
this allows users to choose which exporter they want without recompiling the router, at the price of a larger binary (unstripped binary goes from 59MB to 66MB) updates the opentelemetry dependency to get access to SpanExporterBuilder that was not public before
since the features are not mutually exclusive now, we can do all tests in one pass, and reduce the CI time
- pre build the xtask binary so it can be cached - deactivate circleci/rust cache since we already have a global cache step - run the main compilation and tests with the same feature as xtask test to avoid a full rebuild
this is far too slow on other platforms than linux
I hereby declare this PR open for review and will avoid messing with the CI for a while now 😬 |
circleci tries to find a cache for one of the keys, in order, by prefix matching. In order: - we look for a cache for the same branch(or PR) and the same Cargo.lock. This is the key under which we will save a cache if it is not present, to have an up to date cache for the current PR and make checks faster - we look for a cache from the main branch with the same Cargo.lock, which means it will have all the dependencies already downloaded and compiled - if Cargo.lock changed, we start from the latest cache generated on main, it is likely it will have some dependencies in common
I a few questions here so please bear with me 🙏
|
HTTP is using OTLP/HTTP which is not the same protocol as OTLP/gRPC. It can support HTTPS since it's based on reqwest. If we want to pass specific parameters (like certificate authority or client cert) we can do that by creating a reqwest client manually and using the with_http_client method
It does not make sense to tell users that to use OTLP/gRPC they can download the released binary, but for OTLP:HTTP they need to build from source. We could make different binaries for each feature set. Or make them all available at once as is done in this PR.
I want to reduce the number of paths. Right now, we have otlp-tonic, otlp-http, otlp-grpcio and tls, which modifies otlp-tonic. We already have a case where building a feature separately from the others results in a compilation error: building only with tls, because it should depend on otlp-tonic. Increasing the number of features is increasing the risk that some of them are not properly tested. What I want is:
We're only testing that configuration files are deserialized properly, the tests are already not 100% complete.
My goal here is exactly that: one simple build path. The difference is that for me that build path should exercise all the features. Also we are not in a case with multiple features modifying graphql interpretation in potentially incompatible ways, where I'd want the CI to test various combinations. We're in a case where we have multiple incompletely tested telemetry options that will not conflict because we choose only one at runtime. And I trust that if we change that behaviour later it will be obvious in code reviews.
The test coverage stays the same, all of the tests are executed, I even added more. I get that you don't feel affected too much by the build times, but in my experience, improving build times locally and in CI has a great impact on development velocity. That means we can run a lot more tests, have a quicker feedback loop between a change and tests passing, and switch faster between branches, without having to wait. |
I guess I'm not too impacted either way so... if @BrynCooke agrees with this PR, let's go for it |
There is a way for us to fan out in CI and run each variant of the features in parallel. This would allow us to check each valid permutation of features and keep a steady build time. However, this doesn't seem to be our concerns here, I would have expected Would it be worth opening an issue or a discussion? I feel like this is more of a business decision. |
Multiple points to address here:
|
The point of xtask is exactly to make it work right out of box. Right now the repositories builds properly locally when you select the set of feature manually using |
xtask is doing too much or not enough right now. It can be the source of truth for feature permutations, and have an option to build them all in sequence, but it should allow a way to parallelize for CI. |
@Geal yes, there is. I've opened #160 so we can go back to the requirements / problem space and define the expected role for xtask and the ci script, which I can then implement. This will hopefully allow us to keep this pull request scoped on merging the features, and land it in a reasonable amount of time, while getting buy in on what xtask and the script should do. Edit: Although I'm all for merging the features, we might wanna make sure everyone else also is (cc @abernix @BrynCooke) |
Ok I have answered in the discussion then |
a proposal: I shuffle my tree and make a separate PR with only the changes to configuration/OTLP and #153 (which is based on this PR), and we put off the decision about what to do or not with features, CI and caches to @o0Ignition0o's PR on CI that goes into much more detail than what I did. |
Oh sorry I thought I had replied before, this sounds good to me! |
closed in favor of #153 |
…rgo/regex-1.6.0 chore(deps): bump regex from 1.5.4 to 1.6.0
this allows users to choose which exporter they want without recompiling the router, at the price of a larger binary (unstripped binary goes from 59MB to 66MB)
I started on this to see if I could reduce the time spent in xtask test in CI, but I think it makes more sense anyway to have all the exporters available in the released version
This PR also includes CI tuning to reduce the time spent, mainly by:
The result:
Cargo.lock
changes): 18mn https://app.circleci.com/pipelines/github/apollographql/router/237/workflows/c3e173d9-ff51-40a5-b816-b5d500115eca