-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add caching, tracing, logging, and batching to the plugin #413
Add caching, tracing, logging, and batching to the plugin #413
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How extensively have you tested this? Have you deployed it to your instance with the large number of results?
My main concern is that the optimised database calls for only reading e.g the fail or skipped count are now being in Java so the full amount needs to be read into memory and filtered client side.
Its possible that most of the methods get called when this is used anyway so it may not impact much, or possibly there could either be more caches.
I assume you found a cache was needed because retrieval was slow? do you have any info on this?
The batch changes make full sense
src/test/java/io/jenkins/plugins/junit/storage/database/DatabaseTestResultStorageTest.java
Outdated
Show resolved
Hide resolved
src/test/java/io/jenkins/plugins/junit/storage/database/DatabaseTestResultStorageTest.java
Outdated
Show resolved
Hide resolved
As far as testing goes, the unit-tests at our company are quite large, over 29,000 test results in a single build. We're looking into using a plugin like this, but the performance was quite poor when first tried it (which is why I looked at improving the performance). At first I deployed it using mysql instead of postgres because I could not connect the database to Jenkins. Once I was able to deploy it using and connect it to a dockerized mysql database, I ran the following pipeline:
Basically, I downloaded the zip file with the large number of test results from our production Jenkins server and then
Then I ran the pipeline and it took a long time to store the test results (nearly 3 minutes to store the results using
So to answer your comment, yes I did a lot of testing to verify that my changes improved the performance. The telemetry that I added was also very useful in figuring out the bottle necks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting close
I think I've resolved all of your comments, please let me know if there is any thing else. I really appreciate all of the feedback. I'm going to do a bit more testing, especially with junit attachments plugin. |
- Update DatabaseTestResultStorage: * add caseResultsCache * add packageResultsCache * update publish() method to use batch updates to improve performance * Add manual trace creation (annotation based tracing is only supported using an open telemetry agent) * add withSpan(), createSpan(), addSqlAttributes(), addPackageAttributes(), and getTracer() helper methods for creating manual traces to the code * add logger - Update DatabaseTestResultStorageTest: * add basic unit-test with a mocked database * move duplicate code segments to constants and methods * print tables with proper column widths (and truncate values that are too large) * use LoggerRule for configuring the log level of the DatabaseTestResultStorage's logger - Update pom.xml: * add dependency on io.jenkins.plugins:opentelemetry plugin * update dependencyManagements: fix kotlin and errorprone dependency versions cause build to fail due to multiple version being pulled in by transitive dependencies * add io.jenkins.plugins:caffine dependency (for caching) * add dependencyManagement for `kotlin-std-lib-jkd*` to fix dependency resolution issues - update docker-compose.yaml: * add jaeger server * add jenkins server: skip setup wizard and add otel exporter (to jaeger) - Add Dockerfile: * creates an instance of the Jenkins docker image (used as the 'jenkins' image in the docker-compose.yaml file) * install plugins using jenkins-plugin-cli: configuration-as-code, database-postgresql, and opentelemetry * install plugin junit-sql-plugin from the target folder and configure jenkins using jenkins-config.yaml file - Add deploy.sh: bash script for compiling the plugin, building the 'jenkins' docker image, and deploying the docker-compose.yaml file - Add jenkins-config.yaml: * add settings for postgresql plugin * configure the openTelemetry plugin * configure the junit to store results to database - Update README.md: * Add examples on how to build plugin * Add example on how to deploy docker compose swarm * Add example on how to install the compiled plugin * Add example on how to configure jenkins using configuration-as-code plugin * Add note on Accessing jaeger * Add note on how to access the db inside of docker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good, I've tested it out, fixed up the deploy script to work with build and a couple of other minor things.
I've found one issue though with the open telemetry integration, I would expect the spans to be linked to the build but they aren't:
I'm wondering if the junit
step needs a StepHandler
?
see also:
jenkinsci/opentelemetry-plugin#850
I couldn't see any TRACEPARENT
env values being set
cc @cyrille-leclerc or @kuisathaverat if you can provide any advice
statement.setNull(12, Types.VARCHAR); | ||
@Override | ||
public void publish(TestResult result, TaskListener listener) throws IOException { | ||
var publishSpan = createSpan("DatabaseTestResultStorage.RemotePublisherImpl.publish"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell this only works for agents on the controller not remote agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting -Dotel.java.global-autoconfigure.enabled=true
on the agent seems to make it work
I don't think its |
Continued at #414 |
Shoudn't the opentelemetry plugin be optional ? What about jenkins instance using SQL storage but not connected to an Opentelemetry collector ? From what I remember few month ago is that having OpenTelemetry plugin installed will display some deak links on job/build page when the plugin is not configured |
Possibly, would be better to remove those dead links I think if the plugin isn't configured at all. Plugins should be able to integrate without it adding a bunch of things to the UI |
Thanks for your interest in the opentelemetry plugin. |
just raised a PR jenkinsci/opentelemetry-plugin#868 |
you have to enable the export of the environment variables that define the TRACEPARENT, but I not sure how this plugin get the Opentelemetry context, it probably has to retrieve it from the job not front and environment variable. We store the context in actions in the job. |
I was able to get it in #414 |
@craigtmoore @timja the way you integrate OpenTelemetry is great. I saw @timja' PR, Ill review it it asap, I have to understand the challenge you faced :-) |
Anything running on the controller I think would be quite straightforward, but running on agents is a bit more complicated, its not the nicest / cleanest implementation in #414 |
Adds caching using Caffine cache, logging with java.util.logging, tracing using opentelemtry, and inserts the data into the database using batching. We have some builds on our Jenkins instance that have over 29,000 test results so we added these changes to optimize performance.
Fixes #412
kotlin-std-lib-jkd8
to fix dependency resolution issueshpi
goal to thecompile
phaseTesting done
I re-used the existing tests to verify that I've introduced no new regressions, but also addeed a new unit-test to the
getCaseResults_mockDatabase()
method and verify that it loads the data correctly.Submitter checklist