Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: don't necessarily include all artifacts from templates in node outputs #13066
fix: don't necessarily include all artifacts from templates in node outputs #13066
Changes from 10 commits
c2bc603
5734a5a
afb55a8
c650be0
1546ba7
8e8c603
67adb6c
2d69a29
707c0cb
ccb5178
9e5b1c5
3c57e69
cecb05f
0ab94e7
b02eb10
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check failure on line 410 in test/e2e/artifacts_test.go
GitHub Actions / Lint
Check failure on line 410 in test/e2e/artifacts_test.go
GitHub Actions / E2E Tests (test-executor, minimal)
Check failure on line 410 in test/e2e/artifacts_test.go
GitHub Actions / E2E Tests (test-executor, v1.26.15+k3s1, minimal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of updating this directly, create a new list of the artifacts that we actually saved and return that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. So before we were using the
we.Template.Outputs.Artifacts
to pass toReportOutputs
but now we are creating a freshwfv1.Artifacts
object that will only include the successfully saved artifacts. This makes sense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I do see a small issue here, but I think this already existed beforehand:
If the
wait
container is interrupted during this loop, some artifacts may save to S3 (etc) but not be reportedBut since adding artifacts to
Outputs.Artifacts
wouldn't report them until the very end anyway, I think this doesn't change any existing behaviorThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you mean it wouldn't be reported? Since the background context is passed into both SaveArtifacts() and ReportOutputs() below, I believe all of that logic is still supposed to be executed. Is there something different you're referring to besides the context being cancelled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the loop is long or the container is otherwise non-gracefully terminated, the request context won't matter.
Also realized that saving artifacts can be parallelized here to reduce that chance and improve throughput (related: #12442). Similarly pre-existing logic though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is basically if this Pod receives a SIGKILL? I don't think there's anything that can be done in that case, is there? (at least from the Workflow Pod side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More or less yes, but there are things we can do to mitigate that or detect that scenario, such as parallel uploads and writing pieces of the report as each artifact completes uploading (although k8s does not have streaming requests afaik, so that would put more strain on the api server when there are many artifacts).
I also recently remembered we had #12413 merged for 3.6 that does help prevent that too.
(To be clear, this is pre-existing logic / an incidental finding, so not going to block review on that, but there are perhaps actions we should take around this case. Perhaps better discussed in #12993 though; although that's a specific case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good