-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: controlling issues #5756
fix: controlling issues #5756
Conversation
…orkflowResult model
…hing more resilient to external problems, expose more Kubernetes error details
…orkers and services
if nodeName != prevNodeName || podIP != prevPodIP || prevStatus != status || prevCurrent != current { | ||
// TODO: the final status should always have the finishedAt too, | ||
// there should be no need for checking isFinished diff | ||
if nodeName != prevNodeName || isFinished != prevIsFinished || podIP != prevPodIP || prevStatus != status || prevCurrent != current { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such code will not be simple, to maintain (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully agree, although this code is rather not meant to be maintained. This PR, along actual bugfixes, is meant to "enable" the new orchestration with similar watching system.
The (mainly) WatchInstrumentedPod
and TestWorkflowResult
have so many edge cases handled, clock calibration, and auto-healing mechanisms implemented, that at this point it's probably better to just rewrite them based on the observations that we have for the last few months of running Test Workflows. After that, we will likely not need conditions like these at all.
I'm guessing that half of the healing mechanisms and edge cases handlers are no longer needed, considering iterations of orchestration improvements.
pkg/testworkflows/testworkflowcontroller/watchinstrumentedpod.go
Outdated
Show resolved
Hide resolved
* fix: continue paused container, when the abort is requested * fix: ensure the lightweight container watcher will get `finishedAt` timestamp * chore: add minor todos * fix: configure no preemption policy by default for Test Workflows * fix: allow Test Workflow status notifier to update "Aborted" status with details * fix: ensure the parallel workers will not end without result * fix: properly build timestamps and detect finished resul in the TestWorkflowResult model * fix: use Pod/Job StatusConditions for detecting the status, make watching more resilient to external problems, expose more Kubernetes error details * chore: do not require job/pod events when fetching logs of parallel workers and services * fixup unit tests * fix: delete preemption policy setup * fixup unit tests * fix: adjust resume time to avoid negative duration * fix: calibrate clocks * chore: use consts * fixup unit tests
* fix: continue paused container, when the abort is requested * fix: ensure the lightweight container watcher will get `finishedAt` timestamp * chore: add minor todos * fix: configure no preemption policy by default for Test Workflows * fix: allow Test Workflow status notifier to update "Aborted" status with details * fix: ensure the parallel workers will not end without result * fix: properly build timestamps and detect finished resul in the TestWorkflowResult model * fix: use Pod/Job StatusConditions for detecting the status, make watching more resilient to external problems, expose more Kubernetes error details * chore: do not require job/pod events when fetching logs of parallel workers and services * fixup unit tests * fix: delete preemption policy setup * fixup unit tests * fix: adjust resume time to avoid negative duration * fix: calibrate clocks * chore: use consts * fixup unit tests
Pull request description
Checklist (choose whats happened)
Breaking changes
Changes
Fixes