-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Account for finally TaskRun retries in PR timeouts #4508
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @lbernick!
could we add some documentation updates for this? possibly in the pipelinerun timeouts section: https://github.com/tektoncd/pipeline/blob/main/docs/pipelineruns.md#configuring-a-failure-timeout
Done! |
The following is the coverage report on the affected files.
|
Prior to this commit, the PipelineRun reconciler did not account for time elapsed during a finally TaskRun's retries when setting its timeout. This resulted in `pipelinerun.timeouts.finally` not being respected when a finally TaskRun was retried. This commit updates the finally TaskRun's timeout to account for time elapsed during retries. Co-authored-by: Jerop Kipruto jerop@google.com
The following is the coverage report on the affected files.
|
if t.IsCustomTask() { | ||
r := t.Run | ||
if r == nil { | ||
return nil | ||
} | ||
startTime = r.Status.StartTime | ||
if startTime.IsZero() { | ||
if len(r.Status.RetriesStatus) == 0 { | ||
return startTime | ||
} | ||
startTime = &metav1.Time{Time: c.Now()} | ||
} | ||
for _, retry := range r.Status.RetriesStatus { | ||
if retry.StartTime.Time.Before(startTime.Time) { | ||
startTime = retry.StartTime | ||
} | ||
} | ||
return startTime | ||
} | ||
tr := t.TaskRun | ||
if tr == nil { | ||
return nil | ||
} | ||
startTime = tr.Status.StartTime | ||
if startTime.IsZero() { | ||
if len(tr.Status.RetriesStatus) == 0 { | ||
return startTime | ||
} | ||
startTime = &metav1.Time{Time: c.Now()} | ||
} | ||
for _, retry := range tr.Status.RetriesStatus { | ||
if retry.StartTime.Time.Before(startTime.Time) { | ||
startTime = retry.StartTime | ||
} | ||
} | ||
return startTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's a way we can reduce the duplication here with an interface? These types seem very similar (e.g. they are the same base data with extra data layered on top), so it might make sense for them to depend on a common base with an interface that can extract out the common values.
startTime = &metav1.Time{Time: c.Now()} | ||
} | ||
for _, retry := range tr.Status.RetriesStatus { | ||
if retry.StartTime.Time.Before(startTime.Time) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this actually work? 🤔 The API documentation says: All TaskRunStatus stored in RetriesStatus will have no date within the RetriesStatus as is redundant.
This might just be out of date documentation, but we should double check this and ideally add a test that goes through a full reconcile loop if we can - IIUC the current reconciler tests just mock this by injecting a simulated resolved Task with the retry values set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this actually does not work, but not for this reason (docs are out of date; retries status does have a timestamp). I think #4409 might be affecting how this works; it's possible we need to address this bug before addressing this TODO.
@@ -174,6 +176,49 @@ func (t ResolvedPipelineRunTask) IsStarted() bool { | |||
return t.TaskRun != nil && t.TaskRun.Status.GetCondition(apis.ConditionSucceeded) != nil | |||
} | |||
|
|||
// FirstAttemptStartTime returns the start time of the first time the ResolvedPipelineRunTask was attempted. | |||
// Returns nil if no attempt has been started. | |||
func (t *ResolvedPipelineRunTask) FirstAttemptStartTime(c clock.Clock) *metav1.Time { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want clock based funcs to be exposed to clients, or is this an implementation detail that should be unexported?
/hold |
@lbernick given the hold, may I move this to the next milestone? hoping to release 0.34 today |
yup sounds good! |
@lbernick: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
releasing 0.35 today, moving this to the next milestone! 🤞 |
Closing since this is not being actively worked on. |
Changes
Prior to this commit, the PipelineRun reconciler did not account for time elapsed during a finally
TaskRun's retries when setting its timeout. This resulted in
pipelinerun.timeouts.finally
not beingrespected when a finally TaskRun was retried.
This commit updates the finally TaskRun's timeout to account for time elapsed during retries.
Closes #4071.
Co-authored-by: Jerop Kipruto jerop@google.com @jerop
/kind bug
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
functionality, content, code)
Release Notes