Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PipelineRun timeouts delete TaskRun related pods #4035

Open
rporres opened this issue Jun 11, 2021 · 8 comments
Open

PipelineRun timeouts delete TaskRun related pods #4035

rporres opened this issue Jun 11, 2021 · 8 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@rporres
Copy link

rporres commented Jun 11, 2021

The current implementation of the PipelineRun timeout is done through deleting the TaskRun associated pods. While this is an understandable implementation as there is no easy way to stop a pod, it makes the debugging of a PipelineRun that has failed completely impossible. It also took me a while to understand why the related pods were being deleted.

In the current shape, the PipelineRun timeout may not be very helpful for people that need to access the logs of the failed TaskRuns. It may be helpful to add a note in the docs about how timeouts are being implemented so that people choose wisely if they want to use the feature.

Since pipeline timeouts are a very desirable feature, they should be implemented in a way that keeps the underlying pods available for debugging purposes.

@vdemeester
Copy link
Member

/kind feature

One possible exploration to do here would be to make the entrypoint handle the timeout somehow. That way, we could stop the execution of the pod/containers without deleteting it.

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 11, 2021
@afrittoli
Copy link
Member

Related to the behaviour of "cancel" - see #3238

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 15, 2021
@rporres
Copy link
Author

rporres commented Oct 22, 2021

/remove-lifecycle stale

I doubt we want to close this, it is an important feature to have as it will make the timeouts really useful

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 22, 2021
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2022
@rporres
Copy link
Author

rporres commented Jan 20, 2022

/remove-lifecycle stale

I doubt we want to close this, it is an important feature to have as it will make the timeouts really useful

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2022
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 20, 2022
@vdemeester
Copy link
Member

/lifecycle frozen
Should be more or less taken care of with #4618, as the timeout is cancelling the Task. Today cancel means delete, but with that proposed feature, we would not delete the pod anymore.

@tekton-robot tekton-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
Status: Todo
Development

No branches or pull requests

4 participants