-
Notifications
You must be signed in to change notification settings - Fork 220
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Proposing a tep to ignore step errors and provide an option to continue after capturing the non zero exit code. Also document the container termination state to access it after the pipeline exectution finishes.
- Loading branch information
1 parent
97f1064
commit 9c3ae5f
Showing
2 changed files
with
173 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
--- | ||
status: proposed | ||
title: 'Ignore Step Errors' | ||
creation-date: '2021-01-06' | ||
last-updated: '2021-02-03' | ||
authors: | ||
- '@pritidesai' | ||
- '@afrittoli' | ||
- '@skaegi' | ||
--- | ||
|
||
# TEP-0040: Ignore Step Errors | ||
|
||
<!-- toc --> | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Requirements](#requirements) | ||
- [Use Cases](#use-cases) | ||
- [References](#references) | ||
<!-- /toc --> | ||
|
||
## Summary | ||
|
||
Tekton tasks are defined as a collection of steps in which each step can specify a container image to run. | ||
Steps are executed in order in which they are specified. One single step failure results in a task failure | ||
i.e. once a step results in a failure, rest of the steps are not executed. When a container exits with | ||
non-zero exit code, the step results in error: | ||
|
||
```yaml | ||
$ kubectl get tr failing-taskrun-hw5xj -o json | jq .status.steps | ||
[ | ||
{ | ||
"container": "step-failing-step", | ||
"imageID": "...", | ||
"name": "failing-step", | ||
"terminated": { | ||
"containerID": "...", | ||
"exitCode": 244, | ||
"finishedAt": "2021-02-02T18:27:46Z", | ||
"reason": "Error", | ||
"startedAt": "2021-02-02T18:27:46Z" | ||
} | ||
} | ||
] | ||
``` | ||
|
||
`TaskRun` with such step error, stops executing subsequent steps and results in a failure: | ||
|
||
```yaml | ||
$ kubectl get tr failing-taskrun-hw5xj -o json | jq .status.conditions | ||
[ | ||
{ | ||
"lastTransitionTime": "2021-02-02T18:27:47Z", | ||
"message": "\"step-failing-step\" exited with code 244 (image: \"..."); for logs run: kubectl -n default logs failing-taskrun-hw5xj-pod-wj6vn -c step-failing-step\n", | ||
"reason": "Failed", | ||
"status": "False", | ||
"type": "Succeeded" | ||
} | ||
] | ||
``` | ||
|
||
If such a task with a failing step is part of a pipeline, the `pipelineRun` stops executing and subsequent steps in that | ||
task (similar to `taskRun`) stop executing any other task in the pipeline which results in a pipeline failure. | ||
|
||
```yaml | ||
$ kubectl get pr pipelinerun-with-failing-step-csmjr -o json | jq .status.conditions | ||
[ | ||
{ | ||
"lastTransitionTime": "2021-02-02T18:51:15Z", | ||
"message": "Tasks Completed: 1 (Failed: 1, Cancelled 0), Skipped: 3", | ||
"reason": "Failed", | ||
"status": "False", | ||
"type": "Succeeded" | ||
} | ||
] | ||
``` | ||
|
||
Many common tasks have the requirement where a step failure must not stop executing the rest of the steps. | ||
In order to continue executing subsequent steps, task authors have the flexibility of wrapping an image and | ||
exiting that step with success. This changes the failing step into a success and does not block further | ||
execution. But, this is a workaround and only works with images that can be wrapped: | ||
|
||
```shell | ||
steps: | ||
- image: docker.io/library/golang:latest | ||
name: ignore-unit-test-failure | ||
script: | | ||
go test . | ||
TEST_EXIT_CODE=$? | ||
if [ $TEST_EXIT_CODE != 0 ]; then | ||
exit 0 | ||
fi | ||
``` | ||
|
||
This workaround does not apply to off-the-shelf container images. | ||
|
||
Similarly, many pipelines have the requirement to continue executing the rest of the tasks in a pipeline by stopping the | ||
failure of such a task. | ||
|
||
As a pipeline execution engine, we want to support off-the-shelf container images as a step, and provide | ||
the option to ignore such step errors. The task author can choose to continue execution, capture the original non-zero | ||
exit code, and make it available for the rest of the steps in that task. Also, this provides an option to a pipeline | ||
author to continue executing the rest of the tasks by ignoring a step failure and allow accessing the original non-zero | ||
exit code of that step. | ||
|
||
Issue: [tektoncd/pipeline#2800](https://github.com/tektoncd/pipeline/issues/2800) | ||
|
||
|
||
## Motivation | ||
|
||
It should be possible to easily use off-the-shelf (OTS) images as steps in Tekton tasks. A task author has no | ||
control over the image but may desire to ignore an error and continue executing the rest of the steps. | ||
|
||
Another motivation for this proposal is to expose step level failure at the `pipelineTask` level to support arbitrary | ||
tasks from the catalog. For example, allowing the configuring of step level failures at the pipeline at authoring time | ||
opens up the possibility for the pipeline author to utilize the catalog when the author has no control over the task | ||
definition. | ||
|
||
**Note:** Both motivations might bring separate API changes (former at the task level, and the latter at the pipeline level), | ||
but the changes should ideally be consistent. | ||
|
||
### Goals | ||
|
||
Design a step failure strategy so that the task author can control the behaviour of the underlying step and decide | ||
whether to continue executing the rest of the steps in the event of failure. | ||
|
||
Store the step container's termination state and make it accessible to the rest of the steps in a task. | ||
|
||
Be applicable to any container image including custom or off-the-shelf images. | ||
|
||
### Non-Goals | ||
|
||
This design is limited to a step within a task and does not try to address `pipelineTask` level failure case. | ||
|
||
## Requirements | ||
|
||
* Users should be able to use prebuilt images as-is without having to understand if a shell or similar capability exists | ||
in an image and then altering the entrypoint to allow capturing errors. | ||
|
||
* It should be possible to know that a step failed and subsequent steps allowed to continue by observing the status of | ||
the `TaskRun` (and `PipelineRun` if applicable). | ||
|
||
* When a step is allowed to fail, the exit code of the process that failed should not be lost and should at a minimum be | ||
available in the status of the `TaskRun` (and `PipelineRun` if applicable). | ||
|
||
|
||
### Use Cases | ||
|
||
* As a task author, I would like to design a task where one or more steps running unit tests might fail, | ||
but want the task to succeed, so that a later task can analyze and report results. | ||
|
||
* As a new Tekton user, I want to migrate existing scripts and automations from other CI/CD systems that allowed a | ||
similar step unit of failure. | ||
|
||
* A [platform team](https://github.com/tektoncd/community/blob/master/user-profiles.md#1-pipeline-and-task-authors) | ||
wants to share a `Task` with their team which runs the following steps in a sequence: | ||
* Run unit tests (which may fail) | ||
* Apply a transformation to the test results (e.g. converts them to a certain format such as junit) | ||
* Upload the results to a central location used by all the teams | ||
|
||
* As a pipeline author, I would like to use a shared `Task` (which may result in a step error) and configure the | ||
`pipelineTask` to ignore such step errors. | ||
|
||
|
||
## References | ||
|
||
* [Capture Exit Code, tektoncd/pipeline#2800](https://github.com/tektoncd/pipeline/issues/2800) | ||
* [Add a field to Step that allows it to ignore failed prior Steps *within the same Task, tektoncd/pipeline#1559](https://github.com/tektoncd/pipeline/issues/1559) | ||
* [Scott's Changes to allow steps to run regardless of previous step errors](https://github.com/tektoncd/pipeline/pull/1573) | ||
* [Christie's Notes](https://docs.google.com/document/d/11wygsRe2d4G-wTJMddIdBgSOB5TpsWCqGGACSXusy_U/edit?resourcekey=0-skOAYQiz0xIktxYxCm-SFg) - Thank You, Christie! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters