TEP-0011: Redirecting Step Output Streams #182

chhsia0 · 2020-08-17T21:54:21Z

TEP for pipeline#2925 and pipeline#3103.

Consuming outputs of a step in another step is a common pattern in writing Tasks. However, this is currently tedious to do. Task authors have to overwrite the image entrypoint with either sh -c or the script field to wrap up the command to run with an explicit output stream redirection. This is not even possible if the image does not come with a shell, e.g., images built by ko.

To achieve the functionality of output stream redirection between steps and even Tasks, this TEP proposes to add new fields to steps for Task authors to specify paths to store stdout/stderr to.

teps/0011-redirecting-step-inputs-and-outputs.md

vdemeester · 2020-08-18T09:56:24Z

teps/0011-redirecting-step-inputs-and-outputs.md

+
+### Risks and Mitigations
+
+* Users might mistakenly specify paths not shared among steps for redirection. Potentially Tekton could implement a validation to warn users about such misuses.


Indeed, that might be a problem. We may want to restrict it to workspaces but it's gonna be tricky to validate

I think this is an argument for just always writing stdout/stderr to some Tekton-owned path and documenting where that is. If a later step wants to read it, they can.

Yeah this is why I explicitly call this out in this section so we can discuss if "canonical paths" as you suggest is a better way.

On the other hand, if we make it explicit that this is a path, most users would naturally be aware that if they want to use the same path in different containers, they have to make a shared mount point.

teps/0011-redirecting-step-inputs-and-outputs.md

imjasonh · 2020-08-18T12:59:54Z

teps/0011-redirecting-step-inputs-and-outputs.md

+
+* Run image `gcr.io/k8s-staging-boskos/boskosctl` with args in a step and process its output through another `jq` image in a subsequent step to acquire the project name without overwriting the image entrypoint, building custom image with both utilities, or looking into container logs ([pipeline#2925](https://github.com/tektoncd/pipeline/issues/2925)).
+
+* Run image `alpine/git` with args `["describe", "--tags"]` in a step and extract the stdout as a Task result without overwriting the image entrypoint to be used by subsequent tasks.


FWIW, this should be doable with a script, instead of requiring a whole new image entrypoint.

script: git describe --tags > /tekton/results/tags

script is essentially overwriting the entrypoint ;)

But yeah this is generally not a big problem, just a bit tedious. The only problem, again, is when the image doesn't have a shell and the command to run doesn't have a flag to write output to a file.

That is true. Maybe I'm not understanding what the downside is in that case. Using script mode in this way is pretty straightforward and doesn't require maintaining another image.

teps/0011-redirecting-step-inputs-and-outputs.md

ghost

My main concern would be to document and push users towards using workspaces for the stdout / stderr files so that they can be shared around without accidentally exploding the size of results and the termination message.

I was wondering if you considered any of the alternatives described in the comments on tektoncd/pipeline#2925 - in particular inverting the control a little bit and allowing a subsequent step to provide a filter to run over a previous steps stdout and limit the amount of data shuttled from step to step. Even if those alternatives are rejected it's probably still worth adding them to this doc to capture them I think?

teps/0011-redirecting-step-inputs-and-outputs.md

chhsia0 · 2020-08-18T16:42:20Z

@sbwsg Sure I'll go through pipeline#2925 and add the discussions here. For promoting users to use workspaces, I feel that it's better giving users options but document the pros and cons and the suggested ways, as well as a validation and/or runtime check (which we already have, as you pointed out). I can image the most common use cases for sharing data through results are sending just a couple bytes, and using workspaces (backed by PVC I believe?) for such purpose is too heavy-weight.

ghost · 2020-08-18T16:52:32Z

I can image the most common use cases for sharing data through results are sending just a couple bytes, and using workspaces (backed by PVC I believe?) for such purpose is too heavy-weight.

Having read your example yaml in the TEP I think I understand a bit better where you're coming from. It looks like you're talking about the sort of approach where one step writes its stdout to a shared location on disk, and another step then filters through the data with jq and writes only a tiny portion of it to a result. That's great and I totally understand that use case and I think this is the perfect place for results. Similarly to @imjasonh I'm not actually sure this needs to be explicitly called out as a goal - Tekton will already work like this today so long as the user writes the stdout to a location shared by multiple steps.

However, I do want to be careful that when we come to document / promote the feature we have to be super clear to users that they cannot slam all of stdout/stderr from every step into a task result and expect it to "just work". I think that's where I'm getting stuck a bit - if a user sees this feature and adds stdout: $(results.stepXstdout.path) to every single Step they are going to have a very bad time. I think we could safely expect a number of very similar github bug reports to be generated around it very quickly and ideally I'd like to avoid that.

ghost · 2020-08-18T16:55:15Z

Also, just to add - PVCs are only required if the data needs to be shared between Tasks of a Pipeline. An emptyDir workspace is shared by all steps in a task and is as close to zero-configuration as a volume definition can get.

chhsia0 · 2020-08-18T17:18:55Z

For sharing thing between steps within a task then workspaces isn't even needed. An emptyDir volume in the task is sufficient. See my example between the 1st and 2nd steps.

chhsia0 · 2020-08-18T17:20:42Z

I call it out as a goal in the sense that I'd like to enable user to configure the paths, instead of using some built-in paths, where you won't be able to make it work with Task results without an additional step. If this is not needed in the goal I could remove it.

ghost · 2020-08-18T17:42:27Z

I call it out as a goal in the sense that I'd like to enable user to configure the paths

Gotcha! Makes sense. I don't think you have to remove it in that case, but my suggestion would be to rephrase the goal to be "Enable a user to configure the path where the stream is written"? I got caught up on the result part and I missed the broader idea.

ghost · 2020-08-19T18:45:54Z

teps/0011-redirecting-step-output-streams.md

+## Requirements
+
+* Add new fields for a step to specify paths to redirect stdout/stderr to.
+* Users should be able to observe the output streams through Pod logs even if `StdoutPath` or `StderrPath` is specified. In other words, output streams should be duplicated instead of simply redirected.


Do you think this might become configurable one day? i.e. a user may decide they don't want the stdout logged to terminal, only the file?

I am wondering if that would change the desirable shape of the struct if so :-

type Step struct { // ... StdoutConfig struct { ToPath string ToTerminal bool } }

It probably makes sense to have an additional layer of structure. Do you have anything else in mind that stdout could be redirected to? About the ToTerminal option, at least it should be turned on by default, mainly from a debuggability point of view: it would be impossible to retrieve the stdout from e.g., emptyDir if the task fails.

This extra layer, on the other hand, would make the config structure more verbose.

Having thought about this a bit more - I wonder if "hiding" or limiting the stdout log would be more a runtime concern. Something to configure at the taskrun level instead of task. I'm o.k. to move ahead with the existing proposal of just the path string!

ghost · 2020-08-19T18:49:28Z

teps/0011-redirecting-step-output-streams.md

+}
+```
+
+Once `StdoutPath` or `StderrPath` is specified, the corresponding output stream will be duplicated to both the given file and the standard output stream of the container, so users can still view the output through the Pod log API. If both `StdoutPath` and `StderrPath` are set to the same value, the two streams will be written to two file descriptors referring to the same file description, so outputs from both stream will be interleaved in the same file, but there will be no ordering guarantee on the data.  If multiple steps' `StdoutPath` fields are set to the same value, the file content will be overwritten by the last outputting step.


i'm curious about the interleaving - would it be line-buffered or would the user end up with halves of lines from each stream spliced together?

I also wonder if we'd want a "prefix: " in the interleaved scenario like:

stdout: tests 3 / 6 completed stdout: tests 4 / 6 completed stderr: warning: foo bar stdout: tests 5 / 6 completed

and, if so, whether that'd also be something the user wants to configure?

The "interleaving" behaves similar to redirecting both stdout and stderr to the same stream in shell, e.g,

./command > log 2>&1

IIUC go runtime doesn't implement a buffer and it directly goes to the write syscall, and thus line-buffering only happens if the *os.File is associated with a TTY. You can find some implementation details about how we "redirect" stdout in our entrypoint customization here:
https://github.com/tektoncd/pipeline/pull/3103/files#diff-18256282f9059ed3c2b641b9c2d8ac6fR29
https://github.com/tektoncd/pipeline/pull/3103/files#diff-8c0170fa2830f4fdfc8067ba745cf734R49

We could use a bufio to implement a line buffer before writing to the file, but I'd imagine there would still be corner cases (e.g., super long lines) where the interleaving happens within a line. And most of the time the interleaving is not a big deal so I'm not sure if it is worth complicating the code to have line buffering.

An alternative is to forbid user from specifying the same path for stdout and stderr. If you the user really wants that, they can use the Pod logs API.

Prefixing is a fancy idea, but IMO it's incremental and can be added when there's an actual need. If users really want separation, they should put them to different files. I'm also afraid that we're opening an can of worms, e.g., one may want to add timestamps or other information. Especially timestamping is something we should not touch and leave it to the Pod logs API :)

OK, understood!

teps/0011-redirecting-step-output-streams.md

imjasonh · 2020-08-20T16:55:59Z

teps/0011-redirecting-step-output-streams.md

+  - name: project-id
+  steps:
+  - name: boskosctl
+    image: gcr.io/k8s-staging-boskos/boskosctl


Is the issue here that boskosctl doesn't include a shell, and so script is not useable, and therefore you'd have to jump through hoops to get its output? Would it be possible to get a shell added to the boskosctl image, perhaps by simply basing it on alpine?

I'm not the one using boskosctl. @bobcatfish is ;) And it seems to have a shell.

IMO the problem here is that Tekton shouldn't limit how users build and use images. Especially some images are build to act like command line tools (i.e., comes with decent default entrypoints so running such images has the same UX as running utility programs), and right now if you want to composite multiple utility images to achieve a complex task, you have to ensure that there must be a shell for all images, or the image entrypoints must have an option to put output to a file.

The actual use case I have is, I'm working on a DSL and a tool for user to run any image, and I can redirect the stdout to either task results or somewhere else, without the need to know the details of an image. How to run the image is provided by the user and completely transparent to my tool. In other words, I don't have control to the image, and thus impossible to ensure that there is always a shell and synthesize a script for that shell.

It seems common for users to prefer some third-party “official” images instead of maintaining their own fork to include a shell.

boskosctl is actually a bad example b/c it DOES have a shell, but otherwise: what @chhsia0 said :D

imjasonh · 2020-08-20T16:58:34Z

teps/0011-redirecting-step-output-streams.md

+  1. Write stdout of a step into `.docker/config.json` and consume the generated auth configuration in a subsequent step.
+  1. Extract stdout into a Task result.
+
+  If the step image provides a shell, users can specify the `Script` field to overwrite the image entrypoint to invoke the original entrypoint command and redirect the stdout into the target file. There are two main drawbacks: 1) information encapsulated by the image (e.g., entrypoint) will be leaked into Task specification, creating a tight coupling between the Task and the image; 2) users will lose the ability to see stdout through Pod log API.


I'm not sure I understand the first point: the entrypoint doesn't leak into the Task definition if it uses script -- if anything, the presence of a shell is assumed using script, but the original entrypoint value is ignored completely.

For the second point, the Task author could tee to stdout and wherever else they want that data to end up.

By "leaking" I mean that users cannot treat any third-party utility image as a black box and just pass in appropriate arguments. For example, if you use ko to build an image with shell, you have to know that the binary sits in /ko-app/... and bake this information into your shell.

If I later update my image to move the entrypoint binary around, steps passing args can still run, but steps with scripts will fail.

pritidesai · 2020-08-20T17:42:35Z

/kind tep

ghost

minor nit but otherwise LGTM

teps/0011-redirecting-step-output-streams.md

chhsia0 · 2020-08-25T22:14:05Z

Attaching some great discussions related to this change:

https://tektoncd.slack.com/archives/CLCCEBUMU/p1597969084015700

Pierre Tasci:appleincwhite: 5:18 PM
FWIW it feels a bit odd to make this change at the Tekton level to alleviate something related to the internals of the image. I totally see how users might not want to author their own images but to me at least, it seems like the gap there is in tooling around the image rather than Tektons tooling in how it is executed if that make sense.

chhsiao 5:24 PM
IMO Tekton should be a platform to run any image without assumptions, i.e., treat them as black boxes, like k8s Pod API. (It seems we both agreen on this 🙂 ) But now there are Tekton features (say, task results) that some images cannot use, so it seems an incompleteness/limitation in the API.

Jason Hall 5:37 AM
That's a reasonable argument. I do think that I've tended to think of Tasks as the unit of pluggability/reusability, and the specifics of the images inside those Tasks is up to the Task author, and should generally be a black box to end users and even operators. I'd love to hear from other folks who are generating TaskRuns/PipelineRuns from some other configuration though, since that's not a use case I have as much first-hand experience with.

https://tektoncd.slack.com/archives/CLCCEBUMU/p1598024780034800?thread_ts=1598023428.029200&cid=CLCCEBUMU

Pierre Tasci:appleincwhite: 4 days ago
I think you make some excellent points @Christie Wilson. If I am hearing it correctly (let me know if I am not), the core of the assumption here is that small, focused images are better in some way to larger images.
I only have my own experience to go on here so pardon the naïveté but I haven’t found that to be particularly true. It often hurts debuggability and testability to have bare-bones images. I think of Kaniko as a prime example. Great tool but because it is built off scratch, everyone I know who uses it, actually uses the debug version because they attach a busybox shell into it. I will take ease of development and familiarity over additional complexity most times.
I do think there is a point here. The way steps are hooked up together right now is minimal at best. Even passing a file between steps requires hardcoded, explicit handoff. For that reason, I often end up writing larger scripts than I would otherwise desire. Rather than unix pipe redirection, perhaps there is a larger story about inter-step interaction here?

https://tektoncd.slack.com/archives/CLCCEBUMU/p1598040443039700?thread_ts=1598023428.029200&cid=CLCCEBUMU

Cory Cross 4 days ago
pipes are untyped and hard to test and full of edge cases. I think powershell should be the better inspiration if something's needed intra-Task

Jason Hall 4 days ago
Good point! Now you're making a even stronger argument that there's lots of complexity down this path, and lots of prior art we can learn from before we jump in and implement the next easy enhancement. 🙃

vdemeester

Overall looks good to me. I am definitely voting for Option 1 as it is way more flexible.
For Option 2, what happens if the we refer a step stdoutPath that doesn't have stdout: true ? Should it fail, should we default to empty ? Having $(step.<name>.stdoutPath) gives the impression to the user (at least to me) that it's gonna be available for all the steps (hence we do capture the stdout on all of them).

chhsia0 · 2020-08-26T15:11:38Z

Good point. I can think of two options:
Option 2.1: Validate the task definition and reject the task if users refer to a non-existent step stdout path.
Option 2.2: No explicit Stdout/Stderr flags at all. If $(steps.<name>.stdoutPath) exists in the task definition, capture the stdout stream of the corresponding step.

In general I also like Option 1 (allowing users to specify paths) better. Anyway, we should settle with one option then I'll move the other to "alternatives."

bobcatfish

Thanks @chhsia0 !

I have 3 main pieces of feedback:

Is it possible and reasonable to do stdout only, at least at first? I'm deeply suspicious that stderr is mostly only relevant at runtime
I'd like to revisit the variable interpolation and not provide a way to automatically expose stdout/stderr as results
Can we support both an "auto" mode AND allow the user to specify an explicit path?

bobcatfish · 2020-08-26T19:23:21Z

teps/0011-redirecting-step-output-streams.md

+
+* Parsing stdout/stderr into a structured format ([example](https://github.com/tektoncd/pipeline/issues/2925#issue-654319361)): This approach requires the step image to produce JSON output, which limits what images can be used. It also hides the parsing magic in Tekton, which can be hard to debug if the output is malformed.
+
+* [Allowing subsequent steps to specify a filter expression to apply to step outputs](https://github.com/tektoncd/pipeline/issues/2925#issuecomment-657529820): If there are multiple subsequent "consumer" steps, then either all consumers must use the same filter to save disk space. Also the magic of filtering will be hidden by Tekton from users, creating unnecessary complexity. It is not hard to use `Script` or add an extra step to perform filtering to achieve the same result with more transparency.


another alternative would be to do nothing in particular in tekton and leave this up to the Task and image authors, which effectively means if you want to communicate data between steps, or you want to capture data for results, you need a shell or you need your binaries to support outputting to a file

teps/0011-redirecting-step-output-streams.md

bobcatfish · 2020-08-26T19:39:33Z

teps/0011-redirecting-step-output-streams.md

+
+Once `StdoutPath` or `StderrPath` is specified, the corresponding output stream will be duplicated to both the given file and the standard output stream of the container, so users can still view the output through the Pod log API. If both `StdoutPath` and `StderrPath` are set to the same value, the two streams will be written to two file descriptors referring to the same file description, so outputs from both stream will be interleaved in the same file, but there will be no ordering guarantee on the data.  If multiple steps' `StdoutPath` fields are set to the same value, the file content will be overwritten by the last outputting step.
+
+Variable substitution will be applied to the new fields, so one could specify `$(results.<name>.path)` to the `StdoutPath` field to extract the stdout of a step into a Task result. No new variable substitution for accessing the values of `StdoutPath` and `StderrPath` fields will be provided so variable substitution can remain single-pass.


$(results.<name>.path) to the StdoutPath field to extract the stdout of a step into a Task result

I'm a bit confused by this, you've got an example below: stdout: $(results.project-id.path)

in this case, what path does stdout actually get written to? is this magically writing stdout to a location on disk and then also exposing it as a result?

i dont think we should automatically wire stdout/stderr to any results. if we do, i think we could do it in a subsequent proposal after this

my alternative suggestion for the example below is that if you want to expose stdout as a result, you'd need to add a step after parse-project-id that writes stdout as a result

No new variable substitution for accessing the values of StdoutPath and StderrPath fields will be provided so variable substitution can remain single-pass.

can you explain a bit more? i think it would be very convenient to use $(steps.foo.stdout.path) or similar, and that's exactly how id want to use this feature, i wouldnt want to have to hardcode the path in multiple places

in this case, what path does stdout actually get written to? is this magically writing stdout to a location on disk and then also exposing it as a result?

$(results.project-io.path) would be interpolated and be the path of the results (so /tekton/result/project-io or something), and thus, stdout would be written in. Just like if you use $(workspaces.some.path)/foo.

can you explain a bit more? i think it would be very convenient to use $(steps.foo.stdout.path) or similar, and that's exactly how id want to use this feature, i wouldnt want to have to hardcode the path in multiple places

So one of the problem with $(steps.foo.stdout.path) is that, it may be used without the previous task having stdout captured. You don't necessarly hardcode the path, or at least the full path ; most likely you would use $(workspace.foo.path)/somewhere/afile as a path, and re-use the same later on.

$(results.project-io.path) would be interpolated and be the path of the results (so /tekton/result/project-io or something), and thus, stdout would be written in. Just like if you use $(workspaces.some.path)/foo.

oh that makes sense! i didnt realize you could do that and i've been hardcoding the path everywhere. just ignore that comment from me then @chhsia0!

So one of the problem with $(steps.foo.stdout.path) is that, it may be used without the previous task having stdout captured. You don't necessarly hardcode the path, or at least the full path ; most likely you would use $(workspace.foo.path)/somewhere/afile as a path, and re-use the same later on.

coud you explain a bit more about this? we could check if folks are using $(steps.foo.stdout.path) for a step that doesn't have stdout capturing configured. it still seems really useful to me

we could check if folks are using $(steps.foo.stdout.path) for a step that doesn't have stdout capturing configured

actually without this feature, we wouldn't be able to help out anyone making this mistake

This is one of the main differences between Option 1 and Option 2 if we also want the variable substitution mechanism to remain single-pass. With the single-pass assumption, Option 1 gives us a way to capture stdout into results, workspaces, volumes, ... but it's harder to validate if the specified path exists (and IMO it's Task authors' responsibility: they're responsible to ensure any path used in their Tasks is valid); Option 2 provides more guarantees and validations, but is more limited.

bobcatfish · 2020-08-26T19:40:51Z

teps/0011-redirecting-step-output-streams.md

+    - name: data
+      mountPath: /data
+  volumes:
+  - name: data


curious why you decided to use volume mounts here - workspaces are intended to replace volumes and volumemounts as much as possible (i would have liked to have removed them entiredly tektoncd/pipeline#2058) b/c they allow volumes to be supplied at runtime vs task authoring time

Using volumes in Tasks instead of workspaces provides us a way to encapsulate storage information that's internal to Tasks. IIUC if you use a workspace in a Task, you have to surface that to your Pipeline definition, even if you just want to provide a temporary storage for steps within a Task.

So if the storage not used to pass data between Tasks in a pipeline (e.g., emptyDir, configMap and secret), I prefer using volumes and don't expose that to the Pipeline definition.

Also IMO using emptyDir for a workspace might give people an elusion that the emptyDir is shared.

bobcatfish · 2020-08-26T19:49:07Z

teps/0011-redirecting-step-output-streams.md

+
+* If the stdout/stderr of a step is set to the path of a Task result and the step prints too many data, the result manifest would become too large. Currently the entrypoint binary would [fail if that happens](https://github.com/tektoncd/pipeline/blob/v0.15.2/cmd/entrypoint/main.go#L86). We could enhence the error message to provide more information about which step to blame for a termination message bloat to hint users to fix the problem.
+
+### Option 2: Redirecting to Canonical Conventional Paths


is there a reasonable option that provides both, e.g. the ability to write to a specific location AND the ability to specify an explicit path? e.g.

type Step struct { Stdout PathConfig `json:"stdout,omitempty"` } type PathConfig struct{ Auto bool Path string }

So if you dont care where stdout ends up and you just want to pass it between steps you can use:

- name: boskosctl image: gcr.io/k8s-staging-boskos/boskosctl ... stdout: auto: true - name: parse-project-id image: imega/jq args: - -r - .name - $(steps.boskosctl.stdout.path)

But if you do care you can use:

- name: boskosctl image: gcr.io/k8s-staging-boskos/boskosctl ... stdout: path: /data/boskosctl-stdout - name: parse-project-id image: imega/jq args: - -r - .name - $(steps.boskosctl.stdout.path)

This boils down to whether we allow people to use $(workspaces.<name>.path) or $(results.<name>.path) or $(resources.outputs.<name>.path) in the specified path. If yes and if we want to keep a single-pass variable substitution, then we cannot provide $(steps.<name>.stdout.path).

@chhsia0

I didn't realize this variable existed until @chhsia0 used it in tektoncd/community#182 All this time ive been hardcoding this path XD hoping to help some other folks realize this is possible as well.

teps/0011-redirecting-step-output-streams.md

vdemeester · 2020-09-14T14:54:20Z

See #182 (comment), I don't see why we would only add stdout and not stderr. From my point of view, it's not up to tekton to know which one is valuable/used by the user.

bobcatfish · 2020-09-14T14:59:35Z

From my point of view, it's not up to tekton to know which one is valuable/used by the user.

I don't think we've had any users say they actually need stderr so it feels like we don't need it, and if we don't need it, why not wait to add it until we do.

Sounds like you feel strongly @vdemeester tho

chmouel · 2020-09-14T15:04:45Z

I believe there is a use case for having stderr passed over,

For example, as a user I would like to write a tekton pipeline for python based software to do some linting of my python code.

I look around and find this container provided by the pylint authors https://hub.docker.com/r/eeacms/pylint/ which I can reuse directly in my tekton task.

I would like in a following task being able to capture the output (stdout) or the errors (stderr) of the task so I can for example post it back to GitHub or somewhere else...

Being able to do this directly without having to write a custom script based on eeacms/pylint/ to redirect stderr to stdout or store the stderr to my workspace make it more useful and easy for me.

@chhsia0

I didn't realize this variable existed until @chhsia0 used it in tektoncd/community#182 All this time ive been hardcoding this path XD hoping to help some other folks realize this is possible as well.

@chhsia0

I didn't realize this variable existed until @chhsia0 used it in tektoncd/community#182 All this time ive been hardcoding this path XD hoping to help some other folks realize this is possible as well.

vdemeester

/cc @tektoncd/core-maintainers

tekton-robot · 2020-09-28T15:28:15Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~teps/OWNERS~~ [vdemeester]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ghost · 2020-10-02T14:11:02Z

All my comments are addressed and I'd like to see this move ahead. Adding hold to give @bobcatfish time to respond to the remaining open threads of conversation.

/lgtm

/hold

bobcatfish

I've added a few comments but I think they're all nitpicks :D Generally looks great! Thanks for your patience with the back and forth @chhsia0 !

I'm leaving the hold just because I think one of those nitpicks is in an example that isn't quite right, feel free to remove the hold!

teps/0011-redirecting-step-output-streams.md

bobcatfish · 2020-10-26T19:35:00Z

teps/0011-redirecting-step-output-streams.md

+
+Once `StdoutConfig.Path` or `StderrConfig.Path` is specified, the corresponding output stream will be duplicated to both the given file and the standard output stream of the container, so users can still view the output through the Pod log API. If both `StdoutConfig.Path` and `StderrConfig.Path` are set to the same value, the two streams will be written to two file descriptors referring to the same file description, so outputs from both stream will be interleaved in the same file, but there will be no ordering guarantee on the data.  If multiple steps' `StdoutConfig.Path` fields are set to the same value, the file content will be overwritten by the last outputting step.
+
+Variable substitution will be applied to the new fields, so one could specify `$(results.<name>.path)` to the `StdoutConfig.Path` field to extract the stdout of a step into a Task result. No new variable substitution for accessing the values of `StdoutConfig.Path` and `StderrConfig.Path` fields will be provided so variable substitution can remain single-pass.


bobcatfish · 2020-10-26T19:35:46Z

teps/0011-redirecting-step-output-streams.md

+  - name: project-id
+  steps:
+  - name: boskosctl
+    image: gcr.io/k8s-staging-boskos/boskosctl


boskosctl is actually a bad example b/c it DOES have a shell, but otherwise: what @chhsia0 said :D

teps/0011-redirecting-step-output-streams.md

bobcatfish · 2020-11-02T16:15:10Z

@chhsia0 one last nitpick, it looks like this PR has 14 commits right now, can those be squashed into 1?

Feel free to remove:
/hold

chhsia0 · 2020-11-02T16:16:28Z

/hold cancel

chhsia0 · 2020-11-02T16:18:10Z

I usually don't squash the commits in a PR because it allows reviewers to see diff since last review and the squash could be done through "Squash and merge." But sure I'm happy to do that now 👍

vdemeester · 2020-11-02T16:26:47Z

@chhsia0 😝 but it's a bot that merges, and only through rebase, so it would be nice, now that it's validated, that we would squash the commits before merge 😛

vdemeester · 2020-11-02T16:27:12Z

… and as I commented, you did it @chhsia0 🤗 😻
/lgtm

tekton-robot requested review from afrittoli and imjasonh August 17, 2020 21:54

tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 17, 2020

vdemeester reviewed Aug 18, 2020

View reviewed changes

imjasonh reviewed Aug 18, 2020

View reviewed changes

ghost reviewed Aug 18, 2020

View reviewed changes

teps/0011-redirecting-step-inputs-and-outputs.md Outdated Show resolved Hide resolved

teps/0011-redirecting-step-inputs-and-outputs.md Outdated Show resolved Hide resolved

chhsia0 requested review from vdemeester, imjasonh and a user August 18, 2020 21:11

chhsia0 changed the title ~~TEP-0011: Redirecting Step Inputs and Outputs~~ TEP-0011: Redirecting Step Output Streams Aug 18, 2020

ghost reviewed Aug 19, 2020

View reviewed changes

imjasonh reviewed Aug 20, 2020

View reviewed changes

tekton-robot added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Aug 20, 2020

ghost reviewed Aug 24, 2020

View reviewed changes

teps/0011-redirecting-step-output-streams.md Outdated Show resolved Hide resolved

chhsia0 requested a review from a user August 25, 2020 22:09

vdemeester reviewed Aug 26, 2020

View reviewed changes

bobcatfish reviewed Aug 26, 2020

View reviewed changes

bobcatfish mentioned this pull request Aug 27, 2020

Add results path variable to results docs tektoncd/pipeline#3144

Merged

3 tasks

ghost mentioned this pull request Aug 28, 2020

Feature flags to trim tekton results EOF newlines tektoncd/pipeline#3146

Closed

pierretasci reviewed Sep 10, 2020

View reviewed changes

teps/0011-redirecting-step-output-streams.md Show resolved Hide resolved

tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2020

chhsia0 requested review from bobcatfish and vdemeester September 28, 2020 14:59

vdemeester approved these changes Sep 28, 2020

View reviewed changes

tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 28, 2020

chhsia0 requested a review from imjasonh September 28, 2020 16:38

tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 2, 2020

tekton-robot assigned ghost Oct 2, 2020

tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 2, 2020

XinruZhang mentioned this pull request Oct 8, 2020

TEP-Trim Trailing Whitespace of Tekton Result. #223

Closed

bobcatfish reviewed Oct 26, 2020

View reviewed changes

tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label Nov 2, 2020

chhsia0 requested a review from bobcatfish November 2, 2020 16:14

tekton-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 2, 2020

TEP-0011: Redirecting Step Inputs and Outputs

d3a2092

chhsia0 force-pushed the step-redirection branch from 8eb0bb0 to d3a2092 Compare November 2, 2020 16:26

tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 2, 2020

tekton-robot merged commit f70e525 into tektoncd:master Nov 2, 2020


		### Risks and Mitigations

		* Users might mistakenly specify paths not shared among steps for redirection. Potentially Tekton could implement a validation to warn users about such misuses.


		* Run image `gcr.io/k8s-staging-boskos/boskosctl` with args in a step and process its output through another `jq` image in a subsequent step to acquire the project name without overwriting the image entrypoint, building custom image with both utilities, or looking into container logs ([pipeline#2925](https://github.com/tektoncd/pipeline/issues/2925)).

		* Run image `alpine/git` with args `["describe", "--tags"]` in a step and extract the stdout as a Task result without overwriting the image entrypoint to be used by subsequent tasks.


		* Parsing stdout/stderr into a structured format ([example](https://github.com/tektoncd/pipeline/issues/2925#issue-654319361)): This approach requires the step image to produce JSON output, which limits what images can be used. It also hides the parsing magic in Tekton, which can be hard to debug if the output is malformed.

		* [Allowing subsequent steps to specify a filter expression to apply to step outputs](https://github.com/tektoncd/pipeline/issues/2925#issuecomment-657529820): If there are multiple subsequent "consumer" steps, then either all consumers must use the same filter to save disk space. Also the magic of filtering will be hidden by Tekton from users, creating unnecessary complexity. It is not hard to use `Script` or add an extra step to perform filtering to achieve the same result with more transparency.


		Once `StdoutPath` or `StderrPath` is specified, the corresponding output stream will be duplicated to both the given file and the standard output stream of the container, so users can still view the output through the Pod log API. If both `StdoutPath` and `StderrPath` are set to the same value, the two streams will be written to two file descriptors referring to the same file description, so outputs from both stream will be interleaved in the same file, but there will be no ordering guarantee on the data. If multiple steps' `StdoutPath` fields are set to the same value, the file content will be overwritten by the last outputting step.

		Variable substitution will be applied to the new fields, so one could specify `$(results.<name>.path)` to the `StdoutPath` field to extract the stdout of a step into a Task result. No new variable substitution for accessing the values of `StdoutPath` and `StderrPath` fields will be provided so variable substitution can remain single-pass.


		* If the stdout/stderr of a step is set to the path of a Task result and the step prints too many data, the result manifest would become too large. Currently the entrypoint binary would [fail if that happens](https://github.com/tektoncd/pipeline/blob/v0.15.2/cmd/entrypoint/main.go#L86). We could enhence the error message to provide more information about which step to blame for a termination message bloat to hint users to fix the problem.

		### Option 2: Redirecting to Canonical Conventional Paths


		Once `StdoutConfig.Path` or `StderrConfig.Path` is specified, the corresponding output stream will be duplicated to both the given file and the standard output stream of the container, so users can still view the output through the Pod log API. If both `StdoutConfig.Path` and `StderrConfig.Path` are set to the same value, the two streams will be written to two file descriptors referring to the same file description, so outputs from both stream will be interleaved in the same file, but there will be no ordering guarantee on the data. If multiple steps' `StdoutConfig.Path` fields are set to the same value, the file content will be overwritten by the last outputting step.

		Variable substitution will be applied to the new fields, so one could specify `$(results.<name>.path)` to the `StdoutConfig.Path` field to extract the stdout of a step into a Task result. No new variable substitution for accessing the values of `StdoutConfig.Path` and `StderrConfig.Path` fields will be provided so variable substitution can remain single-pass.

TEP-0011: Redirecting Step Output Streams #182

TEP-0011: Redirecting Step Output Streams #182

Conversation

chhsia0 commented Aug 17, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chhsia0 Aug 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chhsia0 Aug 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost left a comment

Choose a reason for hiding this comment

chhsia0 commented Aug 18, 2020 • edited Loading

ghost commented Aug 18, 2020 • edited by ghost Loading

ghost commented Aug 18, 2020

chhsia0 commented Aug 18, 2020

chhsia0 commented Aug 18, 2020 • edited Loading

ghost commented Aug 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chhsia0 Aug 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chhsia0 Aug 20, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pritidesai commented Aug 20, 2020

ghost left a comment

Choose a reason for hiding this comment

chhsia0 commented Aug 25, 2020 • edited Loading

vdemeester left a comment

Choose a reason for hiding this comment

chhsia0 commented Aug 26, 2020 • edited Loading

bobcatfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chhsia0 Aug 28, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdemeester commented Sep 14, 2020

bobcatfish commented Sep 14, 2020

chmouel commented Sep 14, 2020

vdemeester left a comment

Choose a reason for hiding this comment

tekton-robot commented Sep 28, 2020

ghost commented Oct 2, 2020

bobcatfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobcatfish commented Nov 2, 2020

chhsia0 commented Nov 2, 2020

chhsia0 commented Nov 2, 2020

vdemeester commented Nov 2, 2020

vdemeester commented Nov 2, 2020

chhsia0 commented Aug 17, 2020 •

edited

Loading

chhsia0 Aug 18, 2020 •

edited

Loading

chhsia0 Aug 18, 2020 •

edited

Loading

chhsia0 commented Aug 18, 2020 •

edited

Loading

ghost commented Aug 18, 2020 •

edited by ghost

Loading

chhsia0 commented Aug 18, 2020 •

edited

Loading

chhsia0 Aug 19, 2020 •

edited

Loading

chhsia0 Aug 20, 2020 •

edited

Loading

chhsia0 commented Aug 25, 2020 •

edited

Loading

chhsia0 commented Aug 26, 2020 •

edited

Loading

chhsia0 Aug 28, 2020 •

edited

Loading