-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add credentials filter to entrypoint logger #4837
Conversation
Hi @Useurmind. Thanks for your PR. I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Any interest in this feature? |
/ok-to-test |
The following is the coverage report on the affected files.
|
Thanks @Useurmind for your PR! /cc @imjasonh @vdemeester |
Then I will start implementing a feature toggle for it and will see that I add some tests. |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
So I started looking into this feature and implementing a feature flag for it. Currently I am not sure how a feature flag may be best implemented for this. I have found the following points:
Point 1 and 2 could be solved handing down an argument or setting an environment variable to the entrypoint from the controller. Point 3 probably must be solved by documenting the requirements for the credentials filter, so that pipeline creators can pay attention to it. Does that approach make sense or is there an easier way? |
|
The following is the coverage report on the affected files.
|
ffc4013
to
80aea82
Compare
The following is the coverage report on the affected files.
|
This part worries me a bit. Does this feature require that all TaskRun pods now have more permissions than before, to be able to tell what to redact? That seems valuable, but it also may mean that TaskRuns that, e.g., execute Could this somehow query "all accessible secrets" instead, and redact whatever it can find, instead of requiring new access to all secrets in its namespace? I wonder if this could also lead to more load on the API server, since every TaskRun would start by fetching all secrets in its namespace, which for clusters running lots of TaskRuns could cause problems. Not saying we shouldn't do this -- redacting secrets would be great -- just curious about some of the implications around limiting access and being mindful to reduce API traffic. |
The following is the coverage report on the affected files.
|
You are correct in your understanding Jason. The pods can only redact secrets that they know and have access to. And they will query the API server for this. The current implementation uses the go-client to do so. I am not entirely sure but I suspect that the client will return only the secrets that the service account has access to. If not we can implement it in this way. That would mean if the service account has no access to secrets, he will not be able to redact them properly. The performance implications are as you say. But as this is hidden behind a feature flag, it should not harm existing installations. If the feature is not enabled not communication to the api server takes place. Btw. as far as I understand the permission concept in k8s a user can either see all secrets in a namespace or none. If you want to hide resources from a user only partially, you need to put them in different namespaces. Seeing can be limited to listing or reading them. But in the end its all or none. |
Some more things to consider: I would advice to extend this in the future so multiple secret detection algorithms can be used. Secrets have so many sources meanwhile that it becomes hard to configure one that works for everyone. This is a rather simple approach to read all secrets from the namespace. Others could be:
Should we consider implementing one of those? Then the pod would only need access to its own resource and could detect the secrets from files and environment variables. But I am not sure it is as complete as the approach above. It is also more complicated to implement. |
I thought about the implenentation again over the weekend. If there is currently no contact between tekton pipeline run pods and the API server it would be a grave error to require this in the future. It makes the whole design of the pipelines more complicated and the behaviour less stable and harder to predict. Therefore, I would suggest the following approach, similar to my previous post but not the same:
Much more complicated than the current implementation, but more in line with the current design and easier to reason about regarding the permissions because the pipeline pod will not need any new permissions. |
dcb56c9
to
9cc2fdd
Compare
The following is the coverage report on the affected files.
|
9cc2fdd
to
00435fe
Compare
The following is the coverage report on the affected files.
|
00435fe
to
a05fdc7
Compare
The following is the coverage report on the affected files.
|
a05fdc7
to
c00af54
Compare
The following is the coverage report on the affected files.
|
c00af54
to
0907ab8
Compare
The following is the coverage report on the affected files.
|
} | ||
|
||
// NewRingBuffer creates a RingBuffer with the given length. | ||
func NewRingBuffer(length int) *RingBuffer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we don't want to add a new dependency like the one you listed, but this still adds complexity to an already large PR. Would it be possible to either use a plain slice of bytes as a buffer, or move the ring buffer implementation into its own PR?
} | ||
|
||
// Close will write remaining bytes from the internal buffer to the output stream. | ||
// The remaining bytes will not be filtered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not filter the bytes remaining in the buffer? Couldn't the buffer still contain a secret that is shorter than the longest one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the misconception that there could be no secrets anymore in the remaining bytes. But that is wrong. I implemented a test that proved exactly that and added some filtering to the close method.
cmd/entrypoint/runner.go
Outdated
@@ -168,12 +210,19 @@ func (rr *realRunner) Run(ctx context.Context, args ...string) error { | |||
// be used for multiple streams if desired. | |||
// The behavior of the Reader is the same as io.TeeReader - reads from the pipe | |||
// will be written to the file. | |||
func newTeeReader(pipe func() (io.ReadCloser, error), path string) (*namedReader, error) { | |||
func newTeeReader(pipe func() (io.ReadCloser, error), path string, writer io.WriteCloser) (*namedReader, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function docstring needs some updates; some of the commentary now belongs on getFileWriter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the doc strings
0907ab8
to
3b0345c
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
This adds a filter for secret values to the logging mechanism in the pipeline entrypoint. The secrets locations in the form of environment variables and files are detected by the controller creating the pipeline step pod. The information about secrets attached to that pod are given to the entrypoint as a json file. From there the entrypoint will read the secret values from the given locations and redact all occurences from the output log.
3b0345c
to
82e897c
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
@Useurmind: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@Useurmind: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
closing old pull requests, please feel free to reopen if you pick up this work! |
Changes
This PR is related to #3373. The issue was closed by your bot but the topic is still relevant to us.
It solves the problem that secret values are printed to the output log. All secrets contained in the namespace of the pod will be redacted. The pod needs a service account token with permissions to read out all secret in the namespace from the api server.
I added a credential filter to the runner app of tekton steps. It reads out all secrets from the namespace of the pod and redacts them from the output log.
This is currently a draft, I would be happy to polish it if you have interest in adopting this approach.
/kind feature
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
functionality, content, code)
(if there are no user facing changes, use release note "NONE")
Release Notes