-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle sidecar execution signalling with entrypoint injection #1569
Comments
I think we could do something crazy like subscribe to changes to the pod we created, and inject AFTER istio injects - but I'm not sure if Istio will catch that and try to undo it, causing a sweet k8s mutating webhook injection bomb 💣 which i have to say would be kind of cool to see in action. |
We can't inject after the Pod is created, because we can only update containers' @sbwsg and I had also talked about adding a mutating webhook for Pods, which could rewrite them, but would only work if it was run after Istio's mutating webhook. And you can't declare any kind of ordering. 😕 Either way, it seems simplest to just state "Istio might not work as expected unless you're explicit what you want" and not rely on any more deeply terrible hacks. 😄 |
@imjasonh
|
/kind feature |
@vincent-pli It is very nice thanks for sharing, but unfortunately that would be a big issue for us (on openshift which runs as secure as possible by default) since that would require elevated capabilities to run the tasks, the |
I've thought about this a bit more, and I think entrypoint injection won't be a magic bullet for us after all, at least not without expanding the scope of this work. 😢 Entrypoint injection will tell us when a sidecar is running, but not when it's ready. A sidecar container might pull and start, then take an indeterminate amount of time to actually report as ready. The entrypoint binary would have to take over responsibility of running the user's configured health check against itself before signalling step 0 to start. That's sufficiently complex I don't know if I want us to take it on. As @bobcatfish and others have also pointed out, any injected sidecars (e.g., Istio) would also have to be accounted for, probably by doing the READY annotation Downward dance like we do today, so we're not really able to drop any complexity. I'm going to think about this a bit more and come back to it. In the meantime I'll add an e2e test that checks that an running-but-unready sidecar blocks step 0, to make future changes easier to verify. |
@imjasonh
The file |
@vincent-pli Won't that cause the Pod to report as failed, even if all steps completed successfully? That also has the same problem as entrypoint injection, since it only covers sidecars explicitly stated in the TaskRun, and misses injected sidecars (e.g., Istio). I think we need a stronger e2e test suite around injected sidecars, but I'm not sure of an easy way to add that without defining a test-only sidecar injector, which could be cumbersome. 🤔 |
@imjasonh
|
Reading through this... darn. I was noticing how the downward api ready check was adding a bonus 2 or 3s to every TaskRun. If we want to support being "ready" in the face of an auto-inject sidecar case then I really don't see a good alternative until real sidecar support lands. With that said I have exactly zero cases that look like this and resent having to pay a cost here. Could we remove this overhead if the TaskRun Sidecars array is empty and then use an annotation to opt-in for the auto-inject special case. |
We could make it a configmap option somewhere, something like I'm not excited that this means two code paths to handle both cases (especially while testing support for injected sidecars is so scant: #1687), but it could be a nice performance win if you as an operator know there won't ever be any injected sidecars. |
The configmap version is kind-of yuck too but might be an option especially if we don't feel confident about our e2e testing. I guess another approach that would be transparent is to have a mutating webhook scan for non-step containers at TaskRun Pod creation and then inject the downwardAPI ready check into the first step. |
There's no guarantee that our mutating webhook would run after the other mutating webhook that injects the sidecar. We effectively do what you're proposing already (if I'm understanding correctly): we create a Pod, then watch for changes to it -- if we notice there are non- We can't know at Pod creation time whether sidecars will be injected or not, so we have to wait to see it get created (possibly mutated), which adds the latency. |
Crazy as it might sound mutating webhooks "can" get called twice to prevent this ordering race. And yes the current approach is pretty robust, just wanting to only have the downward api mechanism used if we need to. Sigh... so 1.15 for webhook re-invocation... but 1.18 (hopefully) for sidecar... sigh I may as well just sit on my hands for this... |
That's useful information (and somewhat surprising!). So it's possible we could inject the entrypoint binary even into injected sidecar containers, but that only gets us about half way. The other half is having the entrypoint binary start the user's process and poll it for readiness. The kubelet does this already, using the user's defined I'm inclined to just mark out "injected sidecars are not well supported, please be explicit" and sit around and wait for Sidecars to become a real supported feature, whever that'll be. |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
Stale issues rot after 30d of inactivity. /lifecycle rotten Send feedback to tektoncd/plumbing. |
Rotten issues close after 30d of inactivity. /close Send feedback to tektoncd/plumbing. |
@tekton-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-lifecycle rotten |
@vdemeester: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The istio team has moved to a slightly improved mechanism for sidecar support (while we wait and see what happens in K8). I'm adding this comment for awareness as the approach is elegant if a bit clever hacky -- https://medium.com/@marko.luksa/delaying-application-start-until-sidecar-is-ready-2ec2d21a7b74 -- |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
Stale issues rot after 30d of inactivity. /lifecycle rotten Send feedback to tektoncd/plumbing. |
Closing since the caveat is pretty limiting; it means injected sidecars (e.g., by Istio) break the model. When/if a KEP makes sidecars a first-class concept, we can revisit. |
Background
Today, Tekton injects an
entrypoint
binary into running steps, to control ordering of execution of those steps. Broadly, this is done by having the binary wait for files passed via the--wait_file
flag, and having the binary write a file when it completes, specified by the--post_file
flag.In order to run steps A, B, C, the TaskRun controller creates a Pod with entrypoint binary configured like:
Also: The entrypoint binary has been capable of waiting for multiple files since #1430 -- all specified files must exist before the actual step execution begins.
Problem
Supporting Sidecars introduced a couple problems, which required additional
terrible hackscreative workarounds.First, stepA shouldn't start until all sidecar containers have started. This was accounted for by having
stepA
's container mount a Downward-API-backed volume containing the value of a READY annotation on the Pod. When the controller sees that sidecar containers are up, it applies this annotation, triggeringstepA
to start (and so on).Second, sidecars should be gracefully stopped once all TaskRun steps have completed successfully. This was accounted for by having the TaskRun controller update sidecar containers'
image:
field to a container image that exits successfully immediately, regardless of what args are passed (since K8s doesn't allow us to updateargs
orcommand
, etc., onlyimage
).These "creative workarounds" work fine, but they require the TaskRun controller to have more control over and awareness of the underlying Pod than it should -- it would be nice to just have the controller create and watch Pods, and never have to update them (to add annotations, or update container
image
fields).Proposal
When converting a TaskRun spec to a Pod, the TaskRun controller should also inject the entrypoint binary into specified sidecar images, just as it does for steps. This allows it to have more control over sidecar execution ordering and signalling, and runs entirely within the Pod, not requiring any updates from outside.
To block
stepA
from starting until all sidecars are up, we can augment the entrypoint binary to take a new flag,--start_file
, which writes a file when the container starts.stepA
would then be configured to wait for files for all sidecar start-files, and would wait until all of them exist before starting to execute (and so on). (Steps can already wait for multiple files to exist since #1430)Next, to gracefully exit sidecars when step containers are finished, the entrypoint binary can take a new flag,
--kill_files
which specifies a file to watch. Once that file exists, the entrypoint binary can stop executing its specified command, and exit gracefully. The final step container will be configured to write that file.In this way, Pod updates from outside the Pod are removed, and all start/stop signalling is done via the entrypoint binary.
Caveats
This process only works when we can inject the entrypoint binary into containers. Pod containers injected automatically by mutating webhooks (e.g., Istio sidecars) will not have the entrypoint binary injected, and a.) might not start before
stepA
starts, and b.) might not exit gracefully (or ever!) when the last step completes. This is a fundamental limitation of the Istio sidecar, since it generally (rightfully) expects to run alongside longrunning serving Pods, and to expect consumers to restart the Pod if the sidecar isn't available yet. If this is problematic, users can explicitly specify the necessary Istio sidecar in the TaskRun'ssidecars:
, in which case it will be entrypoint-wrapped and signalled correctly.Future
Once all signalling is done via the entrypoint binary, we can decide to change how signalling is done, to remove the need for file-watching. We could add a super-sidecar that all entrypoint binaries connect to to receive start/stop signals. Having one consistent signalling implementation allows us to change it over time more easily.
Far Far Future
kubernetes/enhancements#753 (alpha in K8s 1.17) describes a new
lifecycle:
stanza inside the Pod type, with atype: Sidecar
option that lets the Pod executor handle the general case of both a.) waiting until sidecars are up before running non-Sidecar
-lifecycled containers, and b.) gracefully exitingSidecar
-lifecycled containers when all other containers exit. This would let us hand over sidecar lifecycle control back to K8s, and should let us drop some entrypoint binary features described above.Longer-term, we might consider proposing a KEP for a new
lifecycle
type, e.g.,type: Ordered
, whereby so-lifecycled containers are run in their specified order, one after another, until all complete. This would be similar toinitContainers
today, except that it would also play more nicely withSidecar
-lifecycled containers, or possible future lifecycle types. It's unclear whether other projects using K8s would benefit from anOrdered
-type container.The text was updated successfully, but these errors were encountered: