-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-3085: Add KEP for Pod network ready condition #3087
Conversation
I think this would be useful since it can be hard to determine why pods are not running if there are errors during CreatePodSandbox - kubernetes/kubernetes#104635 |
/assign |
some more info here: kubernetes/kubernetes#105984 |
ff5fc47
to
cd53e91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it sounds reasonable and useful, a few minor nitpicks though.
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/kep.yaml
Outdated
Show resolved
Hide resolved
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Outdated
Show resolved
Hide resolved
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Outdated
Show resolved
Hide resolved
/assign @derekwaynecarr |
Some flavors of We might also want to define how to represent this condition if using a CNIs that implements NetworkPolicy. That might be better left to a separate condition, though, to avoid holding up this enhancement? |
Setting up of the network concludes the pod "sandbox" creation process coordinated between Kubelet => CRI runtime => CNI plugins. Thus, in the context of this KEP, the new If we want more granular network-related conditions (which is not within scope of the KEP), individual CNI plugins (with a API server client) can mark any necessary condition on the pod using extended conditions - for example, Note that until all CNI plugins in the configured chain do succeed, CRI runtime won't return an "in-process"/intermediate outcome to Kubelet. Either all networking configuration and IP allocation succeeded (i.e. networking is fully up for the pod) OR network configuration failed and the entire CRI sandbox creation + network configuration has to be retried. |
@derekwaynecarr I have updated the KEP with the suggested |
/retitle KEP-3085: Add KEP for Pod network ready condition Hope that's OK. |
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/kep.yaml
Outdated
Show resolved
Hide resolved
@ddebroy consider omitting “draft” from the PR description at this point. |
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Outdated
Show resolved
Hide resolved
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Deep Debroy <ddebroy@gmail.com>
PRR looks fine, I will wait until there is SIG approval though because I am root in this repo |
becomes especially relevant in multi-tenant clusters where individual tenants | ||
own the pod specs (including the set of init containers) while the cluster | ||
administrators are in charge of storage plugins, networking plugins and | ||
container runtime handlers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out the mismatching phase interprets on Initialized
between pods with init containers and without. But the issue isn't obvious to the users since we hide the pod sandbox creation completely in the implementation. To the end user, it is more like all init containers are successfully run, now it is the time to start the app containers. With the newly proposed PodCondition: PodHasNetwork, shouldn't the issue surface to the end users now?
With init containers: PodHasNetwork -> Initialized -> ... -> Ready
Without init containers: Initialized -> PodHasNetwork -> ... -> Ready
Can we consolidate above flows? We can move Initialized after PodHasNetwork to indicate Kubelet now can start any app containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the issue isn't obvious to the users since we hide the pod sandbox creation completely in the implementation. To the end user, it is more like all init containers are successfully run, now it is the time to start the app containers. With the newly proposed PodCondition: PodHasNetwork, shouldn't the issue surface to the end users now?
The inconsistency already surfaces with the way things are today when CSI/CRI/CNI is unable to mount volumes or launch the pod sandbox upon encountering errors: without init containers, today, the pod status reports the Initialized condition to be true but Kubelet also reports error events like FailedCreatePodSandBox
, FailedMount
, etc against the pod (which are reported as part of kubectl describe
etc)
Can we consolidate above flows? We can move Initialized after PodHasNetwork to indicate Kubelet now can start any app containers.
We can certainly consider that. I will add that to the KEP post-merge of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a discussion on this in sig-node on July 5th. The suggestion was to carry on with the PodHasNetwork
condition as described in this KEP as the ordering of transition of pod conditions should not be important. We can separately address the situation with Initialized
condition for pods without init containers (mentioned above) in a separate KEP as that feels like an independent problem.
What is out of scope for this KEP? Listing non-goals helps to focus discussion | ||
and make progress. | ||
--> | ||
- Modify the meaning of the existing `Initialized` condition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my above comment.
Add some more comments, but overall KEP is well written and lgtm. To unblock the progress for such useful feature, I will approve this to meet the deadline. /lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dchen1107, ddebroy, ehashman, johnbelamaric, qiutongs The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ddebroy - Nice features - I just added two minor comments. If you could adjust them in the follow up PR it would be great.
automations, so be extremely careful here. | ||
--> | ||
|
||
No changes to any default behavior should result from enabling the feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully agree with this - by default each pod will get a new condition set.
So if someone is doing equality comparisons, they will start seeing changes.
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos | ||
--> | ||
|
||
No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may potentially increase pod-startup-time (which includes reporting the state) if kubelet is starved on QPS limits.
[This is fine - I'm just pointing this out for completeness...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note is that the Kubelet Status Manager caches all status updates and queues and sends them to API server in an async fashion. So the updates to the pod conditions (existing ones as well as the new one added in this KEP) do not happen synchronously with actual pod creation and initialization activities. Given this, do you think the above may still be a concern @wojtek-t ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I knew that. But still - if the conditions are not changing fast enough and individual phases take a bit of time, we may need to do more API calls. And in that case, it may increase pod startup if kubelet is cpu-starved.
As I mentioned - I don't treat it as a concern or anything blocking - I just wanted it to be added for completeness.
KEP to surface
PodHasNetwork
condition in pods to indicate when pod sandbox creation and network configuration of pod through CRI runtime is complete.