-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promote taint based evictions to GA #1450
Conversation
/cc @ahg-g @Huang-Wei |
2cef2aa
to
b12fab8
Compare
## Summary | ||
|
||
Taint Based Evictions was introduced as an alpha feature in Kubernetes 1.6 and was promoted to | ||
beta in 1.13. Taint Based Evictions evicts pods from a node based on taints applied to the node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the tricky part...
Long time ago, when the feature was proposed, the feature was supposed to evict pods based on node conditions. But along with the development, it's finalized with a much smaller scope:
Taint nodes with a
NoExecute
effect automatically, when the nodes gets not ready or unreachable. While TaintNodeByCondition feature apply taints with aNoSchedule
effect. Sometime people confuse with them.
Other functionalities such as evicting pods and setting tolerationSeconds are out of TaintBasedEviction's scope (by taint manager and default toleration seconds admission controller).
So the name is very confusing, it's worth highlighting the tricky parts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we disable TaintBasedEviction, the eviction behavior and setting default toleration seconds still work. Which also prove that those are not the scope of TaintBasedEviction.
And maybe we should emphasize the word "automatically", as NoExecute comes with taint long time ago, but system-managed NoExecute taints are exactly managed by this feature.
|
||
### Goals | ||
|
||
Promote Taint Based Evictions to GA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should say "Ensure nodes are tainted properly with a NoExecute effect when it's not ready or unreachable, so that scheduler can use taints to make scheduling decisions consistently."
|
||
## Motivation | ||
|
||
Taint Based Evictions has been in beta since 1.13 and has functioned well since then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Motivation section applies generally to the feature, not particularly for promoting it to GA.
How about: TaintNodesByCondition has ensured the nodes to be tainted well with NoSchedule effect, upon different node conditions. However, it's also required to taint nodes with NoExecute effect automatically upon some node conditions such as node gets not ready or unreachable.
|
||
### Non-Goals | ||
|
||
Make any change of functionality to the feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Non-goals applies to the feature.
b12fab8
to
68bcf4b
Compare
@Huang-Wei updated with your feedback |
title: Promote Taint Based Evictions to GA | ||
authors: | ||
- "@damemi" | ||
owning-sig: sig-scheduling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sig-scheduling is not the owning sig of this feature, the scheduler doesn't include any code specifically for this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt which sig does the node lifecycle controller falls under?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would that be sig-node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably sig-node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or sig-cloud-provider for the "not reachable nodes" aspect
* [Node lifecycle controller eviction tests](https://github.com/kubernetes/kubernetes/blob/47d5c3ef8d/pkg/controller/nodelifecycle/node_lifecycle_controller_test.go#L196) | ||
|
||
### Integration tests | ||
* [Taint based evictions integration test](https://github.com/kubernetes/kubernetes/blob/47d5c3ef8df2b1b26da739aec0ada15d41f20cf3/test/integration/scheduler/taint_test.go#L580) (note that prior to 1.17, this test existed as an [end-to-end test](https://github.com/kubernetes/kubernetes/blob/001f2cd2b553d06028c8542c8817820ee05d657f/test/e2e/scheduling/taint_based_evictions.go) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before graduating to GA, we should make sure that the tests are moved under the proper sig.
reviewers: | ||
- "@Huang-Wei" | ||
approvers: | ||
- "@liggitt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend sig-node/sig-cloud-provider for approvers
68bcf4b
to
fa0f1ab
Compare
/lgtm |
/hold |
Just checking, is there anything else this needs? Waiting on this to merge to start breaking out the issues in kubernetes/kubernetes#87161 |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, damemi, Huang-Wei The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
Based on this comment in the TBE issue (#166 (comment)) we are creating an enhancement to officially track the promotion of Taint Based Evictions to GA