Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taint Based Eviction #166

Closed
davidopp opened this issue Jan 20, 2017 · 111 comments
Closed

Taint Based Eviction #166

davidopp opened this issue Jan 20, 2017 · 111 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Milestone

Comments

@davidopp
Copy link
Member

davidopp commented Jan 20, 2017

Feature Description

  • One-line feature description (can be used as a release note):
  • Primary contact (assignee): @gmarek
  • Responsible SIGs: @kubernetes/sig-scheduling-feature-requests
  • KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/20200127-taint-based-evictions.md
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred:
  • Approver (likely from SIG/area to which feature belongs):
  • Feature target (which target equals to which milestone):
    • Alpha release target (x.y)
    • Beta release target (x.y) 1.13
    • Stable release target (x.y)
@davidopp davidopp added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Jan 20, 2017
@davidopp davidopp added this to the v1.6 milestone Jan 20, 2017
@davidopp davidopp mentioned this issue Jan 30, 2017
22 tasks
@davidopp
Copy link
Member Author

davidopp commented Feb 28, 2017

This is finished except for documentation.

NoExecute taint effect is now in Beta (as part of moving taints/tolerations to Beta), and taint-based eviction for node problems is in Alpha.

The PRs involved were:

@idvoretskyi
Copy link
Member

idvoretskyi commented Mar 6, 2017

@davidopp @gmarek @kevin-wangzefeng please, provide us with the release notes and documentation PR (or links) at the features spreadsheet.

@grodrigues3 grodrigues3 added the stage/beta Denotes an issue tracking an enhancement targeted for Beta status label Mar 6, 2017
@davidopp davidopp added stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status and removed stage/beta Denotes an issue tracking an enhancement targeted for Beta status labels Mar 11, 2017
@davidopp
Copy link
Member Author

Documentation PR out for review at https://github.com/kubernetes/kubernetes.github.io/pull/2774/files

@davidopp
Copy link
Member Author

@davidopp
Copy link
Member Author

Regarding "taint-based eviction for node problems is in Alpha": nobody is available to move it to beta in 1.7, so it will stay in alpha in 1.7.

@gyliu513
Copy link

@davidopp I can help on this and target for 1.7, one question is that I saw kubernetes/kubernetes#40355 is already enabled by default, so can you please explain more for what do you mean by move this to beta in 1.7?

@gmarek
Copy link

gmarek commented Apr 27, 2017

@gyliu513 - taint controller is enabled by default, but using taints instead of direct evictions in case of node problems isn't. To move it to beta there's at least one thing to be done (except renaming stuff from alpha to bete), which is to rewrite/write new NodeController unit tests, as they currently assume direct evictions. There are some very basic tests for taint-based evictions, but they should be drastically extended (e.g. cover all master-disruption logic).

@davidopp
Copy link
Member Author

I think we can only do this in 1.7 if @gmarek has the bandwidth to do all the reviews and define what we need to do to move it to beta. (Sounds like he's already done most of the second thing above.)

@gmarek do you have time?

@gmarek
Copy link

gmarek commented Apr 27, 2017

Yes, I can find time for reviews and I can find time to think through what needs to be done, if there's someone willing to work on it.

@gyliu513
Copy link

@davidopp @gmarek I will be the volunteer for this, can you please assign this to me? @gmarek will go through this feature and propose sth to you

@davidopp
Copy link
Member Author

assigned

@gmarek
Copy link

gmarek commented Apr 28, 2017

@gyliu513 OK - let me know if you need some directions.

@davidopp davidopp modified the milestones: v1.7, v1.6 Apr 29, 2017
@idvoretskyi idvoretskyi added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 3, 2017
@idvoretskyi
Copy link
Member

@davidopp @gmarek I've updated the feature description to fit the new template. Please, fill the empty fields in the new template (their actual state was unclear).

@davidopp
Copy link
Member Author

davidopp commented May 5, 2017

Move to Beta is goal for 1.7.

@gyliu513
Copy link

gyliu513 commented May 5, 2017

@gmarek I want to split the work to two tasks:

  1. Rename from alpha to beta.
  2. Update unit test for node controller to cover more cases.

Comments?

@Ye-Tian-Zero
Copy link
Contributor

@damemi Hi, is the progress of moving #1450 from scheduling to node still move on? Will it be done before v1.18 KEP freeze?

@damemi
Copy link
Contributor

damemi commented Jan 27, 2020

@skilxn-go I opened a PR to move the KEP here: #1510

@Ye-Tian-Zero
Copy link
Contributor

Thanks, got it

@palnabarun
Copy link
Member

Hi @damemi, just a friendly reminder that the Code Freeze will go into effect on Thursday 5th March.

Can you please link all the k/k PRs or any other PRs which should be tracked for this enhancement?

Thank You :)

@damemi
Copy link
Contributor

damemi commented Feb 5, 2020

Hi @palnabarun, we have an umbrella issue which links to the issues/PRs that are in the works for this: kubernetes/kubernetes#87161

@palnabarun
Copy link
Member

Thank you @damemi for updating this. :)

@sethmccombs
Copy link

Hey @damemi -

Seth here, Docs shadow on the 1.18 release team.

Does this enhancement work planned for 1.18 require any new docs or modifications to existing docs?

If not, can you please update the 1.18 Enhancement Tracker Sheet (or let me know and I'll do so)

If doc updates are required, reminder that the placeholder PRs against k/website (branch dev-1.18) are due by Friday, Feb 28th.

Let me know if you have any questions!

@ingvagabund
Copy link
Contributor

ingvagabund commented Feb 26, 2020

@sethmccombs IIUC, given I have the doc PR opened against dev-1.18 branch (kubernetes/website#19302), there's no need to update any sheet, right?

@sethmccombs
Copy link

@ingvagabund you got it, I'll update the Enhancement tracking sheet!

@palnabarun
Copy link
Member

Hi @damemi, this a reminder that we are just two days away from Code Freeze on 5th March.

By the Code Freeze, all the relevant PR's should be merged else you would need to file an exception request.

@damemi
Copy link
Contributor

damemi commented Mar 3, 2020

@damemi
Copy link
Contributor

damemi commented Mar 3, 2020

@palnabarun actually before those 3 can merge, we need to get the KEP move approved: #1510

@palnabarun
Copy link
Member

@damemi I see that the PR's are blocked on approvals at the moment. Do you think they would make it before the deadline?

Today EOD is the Code Freeze

Please file an exception if you think the PR's might slip the deadline.

@damemi
Copy link
Contributor

damemi commented Mar 5, 2020

I think we will need more time to get the approvals, what's the process to file an exception?

@jeremyrickard
Copy link
Contributor

@damemi
Copy link
Contributor

damemi commented Mar 5, 2020

@jeremyrickard thanks, exception filed

@palnabarun
Copy link
Member

@damemi The exception request was approved. :)

@palnabarun
Copy link
Member

Hi @damemi, since this enhancement graduated to Stable this release 🚀, the status can now be set to be Implemented.

Can you please update the status? After that, we will close this issue.

@damemi
Copy link
Contributor

damemi commented Mar 23, 2020

@palnabarun sure, opened that here: #1625

@palnabarun
Copy link
Member

Thank you @damemi :)

ingvagabund pushed a commit to ingvagabund/enhancements that referenced this issue Apr 2, 2020
enhancements: support real-time kernels
@palnabarun
Copy link
Member

The corresponding enhancement has graduated to Stable. 🥳

Closing this issue.

/close

@k8s-ci-robot
Copy link
Contributor

@palnabarun: Closing this issue.

In response to this:

The corresponding enhancement has graduated to Stable. 🥳

Closing this issue.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@palnabarun
Copy link
Member

Edit: The issue description here had a 404 link. Pointed it to the correct one.

@palnabarun palnabarun added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels May 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Projects
None yet
Development

No branches or pull requests