-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rescheduler #109
Comments
Removing from 1.5 milestone. |
milestone 1.7? |
@aveshagarwal Will you be working on it for 1.7? If so, then yes we should set 1.7 milestone. |
@davidopp yes. |
done |
@davidopp @aveshagarwal I've updated the feature description to fit the new template. Please, fill the empty fields in the new template (their actual state was unclear). |
@aveshagarwal I assume we won't have any code for this in 1.7, probably just a design at most, so we should move it to next-milestone? |
@davidopp I am prototyping utilization based use case as per existing design doc in the current rescheduler code in contrib. So I am planning to have that by 1.7. But since it will be in contrib repo outside kube repo, not sure it would impact kube 1.7. |
@davidopp the one thing that I am looking into is a new priority function based on node utilization that might be needed as part of existing scheduler, so that when a rescheduler moves a pod off a over utilized node, existing scheduler can schedule that pod to less/under utilized node to be in alignment with rescheduler decision. So that is the thing that might be needed for kube 1.7 as per my current understanding for the first version of rescheduler. |
@aveshagarwal Does the rescheduler still only works for pods under |
This issue is referring to a different rescheduler than the one we currently have. The naming is unfortunate. The current rescheduler will go away once #268 is implemented. |
Good to know, thanks @davidopp |
Shouldn't we just use existing priority functions in scheduler for
rescheduler, instead of adding more? I think the first thing we should
focus on is actually spreading function for services, replicas sets,
deployments etc.
…--
Filip
On Fri, May 12, 2017 at 7:24 AM, Guang Ya Liu ***@***.***> wrote:
Good to know, thanks @davidopp <https://github.com/davidopp>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#109 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKUcdv5OSMPE_e4NbPBvXfnicZpEtdIHks5r4-0RgaJpZM4KLtjg>
.
|
Yes I am focusing on spreading use case based on node's resource utilization. |
Why not reschedule when pods suffer from performance issues instead of
rescheduling whenever nodes have high utilization?
What's the benefit of the latter?
…On May 12, 2017 7:03 AM, "Avesh Agarwal" ***@***.***> wrote:
Yes I am focusing on spreading use case based on node's resource
utilization.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#109 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGvIKE2IkK4ZCEIRra6GwXz-RkFa4_gMks5r5Ga1gaJpZM4KLtjg>
.
|
I think a benefit of the latter is to have a balanced cluster after following events, for rexample:
To reschedule a pod experiencing performance issue or poor service is also an use case that we would like to handle eventually but not as a first step. Moreover, i think if we act pro actively, perhaps a pod may not probably experience poor service in the first place. So there are various trigger that can cause a rescheduler to act like poor service as you mentioned and also node utilization and there are many others. But i think as per discussion, spreading based on node utilization seems to be the first step most users might be interested in. |
What is the rescheduling optimizing for then in the short term? Improved
bin packing?
…On Fri, May 12, 2017 at 8:26 AM, Avesh Agarwal ***@***.***> wrote:
I think a benefit of the latter is to have a balanced cluster after
following events, for rexample:
1. a node comes back from maintenance
2. auto scaling
3. over time, pods' first scheduling decision might turn out a
sub-optimal one.
To reschedule a pod experiencing performance issue or poor service is also
an use case that we would like to handle eventually but not as a first
step. Moreover, i think if we act pro actively, perhaps a pod may not
probably experience poor service in the first place.
So there are various trigger that can cause a rescheduler to act like poor
service as you mentioned and also node utilization and there are many
others. But i think as per discussion, spreading based on node utilization
seems to be the first step most users might be interested in.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#109 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGvIKEZz1zh7SRV9o7CeMUatGJOlrRFeks5r5HodgaJpZM4KLtjg>
.
|
I'd say to optimize (specifically minimize) number of over utilized nodes (x) in a cluster, such that |
Features repo should not be used for technical discussions. Please move the discussion to kubernetes/kubernetes#12140. BTW @aveshagarwal it would probably be good if you were to write a short design doc for what you're doing. |
@davidopp Yeah sure, planning to have something by next week. |
@davidopp @aveshagarwal have you agreed to have this feature for 1.7? If yes, please, update the features template to reflect the actual status. |
@aveshagarwal mentioned just one change that he might want in 1.7 #109 (comment) But Avesh, what you described sounds like the current default scheduling policy (try to spread based on resources). So maybe you don't need a new priority function in 1.7? |
@davidopp Yeah that sounds good, so in that case does not seem any changes for kube for initial version, so should not impact kube 1.7. Though, I was thinking a priority function based on actual resource utilization (like by obtaining metrics from something like heapster) which is different than how the existing spreading function works. |
We've talked about doing usage-based scheduling for best-effort pods (kubernetes/kubernetes#18438), but don't have it yet. |
@davidopp @aveshagarwal so, any update on the status, gentlemen? |
@davidopp @aveshagarwal does this feature going to land in 1.7? If not, I'll remove the 1.7 association. |
@idvoretskyi No. |
@davidopp @aveshagarwal @kubernetes/sig-scheduling-feature-requests any plans to continue the feature development for 1.9? |
There will be development in the future, but I'm not sure about 1.9. @aveshagarwal are you planning to do more work on this for 1.9? |
@davidopp @idvoretskyi yes, there will be on-going development for adding new features/functionalities and regular releases. Here is the repo: https://github.com/kubernetes-incubator/descheduler . After every kubernetes release, it will be rebased to latest kube release. |
@aveshagarwal @davidopp cool, I'll add this item for 1.9 features track. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Feature Description
# Description- Before Alpha
- Write and maintain draft quality doc
- During development keep a doc up-to-date about the desired experience of the feature and how someone can try the feature in its current state. Think of it as the README of your new feature and a skeleton for the docs to be written before the Kubernetes release. Paste link to Google Doc: DOC-LINK
- Design Approval
- Design Proposal. This goes under docs/proposals. Doing a proposal as a PR allows line-by-line commenting from community, and creates the basis for later design documentation. Paste link to merged design proposal here: PROPOSAL-NUMBER
- Decide which repo this feature's code will be checked into. Not everything needs to land in the core kubernetes repo. REPO-NAME
- Initial API review (if API). Maybe same PR as design doc. PR-NUMBER
- Any code that changes an API (
- cc
- Identify shepherd (your SIG lead and/or kubernetes-pm@googlegroups.com will be able to help you). My Shepherd is: replace.me@replaceme.com (and/or GH Handle)
- A shepherd is an individual who will help acquaint you with the process of getting your feature into the repo, identify reviewers and provide feedback on the feature. They are not (necessarily) the code reviewer of the feature, or tech lead for the area.
- The shepherd is not responsible for showing up to Kubernetes-PM meetings and/or communicating if the feature is on-track to make the release goals. That is still your responsibility.
- Identify secondary/backup contact point. My Secondary Contact Point is: replace.me@replaceme.com (and/or GH Handle)
- Write (code + tests + docs) then get them merged. ALL-PR-NUMBERS
- Code needs to be disabled by default. Verified by code OWNERS
- Minimal testing
- Minimal docs
- cc
- cc
- New apis: Glossary Section Item in the docs repo: kubernetes/kubernetes.github.io
- Update release notes
- Before Beta
- Testing is sufficient for beta
- User docs with tutorials
- Updated walkthrough / tutorial in the docs repo: kubernetes/kubernetes.github.io
- cc
- cc
- Thorough API review
- cc
- Before Stable
- docs/proposals/foo.md moved to docs/design/foo.md
- cc
- Soak, load testing
- detailed user docs and examples
- cc
- cc
- Once you get LGTM from a
- Use as many PRs as you need. Write tests in the same or different PRs, as is convenient for you.
- As each PR is merged, add a comment to this issue referencing the PRs. Code goes in the http://github.com/kubernetes/kubernetes repository,
- When you are done with the code, apply the "code-complete" label.
- When the feature has user docs, please add a comment mentioning
- Write user docs and get them merged in.
- User docs go into http://github.com/kubernetes/kubernetes.github.io.
- When the feature has user docs, please add a comment mentioning
- When you get LGTM, you can check this checkbox, and the reviewer will apply the "docs-complete" label.
A component that evicts pods (that are managed by a controller) to achieve some set of objectives.
This feature needs a detailed design doc; an initial design proposal is here.
Progress Tracker
/pkg/apis/...
)@kubernetes/api
@kubernetes/docs
on docs PR@kubernetes/feature-reviewers
on this issue to get approval before checking this off@kubernetes/docs
on docs PR@kubernetes/feature-reviewers
on this issue to get approval before checking this off@kubernetes/api
@kubernetes/feature-reviewers
on this issue to get approval before checking this off@kubernetes/docs
@kubernetes/feature-reviewers
on this issue to get approval before checking this offFEATURE_STATUS is used for feature tracking and to be updated by
@kubernetes/feature-reviewers
.FEATURE_STATUS: IN_DEVELOPMENT
More advice:
Design
@kubernetes/feature-reviewers
member, you can check this checkbox, and the reviewer will apply the "design-complete" label.Coding
and sometimes http://github.com/kubernetes/contrib, or other repos.
@kubernetes/feature-reviewers
and they willcheck that the code matches the proposed feature and design, and that everything is done, and that there is adequate
testing. They won't do detailed code review: that already happened when your PRs were reviewed.
When that is done, you can check this box and the reviewer will apply the "code-complete" label.
Docs
@kubernetes/docs
.The text was updated successfully, but these errors were encountered: