You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 5, 2020. It is now read-only.
This is somewhat of an unknown -- just behavior that should be validated prior to running a rescheduler and possibly removing the critical annotation from scheduler/controller-manager.
If we mark the scheduler as "critical", for example, then it happens to get in a state where a pod is unscheduled because all schedulers are down (has happened when all schedulers end up co-located on same node and that node dies) -- will the rescheduler start evicting other workloads?
This would be pretty dangerous as it would likely just keep evicting workloads even though it will never be scheduled (no scheduler to schedule it).
Ideally the rescheduler would only evict based on "known" scheduling failures (e.g. no more allocatable CPU), not just on "can't schedule" -- but I'm unsure of the exact behavior.
We don't currently deploy the rescheduler, but we should evaluate this behavior before doing so (possibly remove the critical pod annotation from controller-manager / scheduler).
aaronlevy
changed the title
Using rescheduler and marking controller-manager/scheduler as critical might be dangerous
Remove critical-pod annotation from scheduler and controller-manager
Aug 16, 2017
This is somewhat of an unknown -- just behavior that should be validated prior to running a rescheduler and possibly removing the critical annotation from scheduler/controller-manager.
If we mark the scheduler as "critical", for example, then it happens to get in a state where a pod is unscheduled because all schedulers are down (has happened when all schedulers end up co-located on same node and that node dies) -- will the rescheduler start evicting other workloads?
This would be pretty dangerous as it would likely just keep evicting workloads even though it will never be scheduled (no scheduler to schedule it).
Ideally the rescheduler would only evict based on "known" scheduling failures (e.g. no more allocatable CPU), not just on "can't schedule" -- but I'm unsure of the exact behavior.
We don't currently deploy the rescheduler, but we should evaluate this behavior before doing so (possibly remove the critical pod annotation from controller-manager / scheduler).
xref: kubernetes-retired/bootkube#519
The text was updated successfully, but these errors were encountered: