-
Notifications
You must be signed in to change notification settings - Fork 264
Enable PDB based gang scheduling with discrete queues. #465
Enable PDB based gang scheduling with discrete queues. #465
Conversation
…-batch into amarek-pdb-queue
@@ -49,6 +61,7 @@ func (s *ServerOption) AddFlags(fs *pflag.FlagSet) { | |||
fs.StringVar(&s.SchedulerName, "scheduler-name", "kube-batch", "kube-batch will handle pods with the scheduler-name") | |||
fs.StringVar(&s.SchedulerConf, "scheduler-conf", "", "The namespace and name of ConfigMap for scheduler configuration") | |||
fs.StringVar(&s.SchedulePeriod, "schedule-period", "1s", "The period between each scheduling cycle") | |||
fs.StringVar(&s.PdbQueue, "pdb-queue", "", "The name of the Queue object to be used with PDBs instead of their namespace name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing to add a parameter, prefer to add an annotation to PDB for queue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example scheduling.k8s.io/queue-name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with annotation, you can create more queues for different PDBs :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specific use case I'm looking at is co-scheduling tf-operator jobs with other multi-node workloads (PodGroup/Queue based). Fully agree that annotations are more flexible, however if we go that route then tf-operator needs to be extended to add said annotations to its PDBs (either directly or via a mutating admission controller) which is problematic for existing deployments.
So we could consider this PR as a stop gap for environments where PDBs cannot be easily equipped with additional annotations (for whatever reason).
Separately the longer term strategy should be decided: either introduce the annotations and extend the tf-operator's PDBs to include them or switch the latter to PodGroups entirely (since with annotations we would introduce kube-batch specific concepts anyway)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, for tf-operator
case, I'd like to switch it to PodGroups
; but we need to align with tf-operator team and sig-scheduling for it. How about open an issue in kubeflow/tf-operator
for it firstly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, you did not use multiple queues, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no - my use case is based on a single queue
and I agree that switching to PodGroups is the way to go for tf-operator and I can definitely open an issue there, however the way things stand currently, tf-operator gang scheduling doesn't work if queue as namespace
is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we switch to PodGroup
, we still need to provide default queue name for your case (similar to this PR); otherwise, tf-operator need to have a field for Queue
.
Queue
is an experimental feature right now, which means you can have a try, but other community should not dependent on it.
Because of above reasons, maybe your PR is better as a temp solution; WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, prefer DefaultQueue
instead of PdbQueue
; so when we remove PDB, it also works for PodGroup
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, will amend
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adam-marek, k82cn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…queue Enable PDB based gang scheduling with discrete queues.
…queue Enable PDB based gang scheduling with discrete queues.
…queue Enable PDB based gang scheduling with discrete queues.
What this PR does / why we need it:
Currently it's not possible to schedule PDB based gangs when the
enable-namespace-as-queue
option is disabled. This PR introduces an option to set the name of the queue to be used with PDB based jobs via the command line (if not specified the name of the namespace is still used).Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note: