-
Notifications
You must be signed in to change notification settings - Fork 994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Considering best-effort pods when calculating ready task number #647
Conversation
Hey @sivanzcw, TravisBuddy Request Identifier: 361624e0-2796-11ea-8215-356b6c757fb9 |
pkg/scheduler/framework/session.go
Outdated
@@ -297,6 +297,15 @@ func (ssn *Session) Allocate(task *api.TaskInfo, hostname string) error { | |||
return err | |||
} | |||
} | |||
} else { | |||
// For best-effort pod, it it not necessary to consider the gang constraint. | |||
if task.InitResreq.IsEmpty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's happen if there not enough resources for the other Pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are six pods in a job, and the minA
is 6
, two of the the six pods are best-effort pods, and the others are non-best-effort pods. The best-effort pods will be dispatched succeed and refreshed to running
while the non-best-effort pods will be remain pending
if there are not enough resources to satisfy the resoure requests of the job in the cluster.
Hey @sivanzcw, TravisBuddy Request Identifier: 5e544e60-27c7-11ea-8215-356b6c757fb9 |
Hey @sivanzcw, TravisCI finished with status TravisBuddy Request Identifier: 986c21e0-27c7-11ea-8215-356b6c757fb9 |
Hey @sivanzcw, TravisBuddy Request Identifier: de043ca0-27cd-11ea-8215-356b6c757fb9 |
@@ -231,6 +231,10 @@ func (alloc *allocateAction) Execute(ssn *framework.Session) { | |||
|
|||
if ssn.JobReady(job) { | |||
stmt.Commit() | |||
} else if ssn.JobCandidateReady(job) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's not necessary to add a new callback; that should ok to check best-effort pods in JobReadyFn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the judgement of the best-effort pod is merged into the JobReady
fuction, when JobReady
is satisfied, the dispatch of non-best-effort pod will be started, which will cause the non-best-effort pod of the job to be bound to the node first. But at this time, the gang constraint of the job is not satisfied. Maybe the remaining best-effort pods under the job can not be successfully scheduled during the backfill
action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we should dispatch all pods of job in allocate action, including best-effort pod :)
Hey @sivanzcw, TravisBuddy Request Identifier: 7cdb45c0-2a40-11ea-8495-6f9e5f4a4dcf |
pkg/scheduler/api/job_info.go
Outdated
for _, task := range ji.Tasks { | ||
if task.InitResreq.IsEmpty() { | ||
occupied++ | ||
} else if AllocatedStatus(task.Status) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if so, we're going to go through all pending tasks for this. Use TaskStatusIndex
to check allocated tasks, and find best-effort pods in pending status.
Hey @sivanzcw, TravisCI finished with status TravisBuddy Request Identifier: 8daca410-2aaa-11ea-8495-6f9e5f4a4dcf |
Hey @sivanzcw, TravisBuddy Request Identifier: 2ef65d10-2abb-11ea-b3af-3392aa6c8569 |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: k82cn, sivanzcw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
#646