Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scale up with no assigned node #467

Merged
merged 12 commits into from
Nov 3, 2017

Conversation

Yancey1989
Copy link
Collaborator

Fixed #465

@Yancey1989 Yancey1989 changed the title Assign node resource Fix scale up with no assigned node Nov 2, 2017
func syncNodesIdleResource(podList *v1.PodList, nodesCPUIdleMilli map[string]int64, nodesMemoryIdleMega map[string]int64) (err error) {
for _, pod := range podList.Items {
nodeName := pod.Spec.NodeName
// if nodeName is empty, the pod is not assigned.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems not needed, the below code is clear enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -282,6 +309,11 @@ func scaleDryRun(r *ClusterResource, j job, curDiff int, maxLoadDesired float64,
return
}

if !searchAssignableNodeByCPU(r, j) || !searchAssignableNodeByMem(r, j) {
// can not find assignable node, do not scale
additional = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to return immediately from here, or else the code below may be executed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

assignable = false
for _, idle := range r.NodesCPUIdleMilli {
if j.TrainerCPURequestMilli() <= idle {
assignable = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In go, if return value is declared with a var name, return will return the var name directly, this is useful when we need to declare the return value at the begining of the function, like var ret bool.

In your case, just define return type bool and return false or return true will do the job.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, got it. Done.


NodesCPUIdleMilli: nodesCPUIdleMilli,
NodesMemoryIdleMega: nodesMemoryIdleMega,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We say idle cpu and free memory, not idle memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if nodeName == "" {
continue
}
for _, container := range pod.Spec.Containers {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check only pending and running pods?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only Terminating and Running Pod will take up the resource.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, so maybe we need to check only "Terminating and Running" pod? :)

Copy link
Collaborator Author

@Yancey1989 Yancey1989 Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, follow the comment https://github.com/kubernetes/kubernetes/blob/v1.6.2/pkg/api/v1/types.go#L2331, if NodeName is non-empty, the Pod will be assigned to the Node, mean take up the resource on the Node.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so it's not necessary to check pod status, since as long as pod.Spec.NodeName is not empty, it's assigned to a node, and occupying resource.

@helinwang
Copy link
Collaborator

LGTM after the test passes.

Copy link
Collaborator

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM++

@helinwang helinwang merged commit 7f6645c into PaddlePaddle:develop Nov 3, 2017
@@ -505,7 +498,9 @@ func (a *Autoscaler) Monitor() {
continue
}
j.TrainerJob = tj
a.jobs[key] = j
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to write back to the map, j.TrainerJob = tj is not sufficient.

Copy link
Collaborator

@typhoonzero typhoonzero Nov 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine the code later is fine. Also, the scaling algorithm should be updated like:

  1. ScaleDryRun search in order to find a sufficient node to scale up 1 pod, return the diff.
  2. Cluster total resource is not used for the scale-up, use only per node resource.
  3. Find a way to ensure max_load_desired for each node.

I'll add a design doc for details.

Copy link
Collaborator

@helinwang helinwang Nov 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @typhoonzero !

Btw, maybe max_load_desired is for the entire cluster? Otherwise if it's for each node on max_load_desired=0.97, each node will have 0.03*12=0.36 CPUs, probably many pod can not be scheduled on it anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants