Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network-topology-aware plugin and hyperNode score callback #3894

Open
wants to merge 3 commits into
base: network-topology
Choose a base branch
from

Conversation

ecosysbin
Copy link

No description provided.

@volcano-sh-bot
Copy link
Contributor

Welcome @ecosysbin!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

@volcano-sh-bot volcano-sh-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 16, 2024
@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 16, 2024
@ecosysbin ecosysbin changed the title [WIP]Add network-topology-aware plugin and hyperNode score callback #3885 Add network-topology-aware plugin and hyperNode score callback #3885 Dec 16, 2024
@volcano-sh-bot volcano-sh-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 16, 2024
@ecosysbin
Copy link
Author

/assign @Monokaix

@ecosysbin ecosysbin force-pushed the network-topology branch 2 times, most recently from 2f15078 to 947107e Compare December 16, 2024 11:41

// The RootHypernode property of a job is the hypernode that serves as the smallest root in the hypernode tree.
// A job has multiple tasks, each belonging to a hypernode. This RootHypernode is the topmost and lowest common ancestor among the hypernodes of all tasks within the job.
RootHyperNode string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is RootHyperNode assigned? I didn't find any codes assign the value to RootHyperNode

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the task of job commit, the hyperNode of the task will assigned to job. it will add the code later.

@@ -300,6 +301,9 @@ type NodeReduceFn func(*TaskInfo, k8sframework.NodeScoreList) error
// NodeOrderMapFn is the func declaration used to get priority score of all plugins for a node for a particular task.
type NodeOrderMapFn func(*TaskInfo, *NodeInfo) (map[string]float64, float64, error)

// HyperNodeOrderFn is the func declaration used to score hyperNodes for job.
type HyperNodeOrderFn func(*JobInfo, map[string][]*NodeInfo, []int, map[int][]string, map[string]sets.Set[string]) (map[string]float64, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to rebase this pr and factor your code:https://github.com/volcano-sh/volcano/pull/3874/files, the HyperNodeOrderFn structure has changed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. l will do it later.

// hyperNode can gain a better performance, the lower the tier of hyperNode, the better performance.
HyperNodesListByTier map[int][]string

HyperNodesTiers []int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, there are no so many fields now, may need to rebase and refactor: https://github.com/volcano-sh/volcano/pull/3874/files

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. According to the discussion, there will have a hyperNode tree here. it will do in another PR.


// Goals:
// - The tier to which the rootHypernode of a job belongs should be as low as possible.
func networkTopologyAwareScore(hyperNode string, job *api.JobInfo, hyperNodesTiers []int, hyperNodesListByTier map[int][]string, hyperNodesMap map[string]sets.Set[string]) float64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t get the point for this score func. Does this have a design plan or a picture? After refactoring based on this PR: https://github.com/volcano-sh/volcano/pull/3874/files, could you add more details or description about how to score hypernodes in your pr?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanation:
The RootHypernode property of a job is the hypernode that serves as the smallest root in the hypernode
tree.
A job has multiple tasks, each belonging to a hypernode. This RootHypernode is the topmost and lowest common ancestor among the hypernodes of all tasks within the job.

Goals:
The tier to which the rootHypernode of a job belongs should be as low as possible.

I will change the name 'RootHypernode' to 'LCAHyperNode' later.

@volcano-sh-bot volcano-sh-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 18, 2024
@ecosysbin ecosysbin changed the title Add network-topology-aware plugin and hyperNode score callback #3885 Add network-topology-aware plugin and hyperNode score callback Dec 18, 2024
@ecosysbin ecosysbin force-pushed the network-topology branch 6 times, most recently from d1da751 to 414e547 Compare December 25, 2024 02:40
@volcano-sh-bot volcano-sh-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 27, 2024
Signed-off-by: Monokaix <changxuzheng@huawei.com>
Signed-off-by: Monokaix <changxuzheng@huawei.com>
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign monokaix
You can assign the PR to them by writing /assign @monokaix in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 27, 2024
@ecosysbin ecosysbin force-pushed the network-topology branch 2 times, most recently from 889afed to b9a291b Compare December 27, 2024 06:49
// - Tasks under a job should be scheduled to one hyperNode as much as possible.
func networkTopologyAwareScore(hyperNodeName string, job *api.JobInfo, hyperNodeTree []map[string][]string) float64 {
jobHyperNode := job.PodGroup.GetAnnotations()[api.TopologyAllocateLCAHyperNode]
// job fist first scheduler.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> //job is scheduled for the first time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// HyperNodeTree is the hypernode tree of all hypernodes in the cluster.
// currentJobLCAHyperNode is the hypernode of the job's LCAHyperNode.
var (
HyperNodeTree []map[string][]string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to maintain a separate global variable here? And this data structure does not seem to be called HyperNodeTree, it does not reflect hierarchical relationships.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l have move the hyperNodeTree to session, According to the discussion, this variable will be replaced by Gu Peng's HyperNodeTree later.

return taskCount
}

// FindOutRootHyperNode find out the root hypernode of the job when the hypernode join the job.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> //FindLCAHyperNode finds out the common ancestor of the current hypernode and the hypernode where the job is scheduled

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}

func TestFindLCAHyperNode(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the use cases specify currentJobLCAHyperNode?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

hyperNodesMap := make(map[string]sets.Set[string])
for i := 0; i < len(revertHyperNodeTree); i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract a public function, has the same code in FindLCAHyperNode

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// PluginName indicates name of volcano scheduler plugin.
PluginName = "networktopologyaware"
BaseScore = 100
TaskBaseScore = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 10 points reasonable? need to discuss /cc @Monokaix

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure that the scores for the task num are distributed between 0 and 10, so that the scores for the tier level will be higher than those for the task num.

@@ -127,8 +126,10 @@ func openSession(cache cache.Cache) *Session {

TotalResource: api.EmptyResource(),
TotalGuarantee: api.EmptyResource(),
podGroupStatus: map[api.JobID]scheduling.PodGroupStatus{},

PodGroupCache: api.PodGroupCache{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to rename as PodGroupOldState, which is used to record some old PodGroup data.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

job.PodGroup = pg
}

sc.Jobs[job.UID] = job
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to update here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1536,9 +1537,12 @@ func (sc *SchedulerCache) UpdateJobStatus(job *schedulingapi.JobInfo, updatePG b
if err != nil {
return nil, err
}
sc.Mutex.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no need to add lock? session --> UpdateJobStatus is still in the same goroutine, and there is no other goroutine will read or write this map

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the discussion, locks need to be added for all operations on the cache.

@@ -245,6 +247,14 @@ func (alloc *Action) allocateResourceForTasksWithTopology(tasks *util.PriorityQu
break
}
for _, hyperNodeName := range ssn.HyperNodesListByTier[tier] {
if jobHyperNode == "" {
// job first scheduler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> //job is scheduled for the first time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ecosysbin ecosysbin force-pushed the network-topology branch 9 times, most recently from 6c6b2b7 to bbccaee Compare December 28, 2024 04:27
Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>

combine PodGroupStatus and PodGroupAnnotations

Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>

rebase with pr:8704

Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>

rebase with pr:8704

Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>

update networktopologyaware score

Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants