Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Split higher level cell when allocated bad cells #27

Merged
merged 11 commits into from
Jul 15, 2020

Conversation

abuccts
Copy link
Member

@abuccts abuccts commented Jul 8, 2020

When buddy allocation failed due to bad cells, try to split a higher level cell to get current level cells.

When buddy allocation failed due to bad cells,
try to split a higher level cell to get current level cells.
@abuccts abuccts requested review from zhypku and yqwang-ms July 8, 2020 07:37
@fanyangCS fanyangCS mentioned this pull request Jul 8, 2020
4 tasks
pkg/algorithm/cell.go Outdated Show resolved Hide resolved
Copy link
Contributor

@zhypku zhypku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add UT?

@fanyangCS fanyangCS requested a review from hzhua July 8, 2020 08:37
Fix deletion errors.
@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2020

Codecov Report

Merging #27 into master will increase coverage by 0.13%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #27      +/-   ##
==========================================
+ Coverage   88.92%   89.06%   +0.13%     
==========================================
  Files           8        8              
  Lines        2177     2231      +54     
==========================================
+ Hits         1936     1987      +51     
- Misses        190      192       +2     
- Partials       51       52       +1     
Flag Coverage Δ
#unittests 89.06% <ø> (+0.13%) ⬆️
Impacted Files Coverage Δ
...ft/hivedscheduler/pkg/algorithm/hived_algorithm.go 85.25% <0.00%> (+0.06%) ⬆️
...cheduler/pkg/algorithm/topology_aware_scheduler.go 91.51% <0.00%> (+0.07%) ⬆️
...ft/hivedscheduler/pkg/algorithm/cell_allocation.go 93.27% <0.00%> (+0.11%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4372c8c...fa1912d. Read the comment docs.

Copy link
Contributor

@hzhua hzhua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable "l" might be confusing. I suggest to use "checkingLevel".

Comment on lines 94 to 95
// check whether it is safe to split a higher level cell to get current level cells.
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, l CellLevel) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// check whether it is safe to split a higher level cell to get current level cells.
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, l CellLevel) bool {
// check whether it is safe to split a higher level cell to get a cell at the checking level.
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, checkingLevel CellLevel) bool {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to currentLevel as the same in buddyAlloc

}
}
// if there exists a higher level cell with splitable num > 0
for l++; l <= CellLevel(len(freeList)); l++ {
Copy link
Contributor

@hzhua hzhua Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for l++; l <= CellLevel(len(freeList)); l++ {
for l := CellLevel(checkingLevel)+1 ; l <= CellLevel(len(freeList)); l++ {

// check safety
if splitableNum[i] < 0 {
return false
}
Copy link
Member

@yqwang-ms yqwang-ms Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will splitableNum[i] < 0?
Seems we already safety break, and the whole scheduler should crash. #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a test case, will change to panic later

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print more debug info when VC Safety Broken


In reply to: 451561941 [](ancestors = 451561941)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

var splitableCell Cell
splitableNum := map[CellLevel]int32{}
for i := CellLevel(len(freeList)); i >= CellLevel(1); i-- {
// calculate splitable number
Copy link
Member

@yqwang-ms yqwang-ms Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculate splitable number [](start = 5, length = 26)

Seems we can early stop the "calculate splitable number" to l+1 #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check safety to panic

Copy link
Member

@yqwang-ms yqwang-ms Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Background: The panic will only fail current pod schedule, instead of exit the whole process)
Lower level safety problems seems does not necessary to fail current allocation, do we need to panic here?
@zhypku any suggestion? #Closed

Copy link
Member

@yqwang-ms yqwang-ms Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems current behaiour is: even if there are safety broken, but if the broken does not impact current pod scheduling, the pod can still be scheduled.
If so, we can insist this behaiour. #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

early stop at current level + 1

// when buddyAlloc allocates bad cells,
// check whether it is safe to split a higher level cell to get current level cells.
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, l CellLevel) bool {
var splitableCell Cell
Copy link
Member

@yqwang-ms yqwang-ms Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitable [](start = 5, length = 9)

typo: splitable -> splittable #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

freeList, c.cell.GetLevel()), suggestedNodes, ignoreSuggestedNodes, bindings) {
return false
l := getLowestFreeCellLevel(freeList, c.cell.GetLevel())
Copy link
Member

@yqwang-ms yqwang-ms Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getLowestFreeCellLevel(freeList, c.cell.GetLevel()) [](start = 8, length = 51)

As buddyAlloc may modify the freeList too.
Should we reuse the getLowestFreeCellLevel instead of calling here again?
Such as:

l := getLowestFreeCellLevel(freeList, c.cell.GetLevel())
buddyAlloc(..., l, ...)
Infof(...l)
freeList[l] = CellList{}
``` #Closed

freeList, c.cell.GetLevel()), suggestedNodes, ignoreSuggestedNodes, bindings) {
return false
l := getLowestFreeCellLevel(freeList, c.cell.GetLevel())
klog.Infof("Buddy allocation failed due to bad cells, removing level %v free list: %v", l, freeList[l])
Copy link
Member

@yqwang-ms yqwang-ms Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buddy allocation failed due to bad cells, removing level %v free list: %v [](start = 15, length = 73)

Buddy allocation failed due to bad cells for cell level %v with free list %v, skip the level and try to split higher level cells #Closed

@@ -89,21 +90,59 @@ func getLowestFreeCellLevel(freeList ChainCellList, l CellLevel) CellLevel {
"even split to the highest level %v", l-1))
}

// when buddyAlloc allocates bad cells,
Copy link
Member

@yqwang-ms yqwang-ms Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when buddyAlloc allocates bad cells, [](start = 3, length = 36)

after buddyAlloc failed to allocate cells, ... #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

abuccts added 3 commits July 9, 2020 17:03
Add allocate function to split higher level cells.
Fix test.
Add test case.
var splittableCell Cell
splittableNum := map[CellLevel]int32{}
for i := CellLevel(len(freeList)); i >= CellLevel(1); i-- {
// calculate splitable number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitable -> splittable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

pkg/algorithm/cell_allocation.go Show resolved Hide resolved
@@ -78,6 +79,70 @@ func buddyAlloc(
return false
}

// after buddyAlloc failed to allocate cells,
// try to split a higher level cell to get current level cells.
Copy link
Contributor

@zhypku zhypku Jul 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to split a higher level cell to get current level cells when it won't break safety guarantee?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used Fan's comment

}
// check safety
if splittableNum[i] < 0 {
panic(fmt.Sprintf("VC Safety Broken: level %v cell is unsplittable", i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should print cell address in the panic log as well, as we cannot see cell chain in this function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

freeList[l] = freeList[l][cellNum:]
splittableNum[l] -= cellNum
for sl := l; sl > currentLevel; sl-- {
for _, sc := range splitList {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length of splitList is changed during the iteration. Will this impact the correctness of the iteration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, range uses copy of the array, here's an explanation

Copy link
Member

@yqwang-ms yqwang-ms Jul 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your example will not change the slice itself.
current operation is dangerous, pls use index instead #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -913,10 +914,15 @@ func (h *HivedAlgorithm) scheduleGuaranteedAffinityGroup(
common.SortInt32(gpuNums)
lazyPreemptedGroups := h.tryLazyPreempt(virtualPlacement, gpuNums, sr.affinityGroupName)
preassignedCells, nonPreassignedCells := virtualPlacement.toBindingPaths(gpuNums, bindings)
freeCellNum := map[CellLevel]int32{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

freeCellNumCopy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and maybe comment why we need a copy (we will make temporary changes to the data structure in the below function call)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -78,6 +79,70 @@ func buddyAlloc(
return false
}

// after buddyAlloc failed to allocate cells,
// try to split a higher level cell to get current level cells.
func splitAlloc(
Copy link
Contributor

@zhypku zhypku Jul 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we need a clearer and easier-to-understand name for this function...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't come up with a good name, any suggestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hzhua @yqwang-ms any suggestion? 😂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "badCellAvoidanceAlloc"?

Copy link
Member

@yqwang-ms yqwang-ms Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we also aware bad nodes in buddyAlloc. The only difference is that splitAlloc can try more. How about relaxedBuddyAlloc? #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the perspective of the rule of the algorithm, it indeed is a relaxed version of BuddyAlloc, as it allows to split a free cell higher than the lowest level. Maybe safeRelaxedBuddyAlloc?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhypku, sounds good to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to safeRelaxedBuddyAlloc

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that a safe split will trigger the merge operation of buddyAlloc? e.g., suppose a L2 cell's L1 cells are all bad cells. the safeRelaxedBuddyAlloc will split the L2 cell (named A), and go ahead to split another L2 cell (B). Since all the L1 cells of A are free cells (albeit bad), the BuddyAlloc will merge them into a L2 cell (i.e., A).
My question is, when will buddyAlloc do the merging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The free list for cell allocation is not persistent. It is just a copy of the ground truth free list. So even safeRelaxedBuddyAlloc splits a cell, it is not effective immediately. It is effective when we confirm this allocation. In your example, what we truly allocate is an L1 cell from B. So we only split B in the ground truth free list, and won't split A.

} else {
compareSchedulingResult(t, pod, psr)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about adding a case where it could not split but it does?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

}
if cellNum > 0 {
splitList := freeList[l][:cellNum]
freeList[l] = freeList[l][cellNum:]
Copy link
Member

@yqwang-ms yqwang-ms Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

freeList[l] = freeList[l][cellNum:] [](start = 3, length = 35)

when expand splitList, freeList may be overwritten #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

splittableNum[l] -= cellNum
for sl := l; sl > currentLevel; sl-- {
for _, sc := range splitList {
splitList = append(splitList[1:], sc.GetChildren()...)
Copy link
Member

@yqwang-ms yqwang-ms Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitList[1:] [](start = 24, length = 13)

Be aware subslice mem leak #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

abuccts added 2 commits July 13, 2020 16:59
Resolve comments.
Add free list in panic log when safety is broken.
copy(splitList[:], splitList[1:])
splitList[len(splitList)-1] = nil
splitList = splitList[:len(splitList)-1]
}
Copy link
Member

@yqwang-ms yqwang-ms Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too complex, why not use a new slice to store children? Then do 1 swap. #Closed

cellNum = splittableNum[l]
}
if cellNum > 0 {
splitList := CellList{}
Copy link
Member

@yqwang-ms yqwang-ms Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitList [](start = 3, length = 9)

splittableList? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the list is used to really split, not splittable only...

for i := len(splitList); i > 0; i-- {
splitList = append(splitList, splitList[0].GetChildren()...)
copy(splitList[:], splitList[1:])
splitList[len(splitList)-1] = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why copy splitList[1:]? This is equivalent:
splitList[0] = nil
splitList = splitList[1:]

@@ -589,6 +614,7 @@ func TestHivedAlgorithm(t *testing.T) {
testSuggestedNodes(t, configFilePath)
testStatefulPreemption(t, configFilePath)
testBadNodes(t, configFilePath)
testSplitAlloc(t, configFilePath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change name accordingly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

abuccts added 2 commits July 13, 2020 19:03
Add test case when unable to split due to safety guarantee.
Update.
@yqwang-ms
Copy link
Member

yqwang-ms commented Jul 13, 2020

@zhypku will below leak?:

func removePickedGpus(gpus CellList, indices []int32) CellList {
	for i, index := range indices {
		offset := int32(i)
		if i < len(indices)-1 {
			nextIndex := indices[i+1]
			copy(gpus[index-offset:nextIndex-offset-1], gpus[index+1:nextIndex])
		} else {
			copy(gpus[index-offset:], gpus[index+1:])
		}
	}
	return gpus[:len(gpus)-len(indices)]
}
``` #Closed

@zhypku
Copy link
Contributor

zhypku commented Jul 13, 2020

@zhypku will below leak?:

func removePickedGpus(gpus CellList, indices []int32) CellList {
	for i, index := range indices {
		offset := int32(i)
		if i < len(indices)-1 {
			nextIndex := indices[i+1]
			copy(gpus[index-offset:nextIndex-offset-1], gpus[index+1:nextIndex])
		} else {
			copy(gpus[index-offset:], gpus[index+1:])
		}
	}
	return gpus[:len(gpus)-len(indices)]
}

Ah, seems it will. Can @abuccts help fix it? Just insert the code before the return statement:

for i := len(gpus) - len(indices); i < len(gpus); i++ {
	gpus[i] = nil
}

abuccts added 2 commits July 14, 2020 10:33
Fix memory leak in `removePickedGpus`.
Early stop safety check at current level.
Copy link
Member

@yqwang-ms yqwang-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Member

@yqwang-ms yqwang-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@abuccts abuccts merged commit 76ed604 into master Jul 15, 2020
@abuccts abuccts deleted the xiongyf/handle-bad-cell branch July 15, 2020 05:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants