-
Notifications
You must be signed in to change notification settings - Fork 38
Split higher level cell when allocated bad cells #27
Conversation
When buddy allocation failed due to bad cells, try to split a higher level cell to get current level cells.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add UT?
Fix deletion errors.
Codecov Report
@@ Coverage Diff @@
## master #27 +/- ##
==========================================
+ Coverage 88.92% 89.06% +0.13%
==========================================
Files 8 8
Lines 2177 2231 +54
==========================================
+ Hits 1936 1987 +51
- Misses 190 192 +2
- Partials 51 52 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable "l" might be confusing. I suggest to use "checkingLevel".
pkg/algorithm/cell_allocation.go
Outdated
// check whether it is safe to split a higher level cell to get current level cells. | ||
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, l CellLevel) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// check whether it is safe to split a higher level cell to get current level cells. | |
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, l CellLevel) bool { | |
// check whether it is safe to split a higher level cell to get a cell at the checking level. | |
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, checkingLevel CellLevel) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to currentLevel
as the same in buddyAlloc
pkg/algorithm/cell_allocation.go
Outdated
} | ||
} | ||
// if there exists a higher level cell with splitable num > 0 | ||
for l++; l <= CellLevel(len(freeList)); l++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for l++; l <= CellLevel(len(freeList)); l++ { | |
for l := CellLevel(checkingLevel)+1 ; l <= CellLevel(len(freeList)); l++ { |
// check safety | ||
if splitableNum[i] < 0 { | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will splitableNum[i] < 0?
Seems we already safety break, and the whole scheduler should crash. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a test case, will change to panic later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
pkg/algorithm/cell_allocation.go
Outdated
var splitableCell Cell | ||
splitableNum := map[CellLevel]int32{} | ||
for i := CellLevel(len(freeList)); i >= CellLevel(1); i-- { | ||
// calculate splitable number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calculate splitable number [](start = 5, length = 26)
Seems we can early stop the "calculate splitable number" to l+1 #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check safety to panic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Background: The panic will only fail current pod schedule, instead of exit the whole process)
Lower level safety problems seems does not necessary to fail current allocation, do we need to panic here?
@zhypku any suggestion? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems current behaiour is: even if there are safety broken, but if the broken does not impact current pod scheduling, the pod can still be scheduled.
If so, we can insist this behaiour. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
early stop at current level + 1
pkg/algorithm/cell_allocation.go
Outdated
// when buddyAlloc allocates bad cells, | ||
// check whether it is safe to split a higher level cell to get current level cells. | ||
func checkSplitSafety(freeList ChainCellList, freeCellNum map[CellLevel]int32, l CellLevel) bool { | ||
var splitableCell Cell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitable [](start = 5, length = 9)
typo: splitable -> splittable #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
pkg/algorithm/cell_allocation.go
Outdated
freeList, c.cell.GetLevel()), suggestedNodes, ignoreSuggestedNodes, bindings) { | ||
return false | ||
l := getLowestFreeCellLevel(freeList, c.cell.GetLevel()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLowestFreeCellLevel(freeList, c.cell.GetLevel()) [](start = 8, length = 51)
As buddyAlloc may modify the freeList too.
Should we reuse the getLowestFreeCellLevel instead of calling here again?
Such as:
l := getLowestFreeCellLevel(freeList, c.cell.GetLevel())
buddyAlloc(..., l, ...)
Infof(...l)
freeList[l] = CellList{}
``` #Closed
pkg/algorithm/cell_allocation.go
Outdated
freeList, c.cell.GetLevel()), suggestedNodes, ignoreSuggestedNodes, bindings) { | ||
return false | ||
l := getLowestFreeCellLevel(freeList, c.cell.GetLevel()) | ||
klog.Infof("Buddy allocation failed due to bad cells, removing level %v free list: %v", l, freeList[l]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buddy allocation failed due to bad cells, removing level %v free list: %v [](start = 15, length = 73)
Buddy allocation failed due to bad cells for cell level %v with free list %v, skip the level and try to split higher level cells #Closed
pkg/algorithm/cell_allocation.go
Outdated
@@ -89,21 +90,59 @@ func getLowestFreeCellLevel(freeList ChainCellList, l CellLevel) CellLevel { | |||
"even split to the highest level %v", l-1)) | |||
} | |||
|
|||
// when buddyAlloc allocates bad cells, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when buddyAlloc allocates bad cells, [](start = 3, length = 36)
after buddyAlloc failed to allocate cells, ... #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Add allocate function to split higher level cells.
Add test case.
pkg/algorithm/cell_allocation.go
Outdated
var splittableCell Cell | ||
splittableNum := map[CellLevel]int32{} | ||
for i := CellLevel(len(freeList)); i >= CellLevel(1); i-- { | ||
// calculate splitable number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitable -> splittable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
pkg/algorithm/cell_allocation.go
Outdated
@@ -78,6 +79,70 @@ func buddyAlloc( | |||
return false | |||
} | |||
|
|||
// after buddyAlloc failed to allocate cells, | |||
// try to split a higher level cell to get current level cells. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to split a higher level cell to get current level cells when it won't break safety guarantee?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
used Fan's comment
pkg/algorithm/cell_allocation.go
Outdated
} | ||
// check safety | ||
if splittableNum[i] < 0 { | ||
panic(fmt.Sprintf("VC Safety Broken: level %v cell is unsplittable", i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should print cell address in the panic log as well, as we cannot see cell chain in this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
freeList[l] = freeList[l][cellNum:] | ||
splittableNum[l] -= cellNum | ||
for sl := l; sl > currentLevel; sl-- { | ||
for _, sc := range splitList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length of splitList is changed during the iteration. Will this impact the correctness of the iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, range
uses copy of the array, here's an explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your example will not change the slice itself.
current operation is dangerous, pls use index instead #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
pkg/algorithm/hived_algorithm.go
Outdated
@@ -913,10 +914,15 @@ func (h *HivedAlgorithm) scheduleGuaranteedAffinityGroup( | |||
common.SortInt32(gpuNums) | |||
lazyPreemptedGroups := h.tryLazyPreempt(virtualPlacement, gpuNums, sr.affinityGroupName) | |||
preassignedCells, nonPreassignedCells := virtualPlacement.toBindingPaths(gpuNums, bindings) | |||
freeCellNum := map[CellLevel]int32{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
freeCellNumCopy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and maybe comment why we need a copy (we will make temporary changes to the data structure in the below function call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
pkg/algorithm/cell_allocation.go
Outdated
@@ -78,6 +79,70 @@ func buddyAlloc( | |||
return false | |||
} | |||
|
|||
// after buddyAlloc failed to allocate cells, | |||
// try to split a higher level cell to get current level cells. | |||
func splitAlloc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we need a clearer and easier-to-understand name for this function...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't come up with a good name, any suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hzhua @yqwang-ms any suggestion? 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "badCellAvoidanceAlloc"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we also aware bad nodes in buddyAlloc. The only difference is that splitAlloc can try more. How about relaxedBuddyAlloc? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the perspective of the rule of the algorithm, it indeed is a relaxed version of BuddyAlloc, as it allows to split a free cell higher than the lowest level. Maybe safeRelaxedBuddyAlloc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhypku, sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to safeRelaxedBuddyAlloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that a safe split will trigger the merge operation of buddyAlloc? e.g., suppose a L2 cell's L1 cells are all bad cells. the safeRelaxedBuddyAlloc will split the L2 cell (named A), and go ahead to split another L2 cell (B). Since all the L1 cells of A are free cells (albeit bad), the BuddyAlloc will merge them into a L2 cell (i.e., A).
My question is, when will buddyAlloc do the merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The free list for cell allocation is not persistent. It is just a copy of the ground truth free list. So even safeRelaxedBuddyAlloc splits a cell, it is not effective immediately. It is effective when we confirm this allocation. In your example, what we truly allocate is an L1 cell from B. So we only split B in the ground truth free list, and won't split A.
} else { | ||
compareSchedulingResult(t, pod, psr) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about adding a case where it could not split but it does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
pkg/algorithm/cell_allocation.go
Outdated
} | ||
if cellNum > 0 { | ||
splitList := freeList[l][:cellNum] | ||
freeList[l] = freeList[l][cellNum:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
freeList[l] = freeList[l][cellNum:] [](start = 3, length = 35)
when expand splitList, freeList may be overwritten #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
pkg/algorithm/cell_allocation.go
Outdated
splittableNum[l] -= cellNum | ||
for sl := l; sl > currentLevel; sl-- { | ||
for _, sc := range splitList { | ||
splitList = append(splitList[1:], sc.GetChildren()...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitList[1:] [](start = 24, length = 13)
Be aware subslice mem leak #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Resolve comments.
Add free list in panic log when safety is broken.
copy(splitList[:], splitList[1:]) | ||
splitList[len(splitList)-1] = nil | ||
splitList = splitList[:len(splitList)-1] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too complex, why not use a new slice to store children? Then do 1 swap. #Closed
cellNum = splittableNum[l] | ||
} | ||
if cellNum > 0 { | ||
splitList := CellList{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splitList [](start = 3, length = 9)
splittableList? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the list is used to really split, not splittable only...
pkg/algorithm/cell_allocation.go
Outdated
for i := len(splitList); i > 0; i-- { | ||
splitList = append(splitList, splitList[0].GetChildren()...) | ||
copy(splitList[:], splitList[1:]) | ||
splitList[len(splitList)-1] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why copy splitList[1:]? This is equivalent:
splitList[0] = nil
splitList = splitList[1:]
@@ -589,6 +614,7 @@ func TestHivedAlgorithm(t *testing.T) { | |||
testSuggestedNodes(t, configFilePath) | |||
testStatefulPreemption(t, configFilePath) | |||
testBadNodes(t, configFilePath) | |||
testSplitAlloc(t, configFilePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change name accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Add test case when unable to split due to safety guarantee.
@zhypku will below leak?:
|
Ah, seems it will. Can @abuccts help fix it? Just insert the code before the return statement:
|
Fix memory leak in `removePickedGpus`.
Early stop safety check at current level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When buddy allocation failed due to bad cells, try to split a higher level cell to get current level cells.