-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: use pending amp in hot region scheduler #3926
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Codecov Report
@@ Coverage Diff @@
## master #3926 +/- ##
==========================================
+ Coverage 74.64% 74.67% +0.03%
==========================================
Files 247 247
Lines 25372 25408 +36
==========================================
+ Hits 18939 18974 +35
+ Misses 4759 4757 -2
- Partials 1674 1677 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically lgtm
Signed-off-by: lhy1024 <admin@liudos.us>
@rleungx PTAL |
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
/merge |
@HunDunDM: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 0356cce
|
What problem does this PR solve?
In the past, for two stores with large differences, it was easy to have redundant scheduling behavior, because as the difference between the two stores approached, the speed might not come down in time, so we limited the speed by adding pending amps, and the closer the two stores were, the fewer regions were allowed to be scheduled at the same time.
Before
After
What is changed and how it works?
isTolerance checks source store and target store by checking the difference value with pendingAmpFactor * pendingPeer. This will make the hot region scheduling slow even serialized running when each 2 store's pending influence is close.
Check List
Tests
Release note