Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: add limit-aware scheduling #1495

Closed

Conversation

ZiMengSheng
Copy link
Contributor

Ⅰ. Describe what this PR does

This PR proposes a resource limit aware plugin to mitigate the risk of overcommitment.

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign hormes after the PR has been reviewed.
You can assign the PR to them by writing /assign @hormes in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@eahydra eahydra changed the title docs: add limit aware scheduling proposal: add limit aware scheduling Jul 25, 2023
@ZiMengSheng
Copy link
Contributor Author

/cc @eahydra

@koordinator-bot koordinator-bot bot requested a review from eahydra July 25, 2023 09:46
@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

Patch coverage has no change and project coverage change: +0.67% 🎉

Comparison is base (667b3ff) 64.09% compared to head (516e2bf) 64.76%.
Report is 37 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1495      +/-   ##
==========================================
+ Coverage   64.09%   64.76%   +0.67%     
==========================================
  Files         341      347       +6     
  Lines       34943    35376     +433     
==========================================
+ Hits        22396    22913     +517     
+ Misses      10899    10780     -119     
- Partials     1648     1683      +35     
Flag Coverage Δ
unittests 64.76% <ø> (+0.67%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 102 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ZiMengSheng ZiMengSheng force-pushed the limitawareproposal branch 3 times, most recently from a4b552b to 123276d Compare July 25, 2023 09:49
@eahydra eahydra changed the title proposal: add limit aware scheduling proposal: add limit-aware scheduling Aug 8, 2023
@ZiMengSheng ZiMengSheng force-pushed the limitawareproposal branch 2 times, most recently from 0c01996 to 23f65ab Compare August 8, 2023 03:30
@eahydra
Copy link
Member

eahydra commented Aug 8, 2023

There are several issues to consider:

  1. Whether to handle DaemonSet specially during the Filter stage. It is possible for DaemonSet not to set Request, but to set Limit; and DaemonSet is quite special, generally it is a basic component, and it is possible to use DaemonSet to expand its functions.
  2. In the scoring stage, Pods of the BestEffort type also need to be considered, and a default limit can be set for these Pods.

@ZiMengSheng ZiMengSheng force-pushed the limitawareproposal branch 13 times, most recently from 0564dfc to 1846c64 Compare August 9, 2023 12:06
Signed-off-by: wangjianyu.wjy <wangjianyu.wjy@alibaba-inc.com>

To schedule a new pod pod5 (request: 1, limit: 4), although node1's total request/allocatable ratio is smaller than node2's, our `LimitAware` plugin considers node1's resource limit is already oversubscribed and hence place pod5 on node 2 instead.

#### Story 2: Use limit aware filter plugin to filter out nodes with high risk of resource over-subscription
Copy link

@Huang-Wei Huang-Wei Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we should be extra cautious on introducing a Filter plugin. Several reasons:

  • users, to my experience, may prefer their pods get scheduled (and possible OOMed / CPU throttled afterwards) than not being able to get scheduled at first place
  • it'd introduce a new semantics on KRM, which has a large blasting radius - CA has to be aware of this, so does other integrators

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Huang-Wei Thanks for reviewing the PR.

Thanks for reviewing this PR.
Indeed, as you said, introducing a new Filter will cause many problems.
Specific to the Limit-aware scheduling scenario, considering the runtime stability of the node, it is risky if the total limit of a node is too high. IMO there are two ways to control this risk:

  1. Lower the weight of such nodes based on the scoring algorithm;
  2. Introduce Filter to strongly restrict the total amount of Limit to not exceed the threshold.

The first method is a must. As for the second method, it can be given to users as an option, so that they have the opportunity to use it when necessary.

WDYT? @Huang-Wei

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2nd option is not necessarily a big no, it depends on your use-case. It's just when offering it, it'd break KRM (kubernetes resource model), you have to document the restrictions clearly like vanilla CA are unable to handle this kind of filter error (unless it gets re-compiled), etc.

Copy link

stale bot commented Nov 28, 2023

This issue has been automatically marked as stale because it has not had recent activity.
This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the issue is closed
    You can:
  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Close this issue or PR with /close
    Thank you for your contributions.

Copy link

stale bot commented Dec 28, 2023

This issue has been automatically closed because it has not had recent activity.
This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the issue is closed
    You can:
  • Reopen this PR with /reopen
    Thank you for your contributions.

@stale stale bot closed this Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants