Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi goroutine deal taskUnschedulable #3921

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lishangyuzi
Copy link

In the scenario of scheduling large-scale jobs, I also encountered a problem. When the job fails to be scheduled, all the pods under this job will update the PodCondition. Since it is necessary to communicate with the apiserver, this will take a long time.Could we consider using the multi-goroutine approach to handle this part of the logic?

@volcano-sh-bot
Copy link
Contributor

Welcome @lishangyuzi!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign lowang-bh
You can assign the PR to them by writing /assign @lowang-bh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Dec 24, 2024
@lishangyuzi
Copy link
Author

/assign @lowang-bh

@lowang-bh
Copy link
Member

Have you increase the QPS of kubeclient in volcano scheduler?

@lishangyuzi
Copy link
Author

lishangyuzi commented Dec 25, 2024

Have you increase the QPS of kubeclient in volcano scheduler?

default qps of kubeclient has already met my expectations.It takes approximately 200 seconds for a job with 5000 pods to complete this stage.

fs.Float32Var(&s.KubeClientOptions.QPS, "kube-api-qps", defaultQPS, "QPS to use while talking with kubernetes apiserver")
fs.IntVar(&s.KubeClientOptions.Burst, "kube-api-burst", defaultBurst, "Burst to use while talking with kubernetes apiserver")

defaultQPS = 2000.0
defaultBurst = 2000

The parameters related to my API server QPS are as follows:

--max-mutating-requests-inflight=4000
--max-requests-inflight=2000
--watch-cache-sizes=node#2000,pod#10000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants