Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [Feature]: Allow Kube Scheduler Customization #1468

Open
Kausheel opened this issue Aug 6, 2021 · 31 comments
Open

[EKS] [Feature]: Allow Kube Scheduler Customization #1468

Kausheel opened this issue Aug 6, 2021 · 31 comments
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@Kausheel
Copy link

Kausheel commented Aug 6, 2021

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
What do you want us to build?

It would be great if EKS allowed users to configure the Kube Scheduler parameters. This is a Control Plane component, so users don't have access to this by default. Exposing the Kube Scheduler configuration either via AWS APIs or via the KubeSchedulerConfiguration resource type would be a significant advantage for EKS users.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Use cases for this might include switching from equal Pod distribution to a binpacking approach, which optimizes cost effectiveness. There are many other Scheduler parameters which users might want to tweak themselves.

Are you currently working around this issue?
Implementing custom Kube Schedulers. This is not ideal, since it requires operational overhead in maintaining and updating the custom Kube Scheduler. It may also require using tools like OPA to insert custom schedulerName fields into the target Pods, which is yet another burden on the user.

Thanks!

@Kausheel Kausheel added the Proposed Community submitted issue label Aug 6, 2021
@mikestef9 mikestef9 added the EKS Amazon Elastic Kubernetes Service label Aug 6, 2021
@ashishapy
Copy link

ashishapy commented Oct 21, 2021

Another use-case is to define Cluster-level default constraints for PodTopologySpread in scheduler. As per doc https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/#cluster-level-default-constraints

AWS should make it as default behaviour in EKS cluster.

apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
profiles:
  - pluginConfig:
      - name: PodTopologySpread
        args:
          defaultConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: ScheduleAnyway
          defaultingType: List

@stijndehaes
Copy link

I would love to use this for enabling bin packing like explained here:
https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/

@sherifabdlnaby
Copy link

Upvote.

Trying to use EKS and achieve bin packing is hard without changing Scheduler Behavior to favor MostAllocated.

@logyball
Copy link

Note that this feature is supported to some extent in Azure and is supported for the use case of Scheduler Scoring Strategy: MostAllocated in GKE by using the autoscaling profile (note this is an assumption on my part, GKE does not explicitly document what this setting does under the hood) . Adding this ability would help EKS users gain parity in that sense.

@stijndehaes
Copy link

I would be fine with having a setting like GKE has, this would solve my use case. It probably does not solve every use case out there, but I can understand if the AWS EKS team feels reluctant to allow changing the whole configuration.

@boblee0717
Copy link

boblee0717 commented Mar 21, 2023

Imagine this, if this feature can be opened for all EKS users, that would save a lot of time for them. Let's assume it will take one week per person to workaround this via custom kube-scheduler, if there are 1000 users need this, it will cost 7000 days, that would be a whole life of one person.

@alex-berger
Copy link

With Kubernetes v1.24 the DefaultPodTopologySpread feagture graduated to GA kubernetes/kubernetes#108278. Without this we have not way to use (resp. configure) it on EKS clusters.

@AnhQKatalon
Copy link

Same here. We need this feature to enable resource bin packing for cost saving
https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/

@Art3mK
Copy link

Art3mK commented Jul 21, 2023

@AnhQKatalon, run scheduler yourself with needed settings + patch pods to use that scheduler with kyverno for example :) Could be done in couple hours.

@AnhQKatalon
Copy link

AnhQKatalon commented Jul 21, 2023

@AnhQKatalon, run scheduler yourself with needed settings + patch pods to use that scheduler with kyverno for example :) Could be done in couple hours.

Yeah, I am doing the workaround this way. Appreciate your help. But it should be great if EKS supports changing the scheduler configuration officially.

@babinos87
Copy link

As others mentioned, this is required to set default pod topology constraints on the cluster, as per: https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#cluster-level-default-constraints. There would be other uses cases, I am sure of it.

There are workarounds, of course, but this seems like a core thing to do, in order to make the life of EKS users easier. I thing this is a MUST.

@fernandesnikhil
Copy link

fernandesnikhil commented Sep 16, 2023

This would be very helpful for the same reasons mentioned by other above:

  • set default topology spreads for all pods in one central place
  • tweak bin-packing by changing NodeResourcesFit

The suggestion of rolling your own Scheduler is not appealing because EKS might have bolted on their own tweaks/modifications to get the scheduler to work right in AWS and then we'd loose all of that. And then there's maintaining it. I get that modifying the EKS blessed set of configuration can lead to instability - but if I want to modify just a few settings I should be allowed to do that with the understanding it could break scheduling on my cluster. Upstream k8s allows it and it's useful.

@subhranil05
Copy link

If not possible to add customization in kube-scheduler,
can we think about this feature like GKE, node groups will have option to scale with the mostAllocated like strategy like GKE have autoscale profile optimize-utilization ?

@sherifabdlnaby
Copy link

@subhranil05 This is not an alternative solution. Scaling Node Groups can only achieve bin-packing during the event of scaling up. Kube Scheduler customization is necessary for in-place, proactive bin-packing.

@m00lecule
Copy link

m00lecule commented Oct 3, 2023

Can somebody take a look and consider including this issue to kanban board? It seems that demand is still valid in 2023 as issue is active for more than 2 years. Of course we we can self-manage additional kube-scheduler but it's counter intuitive to subscribe for aws-managed EKS controlplane with self-managed controlplane components (additional kube-scheduler).

CC @tabern @mikestef9

@paulchambers
Copy link

This would be very useful for my EKS clusters. I want to be able to set sensible defaults without having to run my own scheduler.

@cskinfill
Copy link

I would love to see this as well too support bin packing at scheduling.

@sherifabdlnaby
Copy link

Do it for the environment folks!

@Legion2
Copy link

Legion2 commented Dec 4, 2023

I want to use bin packing with karpenter for job workloads. So karpenter can scale down empty nodes after a scale up. Instead of spreading the pods across all nearly empty nodes they should be packed on some full nodes, to enable karpenter removing empty nodes after the last job running on it completed.

@onelapahead
Copy link

onelapahead commented Dec 13, 2023

Assuming AWS may not prioritize this for awhile at the current rate, I think an example deployment of a custom scheduler with MostAllocated enabled for binpacking would benefit everyone here (as suggested in #1468 (comment)) - despite the burden it puts on 1) cluster admins to maintain control plane infra in-step with EKS versions, 2) Pod creators to ensure the custom scheduler is used. A Kyverno / Gatekeeper / custom webhooks potentially helping with the latter.

https://kubernetes.io/docs/tasks/extend-kubernetes/configure-multiple-schedulers/

Is a starting point, but if anyone has manifest samples that have been tested for a binpack configuration everyone wants that'd be appreciated. If I get to this at some point will share.

In some clusters, I've seen something like this provided:

apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
clientConnection:
  kubeconfig: /var/lib/kube-scheduler/kubeconfig
profiles:
- schedulerName: default-scheduler
  pluginConfig:
    - args:
        scoringStrategy:
          type: MostAllocated
      name: NodeResourcesFit
  plugins:
    score:
      disabled:
      - name: "NodeResourcesBalancedAllocation"
      enabled:
      - name: "NodeResourcesFit"
        weight: 5

@vinay92-ch
Copy link

vinay92-ch commented Feb 12, 2024

We ran into this this same issue and had to setup a custom scheduler to implement bin-packing. It's the same kube-scheduler image with a MostAllocated scoring policy as suggested above. Blog has more details about how we dealt with overprovisioning and system workloads and rollout to all pods. This section has the specific scheduler config.

We were able to achieve this in GCP by using the optimize-utilization setting in GKE, but for Azure AKS, we still have to use this secondary scheduler with custom scoring policy.

@MattLJoslin
Copy link

How is this API not supported yet? Is there any plan to support this soon? It's part of the standard Kubernetes service but there's no way to use on EKS? This really doesn't make EKS very usable in our case. All of the major packages are assuming that the standard APIs are available.

@eliran-zada-zesty
Copy link

Same as @MattLJoslin said... we really need it as well

@stevehipwell
Copy link

I think being able to run the scheduler in MostAllocated mode would make the Karpenter use case even more compelling.

@stevehipwell
Copy link

https://www.cncf.io/blog/2024/06/03/tackling-gpu-underutilization-in-kubernetes-runtimes/

@jukie
Copy link

jukie commented Oct 1, 2024

Any updates on this?

@nikimanoledaki
Copy link

+1 for being able to add a pluginConfig about PodTopologySpread as well as a MostAllocated scoring policy.

Do it for the environment folks!

And this!

@woehrl01
Copy link

woehrl01 commented Oct 2, 2024

Hint in the meantime: you can use the following AWS managed image to provision the scheduler yourself, without the need to self-manage the image: https://gallery.ecr.aws/eks-distro/kubernetes/kube-scheduler

@jukie
Copy link

jukie commented Oct 2, 2024

@woehrl01 That's a viable workaround but then users have to manage the scheduler component themselves as well as update every workload to target it in the pod spec. As a managed Kubernetes service it'd be ideal if these options were exposed as configuration to the user instead.

@woehrl01
Copy link

woehrl01 commented Oct 2, 2024

@jukie I'm not arguing against that this would be a nice addition. I just wanted to mention a solution which won't require you to wait more than 3 years. Just wanted to share this to newcomers which maybe are having troubles with self maintaining this image, etc.

@jukie
Copy link

jukie commented Oct 2, 2024

Totally agree, and thanks for sharing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests