Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS/Kubernetes] [request]: Cluster autoscaler support for Multiple Instance Type ASGs/Spot fleets #144

Closed
gjtempleton opened this issue Jan 31, 2019 · 7 comments
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@gjtempleton
Copy link

Tell us about your request
Support in the cluster autoscaler for multi instance type ASGs to make utilisation of spot instances and administration of spot instance based clusters simpler.

Which service(s) is this request for?
EKS/Kubernetes on AWS in general.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Reliably run a cluster where the worker nodes run on spot instances (with very few exceptions). We're currently running with 6 spot ASGs across 6 different instance families to mitigate the chance of losing significant proportions of our compute at once. Unfortunately the cluster autoscaler currently isn't able to make use of spot fleets or multi instance ASGs due to the inability to guess exactly which instance will come up next.

The cluster autoscaler also isn't aware of spot bids, instead only looking at actual instances, and the difference between desired and running count of ASGs, this leads to issues when spot bids are unable to be fulfilled.

Using a multiple instance type ASG would make administration of the clusters significantly simpler, as well as allowing the cluster to automatically switch between spot and on-demand.

Are you currently working around this issue?
Multiple ASGs with multiple instance types, all using spot bids, along with backporting of some patches by Zalando to the cluster autoscaler version being used by us. (The latest 2 commits here: https://github.com/aermakov-zalando/autoscaler/commits/autoscaler-ignore-existing-nodes-upstream )

Additional context
There is an ongoing PR to add the ability to use the price expander with AWS spot here: kubernetes/autoscaler#486

One PR discussing the same sort of functionality (ability to use MixedInstancesPolicy): kubernetes/autoscaler#1473

WIP RFC on discussing the desire for multi instance ASG support here: https://docs.google.com/document/d/1m-2lQCOwxMrlv1rCz1JqUyBrZAjLOZAELswjuljJVjg/edit

Attachments
None

@gjtempleton gjtempleton added the Proposed Community submitted issue label Jan 31, 2019
@runningman84
Copy link

We are in the same boat and use mixed autoscaling groups with almost identical specs like m5a.large, m5.large, m5d.large, m4.large and t3.large (unlimited)... This solution seems to work.

@alfredkrohmer
Copy link

Isn't this basically a feature request for the cluster-autoscaler or for EC2 in general? Not sure what exactly this has to do with EKS.

@Jeffwan
Copy link

Jeffwan commented Feb 4, 2019

@devkid I think EKS team may help on this case. We're also evaluating this feature and try to bring into cluster autoscaler. The challenge right now is the node group abstraction level may change to support this feature in CA, since other cloud provider still at node group level. This will take one level down to manage different node flavors in one ASG.

There's also some challenges, for example, bring up different nodes(or OnDemand with Spot) in one scale up request, etc. We want to make sure everything is compatible in the CA community.

@bigpg
Copy link

bigpg commented Feb 11, 2019

This feature would also make it considerably easier to utilise a mix of OnDemand instances and Spot in the same K8S cluster where your OnDemand usage is covered by RIs without the need for managing multiple ASGs and node affinity.

@jaypipes
Copy link

kubernetes/autoscaler#1886 has merged and been cherry-picked at least back to 1.14. There is also now documentation going over how to use mixed instance types in the ASG with cluster autoscaler. There are known limitations, of course, including the big one, which is that the instance types should be the same "size".

I'm guessing that kubernetes/autoscaler#486 will not be merged any time soon. It's too big to reasonably review and needs to be rebased anyway. My suggestion would be to close this particular Github issue since I believe the original feature request has now been completed. We can create a separate issue around removing specific limitations such as the same-size instance types restriction, if that is amenable to you, @gjtempleton?

@gjtempleton
Copy link
Author

Hey @jaypipes, yep, that seems reasonable to me.

I think with the merging of kubernetes/autoscaler#2235 we're unlikely to switch to mixed instance types with the current limitations, but I equally don't think that should be a blocker for closing this as the same PR makes it reasonable for us to continue using multiple ASGs of single types.

Will close as resolved for now, thanks for all your help in getting those PRs over the line!

@Jeffwan
Copy link

Jeffwan commented Aug 28, 2019

@gjtempleton Great! Also check kubernetes/autoscaler#2248. It fixed some issues of MixedInstancePolicy. I will also help cherry-pick #2235 back to previous versions. Hope you can share more use case on them. Let's improve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests

7 participants