Stuck while scaling AWS spot instances in a "full" zone #2391

xrl · 2019-09-26T17:15:54Z

We spent 45 minutes today waiting for spot instances to come online. The AZ was fully committed and would not give out spot reservations. I manually changed the scaling order to a different ASG (i.e., different AZ) and we got our servers right away.

Here is what it looked like when the ASG was trying to scale in a fully committed AZ:

I do not have any volume/taint restrictions for these workloads. It's safe to schedule them on any host. Could we make it so ASGs are knocked to the back of the list after a few failed attempts at scaling up?

seh · 2019-10-02T12:02:31Z

Does your version of the cluster autoscaler have #2235 included?

Jeffwan · 2019-10-08T22:29:40Z

@xrl Please provides CA version and it would be helpful for us to debug the issue

xrl · 2019-10-10T16:14:54Z

I just built master and deployed it. I'm running an EKS 1.13 but running master... hopefully that isn't too problematic!

I'm going to close this ticket while I wait for another failure situation in our eu-central-1 region.

xrl closed this as completed Oct 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck while scaling AWS spot instances in a "full" zone #2391

Stuck while scaling AWS spot instances in a "full" zone #2391

xrl commented Sep 26, 2019 •

edited

Loading

seh commented Oct 2, 2019

Jeffwan commented Oct 8, 2019

xrl commented Oct 10, 2019

Stuck while scaling AWS spot instances in a "full" zone #2391

Stuck while scaling AWS spot instances in a "full" zone #2391

Comments

xrl commented Sep 26, 2019 • edited Loading

seh commented Oct 2, 2019

Jeffwan commented Oct 8, 2019

xrl commented Oct 10, 2019

xrl commented Sep 26, 2019 •

edited

Loading