-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS Ondemand not scaled up if Spot requests remain "Open" #1795
Comments
Second that. We're facing the same issue, but we're running spot instances only. Same taints multiple different instance groups in 3 azs. We expected failover to the next asg if a spot request couldn't be fulfilled. |
AWS spot instance in CA is not supported and I have not tested multiple node groups with onDemand and spot together. The behavior is unpredicted. I can help check. |
#1133 describes this problem |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
It seems my issue is pretty related to the this one. |
/label aws |
I believe the underlying issue in this bug has been addressed by #2235, which is now merged. Can we close this issue out? |
/remove-lifecycle stale |
/area provider/aws |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
As Jay mentioned, #2235 addressed this issue. Please find the right release version with this improvement. (All releases after Sept has this change from 1.12.x). |
/close |
@Jeffwan: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Greetings,
I'm running cluster-autoscaler v1.3.8 in a Kubernetes v1.11.8 cluster in AWS created using Kops. I'm not sure if this falls under feature request, a bug, or a misconfiguration on my part, so apologies in advance if it's a misconfiguration. Basically, for our pre-production cluster, I want to run spot instances as much as possible. I have the following node groups (and thus, ASGs), and they look like the following if the cluster is running normally:
I also have k8s-spot-rescheduler running to take pods from any
ondemand
nodes that are provisioned, and move them to spot instances so that theondemand
nodes can be removed. However, lately, there always seems to be spot requests that are open, but not yet fulfilled:This is normal behavior in AWS, but the problem is, the cluster autoscaler does not use the
ondemand
node groups at all. The following log output is what I'll see during this situation:The issue now is that there are pods that remain unscheduled because there's no capacity, but the cluster-autoscaler treats the "open" spot requests as having taken care of the scaling.
Any ideas on how to achieve both? I'm using the following configuration flags:
The text was updated successfully, but these errors were encountered: