Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipping scale-up if MachineTypeNotAvailableAnnotation is present on MachineDeployment #37

Conversation

hardikdr
Copy link
Member

@hardikdr hardikdr commented May 3, 2020

What this PR does / why we need it: This PR enables autoscaler to skip the scale-up if MachineTypeNotAvailable annotation is present on the machine-deployment.
This annotation is set by the MCM, when certain machine-types are not out of capacity and can't be further scaled-up.

As an outcome of this change, autoscaler will pick up alternate node-group if one is not capable of scaling.

Related to : #35
Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Release note:

Autoscaler avoids scaling-up the MachineDeployment if MachineTypeNotAvailableAnnotation is present

@hardikdr hardikdr requested a review from prashanth26 as a code owner May 3, 2020 09:59
@gardener-robot-ci-2 gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 3, 2020
@prashanth26
Copy link

Hi @hardikdr ,

Were these changes tested with a real setup?

@hardikdr
Copy link
Member Author

hardikdr commented May 4, 2020

Yes, I have tested this with the related changes at MCM #454, works fine.

Copy link

@prashanth26 prashanth26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hold

@CLAassistant
Copy link

CLAassistant commented Sep 16, 2020

CLA assistant check
All committers have signed the CLA.

@jsravn
Copy link

jsravn commented Sep 28, 2020

Is this the correct approach? From looking at it, upstream autoscaler already handles this but you have to return unregistered instances in the autoscaler cloudprovider interface, which this fork isn't doing properly.

@hardikdr
Copy link
Member Author

@jsravn That's true, and also one of the reasons why I stopped working on this approach for now.
The interface in this version of CA, isn't mature enough, and I am rebasing the autoscaler to the latest version here: https://github.com/gardener/autoscaler/tree/machine-controller-manager-provider. I'll then be looking at this topic again and find a more natural way to fix it.

@hardikdr
Copy link
Member Author

/hold

@gardener-robot gardener-robot added the reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies label Sep 28, 2020
@jsravn
Copy link

jsravn commented Sep 28, 2020

@hardikdr Ah cool. I was also going to try rebasing. I should have a PR soon which fixes it to return unregistered instances - so at least this functionality will work as expected in the existing code (if it fails to scale up the machinedeployment, autoscaler will give up after max-node-provision-time and try the next one).

@hardikdr
Copy link
Member Author

hardikdr commented Sep 28, 2020

Sounds great. We are discussing this issue at length here: #35.

Also, you might want to make sure, your changes are on top of the machine-controller-manager-provider, which is also a default branch now.

@jsravn
Copy link

jsravn commented Oct 2, 2020

@hardikdr FYI I opened a PR at #58.

@prashanth26
Copy link

We no longer need this PR right? Thanks to this - #58?

@hardikdr
Copy link
Member Author

hardikdr commented Oct 9, 2020

No, we don't need this anymore.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Nobody worked on this for 6 months (will further age) needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/do-not-merge Has no approval for merging as it may break things, be of poor quality or have (ext.) dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants