Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain the right number of ENIs and its IP addresses in WARM-IP pool #169

Merged
merged 1 commit into from
Sep 6, 2018

Conversation

liwenwu-amazon
Copy link
Contributor

@liwenwu-amazon liwenwu-amazon commented Sep 5, 2018

*Issue #154, #155 *

Description of changes:
Today, curMaxAddrsPerENI is set to 1 initially and it is only set to the correct value after allocating the 1st ENI. This can cause nodeIPPoolLow() and nodeIPPoolTooHIgh() returns incorrect value.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@liwenwu-amazon liwenwu-amazon changed the title Make sure maxAddrsPerENI is set correctly. Maintain the right number of ENIs and its IP addresses in WARM-IP pool Sep 5, 2018
ipamd/ipamd.go Outdated
c.currentMaxAddrsPerENI, err = c.awsClient.GetENIipLimit()

if err != nil {
c.currentMaxAddrsPerENI = int64(len(ec2Addrs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it better to use the result from c.awsClient.GetENIipLimit() then int64(len(ec2Addrs))? Can you add a comment why we only do one when the other fails? Does anything not work in that case?

Copy link
Contributor

@nckturner nckturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@edmorley
Copy link
Contributor

edmorley commented Jul 1, 2019

Hi!

The description of this PR at first glance makes it seem like a small bug fix, however it in fact appears to be the cause of CNI plugin v1.2.0+ having double the IP address usage requirements of the prior v1.1.0 (under our workflow at least), which caused us to run out of free IP addresses until we adjusted our subnets accordingly (which in itself required an EKS cluster rebuild/migration).

Specifically after this change, our worker nodes went from being allocated one ENI (which was only partly used, since we use smaller instances with fewer pods per instance) to always having two ENIs allocated (with the second being entirely unused), with both ENIs having the full 10 IP addresses allocated, doubling usage. (Whilst we plan on trying out WARM_IP_TARGET in the future, it sounds like it will need some testing/tuning, so we had stuck with the out of the box WARM_ENI_TARGET for now.)

As such, please could significant changes like these (that have the chance of being breaking) be called out more clearly in the PR description and changelog in the future? :-)

@mogren
Copy link
Contributor

mogren commented Jul 1, 2019

@edmorley Thanks, good call-out. And like you said, WARM_IP_TARGET is another quite significant change, and it will only work well if the pod-churn is not too high. If pods are scheduled and terminated at a high and variable rate, the calls to EC2 will get throttled, and attaching new IPs and ENIs will take a long time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants