-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 30 sec cooling for IP address reassignment #62
Conversation
ipamd/datastore/data_store.go
Outdated
@@ -213,7 +214,7 @@ func (ds *DataStore) assignPodIPv4AddressUnsafe(k8sPod *k8sapi.K8SPodInfo) (stri | |||
addr.address, k8sPod.Name, k8sPod.Namespace) | |||
return addr.address, eni.DeviceNumber, nil | |||
} | |||
if !addr.Assigned && k8sPod.IP == "" { | |||
if !addr.Assigned && k8sPod.IP == "" && curTime.Sub(addr.unAssignedTime) > 30*time.Second { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this time related to the addressCoolingPeriod
const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on Matt's question. The values seem related, though the defined constant seems to be used to guard when it's ok to free an ENI. If they are coupled, reuse the defined constant. If not, make another named constant instead of the inline 30*time.Second
.
I note that addr.unAssignedTime
may be nil. In this case Sub
appears to compute a duration from epoch, but this is accidental, not documented
behavior. It would be safer for code to handle the nil case explicitly, to protect for this undocumented behavior changing. I also note the time package has a Since
function that would simplify the expression to Since(addr.unAssignedTime) > addressCoolingPeriod
and save having to set curTime above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon deeper look, time.Time is struct, only one element of which can be nil. The Zero value of time is defined as the unix epoch, so the behavior is not undocumented. I still advocate a named constant, commentary, and use of Since in preference to now.Sub(prior).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be nice to explain in a comment why ipaddrs and enis need a cooling period, preferably near where you define the cooling duration constant(s). The comment can be brief and link to a design doc or issue for depth.
ipamd/datastore/data_store.go
Outdated
@@ -213,7 +214,7 @@ func (ds *DataStore) assignPodIPv4AddressUnsafe(k8sPod *k8sapi.K8SPodInfo) (stri | |||
addr.address, k8sPod.Name, k8sPod.Namespace) | |||
return addr.address, eni.DeviceNumber, nil | |||
} | |||
if !addr.Assigned && k8sPod.IP == "" { | |||
if !addr.Assigned && k8sPod.IP == "" && curTime.Sub(addr.unAssignedTime) > 30*time.Second { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on Matt's question. The values seem related, though the defined constant seems to be used to guard when it's ok to free an ENI. If they are coupled, reuse the defined constant. If not, make another named constant instead of the inline 30*time.Second
.
I note that addr.unAssignedTime
may be nil. In this case Sub
appears to compute a duration from epoch, but this is accidental, not documented
behavior. It would be safer for code to handle the nil case explicitly, to protect for this undocumented behavior changing. I also note the time package has a Since
function that would simplify the expression to Since(addr.unAssignedTime) > addressCoolingPeriod
and save having to set curTime above.
1890d05
to
3ccfef9
Compare
The cooling period (design-proposal ) is used to avoid an IP address being reassigned to a different Pod within cooling period. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to ship. But I'm confused why ENI cool period should be different from address cool period. I'd prefer the more concise time differential expression I suggested.
|
||
// addressCoolingPeriod is used to ensure an IP not get assigned to a Pod if this IP is used by a different Pod | ||
// in addressCoolingPeriod | ||
addressCoolingPeriod = 30 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these two cooling delays different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressCoolingPeriod is used to ensure an IP not get assigned to another Pod within addressCoolingPeriod. It impacts how soon the new Pods assigned to the node can re-use those IP addresses. For examples, a replica scale down which deletes all Pods on the node, and then immediately followed by a replica scale up that allocates all new Pods to the node. The addressCoolingPeriod can impact how soon these new Pods get their addresses. addressENICoolingPeriod is used to ensure ENI will not get freed back to EC2 control plane within addressENICoolingPeriod. ENI allocation/free are expensive operations. So, the timer is longer and we planning to add this a configurable parameter in future. For example, in some cases, ENI are never freed.
@@ -213,7 +220,7 @@ func (ds *DataStore) assignPodIPv4AddressUnsafe(k8sPod *k8sapi.K8SPodInfo) (stri | |||
addr.address, k8sPod.Name, k8sPod.Namespace) | |||
return addr.address, eni.DeviceNumber, nil | |||
} | |||
if !addr.Assigned && k8sPod.IP == "" { | |||
if !addr.Assigned && k8sPod.IP == "" && curTime.Sub(addr.unAssignedTime) > addressCoolingPeriod { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since
is more concise and simpler to understand than Sub(priorTime)
@@ -248,7 +255,7 @@ func (ds *DataStore) getDeletableENI() *ENIIPPool { | |||
continue | |||
} | |||
|
|||
if time.Now().Sub(eni.lastUnAssignedTime) < addressCoolingPeriod { | |||
if time.Now().Sub(eni.lastUnAssignedTime) < addressENICoolingPeriod { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto. Since
is more concise and simpler to understand than Sub(priorTime)
Issue #59:
Pods stuck in ContainerCreating due to CNI Failing to Assing IP to Container Until aws-node is deleted*
Description of changes:
When an IP is returned to IP WarmPool, keep it there for 30 sec before reassigning to a new Pod. This was planned in the original proposal.
Tests Performed:
performs back-to-back pods create and pods delete continuously. Without fix, the issue can be reproduced easily uses "heavy pods"(e.g. from StatefulSet). With the fix, the issue is no-longer reproducible
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.