-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IPAllocated and IPAssigned conditions to Egress status #5282
Conversation
8b85374
to
1270a43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IPAssigned's handling almost looks good to me. We could do something similar for IPAllocated to be more robust.
@@ -766,3 +766,81 @@ func checkExternalIPPoolUsed(t *testing.T, controller *egressController, poolNam | |||
}) | |||
assert.NoError(t, err) | |||
} | |||
|
|||
func TestEgressStatus(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After making the above change, we could just have an unit test for updateEgressAllocatedCondition
, which just focus on validating the condition based on given egress and error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but let's see if all tests pass
/test-all |
unit test is failing |
I’ll take a look and try to resolve the issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM, tests could be simpler and more focused.
assert.NoError(t, err) | ||
|
||
if tt.expectedUpdate { | ||
assert.Eventually(t, func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syncEgress
is synchronous, there is no need to perform eventual check.
@@ -766,3 +766,100 @@ func checkExternalIPPoolUsed(t *testing.T, controller *egressController, poolNam | |||
}) | |||
assert.NoError(t, err) | |||
} | |||
|
|||
func TestEgressAllocatedStatus(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little duplicate to call other functions addEgress
, syncEgress
in this test. Since we add updateEgressAllocatedCondition
, the test could focus on it:
func TestUpdateEgressAllocatedCondition(t *testing.T) {
tests := []struct {
name string
inputEgress *v1beta1.Egress
inputErr error
expectedStatus v1beta1.EgressStatus
}{
...
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
stopCh := make(chan struct{})
defer close(stopCh)
controller := newController(nil, []runtime.Object{tt.inputEgress})
controller.updateEgressAllocatedCondition(tt.inputEgress, tt.inputErr)
gotEgress, err := controller.crdClient.CrdV1beta1().Egresses().Get(context.TODO(), tt.inputEgress.Name, metav1.GetOptions{})
require.NoError(t, err)
assert.True(t, semanticIgnoringTime.DeepEqual(tt.expectedStatus, gotEgress.Status), "Expected %v, got %v", tt.expectedStatus, gotEgress.Status)
})
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AJPL88, please try to squash 12 commits into one. Thanks.
type: string | ||
status: | ||
type: string | ||
lastTransitionTime: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add format: date-time
for this field. You can check here https://speakeasyapi.dev/post/openapi-tips-data-type-formats/ if you'd like to learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the existing APIs don't have it, I think we can keep them consistent for now and update them separately together.
type EgressConditionType string | ||
|
||
const ( | ||
IPAllocated EgressConditionType = "IPAllocated" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to add comments about what does these types mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
335b354
to
d93924d
Compare
When EgressIP is successfully allocated by antrea-controller, IPAllocated condition in Egress status is updated. When EgressIP is assigned to a Node by antrea-agent, IPAssigned condition in Egress status is updated. Signed-off-by: Alan Jiang <accelerator5460@gmail.com>
d93924d
to
df8a2f7
Compare
df8a2f7
to
5c0319c
Compare
@@ -23,6 +23,7 @@ import ( | |||
"sync" | |||
"time" | |||
|
|||
v1 "k8s.io/api/core/v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd prefer corev1
as the name for consistency with other imports (coreinformers
, metav1
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
} else if egressIP == "" { | ||
// Select one Node to update false status among all Nodes. | ||
nodeToUpdateStatus, err := c.cluster.SelectNodeForIP(egress.Spec.EgressIP, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a guarantee here that egress.Spec.EgressIP
is not nil
(I know that addEgress
has a check for that)? If so, would be nice to have a comment to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not guaranteed, but we don't care the value of egress.Spec.EgressIP, just use it to reach consensus among all agents about which one should do the update.
Added a comment for it.
} | ||
} | ||
} else { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to have a comment here, to explain which case this is (static Egress IP non-assigned to this Node?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added "The Egress IP is assigned to a Node (egressIP != "") but it's not this Node (isLocal == false), do nothing."
desiredStatus.EgressIP = "" | ||
// If the error is nil, it means the Egress hasn't been processed yet. Therefore, we only set IPAssigned | ||
// condition to false when there is an error. | ||
if scheduleErr != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if scheduleErr
is nil
shouldn't we have the condition anyway with a message explaining that the Egress hasn't been processed yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interval will just be a few ms. And there could be some bad side effects, so I intentionally add the check:
- One intermediate state will be generated and overwriten very soon.
- Two agents may try to update it and cause some backoff retries, which may delay the update of the right condition. One agent is selected due to fallback mechanism (when the Egress is not scheduled to any node because it's not processed yet), one agent is the right owner (after it's processed).
The state an Egress is not processed will last only a very short time. It's because the scheduler may receive the event at the same time as the controller. So controller may process it before the scheduler gets a result. But as long as the Egress is there, the scheduler will eventually have a result for it very soon.
Also added comments for it.
@@ -1026,3 +1072,19 @@ func (c *EgressController) GetEgress(ns, podName string) (string, string, error) | |||
func isEgressSchedulable(egress *crdv1b1.Egress) bool { | |||
return egress.Spec.EgressIP != "" && egress.Spec.ExternalIPPool != "" | |||
} | |||
|
|||
// compareEgressStatus compares two Egress Statuses, ignoring LastTransitionTime and conditions other than IPAssigned, returns true if they are equal. | |||
func compareEgressStatus(currentStatus, desiredStatus crdv1b1.EgressStatus) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: any reason not to use pointer parameters for this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to pointer
if compareEgressStatus(toUpdate.Status, *desiredStatus) { | ||
return nil | ||
} | ||
statusToUpdate := desiredStatus.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we make yet another copy here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment:
// Must make a copy here as we will append more conditions. If it's appended to desiredStatus directly, there
// would be duplicate conditions when the function retries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
@@ -205,6 +205,27 @@ type EgressStatus struct { | |||
// EgressIP indicates the effective Egress IP for the selected workloads. It could be empty if the Egress IP in spec | |||
// is not assigned to any Node. It's also useful when there are more than one Egress IP specified in spec. | |||
EgressIP string `json:"egressIP"` | |||
|
|||
Conditions []EgressCondition `json:"conditions,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnqn is it required to also add this to the v1alpha2 version of the API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is no drawback in doing it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I noticed it's added to v1alpha2, I thought it's harmless so just leave it as is.
5c0319c
to
c1c2928
Compare
@tnqn It doesn't look like you pushed your latest changes? |
c1c2928
to
ce92b04
Compare
done, sorry for the confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
One follow-up question: if the actual IP assignment fails (call to c.ipAssigner.AssignIP
in the syncEgress
function), will this ever be reflected in the status?
} | ||
} else if egressIP == "" { | ||
// Select one Node to update false status among all Nodes. | ||
// We don't care the value of egress.Spec.EgressIP, just use it to reach a consensus among all agents about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't care about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, hope it’s the last time I made the same mistake :)
if compareEgressStatus(toUpdate.Status, *desiredStatus) { | ||
return nil | ||
} | ||
statusToUpdate := desiredStatus.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
Not yet, the error shouldn't happen in practice. As of now I've never seen it. We can add it when it turns out to be possible. |
ce92b04
to
3a03a93
Compare
/test-all |
1. Avoid generating a transient IPAssigned failure by differentiating scheduling failure from unprocessed case. 2. Fix duplicate IPAllocated conditions. 3. Add/update unit tests and e2e tests. Signed-off-by: Quan Tian <qtian@vmware.com>
3a03a93
to
b88f129
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
When EgressIP is successfully allocated by antrea-controller, IPAllocated condition in Egress status is updated
When Egress is assigned to a Node by antrea-agent, IPAssigned condition in Egress status is updated
For #4614