-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize processing of egress rules that have no named port #1100
Conversation
/test-all |
Thanks for your PR. The following commands are available:
|
/test-all |
@tnqn, could i bother you to trigger the hw-offload ci again, there was a mis-configuration that caused the 404 error. |
@AbdYsn done. /test-hw-offload |
@tnqn Thanks. |
allPodsGroupUID := n.createAddressGroupForCRD(matchAllPodsPeerCrd, cnp) | ||
podsPeer := matchAllPeer | ||
addressGroups = append(addressGroups, allPodsGroupUID) | ||
podsPeer.AddressGroups = addressGroups | ||
podsPeer.AddressGroups = append(addressGroups, allPodsGroupUID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can further optimize to include only Pods with the specified namedPorts, or at least do not include Pods without any namedPorts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this comment. Although maybe there is a trade off here, as we may end up with multiple large address groups instead of one large one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. I thought about it but didn't see a clear benefit.
The trade off is generic AddressGroups vs. AddressGroups specific to namedPorts. If we include only Pods with the specified namedPorts or exclude Pods without any namedPorts, the AddressGroup becomes associated with a specific rule or a specific kind of rules, there could be some drawbacks:
- More information needs to be added to AddressGroup to indicate its usage pertaining to namedPorts.
- Multiple groups with same selectors needs to be created and maintained.
- This works for matching all addresses case only, need to be handled specially.
For instance, when a rule like below is created, we will create the matchAllPods
AddressGroup and need to add a field like namedPorts
to the struct of AddressGroup and set its value to []{"http"}
so that only Pods that have containerPort http
will be included.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-http
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- port: http
protocol: TCP
When another rule like below is created, which allows accessing all Pods but not other IP Addresses. They cannot share the "matchAllPods" AddressGroup as the group doesn't container Pods without namedPorts.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-http-and-22
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- port: http
protocol: TCP
- port: 22
protocol: TCP
to:
- namespaceSelector: {}
Similarly, for normal AddressGroups, a rule may allow other numbered ports even the Pod doesn't have named ports, so the optimization doesn't apply. We will need to handle "matchAllPods" AddressGroup and normal AddressGroups differently, which means there would be another flag to indicate it.
Besides, from the perspective of use cases, I think it's not normal to have a rule allowing accessing Pods that don't have the named port it specifies, otherwise the rule wouldn't achieve its purpose. So there shouldn't be many Pods can be filtered out in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to include only Pods with the specified namedPorts
, one proposal is to add capabilities in addressGroups to select pods with certain namedPort. Right now an addressGroup selector is only defined by namespaceSelector and podSelector (externalEntitySelector soon). With the new 'namedPort selector', we can do the namedPort resolution in filterAddressGroupsForPod(). However I believe this fundamentally changes how namedPort resolution is done right now and would require quite some efforts? @tnqn
Another thing that confuses me is that, if there is named port in egress rule, why do we append 0.0.0.0/24? This is essentially saying, for out of cluster traffic, we simply ignore the namedPort rather than interpreting it as resolving to nothing? Is this is expected semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like github sorts comments based on "drafted time" instead of published time. (Mine is actually later than @Dyanngg's comment)
@Dyanngg #1100 (comment) should answer your first question, at the moment I don't feel it's worth to do this change given the listed reasons.
For the second question, I think you mean "0.0.0.0/0".
// List of destinations for outgoing traffic of pods selected for this rule.
// Items in this list are combined using a logical OR operation. If this field is
// empty or missing, this rule matches all destinations (traffic not restricted by
// destination). If this field is present and contains at least one item, this rule
// allows traffic only if the traffic matches at least one item in the to list.
// +optional
To []NetworkPolicyPeer `json:"to,omitempty" protobuf:"bytes,2,rep,name=to"`
When "To" is not specified, it means allowing all destinations, so "0.0.0.0/0" is a must, we prepend the "matchAllPods" Group is to support named port case, see this test case https://github.com/kubernetes/kubernetes/blob/master/test/e2e/network/network_policy.go#L691.
If we don't do it, the named port would be ignored as "0.0.0.0/0" cannot resolve any port name, but apparently it's expected that the name should be converted number if a Pod is able to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed to include NamedPorts into group definition before. I know it will add much complexity and probably need more thinking.
For a global group to include all Pods without any NamedPort, I still feel it is doable, though yes we have to handle all different cases and judge which cases can use this group. I think it also depends on what kinds of rules are more common (e.g. if rules likely contain both NamedPorts and un-named ports). If we are not clear of these for now, probably let us see if we can get more feedback later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general i feel the named port in network policy is not a highly used use case . so i feel adding the additional complexity to handle the named ports is probably not worth doing. but as Jianjun says, we need to see how people using antrea use the network policies ... so we can decide a better approach when we have more data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I wonder if we should consider adding a benchmark test every time we run into an "edge" case like this one and we fix it. This could beef up our test suite and help prevent performance regressions in the future.
allPodsGroupUID := n.createAddressGroupForCRD(matchAllPodsPeerCrd, cnp) | ||
podsPeer := matchAllPeer | ||
addressGroups = append(addressGroups, allPodsGroupUID) | ||
podsPeer.AddressGroups = addressGroups | ||
podsPeer.AddressGroups = append(addressGroups, allPodsGroupUID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this comment. Although maybe there is a trade off here, as we may end up with multiple large address groups instead of one large one?
If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.
416e543
Thanks for suggesting it. I'd love to add benchmark tests, but this one is a little different because it's a logic optimization instead physical one. We can not really see improvement by benchmarking a function. The change is made to antrea-controller but it takes effect on antrea-agent which should no longer receive the global AddressGroup and reconcile the rules when Pods are created/deleted in the cluster, thus saving CPU cycles and memory. Even we setup a cluster to verify it, the overhead of receiving the AddressGroup and reconciling the rules can only be seen with tens of thousands of Pods and tens of such Rules applied to a Node, which is not feasible at the moment. In future maybe we could leverage kubemark to simulate a large scale cluster in CI. For this particular case, I think the unit tests are able to ensure that such Policy rules will no longer be converted to internal rules that include "matchAllPods" group. |
/test-all |
/test-hw-offload |
…o#1100) If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.
If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.
…o#1100) If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.
If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.
For instance, the following rule allows all Pods in this namespace to access 53 port of all IP Addresses.
Before applying this patch, a global AddressGroup that contains all Pods in all Namespaces would be created, and the rule would be converted to:
This would lead to the rule has to be reconciled whenever there's a Pod update event in the cluster regardless the Namespace and the Node it belongs to. The overhead of handling such updates is considerable if it's a large scale cluster (more than 30k Pods and 10k such rules).
After applying this patch, the "to" field of the converted rule will have only the ipBlock "0.0.0.0/0", thus the rule won't be affected by other Pod events in the cluster.