Optimize processing of egress rules that have no named port #1100

tnqn · 2020-08-18T07:05:47Z

If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.

For instance, the following rule allows all Pods in this namespace to access 53 port of all IP Addresses.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
    name: allow-dns
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - ports:
    - port: 53
      protocol: TCP
    - port: 53
      protocol: UDP

Before applying this patch, a global AddressGroup that contains all Pods in all Namespaces would be created, and the rule would be converted to:

- appliedToGroups:
  - c4c59cfe-9160-5de5-a85b-01a58d11963e
  name: allow-dns
  namespace: default
  rules:
  - direction: Out
    from: {}
    services:
    - port: "53"
      protocol: TCP
    - port: "53"
      protocol: UDP
    to:
      addressGroups:
      - eebc0ede-1d20-50a8-a9bd-eaa41b5954d5 # The AddressGroup that contains all Pods of the cluster.
      ipBlocks:
      - cidr: 0.0.0.0/0

This would lead to the rule has to be reconciled whenever there's a Pod update event in the cluster regardless the Namespace and the Node it belongs to. The overhead of handling such updates is considerable if it's a large scale cluster (more than 30k Pods and 10k such rules).

After applying this patch, the "to" field of the converted rule will have only the ipBlock "0.0.0.0/0", thus the rule won't be affected by other Pod events in the cluster.

- appliedToGroups:
  - c4c59cfe-9160-5de5-a85b-01a58d11963e
  name: allow-dns
  namespace: default
  rules:
  - direction: Out
    from: {}
    services:
    - port: "53"
      protocol: TCP
    - port: "53"
      protocol: UDP
    to:
      ipBlocks:
      - cidr: 0.0.0.0/0

tnqn · 2020-08-18T07:05:59Z

/test-all

antrea-bot · 2020-08-18T07:06:02Z

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

/test-e2e: to trigger e2e tests.
/skip-e2e: to skip e2e tests.
/test-conformance: to trigger conformance tests.
/skip-conformance: to skip conformance tests.
/test-whole-conformance: to trigger all conformance tests on linux.
/skip-whole-conformance: to skip all conformance tests on linux.
/test-networkpolicy: to trigger networkpolicy tests.
/skip-networkpolicy: to skip networkpolicy tests.
/test-windows-conformance: to trigger windows conformance tests.
/skip-windows-conformance: to skip windows conformance tests.
/test-windows-networkpolicy: to trigger windows networkpolicy tests.
/skip-windows-networkpolicy: to skip windows networkpolicy tests.
/test-hw-offload: to trigger ovs hardware offload test.
/skip-hw-offload: to skip ovs hardware offload test.
/test-all: to trigger all tests (except whole conformance).
/skip-all: to skip all tests (except whole conformance).

tnqn · 2020-08-18T07:37:48Z

/test-all

abdallahyas · 2020-08-18T08:28:31Z

@tnqn, could i bother you to trigger the hw-offload ci again, there was a mis-configuration that caused the 404 error.

tnqn · 2020-08-18T08:45:48Z

@AbdYsn done.

/test-hw-offload

abdallahyas · 2020-08-18T08:53:15Z

@tnqn Thanks.

jianjuns · 2020-08-18T19:19:15Z

pkg/controller/networkpolicy/clusternetworkpolicy.go

 		allPodsGroupUID := n.createAddressGroupForCRD(matchAllPodsPeerCrd, cnp)
 		podsPeer := matchAllPeer
-		addressGroups = append(addressGroups, allPodsGroupUID)
-		podsPeer.AddressGroups = addressGroups
+		podsPeer.AddressGroups = append(addressGroups, allPodsGroupUID)


Do you think we can further optimize to include only Pods with the specified namedPorts, or at least do not include Pods without any namedPorts?

+1 to this comment. Although maybe there is a trade off here, as we may end up with multiple large address groups instead of one large one?

Thanks for the suggestion. I thought about it but didn't see a clear benefit.
The trade off is generic AddressGroups vs. AddressGroups specific to namedPorts. If we include only Pods with the specified namedPorts or exclude Pods without any namedPorts, the AddressGroup becomes associated with a specific rule or a specific kind of rules, there could be some drawbacks:

More information needs to be added to AddressGroup to indicate its usage pertaining to namedPorts.

Multiple groups with same selectors needs to be created and maintained.

This works for matching all addresses case only, need to be handled specially.

For instance, when a rule like below is created, we will create the matchAllPods AddressGroup and need to add a field like namedPorts to the struct of AddressGroup and set its value to []{"http"} so that only Pods that have containerPort http will be included.

apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-http spec: podSelector: {} policyTypes: - Egress egress: - ports: - port: http protocol: TCP

When another rule like below is created, which allows accessing all Pods but not other IP Addresses. They cannot share the "matchAllPods" AddressGroup as the group doesn't container Pods without namedPorts.

apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-http-and-22 spec: podSelector: {} policyTypes: - Egress egress: - ports: - port: http protocol: TCP - port: 22 protocol: TCP to: - namespaceSelector: {}

Similarly, for normal AddressGroups, a rule may allow other numbered ports even the Pod doesn't have named ports, so the optimization doesn't apply. We will need to handle "matchAllPods" AddressGroup and normal AddressGroups differently, which means there would be another flag to indicate it.

Besides, from the perspective of use cases, I think it's not normal to have a rule allowing accessing Pods that don't have the named port it specifies, otherwise the rule wouldn't achieve its purpose. So there shouldn't be many Pods can be filtered out in most cases.

If we want to include only Pods with the specified namedPorts, one proposal is to add capabilities in addressGroups to select pods with certain namedPort. Right now an addressGroup selector is only defined by namespaceSelector and podSelector (externalEntitySelector soon). With the new 'namedPort selector', we can do the namedPort resolution in filterAddressGroupsForPod(). However I believe this fundamentally changes how namedPort resolution is done right now and would require quite some efforts? @tnqn

Another thing that confuses me is that, if there is named port in egress rule, why do we append 0.0.0.0/24? This is essentially saying, for out of cluster traffic, we simply ignore the namedPort rather than interpreting it as resolving to nothing? Is this is expected semantics?

Looks like github sorts comments based on "drafted time" instead of published time. (Mine is actually later than @Dyanngg's comment)

@Dyanngg #1100 (comment) should answer your first question, at the moment I don't feel it's worth to do this change given the listed reasons.

For the second question, I think you mean "0.0.0.0/0".

// List of destinations for outgoing traffic of pods selected for this rule. // Items in this list are combined using a logical OR operation. If this field is // empty or missing, this rule matches all destinations (traffic not restricted by // destination). If this field is present and contains at least one item, this rule // allows traffic only if the traffic matches at least one item in the to list. // +optional To []NetworkPolicyPeer `json:"to,omitempty" protobuf:"bytes,2,rep,name=to"`

When "To" is not specified, it means allowing all destinations, so "0.0.0.0/0" is a must, we prepend the "matchAllPods" Group is to support named port case, see this test case https://github.com/kubernetes/kubernetes/blob/master/test/e2e/network/network_policy.go#L691.
If we don't do it, the named port would be ignored as "0.0.0.0/0" cannot resolve any port name, but apparently it's expected that the name should be converted number if a Pod is able to.

We discussed to include NamedPorts into group definition before. I know it will add much complexity and probably need more thinking.

For a global group to include all Pods without any NamedPort, I still feel it is doable, though yes we have to handle all different cases and judge which cases can use this group. I think it also depends on what kinds of rules are more common (e.g. if rules likely contain both NamedPorts and un-named ports). If we are not clear of these for now, probably let us see if we can get more feedback later.

in general i feel the named port in network policy is not a highly used use case . so i feel adding the additional complexity to handle the named ports is probably not worth doing. but as Jianjun says, we need to see how people using antrea use the network policies ... so we can decide a better approach when we have more data

@jianjuns @abhiraut sure, let's consider this when there are cases proving it's worth.

antoninbas

LGTM

I wonder if we should consider adding a benchmark test every time we run into an "edge" case like this one and we fix it. This could beef up our test suite and help prevent performance regressions in the future.

pkg/controller/networkpolicy/clusternetworkpolicy.go

antoninbas · 2020-08-18T20:43:32Z

pkg/controller/networkpolicy/clusternetworkpolicy.go

 		allPodsGroupUID := n.createAddressGroupForCRD(matchAllPodsPeerCrd, cnp)
 		podsPeer := matchAllPeer
-		addressGroups = append(addressGroups, allPodsGroupUID)
-		podsPeer.AddressGroups = addressGroups
+		podsPeer.AddressGroups = append(addressGroups, allPodsGroupUID)


+1 to this comment. Although maybe there is a trade off here, as we may end up with multiple large address groups instead of one large one?

pkg/controller/networkpolicy/networkpolicy_controller.go

If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.

tnqn · 2020-08-19T08:32:17Z

I wonder if we should consider adding a benchmark test every time we run into an "edge" case like this one and we fix it. This could beef up our test suite and help prevent performance regressions in the future.

Thanks for suggesting it. I'd love to add benchmark tests, but this one is a little different because it's a logic optimization instead physical one. We can not really see improvement by benchmarking a function. The change is made to antrea-controller but it takes effect on antrea-agent which should no longer receive the global AddressGroup and reconcile the rules when Pods are created/deleted in the cluster, thus saving CPU cycles and memory. Even we setup a cluster to verify it, the overhead of receiving the AddressGroup and reconciling the rules can only be seen with tens of thousands of Pods and tens of such Rules applied to a Node, which is not feasible at the moment. In future maybe we could leverage kubemark to simulate a large scale cluster in CI.

For this particular case, I think the unit tests are able to ensure that such Policy rules will no longer be converted to internal rules that include "matchAllPods" group.

tnqn · 2020-08-19T08:33:15Z

/test-all

tnqn · 2020-08-19T09:48:07Z

/test-hw-offload
/test-windows-conformance

…o#1100) If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.

If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.

…o#1100) If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.

vmwclabot added the cla-not-required label Aug 18, 2020

tnqn force-pushed the allow-all branch from 1b99b27 to 6c56045 Compare August 18, 2020 07:23

tnqn requested review from abhiraut, antoninbas and jianjuns August 18, 2020 10:57

jianjuns reviewed Aug 18, 2020

View reviewed changes

jianjuns previously approved these changes Aug 18, 2020

View reviewed changes

abhiraut previously approved these changes Aug 18, 2020

View reviewed changes

antoninbas previously approved these changes Aug 18, 2020

View reviewed changes

tnqn dismissed stale reviews from antoninbas, abhiraut, and jianjuns via 416e543 August 19, 2020 07:52

tnqn force-pushed the allow-all branch from 6c56045 to 416e543 Compare August 19, 2020 07:52

jianjuns approved these changes Aug 19, 2020

View reviewed changes

abhiraut approved these changes Aug 19, 2020

View reviewed changes

tnqn merged commit f8acc2b into antrea-io:master Aug 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize processing of egress rules that have no named port #1100

Optimize processing of egress rules that have no named port #1100

tnqn commented Aug 18, 2020 •

edited

Loading

tnqn commented Aug 18, 2020

antrea-bot commented Aug 18, 2020

tnqn commented Aug 18, 2020

abdallahyas commented Aug 18, 2020

tnqn commented Aug 18, 2020

abdallahyas commented Aug 18, 2020

jianjuns Aug 18, 2020

antoninbas Aug 18, 2020

tnqn Aug 19, 2020

Dyanngg Aug 19, 2020

tnqn Aug 19, 2020

jianjuns Aug 19, 2020

abhiraut Aug 19, 2020

tnqn Aug 20, 2020

antoninbas left a comment

antoninbas Aug 18, 2020

tnqn commented Aug 19, 2020

tnqn commented Aug 19, 2020

tnqn commented Aug 19, 2020

Optimize processing of egress rules that have no named port #1100

Optimize processing of egress rules that have no named port #1100

Conversation

tnqn commented Aug 18, 2020 • edited Loading

tnqn commented Aug 18, 2020

antrea-bot commented Aug 18, 2020

tnqn commented Aug 18, 2020

abdallahyas commented Aug 18, 2020

tnqn commented Aug 18, 2020

abdallahyas commented Aug 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antoninbas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn commented Aug 19, 2020

tnqn commented Aug 19, 2020

tnqn commented Aug 19, 2020

tnqn commented Aug 18, 2020 •

edited

Loading