Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize processing of egress rules that have no named port #1100

Merged
merged 1 commit into from
Aug 20, 2020

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Aug 18, 2020

If an egress rule allows all addresses but has no named port, it doesn't need to have the "matchAllPodsPeer" in its "To" Peer which is used for named port resolving. This could avoid the transmission of the "matchAllPods" AddressGroup and the reconciliation of the relevant rules.

For instance, the following rule allows all Pods in this namespace to access 53 port of all IP Addresses.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
    name: allow-dns
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - ports:
    - port: 53
      protocol: TCP
    - port: 53
      protocol: UDP

Before applying this patch, a global AddressGroup that contains all Pods in all Namespaces would be created, and the rule would be converted to:

- appliedToGroups:
  - c4c59cfe-9160-5de5-a85b-01a58d11963e
  name: allow-dns
  namespace: default
  rules:
  - direction: Out
    from: {}
    services:
    - port: "53"
      protocol: TCP
    - port: "53"
      protocol: UDP
    to:
      addressGroups:
      - eebc0ede-1d20-50a8-a9bd-eaa41b5954d5 # The AddressGroup that contains all Pods of the cluster.
      ipBlocks:
      - cidr: 0.0.0.0/0

This would lead to the rule has to be reconciled whenever there's a Pod update event in the cluster regardless the Namespace and the Node it belongs to. The overhead of handling such updates is considerable if it's a large scale cluster (more than 30k Pods and 10k such rules).

After applying this patch, the "to" field of the converted rule will have only the ipBlock "0.0.0.0/0", thus the rule won't be affected by other Pod events in the cluster.

- appliedToGroups:
  - c4c59cfe-9160-5de5-a85b-01a58d11963e
  name: allow-dns
  namespace: default
  rules:
  - direction: Out
    from: {}
    services:
    - port: "53"
      protocol: TCP
    - port: "53"
      protocol: UDP
    to:
      ipBlocks:
      - cidr: 0.0.0.0/0

@tnqn
Copy link
Member Author

tnqn commented Aug 18, 2020

/test-all

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@tnqn
Copy link
Member Author

tnqn commented Aug 18, 2020

/test-all

@abdallahyas
Copy link
Contributor

@tnqn, could i bother you to trigger the hw-offload ci again, there was a mis-configuration that caused the 404 error.

@tnqn
Copy link
Member Author

tnqn commented Aug 18, 2020

@AbdYsn done.

/test-hw-offload

@abdallahyas
Copy link
Contributor

@tnqn Thanks.

allPodsGroupUID := n.createAddressGroupForCRD(matchAllPodsPeerCrd, cnp)
podsPeer := matchAllPeer
addressGroups = append(addressGroups, allPodsGroupUID)
podsPeer.AddressGroups = addressGroups
podsPeer.AddressGroups = append(addressGroups, allPodsGroupUID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can further optimize to include only Pods with the specified namedPorts, or at least do not include Pods without any namedPorts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this comment. Although maybe there is a trade off here, as we may end up with multiple large address groups instead of one large one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I thought about it but didn't see a clear benefit.
The trade off is generic AddressGroups vs. AddressGroups specific to namedPorts. If we include only Pods with the specified namedPorts or exclude Pods without any namedPorts, the AddressGroup becomes associated with a specific rule or a specific kind of rules, there could be some drawbacks:

  1. More information needs to be added to AddressGroup to indicate its usage pertaining to namedPorts.
  2. Multiple groups with same selectors needs to be created and maintained.
  3. This works for matching all addresses case only, need to be handled specially.

For instance, when a rule like below is created, we will create the matchAllPods AddressGroup and need to add a field like namedPorts to the struct of AddressGroup and set its value to []{"http"} so that only Pods that have containerPort http will be included.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
    name: allow-http
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - ports:
    - port: http
      protocol: TCP

When another rule like below is created, which allows accessing all Pods but not other IP Addresses. They cannot share the "matchAllPods" AddressGroup as the group doesn't container Pods without namedPorts.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
    name: allow-http-and-22
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - ports:
    - port: http
      protocol: TCP
    - port: 22
      protocol: TCP
    to:
     - namespaceSelector: {}

Similarly, for normal AddressGroups, a rule may allow other numbered ports even the Pod doesn't have named ports, so the optimization doesn't apply. We will need to handle "matchAllPods" AddressGroup and normal AddressGroups differently, which means there would be another flag to indicate it.

Besides, from the perspective of use cases, I think it's not normal to have a rule allowing accessing Pods that don't have the named port it specifies, otherwise the rule wouldn't achieve its purpose. So there shouldn't be many Pods can be filtered out in most cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to include only Pods with the specified namedPorts, one proposal is to add capabilities in addressGroups to select pods with certain namedPort. Right now an addressGroup selector is only defined by namespaceSelector and podSelector (externalEntitySelector soon). With the new 'namedPort selector', we can do the namedPort resolution in filterAddressGroupsForPod(). However I believe this fundamentally changes how namedPort resolution is done right now and would require quite some efforts? @tnqn

Another thing that confuses me is that, if there is named port in egress rule, why do we append 0.0.0.0/24? This is essentially saying, for out of cluster traffic, we simply ignore the namedPort rather than interpreting it as resolving to nothing? Is this is expected semantics?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like github sorts comments based on "drafted time" instead of published time. (Mine is actually later than @Dyanngg's comment)

@Dyanngg #1100 (comment) should answer your first question, at the moment I don't feel it's worth to do this change given the listed reasons.

For the second question, I think you mean "0.0.0.0/0".

	// List of destinations for outgoing traffic of pods selected for this rule.
	// Items in this list are combined using a logical OR operation. If this field is
	// empty or missing, this rule matches all destinations (traffic not restricted by
	// destination). If this field is present and contains at least one item, this rule
	// allows traffic only if the traffic matches at least one item in the to list.
	// +optional
	To []NetworkPolicyPeer `json:"to,omitempty" protobuf:"bytes,2,rep,name=to"`

When "To" is not specified, it means allowing all destinations, so "0.0.0.0/0" is a must, we prepend the "matchAllPods" Group is to support named port case, see this test case https://github.com/kubernetes/kubernetes/blob/master/test/e2e/network/network_policy.go#L691.
If we don't do it, the named port would be ignored as "0.0.0.0/0" cannot resolve any port name, but apparently it's expected that the name should be converted number if a Pod is able to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed to include NamedPorts into group definition before. I know it will add much complexity and probably need more thinking.

For a global group to include all Pods without any NamedPort, I still feel it is doable, though yes we have to handle all different cases and judge which cases can use this group. I think it also depends on what kinds of rules are more common (e.g. if rules likely contain both NamedPorts and un-named ports). If we are not clear of these for now, probably let us see if we can get more feedback later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general i feel the named port in network policy is not a highly used use case . so i feel adding the additional complexity to handle the named ports is probably not worth doing. but as Jianjun says, we need to see how people using antrea use the network policies ... so we can decide a better approach when we have more data

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jianjuns @abhiraut sure, let's consider this when there are cases proving it's worth.

jianjuns
jianjuns previously approved these changes Aug 18, 2020
abhiraut
abhiraut previously approved these changes Aug 18, 2020
antoninbas
antoninbas previously approved these changes Aug 18, 2020
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I wonder if we should consider adding a benchmark test every time we run into an "edge" case like this one and we fix it. This could beef up our test suite and help prevent performance regressions in the future.

pkg/controller/networkpolicy/clusternetworkpolicy.go Outdated Show resolved Hide resolved
allPodsGroupUID := n.createAddressGroupForCRD(matchAllPodsPeerCrd, cnp)
podsPeer := matchAllPeer
addressGroups = append(addressGroups, allPodsGroupUID)
podsPeer.AddressGroups = addressGroups
podsPeer.AddressGroups = append(addressGroups, allPodsGroupUID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this comment. Although maybe there is a trade off here, as we may end up with multiple large address groups instead of one large one?

pkg/controller/networkpolicy/networkpolicy_controller.go Outdated Show resolved Hide resolved
If an egress rule allows all addresses but has no named port, it doesn't
need to have the "matchAllPodsPeer" in its "To" Peer which is used for
named port resolving. This could avoid the transmission of the
"matchAllPods" AddressGroup and the reconciliation of the relevant
rules.
@tnqn tnqn dismissed stale reviews from antoninbas, abhiraut, and jianjuns via 416e543 August 19, 2020 07:52
@tnqn
Copy link
Member Author

tnqn commented Aug 19, 2020

I wonder if we should consider adding a benchmark test every time we run into an "edge" case like this one and we fix it. This could beef up our test suite and help prevent performance regressions in the future.

Thanks for suggesting it. I'd love to add benchmark tests, but this one is a little different because it's a logic optimization instead physical one. We can not really see improvement by benchmarking a function. The change is made to antrea-controller but it takes effect on antrea-agent which should no longer receive the global AddressGroup and reconcile the rules when Pods are created/deleted in the cluster, thus saving CPU cycles and memory. Even we setup a cluster to verify it, the overhead of receiving the AddressGroup and reconciling the rules can only be seen with tens of thousands of Pods and tens of such Rules applied to a Node, which is not feasible at the moment. In future maybe we could leverage kubemark to simulate a large scale cluster in CI.

For this particular case, I think the unit tests are able to ensure that such Policy rules will no longer be converted to internal rules that include "matchAllPods" group.

@tnqn
Copy link
Member Author

tnqn commented Aug 19, 2020

/test-all

@tnqn
Copy link
Member Author

tnqn commented Aug 19, 2020

/test-hw-offload
/test-windows-conformance

@tnqn tnqn merged commit f8acc2b into antrea-io:master Aug 20, 2020
antoninbas pushed a commit to antoninbas/antrea that referenced this pull request Aug 21, 2020
…o#1100)

If an egress rule allows all addresses but has no named port, it doesn't
need to have the "matchAllPodsPeer" in its "To" Peer which is used for
named port resolving. This could avoid the transmission of the
"matchAllPods" AddressGroup and the reconciliation of the relevant
rules.
antoninbas pushed a commit that referenced this pull request Aug 21, 2020
If an egress rule allows all addresses but has no named port, it doesn't
need to have the "matchAllPodsPeer" in its "To" Peer which is used for
named port resolving. This could avoid the transmission of the
"matchAllPods" AddressGroup and the reconciliation of the relevant
rules.
GraysonWu pushed a commit to GraysonWu/antrea that referenced this pull request Sep 22, 2020
…o#1100)

If an egress rule allows all addresses but has no named port, it doesn't
need to have the "matchAllPodsPeer" in its "To" Peer which is used for
named port resolving. This could avoid the transmission of the
"matchAllPods" AddressGroup and the reconciliation of the relevant
rules.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants