Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated cherry pick of #5739: Store NetworkPolicy in filesystem as fallback data source #5777: Enable Pod network after realizing initial NetworkPolicies #5795: Support Local ExternalTrafficPolicy for Services with #5850

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Jan 8, 2024

Cherry pick of #5739 #5777 #5795 on release-1.12.

#5739: Store NetworkPolicy in filesystem as fallback data source
#5777: Enable Pod network after realizing initial NetworkPolicies
#5795: Support Local ExternalTrafficPolicy for Services with

For details on the cherry pick process, see the cherry pick requests page.

@tnqn tnqn added the kind/cherry-pick Categorizes issue or PR as related to the cherry-pick of a bug fix from the main branch to a release label Jan 8, 2024
tnqn added 2 commits January 8, 2024 16:02
In the previous implementation, traffic from/to a Pod may bypass
NetworkPolicies applied to the Pod in a time window when the agent
restarts because realizing NetworkPolicies and enabling forwarding are
asynchronous.

This patch stores NetworkPolicy data in files when they are received,
and makes antre-agent fallback to use the files as data source if it
can't connect to antrea-controller on startup. This prevents security
regression: a NetworkPolicy that has been realized on a Node will
continue to work even if antrea-controller is not available after
antrea-agent restarts.

The benchmark results of the storage's operations are as below:

BenchmarkFileStoreAddNetworkPolicy-40              70383             16102 ns/op             520 B/op          9 allocs/op
BenchmarkFileStoreAddAppliedToGroup-40             45382             25880 ns/op            3019 B/op          9 allocs/op
BenchmarkFileStoreAddAddressGroup-40                7400            180000 ns/op           49538 B/op          9 allocs/op
BenchmarkFileStoreReplaceAll-40                       10         127088004 ns/op        17815943 B/op      33099 allocs/op

The disk usage when storing 1k NetworkPolicies, AddressGroups, and
AppliedToGroups created by BenchmarkFileStoreReplaceAll is as below:

16M     /var/run/antrea-test/file-store/address-groups
4.0M    /var/run/antrea-test/file-store/applied-to-groups
4.0M    /var/run/antrea-test/file-store/network-policies

Signed-off-by: Quan Tian <qtian@vmware.com>
Pod network should only be enabled after realizing initial
NetworkPolicies, otherwise traffic from/to Pods could bypass
NetworkPolicy when antrea-agent restarts.

After commit f9fc979 ("Store NetworkPolicy in filesystem as
fallback data source"), antrea-agent can realize either the latest
NetworkPolicies got from antrea-controller or the ones got from
filesystem as fallback. Therefore, waiting for NetworkPolicies to be
realized should not add marked delay or make antrea-controller a failure
point of Pod network.

This commit adds an implementation of wait group capable of waiting with
a timeout, and uses it to wait for common initialization and
NetworkPolicy realization before installing any flows for Pods. More
preconditions can be added via the wait group if needed in the future.

Signed-off-by: Quan Tian <qtian@vmware.com>
@tnqn tnqn force-pushed the automated-cherry-pick-of-#5739-#5777-#5795-upstream-release-1.12 branch from 798c1c3 to d1a97db Compare January 8, 2024 08:03
tnqn added 2 commits January 8, 2024 16:35
Since K8s 1.29, setting Local ExternalTrafficPolicy for ClusterIP
Services with ExternalIPs is supported.

Signed-off-by: Quan Tian <qtian@vmware.com>
cniServer.reconcile() now installs flows asynchorously.

Signed-off-by: Quan Tian <qtian@vmware.com>
@tnqn tnqn force-pushed the automated-cherry-pick-of-#5739-#5777-#5795-upstream-release-1.12 branch from d1a97db to f978fc4 Compare January 8, 2024 08:38
@tnqn
Copy link
Member Author

tnqn commented Jan 8, 2024

/test-all
/test-ipv6-all
/test-ipv6-only-all
/test-windows-all

@tnqn tnqn requested a review from luolanzone January 9, 2024 03:10
@tnqn tnqn merged commit 301153b into antrea-io:release-1.12 Jan 10, 2024
52 of 58 checks passed
@tnqn tnqn deleted the automated-cherry-pick-of-#5739-#5777-#5795-upstream-release-1.12 branch January 10, 2024 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cherry-pick Categorizes issue or PR as related to the cherry-pick of a bug fix from the main branch to a release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants