-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add single stack IPv6 support in Flow aggregator and ELK flow collector #1819
Conversation
/test-all |
/test-ipv6-only-e2e |
return fmt.Errorf("resolved Flow Aggregator address %v is not supported", hostIPs[0]) | ||
} | ||
// Update the collector address with resolved IP of flow aggregator | ||
collectorAddr = net.JoinHostPort(hostIPs[0].String(), defaultIPFIXPort) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of dual-stack, it seems we can pick either IPv4 or IPv6 addresses depending on the DNS LookUp output. What if we pick the IPv6 address and the node network is IPv4? In that scenario, the flow exporter will not be able to connect to the flow aggregator. I believe IPv4-in-IPv6 and IPv6-in-IPv4 is possible in dual stack. @lzhecheng any input here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we should check the node network address format and pick the correct resolved address of the flow aggregator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, make sense to me, will try to check the node network address format to determine the v4/v6 address of flow aggregator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are possible in a dual-stack setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking both addresses and picking the "correct" one should be ok. However, I am not sure the case you describe is possible. I think in order to have dual-stack Services, the Nodes need to have dual-stack networking themselves.
BTW, it's still a bit weird to me that:
- we have a special case for
strings.Contains(collectorAddr, flowAggregatorDNSName)
. I thought we had removed that special case. - the DNS lookup is not handled by the go-ipfix library
test/e2e/fixtures.go
Outdated
return nil, err, isIPv6 | ||
func setupTestWithIPFIXCollector(tb testing.TB) (*TestData, bool, error) { | ||
var isIPv6 bool | ||
if clusterInfo.podV6NetworkCIDR == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
networkPolicyOnly mode does not support dual stack. That has to be handled differently.
See this: https://github.com/vmware-tanzu/antrea/blob/e92ee627c2a218150935e41a62d43ed5a03e06b4/pkg/agent/config/node_config.go#L120
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Please make sure that the following CI tests pass: |
1c8979d
to
74cf81e
Compare
Codecov Report
@@ Coverage Diff @@
## main #1819 +/- ##
==========================================
+ Coverage 65.10% 67.57% +2.46%
==========================================
Files 195 197 +2
Lines 16875 17217 +342
==========================================
+ Hits 10987 11634 +647
+ Misses 4725 4384 -341
- Partials 1163 1199 +36
Flags with carried forward coverage won't be shown. Click here to find out more.
|
test/e2e/fixtures.go
Outdated
if err := ensureAntreaRunning(tb, testData); err != nil { | ||
return nil, err, isIPv6 | ||
func setupTestWithIPFIXCollector(tb testing.TB) (*TestData, bool, error) { | ||
var isIPv6 bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe you can set isIPv6
to false and turn it to true in if statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addressed.
test/e2e/framework.go
Outdated
@@ -462,7 +462,8 @@ func (data *TestData) deployFlowAggregator(ipfixCollector string) (string, error | |||
} | |||
if rc, _, _, err = provider.RunCommandOnNode(controlPlaneNodeName(), fmt.Sprintf("kubectl -n %s rollout status deployment/%s --timeout=%v", flowAggregatorNamespace, flowAggregatorDeployment, 2*defaultTimeout)); err != nil || rc != 0 { | |||
_, stdout, _, _ := provider.RunCommandOnNode(controlPlaneNodeName(), fmt.Sprintf("kubectl -n %s describe pod", flowAggregatorNamespace)) | |||
return stdout, fmt.Errorf("error when waiting for flow aggregator rollout to complete. kubectl describe output: %v", stdout) | |||
_, logStdout, _, _ := provider.RunCommandOnNode(controlPlaneNodeName(), fmt.Sprintf("kubectl -n %s logs -l app=flow-aggregator", flowAggregatorNamespace)) | |||
return stdout, fmt.Errorf("error when waiting for flow aggregator rollout to complete. kubectl describe output: %v, logs: %v", stdout, logStdout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use %v
rather than %s
, any specific purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addressed.
d3c513c
to
0ac0ab6
Compare
/test-ipv6-only-e2e |
0ac0ab6
to
9c3b9c1
Compare
/test-ipv6-only-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what the impact for the collector / dashboards of using 2 different template sets (one for IPv4 and one for IPv6)?
@@ -289,6 +289,7 @@ function deliver_antrea { | |||
docker tag "${DOCKER_REGISTRY}/antrea/sonobuoy-systemd-logs:v0.3" "sonobuoy/systemd-logs:v0.3" | |||
fi | |||
DOCKER_REGISTRY=${DOCKER_REGISTRY} make | |||
DOCKER_REGISTRY=${DOCKER_REGISTRY} make flow-aggregator-ubuntu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so until now the flow-aggregator image was pulled from the docker registry (instead of building the image from the checked-out repo and using that one)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Jenkins, yes. I found this issue when debugging the Jenkins tests.
return fmt.Errorf("resolved Flow Aggregator address %v is not supported", hostIPs[0]) | ||
} | ||
// Update the collector address with resolved IP of flow aggregator | ||
collectorAddr = net.JoinHostPort(hostIPs[0].String(), defaultIPFIXPort) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking both addresses and picking the "correct" one should be ok. However, I am not sure the case you describe is possible. I think in order to have dual-stack Services, the Nodes need to have dual-stack networking themselves.
BTW, it's still a bit weird to me that:
- we have a special case for
strings.Contains(collectorAddr, flowAggregatorDNSName)
. I thought we had removed that special case. - the DNS lookup is not handled by the go-ipfix library
pkg/flowaggregator/flowaggregator.go
Outdated
if err != nil { | ||
fa.exportingProcess.CloseConnToCollector() | ||
fa.exportingProcess = nil | ||
fa.templateID = 0 | ||
fa.templateIDv4 = 0 | ||
return fmt.Errorf("sending template set failed, err: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may specify which template set in the error message (IPv4 / IPv6), same comment below
Hi Antonin, thanks for your comments. Yes in this PR, we send two templates (one for IPv4, one for IPv6) regardless of the cluster configuration. This is the same method used when adding IPv6 support for flow exporter. It would be fine for the collector. However, it has issues in the e2e test in dual-stack cluster, for example, the source IP is IPv6 and destination is IPv4. I have talked with @lzhecheng and found that the current method doesn’t support dual stack for flow exporter too (the old e2e test for flow exporter didn’t test that scenario so IPv6 flow exporter PR passed all tests). So in this PR, we are focusing on the IPv6 only support of flow aggregator to catch up with flow exporter. And will add dual-stack support for flow exporter and aggregator in the future. How does this sound to you? |
If source and destination are a mixture of IPv4 and IPv6, one template including both IP families should be used. It may introduce some extra effort to implement. So far, with 2 templates used, all IPv4 or all IPv6 traffic is supported in a dual-stack setup. |
@dreamtalen @lzhecheng why don't we just a single template from the get go? we already have fields, such as Also speaking of dual-stack, how easy will it be to create dashboards to show traffic distribution regardless of whether the traffic is v4 or v6? |
f85671c
to
c1bc648
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest code LGTM. Just want to make sure that @srikartati and / or @zyiou review the Kibana changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be better if you can have some screenshot on changes to Kibana dashboard. (e.g. overview and flow records dashboard)
c71848f
to
0014815
Compare
Sure, please take a look at the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
With dual stack support, we may have to add ipFamily as another filter along with podname, podnamespace filters etc. in kibana dashboard to enable users to pick one, otherwise both will be shown.
/test-all |
1 similar comment
/test-all |
/test-ipv6-only-e2e |
/test-ipv6-only-e2e |
1 similar comment
/test-ipv6-only-e2e |
8426f87
to
66e6b5d
Compare
/test-all |
/test-ipv6-only-e2e |
/test-ipv6-only-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Let's wait for PR #1959 to go in as it is a critical bug; that will also help in passing e2e tests.
@@ -304,8 +319,10 @@ func exportLogs(tb testing.TB, data *TestData, logsSubDir string, writeNodeLogs | |||
} | |||
|
|||
func teardownFlowAggregator(tb testing.TB, data *TestData) { | |||
if err := testData.gracefulExitFlowAggregator(testOptions.coverageDir); err != nil { | |||
tb.Fatalf("Error when gracefully exiting Flow Aggregator: %v", err) | |||
if testOptions.enableCoverage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this missing code as suggested, so we do not fail for tests that don't use flow aggregator coverage binary like ipv6 tests.
66e6b5d
to
411316e
Compare
/test-all |
411316e
to
338e110
Compare
/test-all |
/test-e2e |
In this commit, we add single stack IPv6 support in flow aggregator and elk flow collector. For dual stack, there are some issues in service name retreival at flow exporter, will add support in a new PR.
338e110
to
003a619
Compare
/test-e2e |
/test-ipv6-only-e2e |
/test-ipv6-e2e |
All the main tests seem to be passing. Will wait for dual-stack as well, because it involves IPv6 family. |
In this commit, we add single stack IPv6 support in flow aggregator and elk flow collector.
Some screenshots of updated kibana dashboard in IPv6 cluster:
For dual stack, there are some issues in service name retreival at flow exporter, will add support in a new PR.