Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode flowType in Kibana dashboard #2102

Merged
merged 1 commit into from
May 3, 2021
Merged

Conversation

zyiou
Copy link
Contributor

@zyiou zyiou commented Apr 16, 2021

This commit decodes flowType from uint8 to string and adds flowType
as filter in Kibana dashboard. It adds a Pod-to-External dashboard
to visualize flows with type 'To External' and supports flowType
filter. It also extends number of options shown in filter and
updates corresponding visibility doc.

fixes #2056

Changes in Pod-to-Pod Flow dashboard:
flow-visualization-pod-to-pod-1

Pod-to-External Flow dashboard:
flow-visualization-pod-to-external-1
flow-visualization-pod-to-external-2

Changes in Flow Record dashboard:
flow-visualization-flow-record

@codecov-commenter
Copy link

codecov-commenter commented Apr 16, 2021

Codecov Report

Merging #2102 (60d9490) into main (d258bbc) will increase coverage by 19.81%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2102       +/-   ##
===========================================
+ Coverage   41.41%   61.22%   +19.81%     
===========================================
  Files         131      269      +138     
  Lines       16502    20453     +3951     
===========================================
+ Hits         6834    12523     +5689     
+ Misses       9084     6634     -2450     
- Partials      584     1296      +712     
Flag Coverage Δ
kind-e2e-tests 52.07% <ø> (?)
unit-tests 41.39% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/apiserver/handlers/endpoint/handler.go 58.82% <0.00%> (-11.77%) ⬇️
pkg/apiserver/handlers/webhook/mutation_labels.go 24.71% <0.00%> (ø)
...versions/security/v1alpha1/clusternetworkpolicy.go 64.28% <0.00%> (ø)
...et/versioned/typed/system/v1beta1/system_client.go 45.45% <0.00%> (ø)
pkg/agent/flowexporter/flowrecords/flow_records.go 78.43% <0.00%> (ø)
pkg/legacyapis/core/v1alpha2/register.go 80.00% <0.00%> (ø)
pkg/controller/types/networkpolicy.go 100.00% <0.00%> (ø)
...ormers/externalversions/core/v1alpha2/interface.go 100.00% <0.00%> (ø)
pkg/apiserver/registry/networkpolicy/util.go 100.00% <0.00%> (ø)
pkg/agent/cniserver/ipam/ipam_delegator.go 48.83% <0.00%> (ø)
... and 226 more

Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Will there be a new dashboard called Pod-To-External flows like Pod-To-Pod flows and Pod-to-Service flows?

@zyiou
Copy link
Contributor Author

zyiou commented Apr 20, 2021

Thanks for the PR. Will there be a new dashboard called Pod-To-External flows like Pod-To-Pod flows and Pod-to-Service flows?

Do we need extra dashboard? Currently I put a flow type filter in Pod-To-Pod flows and Pod-to-Service flows dashboard so that users can view the Pod-to-External flows from the diagram in these two dashboards.

@srikartati
Copy link
Member

Do we need extra dashboard? Currently I put a flow type filter in Pod-To-Pod flows and Pod-to-Service flows dashboard so that users can view the Pod-to-External flows from the diagram in these two dashboards.

I prefer a separate dashboard for Pod-To-External flows as they are different from Pod-To-Pod flows and Pod-To-Service flows. Any opinion here? @antoninbas @jianjuns

@antoninbas
Copy link
Contributor

If we are going to keep the name "Pod-to-Pod ..." for the diagram, then it makes sense to not include the Pod-to-External flows in it.
I'm ok with keeping a single diagram if we rename it to "Pod traffic cumulative bytes" or something along these lines. That may be the best solution actually if we want to be able to visualize all the traffic from a given Pod in a single diagram. And of course one can always based on flow types (and even combine several flow types as the filter).

BTW, can someone validate Flow Export when using the Egress feature and using a SNAT IP assigned to a different Node for Pod-to-External traffic?

@srikartati
Copy link
Member

If we are going to keep the name "Pod-to-Pod ..." for the diagram, then it makes sense to not include the Pod-to-External flows in it.
I'm ok with keeping a single diagram if we rename it to "Pod traffic cumulative bytes" or something along these lines. That may be the best solution actually if we want to be able to visualize all the traffic from a given Pod in a single diagram. And of course one can always based on flow types (and even combine several flow types as the filter).

Agree with the latter option, but we probably need to merge existing dashboards of Pod-To-Pod flows and Pod-To-Service flows and then capture everything in one dashboard through new filters. I feel it is probably easier if we create an extra dashboard for Pod-To-External flows. @zyiou can be best judge for this.

BTW, can someone validate Flow Export when using the Egress feature and using a SNAT IP assigned to a different Node for Pod-to-External traffic?

I tried with the following egress policy.

apiVersion: crd.antrea.io/v1alpha2
kind: Egress
metadata:
  name: egress-kafka
spec:
  appliedTo:
    namespaceSelector:
      matchLabels:
        app: flow-aggregator
  egressIP: 192.168.77.100

I can see traffic being tunneled to a different node before leaving the cluster, but we do not see all the connections being exported by Antrea Flow Exporter as we consider only the ones in Antrea zones. The following connection is exported to the flow aggregator, and then to the flow collector. Flow type is correctly tagged but there is network policy info because there are no conntrack labels in the polled connection. Need to check if we are adding labels for the egress policy or those are available in a different connection.

DATA RECORD-0:
   flowStartSeconds: 1618963295 
   flowEndSeconds: 1618963301 
   flowEndReason: 3 
   sourceTransportPort: 47470 
   destinationTransportPort: 9092 
   protocolIdentifier: 6 
   packetTotalCount: 4 
   octetTotalCount: 180 
   packetDeltaCount: 4 
   octetDeltaCount: 180 
   sourceIPv4Address: 10.10.1.20 
   destinationIPv4Address: 10.186.149.178 
   reversePacketTotalCount: 3 
   reverseOctetTotalCount: 124 
   reversePacketDeltaCount: 3 
   reverseOctetDeltaCount: 124 
   sourcePodName: flow-aggregator-6b6c55cb9c-cql55 
   sourcePodNamespace: flow-aggregator 
   sourceNodeName: k8s-node-worker-1 
   destinationPodName:  
   destinationPodNamespace:  
   destinationNodeName:  
   destinationServicePort: 0 
   destinationServicePortName:  
   ingressNetworkPolicyName:  
   ingressNetworkPolicyNamespace:  
   egressNetworkPolicyName:  
   egressNetworkPolicyNamespace:  
   tcpState: TIME_WAIT 
   flowType: 3 
   destinationClusterIPv4: 0.0.0.0 
   originalExporterIPv4Address: 10.10.1.1 
   originalObservationDomainId: 2294339213 
   octetDeltaCountFromSourceNode: 180 
   octetTotalCountFromSourceNode: 180 
   packetDeltaCountFromSourceNode: 4 
   packetTotalCountFromSourceNode: 4 
   reverseOctetDeltaCountFromSourceNode: 124 
   reverseOctetTotalCountFromSourceNode: 124 
   reversePacketDeltaCountFromSourceNode: 3 
   reversePacketTotalCountFromSourceNode: 3 
   octetDeltaCountFromDestinationNode: 180 
   octetTotalCountFromDestinationNode: 180 
   packetDeltaCountFromDestinationNode: 4 
   packetTotalCountFromDestinationNode: 4 
   reverseOctetDeltaCountFromDestinationNode: 124 
   reverseOctetTotalCountFromDestinationNode: 124 
   reversePacketDeltaCountFromDestinationNode: 3 
   reversePacketTotalCountFromDestinationNode: 3 

@antoninbas
Copy link
Contributor

@srikartati For Egress here is what I would expect:

  1. the Source Node sends a flow record (Pod-to-External) to the Flow Aggregator
  2. the Egress Node does not send a flow record to the Flow Aggregator - or if it does, the Flow Aggregator may do de-dup
  3. we may want to add a new IE with the Egress information (Egress IP?)
  4. the Egress NetworkPolicy information should be populated correctly. I am really surprised that this is not the case already. Did you have an Antrea-native policy with an egress rule applied on the source Pod and allowing traffic?

@zyiou
Copy link
Contributor Author

zyiou commented Apr 27, 2021

Agree with the latter option, but we probably need to merge existing dashboards of Pod-To-Pod flows and Pod-To-Service flows and then capture everything in one dashboard through new filters. I feel it is probably easier if we create an extra dashboard for Pod-To-External flows. @zyiou can be best judge for this.

Workloads of these two options are similar to me. I have a question on these first option.
(1) merge Pod-to-Pod and Pod-to-Service dashboards
In this case, we need to move all the diagrams and graphs from Pod-to-Service to Pod-to-Pod dashboard, right? Would it be too crowded (about 16 graphs in one dashboard)

@antoninbas
Copy link
Contributor

I think Pod-to-External and Pod-to-Service are orthogonal:

  • A Service can have an external endpoint
  • Pod-to-Service is not one of the flow types (FlowTypeIntraNode, FlowTypeInterNode, FlowTypeToExternal)
    So it makes sense to me that we have a dedicated Pod-to-Service graph.

@zyiou
Copy link
Contributor Author

zyiou commented Apr 27, 2021

Got it. Then combining Pod-to-Pod and Pod-to-External traffic into one dashboard makes sense to me.
Updated the dashboard with following changes: (see screenshot in PR description)

  1. Pod-to-Pod Flow dashboard -> Pod-to-Pod/External Flow dashboard
  2. diagram names are changed to: Pod Traffic Cumulative Bytes and Pod Traffic Reverse Cumulative Bytes
  3. destination pod name of Pod-to-External traffic is replaced by destination IP

@srikartati
Copy link
Member

@srikartati For Egress here is what I would expect:

  1. the Source Node sends a flow record (Pod-to-External) to the Flow Aggregator
  2. the Egress Node does not send a flow record to the Flow Aggregator - or if it does, the Flow Aggregator may do de-dup
  3. we may want to add a new IE with the Egress information (Egress IP?)
  4. the Egress NetworkPolicy information should be populated correctly. I am really surprised that this is not the case already. Did you have an Antrea-native policy with an egress rule applied on the source Pod and allowing traffic?

Thanks for the clarification. Re-tested it and made sure the egress/SNAT policy is installed properly. Here the egress node IP is 192.168.77.100
I see the following conntrack flow on the source Node, which gets exported to Flow Aggregator.

vagrant@k8s-node-worker-1:~$ sudo conntrack -L  | grep 9092
tcp      6 86393 ESTABLISHED src=10.10.1.79 dst=10.186.149.122 sport=53090 dport=9092 packets=86 bytes=9345 src=10.186.149.122 dst=10.10.1.79 sport=9092 dport=53090 packets=94 bytes=13313 [ASSURED] mark=0 zone=65520 delta-time=546 use=1

And the following conntrack flows on the egress Node:

vagrant@k8s-node-control-plane:~$ sudo conntrack -L | grep 9092
tcp      6 86397 ESTABLISHED src=10.10.1.79 dst=10.186.149.122 sport=53090 dport=9092 packets=74 bytes=8045 src=10.186.149.122 dst=192.168.77.100 sport=9092 dport=53090 packets=78 bytes=12233 [ASSURED] mark=0 delta-time=453 use=1
tcp      6 86397 ESTABLISHED src=10.10.1.79 dst=10.186.149.122 sport=53090 dport=9092 packets=74 bytes=8045 src=10.186.149.122 dst=10.10.1.79 sport=9092 dport=53090 packets=78 bytes=12233 [ASSURED] mark=0 zone=65520 delta-time=453 use=1

Here one of them is in Antrea conntrack zone and other one is in default zone. which is ignored by the flow exporter.
The one in Antrea conntrack zone gets exported. It is the same flow record and this needs to be correlated and aggregated in the flow aggregator. I will open an issue for this.

Regarding adding egress IP into the flow record as new IE. We need the ignored conntrack flow to do that. We will consider this enhancement when supporting External-To-Pod flows in the flow exporter.

@srikartati
Copy link
Member

  1. Pod-to-Pod Flow dashboard -> Pod-to-Pod/External Flow dashboard

Hi @zyiou,
I looked at the new dashboard. Two questions:

  1. What do filters destinationPodname, destinationPodnamespace, and destinationPodNode signify in the case of Pod-To-External flows?
  2. Why don't we support flow key filter for Pod-To-External flows?

@zyiou
Copy link
Contributor Author

zyiou commented Apr 29, 2021

  1. Pod-to-Pod Flow dashboard -> Pod-to-Pod/External Flow dashboard

Hi @zyiou,
I looked at the new dashboard. Two questions:

  1. What do filters destinationPodname, destinationPodnamespace, and destinationPodNode signify in the case of Pod-To-External flows?
  2. Why don't we support flow key filter for Pod-To-External flows?

For Pod-To-External flows, destinationPodname, destinationPodnamespace, and destinationPodNode will be empty string. Basically it indicates the information is not available. We can replace the empty string to N/A if you think that is clearer.
We support flowKey for Pod-to-External` flows. Forgot to change the names of the filter. If you are not seeing the expected flow key, that is because by default they limit number of options shown in the option list to be 10. Fixed that in latest commit.

@srikartati
Copy link
Member

  1. Pod-to-Pod Flow dashboard -> Pod-to-Pod/External Flow dashboard

Hi @zyiou,
I looked at the new dashboard. Two questions:

  1. What do filters destinationPodname, destinationPodnamespace, and destinationPodNode signify in the case of Pod-To-External flows?
  2. Why don't we support flow key filter for Pod-To-External flows?

For Pod-To-External flows, destinationPodname, destinationPodnamespace, and destinationPodNode will be empty string. Basically it indicates the information is not available. We can replace the empty string to N/A if you think that is clearer.
We support flowKey for Pod-to-External` flows. Forgot to change the names of the filter. If you are not seeing the expected flow key, that is because by default they limit number of options shown in the option list to be 10. Fixed that in latest commit.

I feel as a user, the dashboard looks complicated, where some filters are applicable for one type and others are not. I think the primary goal for UI is that it should be simple to navigate and understand. If it is not much work, I prefer a separate dashboard as it is much more straightforward.

This commit decodes flowType from uint8 to string and adds flowType
as filter in Kibana dashboard. It adds a Pod-to-External dashboard
to visualize flows with type 'To External' and supports flowType
filter. It also extends number of options shown in filter and
updates corresponding visibility doc.
@zyiou
Copy link
Contributor Author

zyiou commented Apr 30, 2021

I feel as a user, the dashboard looks complicated, where some filters are applicable for one type and others are not. I think the primary goal for UI is that it should be simple to navigate and understand. If it is not much work, I prefer a separate dashboard as it is much more straightforward.

Got it. Added separate Pod-to-External dashboard. See updated screenshots in description part. Thanks!

@srikartati
Copy link
Member

Got it. Added separate Pod-to-External dashboard. See updated screenshots in description part. Thanks!

LGTM. Thanks for making the changes.

@zyiou zyiou requested a review from antoninbas April 30, 2021 22:55
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems there is no support to check all the traffic leaving a specific Pod (Pod-to-Pod and Pod-to-External) in a single graph. Is that correct? Not that I feel strongly about this, we can always adjust in the future based on feedback...

@zyiou
Copy link
Contributor Author

zyiou commented Apr 30, 2021

So it seems there is no support to check all the traffic leaving a specific Pod (Pod-to-Pod and Pod-to-External) in a single graph. Is that correct? Not that I feel strongly about this, we can always adjust in the future based on feedback...

For checking Pod aggregated traffic, we have two graphs (check 3rd screenshot of Pod-to-Pod dashboard) https://github.com/vmware-tanzu/antrea/blob/main/docs/network-flow-visibility.md#pod-to-pod-flows. I put these two graphs in both Pod-to-Pod flow dashboard and Pod-to-External flow dashboard for users to check. No sankey diagrams support for source Pod aggregated traffic.

@srikartati
Copy link
Member

This is a kibana dashboard change, so does not affect any e2e tests.
/skip-all

@srikartati srikartati merged commit 2dda55c into antrea-io:main May 3, 2021
@zyiou zyiou added area/flow-visibility Issues or PRs related to flow visibility support in Antrea area/flow-visibility/elk Issues or PRs related to the reference ELK configuration for flow visualization labels Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/flow-visibility/elk Issues or PRs related to the reference ELK configuration for flow visualization area/flow-visibility Issues or PRs related to flow visibility support in Antrea
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can the Kibana dashboard decode the flowType to a string?
5 participants