Tracking "not allowed"/"denied" connections at Antrea Agent #1004
Labels
area/monitoring/traffic-analysis
Issues or PRs related to traffic analysis.
kind/design
Categorizes issue or PR as related to design.
Describe what you are trying to solve
Connection tracking table only holds "allowed" connections that are matched through network policy rules. We do not know what connections are dropped or denied, and also statistics associated with them. These connections will be useful for flow visibility feature giving an account of denied connections for the user, specifically flow exporter can export these flows along with the ones in the conntrack table.
In addition, this can be used by network policy metrics(Issue #985) to track metrics of "not allowed"/"denied" connections. Even with specific "deny" rule in ClusterNetworkPolicy, the number of denied connections/sessions would be equal to the number of packets, which is inaccurate.
Describe the solution you have in mind
We will add the "packetIn" action in OVS flow rules, where the packets are dropped. For the K8s network policy, this action will not be any specific rule, whereas for cluster network policy, this action be will specific to rule when the deny rule is present. packetIn infra: https://github.com/vmware-tanzu/antrea/blob/a858c9ad25c215418003e4cb92a68ec62c7fddbf/pkg/agent/openflow/packetin.go
With the help of packetIn, we maintain a map/store to keep track of the "not allowed"/"denied" connections.
More details are below.
Describe how your solution impacts user flows
This solution do not have any user-facing feature. Flow exporter features and other features could be benefited from this solution.
Describe the main design/architecture of your solution
We will have a map/store to store these "not allowed/denied" connections. We will create a new function handler in packetIn, where a new connection is added or an existing connection is updated. The key in the map will 5-tuple of connection. We will make use of Connection and ConnectionKey constructs defined as part of the flow exporter feature:
https://github.com/vmware-tanzu/antrea/blob/a858c9ad25c215418003e4cb92a68ec62c7fddbf/pkg/agent/flowexporter/types.go
We have to flush these connections periodically and have a limit on how many "not allowed"/"denied" connections are stored.
Test plan
Unit tests. Enhance the integration and e2e tests, where applicable
Additional context
Open questions:
Initial thoughts are we should have the same timeout period for all connections, and it should be a small period as these "not allowed"/"denied" connections are typically short-lived and will help to track more connections.
Or should we base the time out on the last updated time for each connection?
Is 10s good to begin with?
The text was updated successfully, but these errors were encountered: