Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sflow dropped packet notifications feature #1477

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 96 additions & 5 deletions doc/sflow/sflow_hld.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ Rev | Rev Date | Author | Change Description
|v1.1 |10/23/2019 |Padmanabhan Narayanan | Update SAI section to use SAI_HOSTIF_ATTR_GENETLINK_MCGRP_NAME instead of ID. Note on genetlink creation. Change admin_state values to up/down instead of enable/disable to be consistent with management framework's sonic-common.yang.
|v1.2 |03/07/2021 | Garrick He | Add VRF support and fix interface admin-status output.
|v1.3 |01/24/2023 | Rajkumar (Marvell) | Add Egress Sflow support.
|v1.4 |06/20/2023 | | Add dropped packet notification support.

## 2. Scope
This document describes the high level design of sFlow in SONiC

Expand All @@ -32,9 +34,11 @@ sFlow (defined in https://sflow.org/sflow_version_5.txt) is a standard-based sam
* Statistical packet-based sampling of switched or routed packet flows to provide visibility into network usage and active routes
* Time-based sampling of interface counters.

The sFlow Dropped Packet Notification Structures extension (defined in https://sflow.org/sflow_drops.txt) adds a third type of measurement, reporting the packet header, ingress port, and drop reason for each packet dropped by the network device.

The sFlow monitoring system consists of:

* sFlow Agents that reside in network equipment which gather network traffic and port counters and combines the flow samples and interface counters into sFlow datagrams and forwards them to the sFlow collector at regular intervals over a UDP socket. The datagrams consist of information on, but not limited to, packet header, ingress and egress interfaces, sampling parameters, and interface counters. A single sFlow datagram may contain samples from many flows.
* sFlow Agents that reside in network equipment which gather network traffic and port counters and combines the flow samples, interface counters, and dropped packet notifications into sFlow datagrams and forwards them to the sFlow collector at regular intervals over a UDP socket. The datagrams consist of information on, but not limited to, packet header, ingress and egress interfaces, sampling parameters, and interface counters. A single sFlow datagram may contain samples from many flows.
* sFlow collectors which receive and analyze the sFlow data.

sFlow is an industry standard, low cost and scalable technique that enables a single analyzer to provide a network wide view.
Expand Down Expand Up @@ -93,6 +97,7 @@ The syncd container is enhanced to support the SAI SAMPLEPACKET APIs.
The ASIC drivers need to be enhanced to:
* Associate the SAI_HOSTIF_TRAP_TYPE_SAMPLEPACKET to a specific genetlink channel and multicast group.
* Punt trapped samples to this genetlink group
* Associate SAI_HOSTIF_TRAP_TYPE_PIPELINE_DISCARD_EGRESS_BUFFER, SAI_HOSTIF_TRAP_TYPE_PIPELINE_DISCARD_WRED, and SAI_HOSTIF_TRAP_TYPE_PIPELINE_DISCARD_ROUTER trap types to specific genetlink channel and multicast group.

The sflow container and changes to the existing components to support sflow are described in the following sections.

Expand Down Expand Up @@ -366,6 +371,13 @@ When sflow is disabled globally, sampling is stopped on all relevant interfaces
* If port speed changes, this setting will be used to determine the updated sample-rate for the interface.
* The config sflow interface sample-rate {interface-name} {value} setting can still be used to override the speed based setting for specific interfaces.

* **sflow drop-monitor-limit** *{value}*

Enable and rate limit dropped packet notifications.

* Valid range 0-500 notifications per second
* Set drop-monitor-limit to 0 to disable


#### Show commands

Expand Down Expand Up @@ -413,8 +425,9 @@ The configDB objects for the above CLI is given below:
"global": {
"admin_state": "up"
"polling_interval": "20"
"agent_id": "loopback0"
"sample_direction": "both"
"agent_id": "loopback0",
"sample_direction": "both"
"drop_monitor_limit": "0"
}
}
"SFLOW_SESSION": {
Expand All @@ -435,13 +448,13 @@ sFlow Global Information:
sFlow Admin State: up
sFlow Sample Direction: both
sFlow Polling Interval: 0
sFlow Drop Notification Limit: 0
sFlow AgentID: default

2 Collectors configured:
Name: prod IP addr: fe80::6e82:6aff:fe1e:cd8e UDP port: 6343 VRF: mgmt
Name: ser5 IP addr: 172.21.35.15 UDP port: 6343 VRF: default


```

# show sflow interface
Expand All @@ -456,7 +469,6 @@ Ethernet2 up 40000 both
Ethernet3 up 40000 both
Ethernet4 up 40000 rx
Ethernet5 up 40000 rx

```

### 6.6 **DB and Schema changes**
Expand Down Expand Up @@ -489,6 +501,7 @@ ADMIN_STATE = "up" / "down"
POLLING_INTERVAL = 1*3DIGIT ; counter polling interval
AGENT_ID = ifname ; Interface name
SAMPLE_DIRECTION = "rx"/"tx"/"both" ; Sampling direction
DROP_MONITOR_LIMIT = 1*3DIGIT ; rate limit for packet drop notifications
```

A new SFLOW_SESSION table would be added.
Expand Down Expand Up @@ -618,6 +631,7 @@ collecttor vrf| collector.vrf
agent ip-address | agentIP
max-datagram-size | datagramBytes
sample-rate | sampling
drop-monitor-limit | dropmon.limit

The master list of supported host-sflow tokens are found in host-sflow/src/Linux/hsflowtokens.h

Expand All @@ -637,6 +651,12 @@ hsflowd bus/events|SONiC callback actions

Refer to host-sflow/src/Linux/hsflowd.h for a list of events.

#### mod_dropmon

Configuring a non-zero sflow drop-monitor-limit enables the hsflowd mod_dropmon module which uses the generic netlink drop_monitor interface to register for and receive packet drop notifications for software and hardware drops https://github.com/torvalds/linux/blob/master/include/uapi/linux/net_dropmon.h which are then exported as sFlow Dropped Packet Notification Structures https://sflow.org/sflow_drops.txt

The netlink drop_monitor interface is used by the Linux kernel NET_DM module to report on packets dropped within the software stack. Integrating software and hardware dropped packet monitoring simplifies the task of the driver since trapped packets that need to be delivered to the network stack, for example ttl expired packets delivered to the Linux stack where the packet will be dropped and an ICMP TTL expired message will be generated, don't need to copied and reported as a hardware drop since the software drop will be reported by the kernel.

### 6.8 **SWSS and syncd changes**

### sFlowOrch
Expand All @@ -645,10 +665,14 @@ An sFlowOrch is introduced in the Orchagent to handle configuration requests. Th

Also, it monitors the SFLOW_SESSIONS_TABLE and PORT state to determine sampling rate / speed changes to derive and set the sampling rate for all the interfaces. Ingress/Egress Sampling is enabled on the interfaces based on direction setting. It uses the SAI samplepacket APIs to set each ports's sampling rate.

Finally, sFlowOrch facilitates the creation/deletion of dropped packet sessions as well as get/set of session specific attributes. sFlowOrch sets the genetlink host interface that is to be used by the SAI driver to deliver dropped packet notifications and reason codes.

### Rate limiting

Considering that sFlow backoff mechanism is not being implemented, users should consider rate limiting sFlow samples using the currently existing COPP mechanism (the COPP config (e.g. src/sonic-swss/swssconfig/sample/00-copp.config.json) can include appropriate settings for the samplepacket trap and initialised using swssconfig).

Similarly, dropped packet notifications should be rate limited using existing COPP mechanism.

### 6.9 **SAI changes**

Creating sFlow sessions and setting attributes (e.g. sampling rate) is described in SAI proposal : https://github.com/opencomputeproject/SAI/tree/master/doc/Samplepacket
Expand Down Expand Up @@ -786,6 +810,73 @@ sai_create_hostif_table_entry_fn(&host_table_entry, 4, sai_host_table_attr);

It is assumed that the trap group and the trap itself have been defined using sai_create_hostif_trap_group_fn() and sai_create_hostif_trap_fn().

### 6.10 Mapping dropped packet traps to a GENETLINK host interface multicast group

Below is an example code snip that shows how a GENETLINK based host inerface is created.

```
sai_object_id_t hostif_id;
sai_attribute_t hostif_attr[3];

hostif_attr[0].id=SAI_HOSTIF_ATTR_TYPE;
hostif_attr[0].value=SAI_HOSTIF_TYPE_GENETLINK;

hostif_attr[1].id= SAI_HOSTIF_ATTR_NAME;
hostif_attr[1].value="NET_DM";

sai_host_if_attr[2].id= SAI_HOSTIF_ATTR_GENETLINK_MCGRP_NAME;
sai_host_if_attr[2].value="packets";

aai_create_hostif_fn(&hostif_id, 3, hostif_attr);
```

Below is the code snip that outlines how a dropped packet notifications are mapped to the GENETLINK host interface.

```
sai_object_id_t table_entry;
sai_attribute_t table_attr[4];

table_attr[0].id=SAI_HOSTIF_TABLE_ENTRY_ATTR_TYPE;
table_attr[0].value=SAI_HOSTIF_TABLE_ENTRY_TYPE_TRAP_ID;

table_attr[1].id=SAI_HOSTIF_TABLE_ENTRY_ATTR_TRAP_ID;
table_attr[1].value=discard_packet_trap_id; // Object referencing DISCARD packet traps

table_attr[2].id=SAI_HOSTIF_TABLE_ENTRY_ATTR_CHANNEL;
table_attr[2].value=SAI_HOSTIF_TABLE_ENTRY_CHANNEL_TYPE_GENETLINK;

table_attr[3].id=SAI_HOSTIF_TABLE_ENTRY_ATTR_HOST_IF;
table_attr[3].value=hostif_id;

sai_create_hostif_table_entry_fn(&table_entry, 4, table_attr);
```
It is assumed that the trap group and the trap itself have been defined using sai_create_hostif_trap_group_fn() and sai_create_hostif_trap_fn().

Similar to sampled packet notifications, dropped packet notifications are mapped to the following NET_DM attributes https://github.com/torvalds/linux/blob/master/include/uapi/linux/net_dropmon.h:

```
NET_DM_ATTR_IN_PORT
The input interface index of the packet, if there is one,
identified by NET_DM_ATTR_PORT_NETDEV_IFINDEX.

NET_DM_ATTR_ORIG_LEN
The size of the original packet (before truncation)

NET_DM_ATTR_TRUNC_LEN
Number of packet header bytes in message

NET_DM_ATTR_PAYLOAD
Packet header bytes

NET_DM_ATTR_HW_TRAP_GROUP_NAME
String describing drop group, e.g. "l2_drops"

NET_DM_ATTR_HW_TRAP_NAME
String describing specific drop reason, e.g. "ingress_vlan_filter"
```

Published group and trap names should be used where possible https://www.kernel.org/doc/html/latest/networking/devlink/devlink-trap.html

#### SAI capability query for Sflow
```
sai_attr_capability_t capability;
Expand Down