Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logical port CONTROLLER for Tofino switch connected to ONOS via Stratum #529

Open
endrigoshi opened this issue Nov 20, 2022 · 17 comments
Open

Comments

@endrigoshi
Copy link

Hi all,

I am using a Tofino switch to deploy my own P4 program using ONOS and Stratum.
First, I have started with the deployment of the fabric-tna program so that I understand how everything works.

However, after building the fabric profile and following all the other steps to create the pipeconf, I get the following error after adding the netcfg:

2022-11-20T17:49:47,390 | WARN  | GroupDriverProvider-0 | P4RuntimeReplicationGroupProgrammable | 221 - org.onosproject.onos-protocols-grpc-utils - 2.7.0 | Unable to translate replication group, aborting ADD operation: interpreter cannot
 map logical port CONTROLLER [DefaultGroup{description=DefaultGroup{deviceId=device:milantofino, type=CLONE, buckets=GroupBuckets{buckets=[DefaultGroupBucket{type=CLONE, treatment=DefaultTrafficTreatment{immediate=[OUTPUT:CONTROLLER], de
ferred=[], transition=None, meter=[], cleared=false, StatTrigger=null, metadata=null}, packets=0, bytes=0}]}, appId=DefaultApplicationId{id=174, name=org.stratumproject.fabric-tna}, appCookie=0x02FE07, givenGroupId=511}, groupid=GroupId{
id=0x1ff}, state=PENDING_ADD_RETRY, age=0}]

All the other flow-rules are added correctly and this is the only one that is stuck in PENDING_RETRY_ADD.

I also have another question regarding the way the port numbers are shown in fabric-tna. For example, I see these type of match criteria: IN_PORT:4294967040, vlan_is_valid=0x0, or IN_PORT:4294967040, ip_eth_type=0x800. What do the values for IN_PORT represent?

Lastly, do you have any information how I can forward a packet to ONOS from Tofino? What should be the CPU_PORT value defined in the P4 program?

Thank you in advance!

@Dscano
Copy link

Dscano commented Nov 20, 2022

Hi @endrigoshi

  1. Your flow rule is in pending add state means that the you are writing something that the p4 piepline cannot hanlde. Indeed you wrote port CONTROLLER when i thint that you have to set the CPU_PORT value. You shold read the Tofino/P4 documentation provide by intel for see the exact value.
  2. you can leverage Packte_in and Packet_out message for exchange control message among the controller and the switch (here you can see how pktio is handgled by the pipeline P4 https://github.com/stratum/fabric-tna/blob/main/p4src/tna/include/control/packetio.p4)

@endrigoshi
Copy link
Author

Hi @Dscano

Thank you very much for the prompt response!

  1. I am just using the code provided by this repository and I have only modified the build.sh file so that I point to my local p4c compile (v9.7.3). So I don't think the error is happening because of something that I am doing wrong. But, since you suggest to read the Tofino documentation, does that mean that the fabric-tna is not meant to work out of the box with Tofino switches?
  2. Regarding the packetio.p4 file, I see that there is an action declared set_switch_info(FabricPortId_t cpu_port). When I check the rules installed by ONOS, the following is shown imm[FabricEgress.pkt_io_egress.set_switch_info(cpu_port=0xfffffffd)], cleared:false. Do you have any idea what exactly this action does (when is it called)? And how to understand the cpu_port=0xfffffffd field?

Lastly, I would appreciate your input in understanding the IN_PORT values that I mentioned in the first comment.

Thank you again for your help! :)

@Dscano
Copy link

Dscano commented Nov 20, 2022

  1. fabric-tna works perfectly. I'm saying that, the exact CPU_PORT number depends on the Tofino model that you have. So, in order to retrieve this valuta you have to read the Intel documentation. The developers of fabric-tnaused an escamotage to not specify it. So, your problem seems related on the way the flow rule is written. Because, if the flow rule is not compliant with the table and/or action in the pipeline it remains in pending add status.

  2. the packetio handle the exchange of packets among control plane and data plane. For example it handles the LLDP packet forwarding from the control plane in order to provide at the controller the network view.

@endrigoshi
Copy link
Author

Hi @Dscano

Maybe I am failing to understand your point, but I will try to express myself again:

  • As I said, I am not doing any changes to the fabric-tna code, except the /p4src/tna/build.sh file where I modify the compile command to point to my local compiler.
  • Then, I run the following commands in order: make fabric-tna, make pipeconf, make pipeconf-install ONOS_HOST=<...> and make netcfg ONOS_HOST=<...>.
  • After this point, I get the error in ONOS saying it is unable to map the logical port CONTROLLER.

Because, if the flow rule is not compliant with the table and/or action in the pipeline it remains in pending add status.

Therefore, regarding your comment above, how can it be that the flow rule is not compliant with the pipeline when I have not done any changes to the pipeline or the control plane code? This is also what I meant when I asked if fabric-tna is meant to work out of the box with Tofino switches. Or are there any steps that I forgot to follow?

Now, regarding the second point, I believe that the escamotage that you mention refers to the definition of cpu_port=0xfffffffd, correct?

Thank you again for your help.

Best,
Endri

@Dscano
Copy link

Dscano commented Nov 21, 2022

Hi @endrigoshi,

did you have also install the application https://github.com/opennetworkinglab/trellis-control on the ONOS controller?, according to the guide https://docs.sd-fabric.org/master/release/1.2.0.html.

Therefore, regarding your comment above, how can it be that the flow rule is not compliant with the pipeline when I have not done any changes to the pipeline or the control plane code? This is also what I meant when I asked if fabric-tna is meant to work out of the box with Tofino switches. Or are there any steps that I forgot to follow?

I thought you had pushed a custom flow rule. Just to give you an idea, if you push a flow rule with a wrong action name or with an extra match field the flow rules in the controller view remains in pending_add status.

Now, regarding the second point, I believe that the escamotage that you mention refers to the definition of cpu_port=0xfffffffd, correct?

yes, exactly.

Best
Davide

@daniele-moro
Copy link
Collaborator

Hi @endrigoshi,
the IN_PORT:4294967040 is a recirculation port. The pipeliner sets some flows in order to admit packets in the fabric-tna pipeline when recirculated, see:

// Set up recirculation ports as untagged (used for INT reports and
// UE-to-UE in UPF pipe).
List<Long> recircPorts = capabilities.isArchTna() ? RECIRC_PORTS : V1MODEL_RECIRC_PORT;
recircPorts.forEach(port -> {
flowRuleService.applyFlowRules(
ingressVlanRule(port, false, DEFAULT_VLAN, PORT_TYPE_INTERNAL),
egressVlanRule(port, DEFAULT_VLAN, false),
fwdClassifierRule(port, null, Ethernet.TYPE_IPV4, FWD_IPV4_ROUTING,
DEFAULT_FLOW_PRIORITY),
// Use higher priority for MPLS rule since the one for IPv4
// matches all IPv4 traffic independently of the eth_type.
fwdClassifierRule(port, Ethernet.MPLS_UNICAST, Ethernet.TYPE_IPV4, FWD_MPLS,
DEFAULT_FLOW_PRIORITY + 10));
});

About the output port issue, which version of SD-Fabric components are you using? Are you using released SD-Fabric ONOS? The issue might be related to this change: https://gerrit.onosproject.org/c/onos/+/25232, please verify that you have it in your code base.

Thanks,
Daniele

@endrigoshi
Copy link
Author

Hi,

Thank you both for your help. I am using ONOS 2.7.0 so I believe the issue with the long port numbers is already fixed there.
About Trellis, I have not installed it. I will do that and see if it will fix that issue.

Thanks,
Endri

@endrigoshi
Copy link
Author

I also wanted to ask you another question regarding the forwarding to CPU_PORT. Please let me know if its better to open another issue in the stratum repo.

I am using ONOS and stratum (in a container) to control a Tofino switch. I have created a very simple P4 program and successfully inserted a rule to forward the matching packets to CPU_PORT (64 for my switch). From ONOS GUI or the bf-sde.pm> show I can see that the packets are indeed matched and they are sent out of the port 64. But other than that, I don't see those packets anywhere, meaning I don't see them when I run tcpdump on the switch, stratum container, and also nothing is triggered on the ONOS side.

As far as the setup goes, Tofino is connected through the management port to the same network as the PC where ONOS is running. Then, another PC is connected to Tofino and serves as a packet generator.

Do you have any idea if I am doing anything wrong? I have tried many things by I am completely lost at this point.

@Dscano
Copy link

Dscano commented Nov 22, 2022

Hi @endrigoshi

This is strange, have you tried to sniff tha packet at the controller side?
Are you sure that packets are correctly formed? I mean maybe they can be silentry dropped because they exceed the mtu size or they are not recognize by the interface at physical level.

@endrigoshi
Copy link
Author

Hi @Dscano ,

Yes I tried running tcpdump also on the ONOS side but I see packets flowing from the network but not the one I expect (unless it is wrapped in headers which makes it difficult to see it). If I send the packet out of the ingress_port instead of forwarding out of port 64, everything works fine and I can see the packet coming back.

I am using scapy to craft a very simple packet, which looks like this:
x = Ether(src='1b:1b:21:c0:51:f9', dst='ab:ab:ab:ab:ab:ab')/IP(dst="8.8.8.8") and then I match on src MAC address. When sending to the controller, does stratum somehow wrap the packet with the correct headers? I believe so, right?

The send_to_cpu() action is defined as follows:

    action send_to_cpu() {
       ig_tm_md.ucast_egress_port = CPU_PORT;
       hdr.cpu_in.setValid();
       hdr.cpu_in.ingress_port = ig_intr_md.ingress_port;
    }

So I am not doing any special processing to the packet.

The management interface of Tofino is connected to switch which in turn is connected to the network. Can it be that the packet is dropped at this switch?

As far as stratum goes, I am using the stratumproject/stratum-bfrt:latest-9.7.2 image.

@Dscano
Copy link

Dscano commented Nov 22, 2022

. When sending to the controller, does stratum somehow wrap the packet with the correct headers? I believe so, right?

yes, I believe that the forwarding if handeld by the ip protocol/ in a traditional way.

The management interface of Tofino is connected to switch which in turn is connected to the network. Can it be that the packet is dropped at this switch?

This could be the problem, because you are generating traffic that does not belong/is handeld by your management network. While the packet_in and packet_out sent by the controller to the data plane have the ip address and mac of the machine where the SDN controller is running. You should try to remove the switch or generate traffic with ip and mac same as those of the machine where the controller runs.

@endrigoshi
Copy link
Author

But only the management interface of the Tofino is connected to this "normal" switch, while the PC that generates traffic is connected directly to Tofinos forwarding ports.

What what puzzles me is the fact that if Stratum is taking care of wrapping the PacketIns with the correct headers (meaning the dst IP/MAC of ONOS), the packet should have no problem being forwarded in a non-programmable switched network. Otherwise it would basically beat the purpose of having remote controllers. I tried crafting new packets as per your suggestion but the behavior was the same.

I will also try to connect the management port of Tofino directly to ONOS, but while I hope it works I am a bit skeptical.

@Dscano
Copy link

Dscano commented Nov 22, 2022

Sorry @endrigoshi , let me explain better. When ONOS send LLDP packets for discovering the network, it leverages packetio mechanism to send and recieve LLDP packet. In this case the header fields are consistent from the traditional ip forwarding point of view and yhose packets are forwarding in the menangement network. I experience that in my experiments. So, stratum does not apply any header fields modification, it exposes the APIs (P4Runtime, gNMI, gNOI) to the controller, the packets are managed by the P4 data plane, so there you have to properly fill the packet headers fields, in order to be managed by a traditional network.

@Dscano
Copy link

Dscano commented Nov 22, 2022

@endrigoshi
I was thinking about your issue. If you see the counter associatet to the flow rule increase in ONOS GUI and/or P4 but you did't see any packet leave the switch from the CPU port, maybe the value that you are using is wong. I claim that because if you use others oports in the data plane you see packets leaving the switch, but you don't see anything at the management port of the swicth or/and at control plane.

@endrigoshi
Copy link
Author

endrigoshi commented Nov 28, 2022

Hi @Dscano ,
First, I would like to say that I really appreciate your assistance with my issues.

My understanding of the setup with Tofino, Stratum and ONOS is as follows (based on the knowledge I have with OpenFlow networks and other P4 networks):

  • Stratum runs on top of the Tofino switch. A gRPC server instance is initialized and ONOS connects to this server using gRPC communication (they use P4Runtime which uses gRPC or the communication).
  • When ONOS configures the switch, it sends its P4Runtime messages to Stratum which acts completely transparently and simply forwards them to the Tofino switch.
  • The situation changes when it comes to PacketIn messages. If the data plane decides to forward a packet to the control plane, the out-port is set to CPU_PORT and a "packet_in" header is added to the packet. In many examples, this header is defined as below, and it is also the one I am using in my code:
// Packet-in header. Prepended to packets sent to the CPU_PORT and used by the
// P4Runtime server (Stratum) to populate the PacketIn message metadata fields.
// Here we use it to carry the original ingress port where the packet was
// received.
@controller_header("packet_in")
header cpu_in_header_t {
    port_num_t  ingress_port;
    bit<7>      _pad;
}
  • After the packet is sent to the CPU_PORT, it arrives at Stratum which is listening for these packets. As also written in the piece of code above, Stratum uses the information in this header to create a gRPC "Packet-In" message to send to ONOS. From my understanding of the code, this is done in bfrt_packetio_manager.cc.
  • In the example that you mention with LLDP packets, they are crafted with the correct headers in the controller + the "packet_out" application header and sent to Stratum via P4Runtime/gRPC. Stratum uses the "packet_out" information to build the "packet_out" header and forwards the packet to the switch. Then its the task of the switch to remove the packet_out header and forward the packet to the correct out_port.

Now, to get back to my problem. The ONOS -> Stratum -> Tofino part of the communication work fine, at least the parts that take care of loading the pipeline and deploying the flow-rules (so the transparent part of the communication). I am not sure if sending PacketOuts would work cuz I have to modify the ONOS code to be able to test that. However, I am sure that the "uplink" part (PacketIn sent from Tofino -> Stratum -> ONOS) does not work since there is nothing transmitted to my ONOS application.

Below, I will explain some details about my setup and what I have learned from debugging this issue.

The OS running on top of Tofino is as follows:

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 9.13 (stretch)
Release:	9.13
Codename:	stretch

$ uname -r
4.14.49-OpenNetworkLinux

I have installed bf-sde-9.7.2, and I am running the stratum container with image stratumproject/stratum-bfrt:latest-9.7.2 as shows in the guidelines. I tried also building it myself and running it but I was having some issues with that. Additionally I tried installing the .deb file provided and the same situation as with the container was happening.

Now, in Tofino I can see two interfaces: enp4s0f0 and enp4s0f1 which are the management interfaces that Tofino exposes to the host. I have put them in UP state with the following command: sudo ip link set enp4s0f0 up.
When packets in the data plane hit on send_to_cpu, I can see the FRAMES TX in Tofino increase (running pm and show in bf-sde):
image
After setting the interfaces in UP state, i can also see packets in enp4s0f0 and enp4s0f1 (depending whether they are sent out of CPU_PORT 64 or 66). However, nothing else happens after this point.

The output from Stratum container is as follows:

└[~]> ./start_stratum_container.sh -bf_switchd_background=false             
/home/lkn/chassis_config.pb.txt
This is the location of chassis_config file:
/home/lkn/chassis_config.pb.txt
------------------------------------------------------------------------------------
++ uname -r
++ uname -r
+ docker run -it --rm --privileged -v /dev:/dev -v /sys:/sys -v /lib/modules/4.14.49-OpenNetworkLinux:/lib/modules/4.14.49-OpenNetworkLinux --env PLATFORM=x86-64-accton-wedge100bf-32qs-r0 --network host -v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so -v /lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1:/lib/x86_64-linux-gnu/libonlp-platform-defaults.so.1 -v /lib/x86_64-linux-gnu/libonlp-platform.so:/lib/x86_64-linux-gnu/libonlp-platform.so -v /lib/x86_64-linux-gnu/libonlp-platform.so.1:/lib/x86_64-linux-gnu/libonlp-platform.so.1 -v /lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1:/lib/x86_64-linux-gnu/libonlp-x86-64-accton-wedge100bf-32qs.so.1 -v /lib/x86_64-linux-gnu/libonlp.so:/lib/x86_64-linux-gnu/libonlp.so -v /lib/x86_64-linux-gnu/libonlp.so.1:/lib/x86_64-linux-gnu/libonlp.so.1 -v /lib/platform-config:/lib/platform-config -v /etc/onl:/etc/onl -v /home/lkn/chassis_config.pb.txt:/etc/stratum/x86-64-accton-wedge100bf-32qs-r0/chassis_config.pb.txt -v /var/log:/var/log/stratum stratumproject/stratum-bfrt:latest-9.7.2 -bf_switchd_background=false
Mounting hugepages...
bf_kdrv_mod found! Unloading first...
loading bf_kdrv_mod...
I20221128 17:32:02.524595     1 logging.cc:64] Stratum version: 88d02d5b02c502f16f2d5c50217a068eb8485458 built at 2022-07-06T00:05:12+00:00 on host 51ce28cde5b1 by user stratum.
I20221128 17:32:02.525056     1 bf_sde_wrapper.cc:1757] bf_sysfs_fname: /sys/class/bf/bf0/device/dev_add
Install dir: /usr (0x559ca1e12020)
bf_switchd: system services initialized
bf_switchd: loading conf_file /usr/share/stratum/tofino_skip_p4.conf...
bf_switchd: processing device configuration...
Configuration for dev_id 0
  Family        : Tofino
  pci_sysfs_str : /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0
  pci_domain    : 0
  pci_bus       : 5
  pci_fn        : 0
  pci_dev       : 0
  pci_int_mode  : 1
  sbus_master_fw: /usr/
  pcie_fw       : /usr/
  serdes_fw     : /usr/
  sds_fw_path   : /usr/share/tofino_sds_fw/avago/firmware
  microp_fw_path:
bf_switchd: processing P4 configuration...
P4 profile for dev_id 0
  p4_name: dummy
    libpd:
    libpdthrift:
    context:
    config:
  Agent[0]: /usr/lib/libpltfm_mgr.so
  diag:
  accton diag:
  non_default_port_ppgs: 0
  SAI default initialize: 1
bf_switchd: library /usr/lib/libpltfm_mgr.so loaded
bf_switchd: agent[0] initialized
Health monitor started
Operational mode set to ASIC
Initialized the device types using platforms infra API
ASIC detected at PCI /sys/class/bf/bf0/device
ASIC pci device id is 16
bf_switchd: drivers initialized
Skipping P4 program load for dev_id 0
Setting core_pll_ctrl0=cd44cbfe

bf_switchd: dev_id 0 initialized

bf_switchd: initialized 1 devices
Skip p4 lib init
Skip mav diag lib init
bf_switchd: spawning cli server thread
bf_switchd: spawning driver shell
bf_switchd: server started - listening on port 9999
I20221128 17:32:23.266234     1 bf_sde_wrapper.cc:1767] switchd started successfully
W20221128 17:32:23.266402     1 credentials_manager.cc:59] No key files provided, using insecure server credentials!
Cannot read termcap database;
using dumb terminal settings.
W20221128 17:32:23.266842     1 credentials_manager.cc:78] No key files provided, using insecure client credentials!
bf-sde> I20221128 17:32:23.267074     1 hal.cc:127] Setting up HAL in COLDBOOT mode...
I20221128 17:32:23.267143     1 config_monitoring_service.cc:94] Pushing the saved chassis config read from /etc/stratum/x86-64-accton-wedge100bf-32qs-r0/chassis_config.pb.txt...
I20221128 17:32:23.269819     1 bfrt_switch.cc:321] Chassis config verified successfully.
I20221128 17:32:23.270242     1 phal.cc:94] No phal_config_file specified and no switch configurator found! PHAL will start without any data source backend. You can specify '--define phal_with_tai=true' while building to enable TAI support, or '-enable_onlp' at runtime to enable the ONLP plugin.
I20221128 17:32:23.270751     1 attribute_database.cc:207] PhalDB service is listening to localhost:28003...
I20221128 17:32:23.270787     1 bf_chassis_manager.cc:1404] Successfully registered port status notification callback.
I20221128 17:32:23.281705     1 bf_chassis_manager.cc:111] Added port 7 in node 1 (SDK Port 308).
I20221128 17:32:23.285529     1 bf_chassis_manager.cc:147] Enabled port 7 in node 1 (SDK Port 308).
I20221128 17:32:23.301673     1 bf_chassis_manager.cc:111] Added port 16 in node 1 (SDK Port 0).
I20221128 17:32:23.305547     1 bf_chassis_manager.cc:147] Enabled port 16 in node 1 (SDK Port 0).
I20221128 17:32:23.315902     1 bf_chassis_manager.cc:111] Added port 3300 in node 1 (SDK Port 64).
I20221128 17:32:23.316090     1 bf_chassis_manager.cc:147] Enabled port 3300 in node 1 (SDK Port 64).
I20221128 17:32:23.326566     1 bf_chassis_manager.cc:111] Added port 3302 in node 1 (SDK Port 66).
I20221128 17:32:23.326906     1 bf_chassis_manager.cc:147] Enabled port 3302 in node 1 (SDK Port 66).
I20221128 17:32:23.327006     1 bfrt_switch.cc:60] Chassis config pushed successfully.
I20221128 17:32:23.329669     1 p4_service.cc:121] Pushing the saved forwarding pipeline configs read from /etc/stratum/pipeline_cfg.pb.txt...
W20221128 17:32:23.329874     1 p4_service.cc:142] Empty forwarding pipeline configs file: /etc/stratum/pipeline_cfg.pb.txt.
E20221128 17:32:23.330509     1 hal.cc:220] Stratum external facing services are listening to 0.0.0.0:9339, 0.0.0.0:9559, localhost:9559...
I20221128 17:32:25.776901    51 bf_chassis_manager.cc:1288] State of port 3302 in node 1 (SDK port 66): UP.
I20221128 17:32:25.780974    51 bf_chassis_manager.cc:1288] State of port 3300 in node 1 (SDK port 64): UP.
I20221128 17:34:19.373461    54 sdn_controller_manager.cc:148] New SDN connection (role_name: <default>, election_id: { low: 20 }, uri: ipv4:10.162.149.228:40442): device_id: 1 election_id { low: 20 }
I20221128 17:34:19.373623    54 sdn_controller_manager.cc:199] New primary connection for role <default> with election ID { low: 20 }.
I20221128 17:34:19.373656    54 p4_service.cc:590] Controller (role_name: <default>, election_id: { low: 20 }, uri: ipv4:10.162.149.228:40442) is connected as MASTER for node (aka device) with ID 1.
I20221128 17:34:19.373706    54 sdn_controller_manager.cc:178] Update SDN connection ((role_name: <default>, election_id: { low: 20 }, uri: ipv4:10.162.149.228:40442)): device_id: 1 election_id { low: 20 }
I20221128 17:34:19.373811    54 sdn_controller_manager.cc:199] Old and new primary connection for role <default> with election ID { low: 20 }.
I20221128 17:34:19.373836    54 p4_service.cc:590] Controller (role_name: <default>, election_id: { low: 20 }, uri: ipv4:10.162.149.228:40442) is connected as MASTER for node (aka device) with ID 1.
I20221128 17:34:19.456538    55 bfrt_switch.cc:277] P4-based forwarding pipeline config verified successfully for node with ID 1.
bf_switchd: starting warm init for dev_id 0 mode 1 serdes_upgrade 0
bf_switchd: agent[0] library unloaded for dev_id 0
bf_switchd: library /usr/lib/libpltfm_mgr.so loaded
bf_switchd: agent[0] initialized
Health monitor started
Skip diag lib deinit
|I20221128 17:34:29.577813    55 bf_chassis_manager.cc:1171] Replayed chassis config for node 1.
I20221128 17:34:29.577847    55 bfrt_switch.cc:78] P4-based forwarding pipeline config pushed successfully to node with ID 1.
I20221128 17:34:29.678843    54 sdn_controller_manager.cc:178] Update SDN connection ((role_name: <default>, election_id: { low: 20 }, uri: ipv4:10.162.149.228:40442)): device_id: 1 election_id { low: 20 }
I20221128 17:34:29.678990    54 sdn_controller_manager.cc:199] Old and new primary connection for role <default> with election ID { low: 20 }.
I20221128 17:34:29.679025    54 p4_service.cc:590] Controller (role_name: <default>, election_id: { low: 20 }, uri: ipv4:10.162.149.228:40442) is connected as MASTER for node (aka device) with ID 1.
I20221128 17:34:29.686756    57 config_monitoring_service.cc:420] Initial Subscribe request from ipv4:10.162.149.228:57508 over stream 0x7fa237fda7d0.
I20221128 17:34:29.788952    76 bfrt_node.cc:262] P4-based forwarding entities written successfully to node with ID 1.
I20221128 17:34:31.732391    51 bf_chassis_manager.cc:1288] State of port 7 in node 1 (SDK port 308): UP.
I20221128 17:34:31.966368    51 bf_chassis_manager.cc:1288] State of port 16 in node 1 (SDK port 0): UP.
I20221128 17:34:32.018694    51 bf_chassis_manager.cc:1288] State of port 3300 in node 1 (SDK port 64): UP.
I20221128 17:34:32.022469    51 bf_chassis_manager.cc:1288] State of port 3302 in node 1 (SDK port 66): UP.

I have also run it with the flags -experimental_enable_p4runtime_translation -incompatible_enable_bfrt_legacy_bytestring_responses as I saw some people had problems because of that, but no luck in resolving my issue.

I would really appreciate if you can point me towards the differences between your setup (which I believe works as intended) and mine.
Thanks a lot in advance!

Best,
Endri

Edit: Updated the stratum output to show the full logs, including the output of adding the pipeline and rules from ONOS.

@endrigoshi
Copy link
Author

In case it helps, the contents of netcfg_tofino.json are:

{
  "devices": {
    "device:Milan-Tofino": {
      "basic": {
        "managementAddress": "grpc://10.162.148.152:9559?device_id=1",
        "driver": "stratum-tofino",
        "pipeconf": "org.onosproject.ngsdn-tutorial"
      }
    }
  }
}

and chassis_config.pb.txt:

description: "Default Chassis Config for Edgecore Wedge100BF-32QS"
chassis {
  platform: PLT_GENERIC_BAREFOOT_TOFINO
  name: "Edgecore Wedge100BF-32qs"
}
nodes {
  id: 1
  slot: 1
  index: 1
}
singleton_ports {
  id: 7
  name: "7/0"
  slot: 1
  port: 7
  speed_bps: 10000000000
  config_params {
    admin_state: ADMIN_STATE_ENABLED
  }
  node: 1
}
singleton_ports {
  id: 16
  name: "16/0"
  slot: 1
  port: 16
  speed_bps: 10000000000
  config_params {
    admin_state: ADMIN_STATE_ENABLED
  }
  node: 1
}
singleton_ports {
  id: 3300
  name: "33/0"
  slot: 1
  port: 33
  channel: 1
  speed_bps: 10000000000
  config_params {
    admin_state: ADMIN_STATE_ENABLED
  }
  node: 1
}
singleton_ports {
  id: 3302
  name: "33/2"
  slot: 1
  port: 33
  channel: 3
  speed_bps: 10000000000
  config_params {
    admin_state: ADMIN_STATE_ENABLED
  }
  node: 1
}

@Dscano
Copy link

Dscano commented Nov 28, 2022

Hi @endrigoshi

  • After the packet is sent to the CPU_PORT, it arrives at Stratum which is listening for these packets. As also written in the piece of code above, Stratum uses the information in this header to create a gRPC "Packet-In" message to send to ONOS. From my understanding of the code, this is done in bfrt_packetio_manager.cc.
  • In the example that you mention with LLDP packets, they are crafted with the correct headers in the controller + the "packet_out" application header and sent to Stratum via P4Runtime/gRPC. Stratum uses the "packet_out" information to build the "packet_out" header and forwards the packet to the switch. Then its the task of the switch to remove the packet_out header and forward the packet to the correct out_port.
  1. The Packet-in and Out should be work as in Openflow, Stratum is an "agent" that exposes standard interfeces (P4Runtime, gNOI and gNMI).

  2. The "pipeconf": "org.onosproject.ngsdn-tutorial" is compiled for the Tofino? The ONOS application was properly modified?
    I developed this simply application tat implements "basic pipeline" tha simple handle LLDP,ARP and forwarding till layer 4 provided by the ONOS applications Reactive Forwarding https://github.com/Dscano/Basic-tna maybe could helps you.

  3. My setup is APS switch with Ubuntu server 20.0 + SDE 9.7.0 + Stratum v22.03. I start Stratum running this commands udo stratum_bfrt -bf_sde_install /home/tofino/SDE-9.7.0/bf-sde-9.7.0/install -bf_switchd_cfg /usr/share/stratum/tofino_skip_p4.conf -bf_switchd_background=false -enable_onlp=false -chassis_config_file ./chassis_config.pb.txt -experimental_enable_p4runtime_translation -incompatible_enable_bfrt_legacy_bytestring_responses

  4. From your log Stratum start properly. But ,if i remeber well, the CPU_PORT is not displayed in the pm-show outup. Indeed this port should be the PCIe used to connecting the ASIC to the Motherboard/CPU., e.i., always on. So, I strongly believe that you CPU_PORT value is wrong. Morover, the RX counter increases because the interface recieve packets but does not means that the packets hit correctly the flow rules configured in the tables within the pipeline. Indeed if evething is working, you should recieve packet on port A and forword it on port B, you should se the RX counter on port A increase and TX counter on port B increase. In your case this happens only for port 64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants