Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TWAMP] Add TWAMP Light feature HLD #1320

Conversation

huseratgithub
Copy link
Contributor

@huseratgithub huseratgithub commented Apr 10, 2023

This document provides general information about the TWAMP Light feature implementation in SONiC.
This feature has been tracked in #1192

Repo PR title State
sonic-swss TWAMP Light orchagent implementation GitHub issue/pull request detail
sonic-swss-common Added TWAMP Light table to schema GitHub issue/pull request detail
sonic-sairedis Support TWAMP Light notification in syncd GitHub issue/pull request detail
sonic-utilities Added CLIs to support TWAMP Light GitHub issue/pull request detail
SAI Added TWAMP Light API GitHub issue/pull request detail

Signed-off-by: Xiaodong Hu <huxd@centec.com>
@guxianghong guxianghong force-pushed the devpr_huxd_sonic-net_twamp_20230410 branch from b7a2e5b to a86b85a Compare May 18, 2023 09:38
@zhangyanzhao
Copy link
Collaborator

@huseratgithub can you please update the HLD by using community HLD template https://github.com/sonic-net/SONiC/blob/master/doc/hld_template.md, there are mandatory sections to cover. Thanks.

@huseratgithub
Copy link
Contributor Author

#1320 (comment)
@zhangyanzhao upding and will submit in the weekend

@huseratgithub
Copy link
Contributor Author

@huseratgithub huseratgithub reopened this Jun 8, 2023
@huseratgithub huseratgithub force-pushed the devpr_huxd_sonic-net_twamp_20230410 branch 2 times, most recently from 56402b3 to 14df416 Compare June 10, 2023 12:51
@huseratgithub
Copy link
Contributor Author

@huseratgithub can you please update the HLD by using community HLD template https://github.com/sonic-net/SONiC/blob/master/doc/hld_template.md, there are mandatory sections to cover. Thanks.

@zhangyanzhao I have update the HLD by template, please help to check and review, thanks

@huseratgithub huseratgithub force-pushed the devpr_huxd_sonic-net_twamp_20230410 branch from 345d3cd to ce693d3 Compare June 13, 2023 01:42
@huseratgithub huseratgithub force-pushed the devpr_huxd_sonic-net_twamp_20230410 branch from ce693d3 to d2dff4b Compare June 13, 2023 06:53
@zhangyanzhao
Copy link
Collaborator

@zhangyanzhao
Copy link
Collaborator

This is reviewed in SONiC community. Please let me know if you want to be reviewer for this feature, thanks.

Signed-off-by: huxd <huxd@centec.com>
@huseratgithub
Copy link
Contributor Author

huseratgithub commented Jun 21, 2023

Answer questions raised during the review meeting:

  1. How do you detemine HW or SW TWAMP solution? >> detemined based on ASIC capability
    At phase 1, it is based on the ASIC capability. Twamporch can query the TWAMP Light capability through SAI API during system initialization.

  2. Do you have CoPP configuration support for SW base TWAMP solution? >> No
    No, we will add a new trap for SW solution which is considered in phase 2.

  3. In SONiC, In HW based mechanism - all the counter are update through flex counters before it updates counters db. Does the design follow the same mechanism?
    No, ASIC will actively report the measuement data at the end of statistical period. The benefit is that the performance measuement data can be reported in time and the software does not need to synchronize the statistical cycle with the hardware.

  4. Can this TWAMP polling interval be configurable? How can it be done - let's update the HLD section
    Yes, when creating a TWAMP Light session, configure it with the parameter statistics_interval. Polling interval is responsible to the frequency for updating counters db. It is described in configuration commands section.

  5. Who will be the consumers for TWAMP session updated in the State_DB? >> Can you share some reference applications of interest/consumption in this state?
    Show command reads the twamp session state from STATE_DB. The state indicates the TWAMP Light session running state.

  6. Since this feature has dependency with ASIC support, let's control this feature controller at complile time flag? Ex: sflow
    Ok, we will add a compilation flag to control the compilation of TWAMP Light feature.

  7. Can you share a few reference sample use cases which support TWAMP feature?
    High latency, jitter and packet loss, and an aggravating user experience are just a few of the issues to be expected when links are not performing as expected or have been configured incorrectly.

  8. How do you trap the TWAMP session to CPU? Do you have COPP rules enforced or not? Let's mention HLD in detail.
    ACL entry with TWAMP-Test packet filed such as ip, udp_port will be installed to ASIC when creating a TWAMP Light session. Also we will add a new trap of host interface for TWAMP-Test packet, and this trap can be used in COPP rules.

  9. Let's have a CLI or user control HW/SW based TWAMP solutions? Ex: if HW solution doesn't need a SW TWAMP container at all.
    OK, we will add a new CLI to decide to use HW or SW solution in phase 2.

  10. How are you offloading the TWAMP session to hardware? doesn't it use an APP DB?
    It does not use an APP_DB, because we think all TWAMP Light session attributes can be configured in CONFIG_DB by CLI, and we want the TWAMP Light module to remain independent.

  11. Does support configure a wide range interval for TWAMP counters.
    We will add a wide range interval for TWAMP Light counters.

community review recording https://zoom.us/rec/share/KMpGzdM5XUr3cfO7jfIWmgtgONfyYKA_FZXK_nIB_jZFWj2GtodlReuVH73W0tNs.K32lR8TWbGckHUFO

@guxianghong
Copy link
Collaborator

@zhangyanzhao We have invite @clarklee-guizhao to reivew this PR,
Could you please help to add @clarklee-guizhao into the reviewer list, Thanks very much.

@huseratgithub
Copy link
Contributor Author

@zhangyanzhao @yxieca The comments were handled. can you please help to approve/merge?

@zhangyanzhao
Copy link
Collaborator

Can you please help to add the code PR into this HLD by referring to #806 ? That is required before merging this HLD PR. Thanks.

@huseratgithub
Copy link
Contributor Author

huseratgithub commented Oct 12, 2023

Can you please help to add the code PR into this HLD by referring to #806 ? That is required before merging this HLD PR. Thanks.

The code PRs have been added. Could you please help to review/merge? Thanks

Copy link
Collaborator

@clarklee-guizhao clarklee-guizhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@huseratgithub
Copy link
Contributor Author

@zhangyanzhao @yxieca Could you please help to approve/merge? Thanks.

@huseratgithub
Copy link
Contributor Author

@clarklee-guizhao Thanks a lot for your review.

prsunny pushed a commit to sonic-net/sonic-swss-common that referenced this pull request Nov 17, 2023
[schema][twamp] Add twamp light table to schema
[TWAMP] Add TWAMP Light feature HLD sonic-net/SONiC#1320
@zhangyanzhao zhangyanzhao merged commit 4abfa5f into sonic-net:master Feb 5, 2024
@zhangyanzhao
Copy link
Collaborator

reviewer approved and merge

@prsunny
Copy link
Contributor

prsunny commented Feb 8, 2024

@huseratgithub , where is the sonic-mgmt test for this feature?

@huseratgithub
Copy link
Contributor Author

@huseratgithub , where is the sonic-mgmt test for this feature?

hi @prsunny , sonic-mgmt is an enhancement, let's have it in later phase.

prsunny pushed a commit to sonic-net/sonic-swss that referenced this pull request Apr 1, 2024
* [orchagent] TWAMP Light orchagent implementation. (#2927)
* What I did
Implemented the TWAMP Light feature according to the SONiC TWAMP Light HLD(sonic-net/SONiC#1320).
cscarpitta pushed a commit to cscarpitta/sonic-swss that referenced this pull request Apr 5, 2024
* [orchagent] TWAMP Light orchagent implementation. (sonic-net#2927)
* What I did
Implemented the TWAMP Light feature according to the SONiC TWAMP Light HLD(sonic-net/SONiC#1320).
superchild pushed a commit to superchild/sonic-swss that referenced this pull request Apr 24, 2024
* Fixes mock test failure

* Fixes mock test run failure

fixes pipeline run failure

FAIL: p4orch_tests_usan
=======================

../../../orchagent/vrforch.cpp:113:41: runtime error: member call on
null pointer of type 'struct RouteOrch'
../../../orchagent/vrforch.cpp:113:41: runtime error: member access
within null pointer of type 'struct RouteOrch'
FAIL p4orch_tests_usan (exit status: 139)

* Fixed orchagent crash in VM with the Qos BUFFER_QUEUE|system-port|Queue-id-range config (sonic-net#3050)

* Fixed orchagent crash in VM with the Qos BUFFER_QUEUE|system-port|Queue-id-range config

* [intfsorch] Enable ipv6 proxy ndp along with proxy arp (sonic-net#3045)

* [intfsorch] Enable ipv6 proxy ndp along with proxy arp

setting SAI_VLAN_ATTR_UNKNOWN_MULTICAST_FLOOD_CONTROL_TYPE to
SAI_VLAN_FLOOD_CONTROL_TYPE_NONE when proxy arp is enabled. This fixes a
bug where ipv6 NS packets were flooding ports with duplicate packets. We
now set multicast flood type to none.

* Fix multi VLAN neighbor learning (sonic-net#3049)

What I did

When adding a new neighbor, check if the neighbor IP has already been learned on a different VLAN. If it has, remove the old neighbor entry before adding the new one.

Why I did it
On Gemini devices, if a neighbor IP moves from an active port in one VLAN to a second VLAN, then back to the first VLAN (with 3 different MAC addresses), orchagent will crash. Even though the MAC address of the last move is different from the first MAC address, orchagent believes the last MAC address to already be programmed in the hardware and tries to set an attribute of the entry which doesn't exist.

* [asan] Disable the "maybe-uninitialized" warning when compiled with ASAN enabled.

* Set HOST_TX_READY_NOTIFY attribute only after query capabilities(sonic-net#3070)

*Set HOST_TX_READY_NOTIFY attribute only after query capabilities

* [EVPN] Skip EVPN routes with invalid VNI or router mac field (sonic-net#3073)

* Skip EVPN routes with invalid VNI or router mac field

* Add port flap count and last flap timestamp to APPL_DB (sonic-net#3052)

* Add port flap count and last flap timestamp

* Add basic fabric link monitoring counters and states handling. (sonic-net#2988)

* Add basic fabric link monitoring counters and states handling.

* [Mellanox] Fix inconsistence in the shared headroom pool initialization (sonic-net#3057)

* Fix inconsistence in the shared headroom pool initialization

* Why I did it

During initialization, if SHP is enabled

the buffer pool sizes, xoff have initialized to 0, which means SHP is disabled
but the buffer profiles already indicate SHP
later on the buffer pool sizes are updated with off being non-zero
In case the orchagent starts handling buffer configuration between 2 and 3, it is inconsistent between buffer pools and profiles, which fails Mellanox SAI sanity check.
To avoid it, it indicates SHP enabled by setting a very small buffer pool and SHP sizes

* [acl] Add IN_PORTS qualifier for L3 table (sonic-net#3078)

* Apply IN_PORTS qualifiier for L3 table

Why I did it
IN_PORTS qualifier was allowed for L3 table in 202012 release and below. Changes in sonic-net#1982 removed that support leading to regression in some of our testcases. The following error was observed
ERR swss#orchagent: :- validateAclRuleMatch: Match SAI_ACL_ENTRY_ATTR_FIELD_IN_PORTS in rule RULE_1 is not supported by table DATAACL

* [bulker] add support for neighbor bulking (sonic-net#2768)

Adding support for sai_neighbor_api_t bulking in bulker.h

* [buffermgrd] Move switch-statement outside of if-statement in BufferMgr::doTask (sonic-net#3055)

* [buffermgr] Moved switch statement outside of if-statmement in Buffermgr::doTask

The switch statement which would normally erase buffer events was moved
to be inside the if-statement which would only enter if the event is a
SET event. This was introduced in commit e5329c39.

This would cause an infinite loop, since non-set events would never be
erased.

The switch statement has now been moved to occur outside the if,
allowing for non-set commands to be processed.

* [portsorch] process only updated APP_DB fields when port is already   created (sonic-net#3025)

* [portsorch] process only updated APP_DB fields when port is already created

What I did

Fixing an issue when setting some port attribute in APPL_DB triggers serdes parameters to be re-programmed with port toggling. Made portsorch to handle only those attributes that were pushed to APPL_DB, so that serdes programming happens only by xcvrd's request to do so.

* [Copp]Refactor coppmgr tests (sonic-net#3093)

What I did
Refactoring coppmgr mock tests

Why I did it
After migration to bookworm, coppmgr tests started failing due to the use of sudo commands.

* Revert "[acl] Add IN_PORTS qualifier for L3 table (sonic-net#3078)" (sonic-net#3092)

This reverts commit 9d4a3ad.
*Revert "[acl] Add IN_PORTS qualifier for L3 table"

* [orchagent] TWAMP Light orchagent implementation (sonic-net#2927)

* [orchagent] TWAMP Light orchagent implementation. (sonic-net#2927)
* What I did
Implemented the TWAMP Light feature according to the SONiC TWAMP Light HLD(sonic-net/SONiC#1320).

* Clang format change. (sonic-net#3080)

What I did
This PR has no real code change. It is purely clang formatting. It only applies to the P4Orch codes.
Commands that I run:
find orchagent/p4orch -name *.h -o -name .cpp | xargs clang-format -i -style="{BasedOnStyle: Microsoft, DerivePointerAlignment: false}"

find orchagent -name response_publisher -o -name return_code.h | xargs clang-format -i -style="{BasedOnStyle: Microsoft, DerivePointerAlignment: false}"

* T2-VOQ-VS: Fix iBGP bringup issue  (sonic-net#3053)

* Fix iBGP bringup issue T2-vswitch
* On T2-VOQ chassis Emulation with multi-asic linecards, iBGP sessions dont come up. Related Issue: sonic-net/sonic-buildimage#18129

* [Fdbsyncd] Adding extern_learn flag with fdb entry so Kernel doesn't age out (sonic-net#2985)

* Adding extern_learn flag with fdb entry so that Kernel doesn't age out the MAC

* [Fdbsyncd] Adding extern_learn flag with fdb entry so Kernel doesn't age out

What I did
extern_learn flag is added while programming the fdb entry into the Kernel. This will make sure that kernel doesn't age out the fdb entry. (#15004)

How I did it
A flag extern_learn will be passed while programing the fdb entry. (#15004)

How to verify it
Tested MAC add/del to the Kernel from the local FDB entry. (#15004)

Signed-off-by: kishore.kunal@broadcom.com

---------

Signed-off-by: kishore.kunal@broadcom.com
Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>

* Fix oper FEC retrieval after warmboot (sonic-net#3100)

Updating oper FEC status in state_db after warm-reboot as part of refresh port status call

* [EVPN]Fix fpmsyncd crash when EVPN type5 is received with bgp fib suppression enabled (sonic-net#3101)

* [EVPN]Fix fpmsyncd crash when EVPN type5 is received with bgp fib suppression enabled

* [portsorch] Handle TRANSCEIVER_INFO table on warm boot (sonic-net#3087)

* Add existing data from TRANSCEIVER_INFO table

* Introduce a new role for DPU-NPU Interconnect

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>

* [p4orch] Clang format change. (sonic-net#3096)

What I did
[p4orch]  This PR has no real code change. It is purely clang formatting. 
It does the same as sonic-net#3080.

* [dash] fix ENI admin state update (sonic-net#3081)

* [dash] fix ENI admin state update

* Add force option for fabric port unisolate command (sonic-net#3089)

What I did
Add force option to the unisolate link command, so users can make the links not isolate if they want.
depends on sonic-net/sonic-buildimage#18447

* [twamporch] Explicitly initialize local variable (sonic-net#3115)

What I did
Explicitly initialized local variable.

Why I did it
We met below error message in sonic-buildimage armhf build (sonic-net/sonic-buildimage#18334)

* Add bookworm build to the PR checkers (sonic-net#3114)

What I did
Add a Bookworm build to the PR checkers. Also fix some Bookworm build errors that crept in.

Why I did it
Buildimage now builds swss for Bookworm, so the build needs to succeed.

* [ACL] Remove flex counter when updating ACL rule (sonic-net#3118)

What I did
This PR is to fix sonic-net/sonic-buildimage#18719

When ACL rule is created for the first time, a flex counter is created and registered. When the same ACL rule is being updated, the FlexCounter created before is not removed, and another FlexCounter is created and registered.

Why I did it
Fix the issue that FlexCounter is duplicated when updating existing ACL rule.

---------

Signed-off-by: kishore.kunal@broadcom.com
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Co-authored-by: saksarav-nokia <sakthivadivu.saravanaraj@nokia.com>
Co-authored-by: Nikola Dancejic <26731235+Ndancejic@users.noreply.github.com>
Co-authored-by: Lawrence Lee <lawlee@microsoft.com>
Co-authored-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Co-authored-by: noaOrMlnx <58519608+noaOrMlnx@users.noreply.github.com>
Co-authored-by: Lior Avramov <73036155+liorghub@users.noreply.github.com>
Co-authored-by: Prince George <45705344+prgeor@users.noreply.github.com>
Co-authored-by: jfeng-arista <98421150+jfeng-arista@users.noreply.github.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Neetha John <nejo@microsoft.com>
Co-authored-by: Amir <mazora@marvell.com>
Co-authored-by: Stepan Blyshchak <38952541+stepanblyschak@users.noreply.github.com>
Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
Co-authored-by: xiaodong hu <32903206+huseratgithub@users.noreply.github.com>
Co-authored-by: mint570 <70396898+mint570@users.noreply.github.com>
Co-authored-by: Deepak Singhal <115033986+deepak-singhal0408@users.noreply.github.com>
Co-authored-by: KISHORE KUNAL <64033340+kishorekunal01@users.noreply.github.com>
Co-authored-by: Vivek <vivekreddykarri98@gmail.com>
Co-authored-by: Yakiv Huryk <62013282+Yakiv-Huryk@users.noreply.github.com>
Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
Co-authored-by: bingwang-ms <66248323+bingwang-ms@users.noreply.github.com>
@zhangyanzhao
Copy link
Collaborator

@eddieruan-alibaba can you please help to find someone in Alibaba to complete the code PR review? Thanks.

@goomadao
Copy link

Hi @huseratgithub , great feature for us. But do you have any plan about the software solution of TWAMP Light? There is few ASICs supporting the TWAMP offload ASAIK and we're considering to offload timestamping only to the ASIC.

@guxianghong
Copy link
Collaborator

Hi @huseratgithub , great feature for us. But do you have any plan about the software solution of TWAMP Light? There is few ASICs supporting the TWAMP offload ASAIK and we're considering to offload timestamping only to the ASIC.

Yes, after completing the phase1 hardware-based TWAMP light, we will immediately start the phase2 software-based TWAMP light which has already been condidered in the TWAMP light HLD.

@zhangyanzhao
Copy link
Collaborator

one code PR is still open, move to backlog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: MovedToBacklog
Development

Successfully merging this pull request may close these issues.

7 participants