forked from sonic-net/sonic-buildimage
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update master #1
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Guohan Lu <lguohan@gmail.com>
platform.json and hwsku.json files are required for a feature called Dynamic Port Breakout Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
* 3b330db4a 2021-01-18 | [build]: Fix build error when compiling for armhf (32-bit) (#30) (HEAD, origin/master, origin/HEAD, master) [dflynn-Nokia] * 56aaa225b 2021-01-16 | [ci]: add pipeline for armhf and arm64 (#29) [lguohan] * 90da6141c 2021-01-12 | [ci]: propagate the correct error code the next step (#27) [lguohan] Signed-off-by: Guohan Lu <lguohan@gmail.com>
[ci]: download artifacts from master branch (#768) Do not create fabric port if mapping is not available (#769) [syncd] Comparison logic log also current attr value on set operation (#763) Add fabric port test to vslib (#737) [ci]: use sonicbld pool (#766) [tests] Remove exit command blocking all tests to run (#765) [vslib]: adapt macsec sai 1.7.1 (#755) Add support for SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY needed by CRM (#756) Signed-off-by: Danny Allen <daall@microsoft.com>
- [route_check.py] - update includes checks on subscriptions (sonic-net/sonic-utilities#1344) - Validations checks while adding a member to PortChannel and removing a member from a Portchannel (sonic-net/sonic-utilities#1328) - [show] Add subcommand to show midplane status for modular chassis (sonic-net/sonic-utilities#1267) - [pytest][qos][config] Added pytests for "config qos reload" commands" (sonic-net/sonic-utilities#1346) - Drop explict 3 seconds pause between two object updates/deletes. (sonic-net/sonic-utilities#1359) - [show]fix for show muxcable status by replacing "hostname" to "peer_switch" for deriving tor ipv4_address (sonic-net/sonic-utilities#1360) - [PFCWD] Fix 'start' pfcwd command (sonic-net/sonic-utilities#1345) Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
**- Why I did it** swsssdk will be deprecated. Migrate sonic-config-engine to use libswsscommon library instead **- How to verify it** Unit test
…th prefix to fpmsyncd. (#4806) Currently FRR is send Prefix with VNI information to FPMSYNCD. This PR allows FRR to send RMAC with EVPN Type5 prefix to fpmsyncd. This is a temp fix. This patch will be removed once neighorch is ready to handle the Prefix and ARP (containing RMAC) separately.
* support reproduceable build for git clone Signed-off-by: shilongliu <shilongliu@microsoft.com> * fix Co-authored-by: shilongliu <shilongliu@microsoft.com>
Changes in this update: 37695c8 [show]: Use TCP Connection For Muxcable Commands (#1371) 8119ba2 Validations checks while creating and deleting a Portchannel (#1326) 3df267e [config] Fix Breakout mode option and BREAKOUT_CFG table check method (#1270) 9bd709b [show] Fix show arp in case with FDB entries, linked to default VLAN (#1357) bc2d27e [generate_dump]: fix syntax error signed-of-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Including below commits: 36f7332 2021-01-14 | modified ERR log to NOTICE log for FDB notification failure after VLAN delete (#1595) [madhanmellanox] c21c883 2021-01-12 | [ci]: download artifacts from master branch build (#1597) [lguohan] a1d03a4 2021-01-12 | [fgnhgorch] Match mode changes for Fine Grained ECMP (#1565) [anish-n] 1b65f3d 2021-01-12 | [ci]: use sonicbld pool (#1594) [lguohan] 48ae866 2021-01-12 | [pfcwd] Update PFC storm detection logic for Mellanox platforms (#1586) [Volodymyr Samotiy] 850001f 2021-01-12 | [FPMSYNCD] Evpn/Vxlan related changes to support FRR7.5(#1585) [KISHORE KUNAL] 64ca9bb 2021-01-12 | [ci]: only copy artifacts when build is successful (#1590) [lguohan] 17d0dae 2021-01-11 | [Fdborch] Fix for arm compilation (#1592) [Prince Sunny] 693a02c 2021-01-08 | [gearbox] Add support for "hwinfo" field (#1547) [Baptiste Covolato] 7e3b2c6 2021-01-09 | [Evpn Warmreboot] Added Dependancy check logic in VrfMgr (#1466) [nkelapur] a960e2e 2021-01-09 | [Orchagent]: FdbOrch changes for EVPN VXLAN (#1275) [Pankaj Jain] 097cfda 2021-01-08 | [swss test] update setup guide for swss tests (#1582) [Ying Xie] b42253a 2021-01-05 | Fix for armhf build (#1580) [Qi Luo] d8c1465 2021-01-05 | [dvs] Update/disable DVS tests with new FRR 7.5 behavior (#1579) [Danny Allen] f6c7422 2021-01-05 | ASIC internal temperature sensors support (#1517) [Santhosh Kumar T] 0aa9ef2 2021-01-01 | Simply by auto iterator type, because we will tune the return types of library functions (#1577) [Qi Luo] 773238b 2020-12-31 | [build]: Fix format string for size_t (#1576) [Qi Luo] 7ba4e43 2020-12-30 | [fgnhgorch] add warm reboot support for fgnhg (#1538) [weixchen1215] 4cf6617 2020-12-30 | [ci]: add build for arm64 and armhf (#1572) [lguohan] 6ebc0ed 2020-12-29 | [ci]: add azure-pipeline for amd64 (#1571) [lguohan] e32b9d0 2020-12-29 | [FDBSYNCD] Added pytest for fdbsyncd (#1560) [KISHORE KUNAL] a43f6be 2020-12-30 | [crm] Add support for snat, dnat and ipmc crm resources (#1511) [Prabhu Sreenivasan] 7fc3888 2020-12-29 | PY Test script for EVPN L3 VxLAN (#1330) [Tapash Das] 6eb36d9 2020-12-27 | vlanmgr changes related to EVPN VxLan warmboot (#1460) [anilkpan] Signed-off-by: Guohan Lu <lguohan@gmail.com>
Update in this change: 640a218 [packaging]: Add Support For Libboost v1.71.0 (#449) signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
) - Why I did it Initially, we used Monit to monitor critical processes in each container. If one of critical processes was not running or crashed due to some reasons, then Monit will write an alerting message into syslog periodically. If we add a new process in a container, the corresponding Monti configuration file will also need to update. It is a little hard for maintenance. Currently we employed event listener of Supervisod to do this monitoring. Since processes in each container are managed by Supervisord, we can only focus on the logic of monitoring. - How I did it We borrowed the event listener of Supervisord to monitor critical processes in containers. The event listener will take following steps if it was notified one of critical processes exited unexpectedly: The event listener will first check whether the auto-restart mechanism was enabled for this container or not. If auto-restart mechanism was enabled, event listener will kill the Supervisord process, which should cause the container to exit and subsequently get restarted. If auto-restart mechanism was not enabled for this contianer, the event listener will enter a loop which will first sleep 1 minute and then check whether the process is running. If yes, the event listener exits. If no, an alerting message will be written into syslog. - How to verify it First, we need checked whether the auto-restart mechanism of a container was enabled or not by running the command show feature status. If enabled, one critical process should be selected and killed manually, then we need check whether the container will be restarted or not. Second, we can disable the auto-restart mechanism if it was enabled at step 1 by running the commnad sudo config feature autorestart <container_name> disabled. Then one critical process should be selected and killed. After that, we will see the alerting message which will appear in the syslog every 1 minute. - Which release branch to backport (provide reason below if selected) 201811 201911 [x ] 202006
Meet the requirement for the MUX_CABLE table that IPv6 loopbacks have a /128 prefix Note that this change only affects the MUX_CABLE table, all other tables continue to use the loopback address provided in minigraph. Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
**- Why I did it** Ledd is the last daemon that is not enabled to run in python3. Even though there is a plan to deprecate this daemon and to replace it by something else it's one simple step toward python2 deprecation. **- How I did it** Changed the `command=` line for `ledd` in the `supervisord` configuration of `pmon`. Copied what was done for other daemons. **- How to verify it** Booting a product that has a `led_control.py` should now show the ledd running in python3. I ran `python3 -m pylint` on all `led_control.py` plugin which means that most of them should be python3 compliant. There is however still a risk that some might not work.
To make BBR configured for peer-group if it's name starts with (prefixed) with the string define in constants.yml instead of exact string match.
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan arlakshm@microsoft.com - Why I did it This PR has the changes to support having different swss.rec and sairedis.rec for each asic. The logrotate script is updated as well - How I did it Update the orchagent.sh script to use the logfile name options in these PRs(sonic-net/sonic-swss#1546 and sonic-net/sonic-sairedis#747) In multi asic platforms the record files will be different for each asic, with the format swss.asic{x}.rec and sairedis.asic{x}.rec Update the logrotate script for multiasic platform .
Fixes #6445 Because the ipmihelper.py script in the 9332 folder is slightly different than the common one (due to LGTM fixes), when the common one gets copied during build time it causes the workspace/build to become dirty. Signed-off-by: Danny Allen <daall@microsoft.com>
Submodule changes to be committed: * src/sonic-platform-daemons 81318f7...e72f6cd (3): > [ledd] Minor refactor; add unit tests (#143) > [thermalctld] Report unit test coverage (#141) > [psud] Increase unit test coverage (#140)
process-reboot-cause script should be executable.
Installing newst buster version of libboost (v1.71) in build docker. signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
To view unit test coverage of sonic-host-services package upon build
…annels. (#6537) The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send. Fixes #6199 To check if it works out for this issue in 201911 ? #6503 This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels. master admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd real 0m15.599s user 0m0.061s sys 0m0.038s Sonic 201911.v58 admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd real 0m5.541s user 0m0.020s sys 0m0.028s
There was a mismatch with Eeprom class methods names and methods called from Eeprom class. Signed-off-by: Antonina Melnyk antoninax.melnyk@intel.com
platform.json and hwsku.json files has not a full set of speeds for split modes Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
platform.json and hwsku.json files has not a full set of speeds for split modes Signed-off-by: Vadym Hlushko <vadymh@nvidia.com>
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Accton util applies lsmod to check if drivers are installed. But lsmod may return error on startup and skip module installation. Signed-off-by: roy_lee <roy_lee@edge-core.com>
Signed-off-by: Guohan Lu <lguohan@gmail.com>
- [ci]: add default values to build template - [ci]: add marvel armhf official build Signed-off-by: Guohan Lu <lguohan@gmail.com>
* [device] add platform.json hwsku.json for Montara * [device] add autonge, fec fields to hwsku
Signed-off-by: Guohan Lu <lguohan@gmail.com>
**- Why I did it** After migrating to python3, the operator '/' always get a float result, but it gets integer result in python2. Need fix this in thermal_conditions. **- How I did it** 1. cast float value to int 2. change the unit test case to cover this situation **- How to verify it** Manually test and regression test
proper bool comparision Signed-off-by: Guohan Lu <lguohan@gmail.com>
Signed-off-by: Guohan Lu <lguohan@gmail.com>
restructure repo clean up to make it robust Signed-off-by: Guohan Lu <lguohan@gmail.com>
… IN_PORTS for TD3 (#6718) ACL entry set attribute updates all the entries in the table. The correct behavior is to set the attribute on single entry. - How I did it Current SDK code, while setting the new attribute, is going through all the entries and updating it. Added a logic to check for requested entry and only allow for that ACL entry. A case has filed with BRCM. Once an official fix is provided by BRCM, we will then remove this in house fix and apply the official fix.
- Why I did it To move ‘sonic-host-service’ which is currently built as a separate package to ‘sonic-host-services' package. - How I did it - Moved 'sonic-host-server' to 'src/sonic-host-services' and included it as part of the python3 wheel. - Other files were moved to 'src/sonic-host-services-data' and included as part of the deb package. - Changed build option ‘INCLUDE_HOST_SERVICE’ to ‘ENABLE_HOST_SERVICE_ON_START’ for enabling sonic-hostservice at boot-up by default.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
DavidZagury
pushed a commit
that referenced
this pull request
Feb 23, 2021
…ebian (sonic-net#6114) Sonic devices advertise meaningful system description along with Debian package information. before the fix: ------------- admin@sonic:~$ show lldp neighbors ------------------------------------------------------------------------------- LLDP neighbors: ------------------------------------------------------------------------------- Interface: Ethernet0, via: LLDP, RID: 3, Time: 0 day, 16:36:30 SysName: sonic SysDescr: Debian GNU/Linux 9 (stretch) Linux 4.9.0-11-2-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 ------------------------------------------------------------------------------- After the fix: root@sonic:~# show lldp neighbors Ethernet16 ------------------------------------------------------------------------------- LLDP neighbors: ------------------------------------------------------------------------------- Interface: Ethernet16, via: LLDP, RID: 10, Time: 0 day, 00:01:00 SysName: sonic SysDescr: SONiC Software Version: SONiC.sonic_upstream_1.0_daily_201130_1501_62-dirty-20201130.203529 - HwSku: Accton-AS7816-64X - Distribution: Debian 10.6 - Kernel: 4.19.0-9-2-amd64 ------------------------------------------------------------------------------- Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
DavidZagury
pushed a commit
that referenced
this pull request
Dec 8, 2021
Allow mellanox platform to build and successfully switch packets in Debian 11 Upgraded * Mellanox SDK * Mellanox Hardware Management * Mellanox Firmware * Mellanox Kernel Patches Adjusted build system to support host system running bullseye and dockers running buster.
DavidZagury
pushed a commit
that referenced
this pull request
Dec 8, 2021
* Make neccesary changed to mellanox platform code to build on Debian 11 * Revert use of backported kernel to build mft and elect to only build kernel module under bullseye
DavidZagury
pushed a commit
that referenced
this pull request
Dec 8, 2021
Submodule update for sonic-linkmgrd Incorporates: c11a576 (2021-11-22 09:38:46) [ci]: show code coverage in azure pipeline (#4) 4ceb01d (2021-11-18 20:24:20) Fix MUX toggling issue (#1) d640527 (2021-11-12 22:31:44) [ci]: fix artifact download b9f247d (2021-11-12 22:31:44) [ci]: use native arm64/armhf build 3059122 (2021-09-27 11:32:23) [linkgrd] Add Missing Apache License Header
DavidZagury
pushed a commit
that referenced
this pull request
Jun 23, 2022
…net#10291) #### Why I did it Fix issue: Non compliant leaf list in config_db schema: sonic-net#9801 #### How I did it The basic flow of DPB is like: 1. Transfer config db json value to YANG json value, name it “yangIn” 2. Validate “yangIn” by libyang 3. Generate a YANG json value to represent the target configuration, name it “yangTarget” 4. Do diff between “yangIn” and “yangTarget” 5. Apply the diff to CONFIG DB json and save it back to DB The fix: • For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”. • For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string. #### How to verify it 1. Manual test 2. Changed sample config DB and unit test passed
DavidZagury
added a commit
that referenced
this pull request
Feb 7, 2023
- Why I did it To improve ASIC FW upgrade logging and have information about the cause of FW update failure in the log. - How I did it Added syslog logger support In case the FW update has failed the update tool will give the cause of the failure in the output in the last line, starting with "Fail". When running the tool, in case of a failed update, we will parse the output to retrieve the cause and log it. Device #1: ---------- Device Type: ConnectX6DX Part Number: MCX623106AN-CDA_Ax Description: ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16; PSID: MT_0000000359 PCI Device Name: /dev/mst/mt4125_pciconf0 Base GUID: 0c42a103007d22d4 Base MAC: 0c42a17d22d4 Versions: Current Available FW 22.32.0498 22.32.0498 PXE 3.6.0500 3.6.0500 UEFI 14.25.0015 14.25.0015 Status: Forced update required --------- Found 1 device(s) requiring firmware update... Device #1: Updating FW ... FSMST_INITIALIZE - OK Writing Boot image component - OK Fail : The Digest in the signature is wrong - How to verify it mlnx-fw-upgrade.sh --upgrade
DavidZagury
pushed a commit
that referenced
this pull request
Oct 1, 2023
…bors over iBGP Session (sonic-net#16705) What I did: Enable Sending BGP Community over internal neighbors over iBGP Session Microsoft ADO: 25268695 Why I did: Without this change BGP community send by e-BGP Peers are not carry-forward to other e-BGP peers. str2-xxxx-lc1-2# show bgp ipv6 20c0:a801::/64 BGP routing table entry for 20c0:a801::/64, version 52141 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65500 2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) Last update: Tue Sep 26 16:08:26 2023 str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25 BGP routing table entry for 192.168.35.128/25, version 52688 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65502 3.3.3.6 from 3.3.3.6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) Last update: Tue Sep 26 15:45:51 2023 After the change str2-xxxx-lc2-2(config)# router bgp 65100 str2-xxxx-lc2-2(config-router)# address-family ipv4 str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V4 send-community str2-xxxx-lc2-2(config-router-af)# exit str2-xxxx-lc2-2(config-router)# address-family ipv6 str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V6 send-community str2-xxxx-lc1-2# show bgp ipv6 20c0:a801::/64 BGP routing table entry for 20c0:a801::/64, version 52400 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65500 2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) **Community: 1111:1111** Last update: Tue Sep 26 16:10:19 2023 str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25 BGP routing table entry for 192.168.35.128/25, version 52947 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65502 3.3.3.6 from 3.3.3.6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) **Community: 1111:1111** Last update: Tue Sep 26 16:10:09 2023 Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
DavidZagury
pushed a commit
that referenced
this pull request
Nov 12, 2023
…kernel 6.1 and bookworm (sonic-net#16954) * sonic-platform-modules-cel: broadcom: adapt for kernel 6.1 and bookworm The i2c_driver->remove API declaration has been updated to return void instead of int, as part of cleanup patches in 6.1. More details can be referred from here: [1]. Update the remove API definition in the modules accordingly and cleanup variables that go unused from the remove API. Update python build commands for bookworm. The packaging based on calling setup.py is deprecated and using build module/pip utility is the recommended method for python packaging/installation. Further details can be referred to from here: [2], [3]. The build module is picky about the package information file, which needs to be either setup.py or pyproject.toml. Additionally, fix formatting inconsistencies in debian/changelog reported by `dh_installchangelogs` during the build. Tested the changes by compiling the changes as below: make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1 sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb cd platform/broadcom/sonic-platform-modules-cel KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage Also verified the python scripts under the sonic-platform-modules-cel with pyflakes to ensure no new errors are flagged (with exception of unused modules). References: [1] - torvalds/linux@ed5c2f5f [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm [3] - 0b20a48 (Update Python build commands for Bookworm, 2023-09-07) Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * platform/pddf: i2c: adapt for kernel 6.1 and bookworm * Fixup i2c_driver->remove API due to changes in the function prototype (ref: [1]). * Cleanup `MODULE_SUPPORTED_DEVICE` macros that were cleaned up in the upstream (ref: [2]). * Sanitize python packaging and installation using the `build` module instead of calling the setup.py directly (ref: [3]. [4]). Tested the changes by compiling pddf module as below: make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1 sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb cd platform/pddf/i2c KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage References: [1] - torvalds/linux@ed5c2f5f [2] - torvalds/linux@6417f031 [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm [3] - 0b20a48 (Update Python build commands for Bookworm, 2023-09-07) Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * platform/broadcom: include platform-modules-cel in builds With pddf modules patched for 6.1, platform-modules-cel can be compiled and included in the final image. Testing by building sonic-broadcom.bin/sonic-broadcom-dnx.bin. Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * pddf/i2c: revert correct rootdir for pip install The pip install directory has been set to test-pkg1/ for testing the build and incorrectly retained as is. Revert this to the correct path $(PACKAGE_PRE_NAME). Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * platform/broadcom: include pddf/modules-cel in the base package Without this change, the modules were built but not packaged in the final .bin. The final sonic-broadcom.bin has been tested for bootup on Celestica's Silverstone platform. admin@sonic:~$ uname -a Linux sonic 6.1.0-11-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4 (2023-08-08) x86_64 GNU/Linux admin@sonic:~$ show platform summary Platform: x86_64-cel_silverstone-r0 HwSKU: Silverstone ASIC: broadcom ASIC Count: 1 Serial Number: R4009B2F062504LK200024 Model Number: N/A Hardware Revision: N/A admin@sonic:~$ show version | head SONiC Software Version: SONiC.g0aad6c67c-rachandr SONiC OS Version: 12 Distribution: Debian 12.2 Kernel: 6.1.0-11-2-amd64 Build commit: 0aad6c67c Build date: Thu Oct 26 07:13:47 UTC 2023 Built by: rachandr@AZUHPS14 Platform: x86_64-cel_silverstone-r0 Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> --------- Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
DavidZagury
pushed a commit
that referenced
this pull request
Nov 22, 2023
…kernel 6.1 and bookworm (sonic-net#16954) * sonic-platform-modules-cel: broadcom: adapt for kernel 6.1 and bookworm The i2c_driver->remove API declaration has been updated to return void instead of int, as part of cleanup patches in 6.1. More details can be referred from here: [1]. Update the remove API definition in the modules accordingly and cleanup variables that go unused from the remove API. Update python build commands for bookworm. The packaging based on calling setup.py is deprecated and using build module/pip utility is the recommended method for python packaging/installation. Further details can be referred to from here: [2], [3]. The build module is picky about the package information file, which needs to be either setup.py or pyproject.toml. Additionally, fix formatting inconsistencies in debian/changelog reported by `dh_installchangelogs` during the build. Tested the changes by compiling the changes as below: make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1 sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb cd platform/broadcom/sonic-platform-modules-cel KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage Also verified the python scripts under the sonic-platform-modules-cel with pyflakes to ensure no new errors are flagged (with exception of unused modules). References: [1] - torvalds/linux@ed5c2f5f [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm [3] - 0b20a48 (Update Python build commands for Bookworm, 2023-09-07) Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * platform/pddf: i2c: adapt for kernel 6.1 and bookworm * Fixup i2c_driver->remove API due to changes in the function prototype (ref: [1]). * Cleanup `MODULE_SUPPORTED_DEVICE` macros that were cleaned up in the upstream (ref: [2]). * Sanitize python packaging and installation using the `build` module instead of calling the setup.py directly (ref: [3]. [4]). Tested the changes by compiling pddf module as below: make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1 sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb cd platform/pddf/i2c KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage References: [1] - torvalds/linux@ed5c2f5f [2] - torvalds/linux@6417f031 [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm [3] - 0b20a48 (Update Python build commands for Bookworm, 2023-09-07) Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * platform/broadcom: include platform-modules-cel in builds With pddf modules patched for 6.1, platform-modules-cel can be compiled and included in the final image. Testing by building sonic-broadcom.bin/sonic-broadcom-dnx.bin. Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * pddf/i2c: revert correct rootdir for pip install The pip install directory has been set to test-pkg1/ for testing the build and incorrectly retained as is. Revert this to the correct path $(PACKAGE_PRE_NAME). Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> * platform/broadcom: include pddf/modules-cel in the base package Without this change, the modules were built but not packaged in the final .bin. The final sonic-broadcom.bin has been tested for bootup on Celestica's Silverstone platform. admin@sonic:~$ uname -a Linux sonic 6.1.0-11-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4 (2023-08-08) x86_64 GNU/Linux admin@sonic:~$ show platform summary Platform: x86_64-cel_silverstone-r0 HwSKU: Silverstone ASIC: broadcom ASIC Count: 1 Serial Number: R4009B2F062504LK200024 Model Number: N/A Hardware Revision: N/A admin@sonic:~$ show version | head SONiC Software Version: SONiC.g0aad6c67c-rachandr SONiC OS Version: 12 Distribution: Debian 12.2 Kernel: 6.1.0-11-2-amd64 Build commit: 0aad6c67c Build date: Thu Oct 26 07:13:47 UTC 2023 Built by: rachandr@AZUHPS14 Platform: x86_64-cel_silverstone-r0 Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com> --------- Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
DavidZagury
pushed a commit
that referenced
this pull request
Nov 22, 2023
…bors over iBGP Session (sonic-net#16705) What I did: Enable Sending BGP Community over internal neighbors over iBGP Session Microsoft ADO: 25268695 Why I did: Without this change BGP community send by e-BGP Peers are not carry-forward to other e-BGP peers. str2-xxxx-lc1-2# show bgp ipv6 20c0:a801::/64 BGP routing table entry for 20c0:a801::/64, version 52141 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65500 2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) Last update: Tue Sep 26 16:08:26 2023 str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25 BGP routing table entry for 192.168.35.128/25, version 52688 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65502 3.3.3.6 from 3.3.3.6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) Last update: Tue Sep 26 15:45:51 2023 After the change str2-xxxx-lc2-2(config)# router bgp 65100 str2-xxxx-lc2-2(config-router)# address-family ipv4 str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V4 send-community str2-xxxx-lc2-2(config-router-af)# exit str2-xxxx-lc2-2(config-router)# address-family ipv6 str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V6 send-community str2-xxxx-lc1-2# show bgp ipv6 20c0:a801::/64 BGP routing table entry for 20c0:a801::/64, version 52400 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65500 2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) **Community: 1111:1111** Last update: Tue Sep 26 16:10:19 2023 str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25 BGP routing table entry for 192.168.35.128/25, version 52947 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65502 3.3.3.6 from 3.3.3.6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) **Community: 1111:1111** Last update: Tue Sep 26 16:10:09 2023 Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
DavidZagury
pushed a commit
that referenced
this pull request
Feb 7, 2024
…bors over iBGP Session (sonic-net#16705) What I did: Enable Sending BGP Community over internal neighbors over iBGP Session Microsoft ADO: 25268695 Why I did: Without this change BGP community send by e-BGP Peers are not carry-forward to other e-BGP peers. str2-xxxx-lc1-2# show bgp ipv6 20c0:a801::/64 BGP routing table entry for 20c0:a801::/64, version 52141 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65500 2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) Last update: Tue Sep 26 16:08:26 2023 str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25 BGP routing table entry for 192.168.35.128/25, version 52688 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65502 3.3.3.6 from 3.3.3.6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) Last update: Tue Sep 26 15:45:51 2023 After the change str2-xxxx-lc2-2(config)# router bgp 65100 str2-xxxx-lc2-2(config-router)# address-family ipv4 str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V4 send-community str2-xxxx-lc2-2(config-router-af)# exit str2-xxxx-lc2-2(config-router)# address-family ipv6 str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V6 send-community str2-xxxx-lc1-2# show bgp ipv6 20c0:a801::/64 BGP routing table entry for 20c0:a801::/64, version 52400 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65500 2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) **Community: 1111:1111** Last update: Tue Sep 26 16:10:19 2023 str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25 BGP routing table entry for 192.168.35.128/25, version 52947 Paths: (1 available, best #1, table default) Not advertised to any peer 65000 65502 3.3.3.6 from 3.3.3.6 (3.3.3.6) Origin IGP, localpref 100, valid, internal, best (First path received) **Community: 1111:1111** Last update: Tue Sep 26 16:10:09 2023 Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
DavidZagury
pushed a commit
that referenced
this pull request
Sep 9, 2024
#### Why I did it Dropping control character (message sent when XSUB connects to XPUB as part of ZMQ Proxy setup to notify that subscription has been made) in do capture has been flaky since control character is not guaranteed to be the first message sent if there are events (like event-down-ctr) being published to XSUB. Scenarios 1) Control character is sent and is first message when starting capture service `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` `eventd#eventd#eventd: :- do_capture: Received subscription message when XSUB connects to XPUB` 2) Events like event-down ctr is sent before control character `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 17 sonic-events-host` `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 0 0 4 0 0 0 1 d 103 {"sonic-events-host:event-stopped-ctr":{"ctr_name":"EVENTD","timestamp":"2024-08-27T00:02:51.407518Z"}} 1 r 36 3357542f-bae1-458f-a804-660e620d21f5 1 s 1 9 1 t 19 1724716971407591080` `heartbeat_ctrl: Set heartbeat_ctrl pause=1` `do_capture: Received subscription message when XSUB connects to XPUB` 3) Control character is not sent at all `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` 4) Control character is delayed and not caught when starting capture service, but is then caught after causing deserialize error. `do_capture: Receiving event from source: 22 serialization::archive 18 17 sonic-events-host, will read second part of event` `deserialize: deserialize Failed: input stream errorstr[0:64]:(#1) data type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&` `zmq_read_part: Failed to deserialize part rc=-2` `zmq_read_part: last:errno=11` `zmq_message_read: Failure to read part1 rc=-2` `zmq_message_read: last:errno=11` We can cover these scenarios by just dropping the control character inside zmq_message_read as part of events_common in swsscommon (different PR). In this PR we will remove such handling logic and make sure that empty events that will be sent by control character are ignored. ##### Work item tracking - Microsoft ADO **(number only)**:28728116 #### How I did it Remove logic for handling control character #### How to verify it UT and sonic-mgmt test cases.
DavidZagury
pushed a commit
that referenced
this pull request
Oct 6, 2024
#### Why I did it Dropping control character (message sent when XSUB connects to XPUB as part of ZMQ Proxy setup to notify that subscription has been made) in do capture has been flaky since control character is not guaranteed to be the first message sent if there are events (like event-down-ctr) being published to XSUB. Scenarios 1) Control character is sent and is first message when starting capture service `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` `eventd#eventd#eventd: :- do_capture: Received subscription message when XSUB connects to XPUB` 2) Events like event-down ctr is sent before control character `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 17 sonic-events-host` `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 0 0 4 0 0 0 1 d 103 {"sonic-events-host:event-stopped-ctr":{"ctr_name":"EVENTD","timestamp":"2024-08-27T00:02:51.407518Z"}} 1 r 36 3357542f-bae1-458f-a804-660e620d21f5 1 s 1 9 1 t 19 1724716971407591080` `heartbeat_ctrl: Set heartbeat_ctrl pause=1` `do_capture: Received subscription message when XSUB connects to XPUB` 3) Control character is not sent at all `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` 4) Control character is delayed and not caught when starting capture service, but is then caught after causing deserialize error. `do_capture: Receiving event from source: 22 serialization::archive 18 17 sonic-events-host, will read second part of event` `deserialize: deserialize Failed: input stream errorstr[0:64]:(#1) data type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&` `zmq_read_part: Failed to deserialize part rc=-2` `zmq_read_part: last:errno=11` `zmq_message_read: Failure to read part1 rc=-2` `zmq_message_read: last:errno=11` We can cover these scenarios by just dropping the control character inside zmq_message_read as part of events_common in swsscommon (different PR). In this PR we will remove such handling logic and make sure that empty events that will be sent by control character are ignored. ##### Work item tracking - Microsoft ADO **(number only)**:28728116 #### How I did it Remove logic for handling control character #### How to verify it UT and sonic-mgmt test cases.
DavidZagury
pushed a commit
that referenced
this pull request
Dec 7, 2024
Fix sonic-net#6866 Unset CONFIG_THERMAL_STATISTICS. Reason: Kernel thermal zones binding to the cooling device together with CONFIG_THERMAL_STATISTICS=y causes kernel crash as out of boundary: trans_table is two-dimensional table allocated per max cooling state (10). If statistics is configured, thermal_cooling_device_stats_update() will be called and will try to update out of boundary: stats->trans_table[stats->state * stats->max_states + new_state]++ Kernel crash with the following stack trace: ``` [ 269.474092] watchdog: watchdog1: watchdog did not stop! [ 269.533625] list_del corruption. prev->next should be ffff9e136bd57418, but was 677ac660ffffffff [ 269.543482] kernel BUG at lib/list_debug.c:53! [ 269.548458] invalid opcode: 0000 [#1] SMP PTI [ 269.553326] CPU: 1 PID: 8890 Comm: kexec Tainted: G OE 4.19.0-9-2-amd64 #1 Debian 4.19.118-2+deb10u1 [ 269.564891] Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 11/03/2020 [ 269.574323] RIP: 0010:__list_del_entry_valid.cold.1+0x34/0x4c [ 269.580740] Code: 9f 29 a5 e8 68 7a d0 ff 0f 0b 48 c7 c7 20 a0 29 a5 e8 5a 7a d0 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 e0 9f 29 a5 e8 46 7a d0 ff <0f> 0b 48 89 fe 48 c7 c7 a8 9f 29 a5 e8 35 7a d0 ff 0f 0b 90 90 90 [ 269.601726] RSP: 0018:ffffaddb83b5fdc0 EFLAGS: 00010246 [ 269.607561] RAX: 0000000000000054 RBX: ffff9e136bd57418 RCX: 0000000000000000 [ 269.615531] RDX: 0000000000000000 RSI: ffff9e136fa566b8 RDI: ffff9e136fa566b8 [ 269.623500] RBP: ffff9e1364bd5070 R08: 00000000000005ce R09: 0000000000000004 [ 269.631470] R10: 0000000000000766 R11: ffffffffa59f66ad R12: ffff9e136bd57400 [ 269.639440] R13: ffffffffa52c6a12 R14: ffff9e1364bd30d0 R15: 0000000000000000 [ 269.647410] FS: 00007f97227af740(0000) GS:ffff9e136fa40000(0000) knlGS:0000000000000000 [ 269.656441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 269.662857] CR2: 000055cfdb69e158 CR3: 00000004677f6001 CR4: 00000000003606e0 [ 269.670820] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 269.678790] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 269.686760] Call Trace: [ 269.689489] device_shutdown+0xc1/0x210 [ 269.693773] kernel_kexec+0x51/0x96 [ 269.697666] __do_sys_reboot+0x1be/0x210 [ 269.702045] ? kmem_cache_free+0x1aa/0x1d0 [ 269.706618] ? __dentry_kill+0x121/0x170 [ 269.710998] ? _cond_resched+0x15/0x30 [ 269.715181] ? dentry_kill+0x4d/0x190 [ 269.719260] ? _cond_resched+0x15/0x30 [ 269.723444] do_syscall_64+0x53/0x110 [ 269.727531] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 269.733172] RIP: 0033:0x7f97228a3373 [ 269.737161] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 e9 9a 0c 00 f7 d8 [ 269.758147] RSP: 002b:00007ffe11d30fa8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9 [ 269.766602] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f97228a3373 [ 269.774572] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead [ 269.782541] RBP: 0000000000000002 R08: 0000000000000004 R09: 000055cfdb69e160 [ 269.790511] R10: fffffffffffffb8e R11: 0000000000000202 R12: 00007ffe11d31238 [ 269.798482] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ffffffff [ 269.806443] Modules linked in: nft_chain_route_ipv4(E) xt_TCPMSS(E) sx_bfd(OE) sx_netdev(OE) psample(E) dummy(E) sx_core(OE) 8021q(E) garp(E) mrp(E) mst_pciconf(OE) mst_pci(OE) xt_hl(E) xt_tcpudp(E) ip6_tables(E) nft_compat(E) nft_counter(E) xt_conntrack(E) nf_nat(E) nf_conntrack_netlink(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) xfrm_user(E) xfrm_algo(E) intel_rapl(E) mlxsw_minimal(E) sb_edac(E) mlxsw_i2c(E) x86_pkg_temp_thermal(E) mlxsw_core(E) intel_powerclamp(E) devlink(E) kvm_intel(E) bonding(E) kvm(E) i2c_mux_reg(E) i2c_mux(E) mlxreg_hotplug(E) mlxreg_io(E) leds_mlxreg(E) i2c_mlxcpld(E) mlxreg_fan(E) mxm_wmi(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) evdev(E) mlx_platform(E) ghash_clmulni_intel(E) intel_cstate(E) sg(E) intel_uncore(E) iTCO_wdt(E) pcspkr(E) [ 269.885239] intel_rapl_perf(E) ioatdma(E) iTCO_vendor_support(E) pcc_cpufreq(E) wmi(E) ebt_vlan(E) ebtable_broute(E) bridge(E) stp(E) llc(E) ebtable_nat(E) nf_tables(E) button(E) nfnetlink(E) ebtable_filter(E) ebtables(E) xdpe12284(E) at24(E) ledtrig_timer(E) tmp102(E) lm75(E) coretemp(E) max1363(E) industrialio_triggered_buffer(E) kfifo_buf(E) industrialio(E) tps53679(E) pmbus(E) pmbus_core(E) i2c_dev(E) ip_tables(E) x_tables(E) autofs4(E) loop(E) ext4(E) crc16(E) mbcache(E) jbd2(E) crc32c_generic(E) fscrypto(E) ecb(E) sd_mod(E) nvme(E) nvme_core(E) nls_utf8(E) nls_cp437(E) nls_ascii(E) vfat(E) fat(E) overlay(E) squashfs(E) zstd_decompress(E) xxhash(E) crc32c_intel(E) gpio_ich(E) ahci(E) aesni_intel(E) libahci(E) aes_x86_64(E) crypto_simd(E) xhci_pci(E) ehci_pci(E) libata(E) igb(E) ehci_hcd(E) [ 269.964036] xhci_hcd(E) cryptd(E) glue_helper(E) scsi_mod(E) i2c_algo_bit(E) i2c_i801(E) lpc_ich(E) dca(E) mfd_core(E) usbcore(E) usb_common(E) [ 269.978536] ---[ end trace 8f56c678b52f9aee ]--- [ 269.983698] RIP: 0010:__list_del_entry_valid.cold.1+0x34/0x4c [ 269.990123] Code: 9f 29 a5 e8 68 7a d0 ff 0f 0b 48 c7 c7 20 a0 29 a5 e8 5a 7a d0 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 e0 9f 29 a5 e8 46 7a d0 ff <0f> 0b 48 89 fe 48 c7 c7 a8 9f 29 a5 e8 35 7a d0 ff 0f 0b 90 90 90 [ 270.011117] RSP: 0018:ffffaddb83b5fdc0 EFLAGS: 00010246 [ 270.016958] RAX: 0000000000000054 RBX: ffff9e136bd57418 RCX: 0000000000000000 [ 270.024935] RDX: 0000000000000000 RSI: ffff9e136fa566b8 RDI: ffff9e136fa566b8 [ 270.032912] RBP: ffff9e1364bd5070 R08: 00000000000005ce R09: 0000000000000004 [ 270.040890] R10: 0000000000000766 R11: ffffffffa59f66ad R12: ffff9e136bd57400 [ 270.048866] R13: ffffffffa52c6a12 R14: ffff9e1364bd30d0 R15: 0000000000000000 [ 270.056844] FS: 00007f97227af740(0000) GS:ffff9e136fa40000(0000) knlGS:0000000000000000 [ 270.065889] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 270.072312] CR2: 000055cfdb69e158 CR3: 00000004677f6001 CR4: 00000000003606e0 [ 270.080289] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 270.088268] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ``` A temporary solution is to disable this config and to work with the linux community on fixing it. The solution requires fan driver update which is not trivial and will take some time to have it available on next-net before can be backported to SONiC linux-kernel. It was tested on: HwSKU: ACS-MSN2410 HwSKU: Mellanox-SN2700
DavidZagury
pushed a commit
that referenced
this pull request
Dec 10, 2024
#### Why I did it Dropping control character (message sent when XSUB connects to XPUB as part of ZMQ Proxy setup to notify that subscription has been made) in do capture has been flaky since control character is not guaranteed to be the first message sent if there are events (like event-down-ctr) being published to XSUB. Scenarios 1) Control character is sent and is first message when starting capture service `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` `eventd#eventd#eventd: :- do_capture: Received subscription message when XSUB connects to XPUB` 2) Events like event-down ctr is sent before control character `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 17 sonic-events-host` `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 0 0 4 0 0 0 1 d 103 {"sonic-events-host:event-stopped-ctr":{"ctr_name":"EVENTD","timestamp":"2024-08-27T00:02:51.407518Z"}} 1 r 36 3357542f-bae1-458f-a804-660e620d21f5 1 s 1 9 1 t 19 1724716971407591080` `heartbeat_ctrl: Set heartbeat_ctrl pause=1` `do_capture: Received subscription message when XSUB connects to XPUB` 3) Control character is not sent at all `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` 4) Control character is delayed and not caught when starting capture service, but is then caught after causing deserialize error. `do_capture: Receiving event from source: 22 serialization::archive 18 17 sonic-events-host, will read second part of event` `deserialize: deserialize Failed: input stream errorstr[0:64]:(#1) data type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&` `zmq_read_part: Failed to deserialize part rc=-2` `zmq_read_part: last:errno=11` `zmq_message_read: Failure to read part1 rc=-2` `zmq_message_read: last:errno=11` We can cover these scenarios by just dropping the control character inside zmq_message_read as part of events_common in swsscommon (different PR). In this PR we will remove such handling logic and make sure that empty events that will be sent by control character are ignored. ##### Work item tracking - Microsoft ADO **(number only)**:28728116 #### How I did it Remove logic for handling control character #### How to verify it UT and sonic-mgmt test cases.
DavidZagury
pushed a commit
that referenced
this pull request
Dec 10, 2024
To fix a statistical issue. The original fix was done in FRRouting/frr#17297. However to accommodate 8.5.4 the patch in the PR was added. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 #4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 #5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 #6 route_next (node=<optimized out>) at ../lib/table.c:436 #7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 #8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 #9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 #10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 #11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 #12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 #13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 #14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
- Why I did it
- How I did it
- How to verify it
- Which release branch to backport (provide reason below if selected)
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)