forked from sonic-net/sonic-buildimage
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating SONiC for OOPT #1
Open
haris-khan1596
wants to merge
6,872
commits into
Telecominfraproject:master
Choose a base branch
from
sonic-net:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fixed switch role check for IDF isolation configuration
Add GNMI client cert cname list to yang model. #### Why I did it Allow gnmi service authentication client cert by cname. ### How I did it Add GNMI client cert cname list to yang model. #### How to verify it Pass all UT. ### Description for the changelog Add GNMI client cert cname list to yang model.
…atically (#19971) #### Why I did it src/sonic-utilities ``` * 9a3f359e - (HEAD -> master, origin/master, origin/HEAD) Add timeout for rexec's get_password (#3484) (29 hours ago) [Changrong Wu] * 4372ced5 - Add lock to config reload/load_minigraph (#3475) (2 days ago) [Longxiang Lyu] ``` #### How I did it #### How to verify it #### Description for the changelog
Why I did it The mix of docker-ptf with Python 2 + Python 3 (in virtual-env) and Python 3 images only across branches (with possibly some test scripts in sonic-mgmt not migrated to Python 3 in older branches) causes different path references to Python and PTF binaries. This can cause backporting a hassle. To prevent this the PR creates virtual environment (/root/env-python3) path to the real python 3 and PTF binaries. This keeps the paths same across docker-ptf images. How I did it Created symlinks for python and PTF from /root/env-python3/bin to /usr/bin. How to verify it Image built successfully and manually verified the existence of the links.
* Support for event persistence in redis-db. * Updates * Updates including fix to eventdb in test enviroment. * Add sonic yang to model event and alarm table. remove ack, noack from sonic-common-event yang. * Add event/alarm persistence related testscases * Remove file eventdb_ut.cpp. * Updates to eventdb testsuite. * Revert changes to existing eventd UT. Set eventdb testcases as separate test binary. Skip testcase execution if DB connection failure. * Commit test related config files.
Updating sai debian version to 1.14.0-1 Signed-off-by: Keshav Gupta <keshavg@marvell.com>
previously 2700a1 sensors.conf is missing bus section for psu, so add it and update the psu_sensors.json with the info. fix psu_sensors_conf_updater to treat PSR psu as same as PSF one. Signed-off-by: Yuanzhe, Liu <yualiu@nvidia.com>
…in vrf (#19587) Why I did it The 'ip nht resolve-via-default' is configured in vrf config by mistake instead of the global setting when vrf is configured causing the bgpl cannot to be established. Work item tracking Microsoft ADO (number only): 28726407 How I did it Add the missing 'exit' line in FRR zebra.interfaces.conf.j2 file to exit vrf config block and ensure the 'ip nht resolve-via-default' is configured in the global setting.
Add a new HwSku for a different port speed layout Arista-7060DX5-32-100Gx48-400Gx8 Add HwSKU Arista-7060DX5-32-100Gx48-400Gx8. Add file media_settings.json
- Why I did it To have a dedicated rshim instance per DPU instead of a single instance for all - How I did it 1. add multi-instance systemd service to run rshim per DPU 2. build rshim w/o default systemd service 3. patch rshim to have the "dpu" interface name 4. update rshim version to 2.0.29 - How to verify it Manual test on SN4280 Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
…n t1 (#20000) Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
* commit changes under sdklt/ foloder Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * commit changes under saibcm-modules/ excpet sdklt/ Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * upgrade saibcm-modules version to 10.1 Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * update changelog Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * add dcb folder Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * move dcb to systems/linux/kernel/modules/ Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * fix path Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * fix path Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * fix path Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * try fix include error Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * remove all generated files Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * add sdklt/bcmlrd/include/bcmlrd/chip/generated/ folder Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * add bcmltd/ Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * Add missing folder Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * add missing folder Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * Add missing folders Signed-off-by: zitingguo <zitingguo@microsoft.com> * Make sure 'genl-packet' is built before 'bcmgenl' Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * sending original pkt size to psample module Signed-off-by: zitingguo <zitingguo@microsoft.com> * try pass semgrep check Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> * try bypass semgrep check Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> --------- Signed-off-by: zitingguo-ms <zitingguo@microsoft.com> Signed-off-by: zitingguo <zitingguo@microsoft.com>
[BGP] Fix TCP MD5 authentication problem in VRF Signed-off-by: Julian Chang - TW <julianc@supermicro.com.tw>
Fixes #19380 To enable idf_isolation route-map towards BGP peer
Why I did it Upgrade xgs SAI version to 10.1.38.0 10.1.38.0: [CSP CS00012359404] Port SONIC-93179 to SAI 10.1 branch Work item tracking Microsoft ADO (number only): 29094720 How I did it Upgrade the SAI version in the sai.mk file. How to verify it https://dev.azure.com/mssonic/internal/_build/results?buildId=620112&view=results
…atforms (#19739) * [Nokia][Device] Update SUP device to set the dpp_db_path * [Nokia Device] Set the programmability ucode relative path for Nokia DNX platforms. Added Programmability path and dpp_path for Nokia DNX platforms to work with SAI 11.x --------- Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
Update EZB files to version 1.08 to support SAI 1.14.0.1
Updating hwsku config file to EZB1.08 for AC5X RD board
#### Why I did it Dropping control character (message sent when XSUB connects to XPUB as part of ZMQ Proxy setup to notify that subscription has been made) in do capture has been flaky since control character is not guaranteed to be the first message sent if there are events (like event-down-ctr) being published to XSUB. Scenarios 1) Control character is sent and is first message when starting capture service `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` `eventd#eventd#eventd: :- do_capture: Received subscription message when XSUB connects to XPUB` 2) Events like event-down ctr is sent before control character `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 17 sonic-events-host` `eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 0 0 4 0 0 0 1 d 103 {"sonic-events-host:event-stopped-ctr":{"ctr_name":"EVENTD","timestamp":"2024-08-27T00:02:51.407518Z"}} 1 r 36 3357542f-bae1-458f-a804-660e620d21f5 1 s 1 9 1 t 19 1724716971407591080` `heartbeat_ctrl: Set heartbeat_ctrl pause=1` `do_capture: Received subscription message when XSUB connects to XPUB` 3) Control character is not sent at all `eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1` 4) Control character is delayed and not caught when starting capture service, but is then caught after causing deserialize error. `do_capture: Receiving event from source: 22 serialization::archive 18 17 sonic-events-host, will read second part of event` `deserialize: deserialize Failed: input stream errorstr[0:64]:(#1) data type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&` `zmq_read_part: Failed to deserialize part rc=-2` `zmq_read_part: last:errno=11` `zmq_message_read: Failure to read part1 rc=-2` `zmq_message_read: last:errno=11` We can cover these scenarios by just dropping the control character inside zmq_message_read as part of events_common in swsscommon (different PR). In this PR we will remove such handling logic and make sure that empty events that will be sent by control character are ignored. ##### Work item tracking - Microsoft ADO **(number only)**:28728116 #### How I did it Remove logic for handling control character #### How to verify it UT and sonic-mgmt test cases.
Signed-off-by: anamehra anamehra@cisco.com Cisco platform 202405.0.3 release Why I did it Cisco platform 202405.0.3 release Work item tracking Microsoft ADO (number only):
Why I did it This is a bug fix and this change should go into 202305 brranch and onwards. Added Route map support for BGP profile FROM_SDN_APPLIANCE_ROUTES With this change the following route map should be added if the BGP profile is added route-map FROM_SDN_APPLIANCE_ROUTES_RM permit 100 set as-path prepend <Prepend value of FROM_SDN_SLB_DEPLOYMENT_ID> set community <community id> set origin incomplete Work item tracking Microsoft ADO (number only): 28896695 How I did it How to verify it Added test to verify the rotue map creation.
…utomatically (#20022) #### Why I did it src/sonic-host-services ``` * e21db16 - (HEAD -> master, origin/master, origin/HEAD) [BFD]Fix BFD blackout issue (#150) (26 hours ago) [Sudharsan Dhamal Gopalarathnam] * 39834f2 - Password Hardening: Add support to disable expiration date (#93) (4 days ago) [davidpil2002] ``` #### How I did it #### How to verify it #### Description for the changelog
… control (#19476) - Why I did it On Mellanox platforms, currently only CMIS active ports can be controlled by the SW, and all copper modules are controlled by FW. We want to let Sonic control passive copper modules as well, for CMIS and SFF (sff8636 and sff8436). - How I did it I updated the module detection flow to tag CMIS and SFF passive modules as SW control. - How to verify it Manual tests.
HLD link: sonic-net/SONiC#1522 - Why I did it SONiC provides two Python logger implementations: sonic_py_common.logger.Logger and sonic_py_common.syslogger.SysLogger. Both of them do not provide the ability to change log level at real time. Sometimes, in order to get more debug information, developer has to manually change the log level in code on a running switch and restart the Python daemon. This is not convenient. SONiC also provides a C/C++ logger implementation in sonic-platform-common.common.logger.cpp. This C/C++ logger implementation is also a wrapper of Linux standard syslog which is widely used by swss/syncd. It provides the ability to set log level on fly by starting a thread to listen to CONFIG DB LOGGER table change. SONiC infrastructure also provides the Python wrapper for sonic-platform-common.common.logger.cpp which is swsscommon.Logger. However, this logger implementation also has some drawbacks: swsscommon.Logger assumes redis DB is ready to connect. This is a valid assumption for swss/syncd. But it is not good for a Python logger implementation because some Python script may be called before redis server starting. swsscommon.Logger wraps Linux syslog which only support single log identifier for a daemon. So, swsscommon.Logger is not an option too. This PR is a Python logger enhancement which allows user setting log level at run time. - How I did it swsscommon.Logger depends on a thread to listen to CONFIG DB LOGGER table change. It refreshes log level for each logger instances once the thread detects a DB entry change. A thread is considered heavy in a python script, especially that there are many short and simple python scripts which also use logger. To keep python logger light weight, it uses a different design than swsscommon.Logger: A class level logger registry shall be added to SysLoggerclass Each logger instance shall register itself to logger register if enables runtime configuration Logger configuration shall be refreshed by CLI which send a SIGHUP signal to the daemon - How to verify it Manual test New unit test cases
…C2/v32.42.1000, BFSoC to 4.9.0 (#20565) - Why I did it To include latest fixes and new functionality - How I did it SDK_VERSION 24.7-RC4 -> 24.10-RC2 FW_VERSION 32.41.1000 -> 32.42.1000 SAI_VERSION SAIBuild0.0.32.0 -> SAIBuild0.0.36.0 BFSOC_VERSION: 4.7.0 -> 4.9.0 - How to verify it Build an image and run tests from "sonic-mgmt".
…20580) * sonic-buildimage: rename qsp 128x400g to o128s2 In keeping with normative convention, renaming the hwsku folders for qsp/qspr from 128x400G to O128S2. * sonic-buildimage: fix qsp-o128s2 port_config typo There is a typo in the lanes used for Ethernet356 within port_config.ini, where lanes 381 and 382 appear twice instead of being followed by the intended 383 and 384. This change fixes that typo. This exact typo is not present in the other hwskus under x86_64-arista_7060x6_64pe or x86_64-arista_7060x6_64de.
#### Why I did it Adding yang model for CONFIG_DB table XCVRD_LOG|Y_CABLE. Introduced by https://github.com/sonic-net/sonic-utilities/blob/master/config/muxcable.py#L1230-L1235 #### How I did it Added the changes in sonic-yang-models #### How to verify it UT test ``` ==================================================================================== test session starts ==================================================================================== platform linux -- Python 3.9.2, pytest-6.0.2, py-1.10.0, pluggy-0.13.0 rootdir: /sonic/src/sonic-yang-models plugins: pyfakefs-5.2.3, cov-2.10.1 collected 3 items tests/test_sonic_yang_models.py .. [ 66%] tests/yang_model_tests/test_yang_model.py . [100%] ===================================================================================== 3 passed in 2.06s ===================================================================================== ```
* MAB common header files for genereic files * Addressed review comments
…#20074) - Why I did it Extend Nvidia Bluefield SONiC infrastructure to support DPU NIC FW auto upgrade. - How I did it Extend the build system and init scripts to support the FW upgrade. - How to verify it Compile an image with the new FW version. Run image installation. Verify that the running FW is upgraded after the image installation.
…ge installation (#19910) - Why I did it The DPU reset after the image installation is required to boot the DPU with the new NIC FW - How I did it Trigger DPU reset with the dpuctl utility after the image installation - How to verify it Build and install the image
Why I did it This PR is to add a patch to fix potential fd leak issue in AsyncSniffer in scapy python library. There are two fd leak scenarios. When starting worker thread _run, if an interface is down, an OSError is thrown, and the sockets that have been created will be leaked as it never got a chance to be closed. When stopping the worker thread, same error can happen when calling close. The sockets not closed will be leaked. How I did it Catch OSError when creating sockets, and catch any exception when closing socket to ensure all sockets are closed. How to verify it Verified by the testing code above. No fd leak happened.
- Why I did it Setting the KV attribute for WECMP normalization - How I did it Update common sai.profile - How to verify it Running basic WECMP tests.
…D automatically (#20660) #### Why I did it src/sonic-platform-daemons ``` * fc557a1 - (HEAD -> master, origin/master, origin/HEAD) [SmartSwitch] Add implementation for the DPU chassis daemon. (#554) (12 hours ago) [Oleksandr Ivantsiv] ``` #### How I did it #### How to verify it #### Description for the changelog
… automatically (#20630) #### Why I did it src/sonic-platform-common ``` * 4668bdc - (HEAD -> master, origin/master, origin/HEAD) Enhanced NVMe disk support, added limited eUSB disk support (#493) (3 days ago) [Ashwin Srinivasan] ``` #### How I did it #### How to verify it #### Description for the changelog
…tically (#20540) #### Why I did it src/sonic-sairedis ``` * e394ced7 - (HEAD -> master, origin/master, origin/HEAD) Fix compilation on Buster (#1449) (11 hours ago) [Saikrishna Arcot] * 4d504ff8 - Rename file name to fit case insensitive file system. (#1444) (2 days ago) [Liu Shilong] * fe650bb7 - [syncd] Add workaround for port error status notification (#1430) (6 days ago) [Kamil Cudnik] * cd2773a3 - [syncd] Fix inspect asic command (#1434) (7 days ago) [Kamil Cudnik] * 2d873766 - [syncd] Make sure notification queue release memory when drained (#1427) (8 days ago) [Kamil Cudnik] * b8a8856a - Fix adding flex counter to wrong context (#1421) (8 days ago) [byu343] * 40979e0b - [fastboot] Notify SAI that fastboot is done (#1396) (8 days ago) [Junchao-Mellanox] * 952ee406 - [codeql] Change pull_request_target to pull_request (#1442) (9 days ago) [Kamil Cudnik] * 697d86b5 - [syncd] Create neighbor entries before next hop (#1432) (9 days ago) [Kamil Cudnik] * fa76ca13 - [codeql] Remove git ancestry (#1441) (10 days ago) [Kamil Cudnik] * 3838d7ee - [codeql] Show git ancestry graph (#1440) (10 days ago) [Kamil Cudnik] * 2e7d946b - [codeql] Show gcc version before compile (#1438) (10 days ago) [Kamil Cudnik] * a1e93f58 - [submodule] Update SAI to latest master (#1431) (2 weeks ago) [Kamil Cudnik] ``` #### How I did it #### How to verify it #### Description for the changelog
- Why I did it Implement the interface required to run DPU chassisd on the Nvidia Smart Switch. Implement get_dpu_id API that deducts the DPU ID based on the midplane interface IP address. - How I did it Implement platform API - How to verify it The implementation is covered by the UT.
…lly (#20610) #### Why I did it src/sonic-swss ``` * 93f7c150 - (HEAD -> master, origin/master, origin/HEAD) Fix State Db LAG_MEMBER_TABLE removal not happening. (#3347) (10 hours ago) [abdosi] * d76c34e4 - fix error in rif_rates.lua (#3218) (31 hours ago) [InspurSDN] * a3aaa398 - Add suppport for SAI DASH appliance object (#3284) (32 hours ago) [Mukesh Moopath Velayudhan] * 064f2e3d - Fix the tlm_teamd deleting STATE_DB LAG_TABLE entry. (6 days ago) [abdosi] ``` #### How I did it #### How to verify it #### Description for the changelog
… automatically (#20670) #### Why I did it src/sonic-platform-common ``` * 59babf5 - (HEAD -> master, origin/master, origin/HEAD) Add/modify VDM and Status related cmis fields for onboarding xcvr diagnostic features (#510) (3 hours ago) [mihirpat1] ``` #### How I did it #### How to verify it #### Description for the changelog
…utomatically (#20668) #### Why I did it src/sonic-host-services ``` * 13a5419 - (HEAD -> master, origin/master, origin/HEAD) Correct real time CPU Utilization calculation (#173) (3 hours ago) [Feng-msft] * f95b7cd - Optimize state_db update into batch way. (#176) (3 hours ago) [Feng-msft] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (#20675) #### Why I did it src/sonic-gnmi ``` * e844925 - (HEAD -> master, origin/master, origin/HEAD) Add support for RebootMethod_HALT for Reboot API (#286) (28 hours ago) [Vasundhara Volam] * e5125cb - Merge pull request #316 from hdwhdw/ignore (31 hours ago) [Dawei Huang] * c702d31 - Add build artifacts to gitignore. (2 days ago) [Dawei Huang] ``` #### How I did it #### How to verify it #### Description for the changelog
…9927) - Why I did it Support new psu model and align to hw definition - How I did it Add psu model MTEF-AC-I data to psu_sensors.json Add place holder for model MTEF-AC-G-DELTA Fix 4700/4700a1 inverted psu designation - How to verify it check the sensors command output on system with the psu model MTEF-AC-I Signed-off-by: Yuanzhe, Liu <yualiu@nvidia.com>
Resolves: #20419 Why I did it To clean up syncd environment How I did it Removed python development packages from a Dockerfile template How to verify it make configure PLATFORM=mellanox make target/sonic-mellanox.bin
#### Why I did it Set cpufreq.default_governor to *performance* for faster boot time. We observe consistent 1 sec improvement across several devices. The change in finalize-warmboot.sh restores the default governor after fast or warm boot is finished. **NOTE**: This will apply to upgrades starting from 202405 since this is set in shutdown path to avoid any extra scripts running at boot time. Upgrade from older versions/branches will require a runtime patch to fast-reboot and warm-reboot script. #### How I did it After fast or warm boot is finished restore to default governor. #### How to verify it Run fast-reboot or warm-reboot. Check: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance ``` After boot is finalized check that it is reset back to default: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor schedutil ``` Tested with sonic-net/sonic-utilities#3435
Add VRF parameter to orchagent and GNMI start script. Why I did it Orchagent failed bind to ZMQ when mgmt VRF enabled: #19638 How I did it Add VRF parameter to orchagent and GNMI start script when mgmt-vrf feature enabled. How to verify it Pass all UT.
Why I did it Baseline implementation for C224O8 support in the Arista-7060X6-64PE (DCS-7060X6-64PE). How I did it Added the necessary files for baseline implementation of C224O8 implementation for Quicksilver OSFP. How to verify it Load DUT with these changes and confirm that the relevant interfaces are up. Which release branch to backport (provide reason below if selected) 202405 Tested branch (Please provide the tested image version) 202405 Description for the changelog Baseline support for the Arista-7060X6-64PE-C224O8 platform variant.
Signed-off-by: Anand Mehra anamehra@cisco.com Update cisco-8000.ini to 202405.0.8 release Release Content Cisco-8102, Cisco-8800 SDK HEALTH event notification causing orchagent crash disabled Support for Multiple rconsole sessions to difference LCs from Chassis Supervisor. Fixed FC asic init failure on Chassis Supervisor reboot Added New CLI to display LC Back Plane port to Sup/RP port mapping. 'show platform npu bp-interface-map -n ' Fixed Log Analyzer error “counter read timeout” Fixed Log Analyzer ERROR avago_sbm_spico_int_read Enabled enable_mbist_repair across platforms Syncd Rpc docker python version migrated to python3
- Why I did it Upgrade MFT tool to 4.30.0-136 - How I did it Change the relevant make file to pick up the new MFT version - How to verify it Run full sonic-mgmt regression on Mellanox platforms.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
- What I did
Updated SONiC for OOPT
- How I did it
- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)