Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding SKU Mellanox-SN3800-D100C12S2 #16

Open
wants to merge 123 commits into
base: master
Choose a base branch
from

Conversation

madhanmellanox
Copy link
Owner

Why I did it

To create a new SKU Mellanox-SN3800-D100C12S2

How I did it

I arrived at the SKU configuration values based on the following SKU template, Port mapping and number of uplinks and downlinks.

SKU template:
Port configuration
• Breakout mode for each port - Defined in port mapping
• Speed of the port Defined in Port mapping
• Auto-negotiation enable/disable No setting required
• FEC mode No setting required
• Type of transceiver used Not needed
Buffer configuration
• Shared headroom enable
• If shared headroom enabled what is the over-subscription ratio as in SN3800
• Dynamic Buffer disable
• In static buffer scenario how many uplinks and downlinks? as in SN3800
• 2km cable support required? no
Switch configuration
• Warmboot enabled? yes
• Should warmboot be added to SAI profile when enabled? yes
• Is VxLAN source port range set? yes
• Should Vxlan source port range be added to SAI profile when set. as in SN3800
• Is Static Policy Based Hashing enabled? no

Port Mapping
etp1 to etp37 split into 50G
etp38 and etp40 is 10G
etp39 splint into 50G
etp41 to etp52 split into 50G
etp53 to etp64 is 100G

Number of Uplinks / Downlinks:
TO topology: 12 100G uplinks and rest all downlinks.
T1 topology: (SKU will not be used in T1 topology), so same 12 100G uplinks and rest all downlinks is used to arrive at buffer config values.

How to verify it

Build the image, install it on the 3800 switch and set the SKU and verify the ports come up with proper speeds.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106

Description for the changelog

Changes are in sonic-buildimage/device/mellanox/x86_64-mlnx_msn3800-r0/Mellanox-SN3800-D100C12S2/ folder.

A picture of a cute animal (not mandatory but encouraged)

Madhan Babu and others added 4 commits July 13, 2021 06:18
#### Why I did it
Update submodule pointer for swss to include recent changes

4f1d726 [portsorch] fix errors when moving port from one lag to another. (sonic-net#1797)
ae44701 [orchagent] Put port configuration to APPL_DB according to autoneg mode (sonic-net#1769)
5295f91 Add failure handling for SAI get operations (sonic-net#1768)
7c7c451 Revert recirc port change (sonic-net#1813)
5528ebf Cleanup code (sonic-net#1814)
Update sonic-snmpagent submodule to pick up new commits:

21d7d97 2021-07-12 Fix: SonicV2Connector behavior change: get_all will return empty dict if (sonic-net#226)
0813b42 2021-07-12 Entries under .1.3.6.1.2.1.31.1.1.1.18 OID should return the "description" field of PORT_TABLE entries in APPL_DB or CONFIG_DB. (sonic-net#224)
7a78703 2021-07-08 Install dotnet core to fix python gcov warning for code covery color bar showing (sonic-net#215)
e0f36a5 2021-06-30 [multi-asic]: Udpate to use SonicDBConfig from swsscommon (sonic-net#219)
266bd15 2021-06-10 Restored snmp vlan support per RFC1213 and added the missing support for RFC2863 (sonic-net#218)
Copy link

@alexrallen alexrallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed port configs, please see my comment.

"default_brkout_mode": "2x50G[40G,25G,10G]"
},
"Ethernet148": {
"default_brkout_mode": "1x10G[100G,50G,40G,25G]"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to sonic-net#7958 these modes will need to be added to the list of supported breakout modes of the platform.json file for the 3800.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the 1x10G is the only missing one but please double check.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allexrallen platform.json is added with 1x10G support. You can see it in the changed files.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

vivekrnv and others added 25 commits July 13, 2021 18:59
…d of 8 for 4600/4600C platforms (sonic-net#8155)

*Edited platform.json for 4600 & 4600C
*Edited hwsku.json and port_config.ini files for all the SKU's present under these platforms
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
The latest Arista submodule contains pmbus resiliency fixes.

Co-authored-by: Zhi Yuan (Carl) Zhao <zyzhao@arista.com>
…ies (sonic-net#8134)

Why I did it
Allows users to host their own local docker registries and utilize them via the REGISTRY_SERVER and REGISTRY_PORT environmental variables

How I did it
Only set REGISTRY_SERVER and REGISTRY_PORT in rules/config if they are unset.

How to verify it
Export environmental variables REGISTRY_SERVER and REGISTRY_PORT to an alternative docker registry. Export the environmental variable ENABLE_DOCKER_BASE_PULL to y.
Ensure the required sonic-slave docker images are not present locally, but are available in the docker registry
Execute make init and make configure
Confirm that the appropriate docker images were pulled from the appropriate docker registry, and not built locally
Modify accton_as4630_54pe_util.py to support Python3 and remove not need code.

Signed-off-by: Jostar Yang <jostar_yang@accton.com.tw>
Fix issue with critical process in the restapi docker restarting immediately after getting killed
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
To include:
> 66e7817 2021-07-13 [pcied] Fix pcied failure to load due to 'pcied NameError: name 'self' is not defined' (sonic-net/sonic-platform-daemons#198) 
> 3df6757 2021-07-08 [ci] fix result color bar in the code coverage report (sonic-net/sonic-platform-daemons#196)
14a4212 [Marvell] CPU1 failure on continuous reboot  (sonic-net#228)
53e75e5 hwmon: (pmbus_core) Do not enable PEC if adapter doesn't (sonic-net#215)

Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
This will be used to build our image as well as tools that need to go
into this image.

Notable changes from Buster:
* Python 2/pip2 module installations have been removed, since nothing
besides the main Python 2 binary (and virtualenv support) is now
available through Bullseye.
* In the cases where both the main library package and the development
package are being installed, now, only the dev package is specified. The
main library is typically marked as a dependency of the dev package.
This reduces the number of changes we have to make as SONAMEs change.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
This adds the Makefile changes to use the Bullseye slave image, but
doesn't use it by default. There should be no functional changes with
this change (Buster will still be used for now).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
…cm-modules. (sonic-net#8160)

Why I did it
Update XGS and DNX SAI to 5.0.0.4 and additional flags needed in saibcm-modules

The following CSP's are merged in 5.0.0.4

CS00012182148 [4.3] Rate Limit Parity error message to syncd/sonic.
CS00012178692 [4.3] ACL drops counted as interface drops
CS00012183901 [4.3][WARMBOOT] WARMReboot with active traffic causes port flap reported during warm reboot
CS00012070713 [SAI 4.3 , DNX, 8690] Everflow ACL creation fails - brcm_sai_dnx_create_acl_table API fails, with unknown attribute error.
CS00012023263 [4.4] TD3/TH2 : Support 4 lossless queues(2 SW PFCWD and 2 HW PFCWD)
CS00012019578 [4.4] Pre FEC bit-error rate (BER) - DNX and XGS (TD and TH 50/100G)

How I did it
Changes the various make files to include the new SAI release + update the opennsl-modules.
- Why I did it
Make DHCP relay docker an extension. DHCP relay now carries dhcp relay commands CLI plugin and has a complete manifest.
It is installed as extension if INCLUDE_DHCP_REALY is set to y.

DEPENDS on sonic-net#5939

- How I did it
Modify DHCP relay docker makefile and dockerfile. Make changes to sonic_debian_extension.j2 to install sonic packages.
I moved DHCP related CLI tests from sonic-utilities to DHCP relay docker.
This PR introduces a way to write a plugin as part of docker image and run the tests from cli-plugin-tests directory under docker directory.
The test result is available in target/docker-dhcp-relay.gz.log:

[ REASON ] :      target/docker-dhcp-relay.gz does not exist   NON-EXISTENT PREREQUISITES: docker-start target/docker-config-engine-buster.gz-load target/python-wheels/sonic_utilities-1.2-py3-none-any.whl-in
stall target/debs/buster/python3-swsscommon_1.0.0_amd64.deb-install
[ FLAGS  FILE    ] : []
[ FLAGS  DEPENDS ] : []
[ FLAGS  DIFF    ] : []
============================= test session starts ==============================
platform linux -- Python 3.7.3, pytest-3.10.1, py-1.7.0, pluggy-0.8.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /sonic/dockers/docker-dhcp-relay/cli-plugin-tests, inifile:
plugins: cov-2.6.0
collecting ... collected 10 items

test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_plugin_registration PASSED [ 10%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_nonexist_vlanid PASSED [ 20%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_vlanid PASSED [ 30%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_invalid_ip PASSED [ 40%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_dhcp_relay_with_exist_ip PASSED [ 50%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_add_del_dhcp_relay_dest PASSED [ 60%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_nonexist_dhcp_relay_dest PASSED [ 70%]
test_config_dhcp_relay.py::TestConfigVlanDhcpRelay::test_config_vlan_remove_dhcp_relay_dest_with_nonexist_vlanid PASSED [ 80%]
test_show_dhcp_relay.py::TestVlanDhcpRelay::test_plugin_registration PASSED [ 90%]
test_show_dhcp_relay.py::TestVlanDhcpRelay::test_dhcp_relay_column_output PASSED [100%]

=============================== warnings summary ===============================
/usr/local/lib/python3.7/dist-packages/tabulate.py:7
  /usr/local/lib/python3.7/dist-packages/tabulate.py:7: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    from collections import namedtuple, Iterable

-- Docs: https://docs.pytest.org/en/latest/warnings.html
==================== 10 passed, 1 warnings in 0.35 seconds =====================
Why I did it
Currently SONiC use the 'isc-dhcp-relay' package to allow DHCP relay functionality on IPv4 networks only.
This will allow the IPv6 functionality along the IPv4 type.

How I did it
Edit supervisord template to start DHCPv6 instances when configured to do so on Config DB.
Align cfg unit test to the new change.
Add DHCPv6 relay minigraph parsing support and a suitable t0 topology xml file for UT.

How to verify it
Configure DHCPv6 agents as described on the feature HLD: sonic-net/SONiC#765
Test it with real client/server with IPv6 or use the dedicated automatic test: sonic-net/sonic-mgmt#3565
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>

* Split docker-dhcp-relay.supervisord.conf.j2 template into several files for easier code maintenance
…8187)

Backlinks are none when there is 0 backlinks. without this change backlinks.number() 
throws an exception when backlinks==None.
Modify micro code of custom_led.bin to support LED 1G
Co-authored-by: Jostar Yang <jostar_yang@accton.com.tw>
…-net#8188)

*Jun 23, 2021 - Fix config prompt question issue (sonic-net#500) b65e257
*Jul 14, 2021 - [pbh]: Add PBH DB schema. (sonic-net#495) 1d067ca

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
…mplementation in system-health (sonic-net#8186)

swsssdk will be deprecated. Use swsscommon instead.
Signed-off-by: Guohan Lu <lguohan@gmail.com>
List of commits (newest first):

sonic-net/sonic-utilities@0efd297 (origin/master, origin/HEAD) mclag enhancements as per HLD at sonic-net/SONiC#596 (sonic-net#1138)
sonic-net/sonic-utilities@e98bbb6 Reworked IP validation in "config interface ip add/remove" command (sonic-net#1709)
sonic-net/sonic-utilities@866d1d7 [minigraph][port_config] Consume port_config.json while reloading minigraph (sonic-net#1705)
sonic-net/sonic-utilities@9ae6f6b [debug dump util] Match Infrastructure (sonic-net#1666)
sonic-net/sonic-utilities@8fe7e26 Coverage uses top level directory as source (sonic-net#1711)
sonic-net/sonic-utilities@3f0b690 [MPLS][CLI] added config/show CLI for MPLS interface, MPLS CRM threshold config, updated CLI reference manual
sonic-net/sonic-utilities@e8b6c5c [ci] Fix python coverage color bar (sonic-net#1692)
sonic-net/sonic-utilities@888701b [Mellanox] Remove mstdump from Mellanoxs collect dump script (sonic-net#1706)
sonic-net/sonic-utilities@4818360 [sonic-package-manager] support warm/fast reboot for extension packages (sonic-net#1554)
sonic-net/sonic-utilities@793b847 [show priority-group drop counters] Remove backup with cached PG drop counters after 'config reload' (sonic-net#1679)
sonic-net/sonic-utilities@24fe1ac [show][config] support for interface alias for muxcable commands (sonic-net#1699)
sonic-net/sonic-utilities@186d851 Pcieutil to load the platform api first instead of using common api (sonic-net#1672)
sonic-net/sonic-utilities@7a82c06 [Mellanox] Update mellanox dump generation to include SDK dumps (sonic-net#1640)
sonic-net/sonic-utilities@38f8c06 [sfputil] Expose error status fetched from STATE_DB or platform API to CLI (sonic-net#1658)
sonic-net/sonic-utilities@c5d00ae [pfcwd] Fix the return code in invalid case (sonic-net#1691)
sonic-net/sonic-utilities@57dc403 [ci]: Fix config prompt question issue (sonic-net#1693)
sonic-net/sonic-utilities@5708497 [show] fix show version (sonic-net#1686)
sonic-net/sonic-utilities@9041ba0 [config] Adding sanity checks for config reload (sonic-net#1664)
sonic-net/sonic-utilities@2cdadb5 [config]: Create portchannel with LACP key (sonic-net#1473)
sonic-net/sonic-utilities@6f74ba5 [vnet_route_check] Fix logic for getting VNET routes from ASIC DB (sonic-net#1653)
sonic-net/sonic-utilities@54fee0f Add range check on portchannel min-links (sonic-net#1630)
*Added new SKU for SN4600C Platform: Mellanox-SN4600C-D48C40
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
…t command (sonic-net#8132)

- Why I did it
to prevent python exception error when executing warm-reboot command on mellanox simulator platform

- How I did it
return None on the watchdog python script on cases that watchdog file is not exist

- How to verify it
warm-reboot is running well without the python error. error message will appear on log on these cases.
in order to avoid this error message we can simulate the watchdog on mellanox simulator platform
ArthiSivanantham and others added 21 commits August 6, 2021 10:32
* SONiC YANG model support for sFlow feature.
Signed-off-by: Arthi Sivanantham <arthi_sivanantham@dell.com>
)

* Update default cable len to 0m for TD2 (sonic-net#8298)
* Update sonic-cfggen tests with the correct cable len

Signed-off-by: Neetha John <nejo@microsoft.com>

As part of the buffer reclamation efforts for TD2, setting the default cable len to 0m which means unused ports will have a cable len of 0m.

Why I did it
To align with the changes in sonic-net/sonic-swss#1830

How to verify it
- With the default cable len set to 0m and the associated changes in swss, CABLE_LENGTH table had '0m' set for unused ports and accordingly more space was reserved for the shared pool
- Cfggen tests passed with the cable len update
…t#8363)

#### Why I did it
* `arp_update` fails to ping those neighbors over vlan sub interfaces.

#### How I did it
* modify `arp_update_vars.j2` to get vlan sub interfaces with ipv6 addresses assigned.
* modify `arp_update` to send ipv6 pings over those retrieved vlan sub interfaces.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
armhf build uses native dockerd

Signed-off-by: Guohan Lu <lguohan@gmail.com>
Advance utilities submodule head to include:
* b540f5f 2021-08-05 | [fast-reboot] revert the change of disabling counter polling before fast-reboot (sonic-net#1744) (HEAD -> master, github/master) [Ying Xie]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* monitor mux_cable_table in state_db to update dhcp acl
print out the process that hold the dpkg frontend lock.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
…6356)

the branch refers the branch name that the commit is in,
for example master, 202012, 201911, ...
In case there is no branch, the name will be HEAD.

release is encoded in /etc/sonic/sonic_release file.
the file is only available for a release branch.
It is not available in master branch.

example for master branch
```
build_version: 'master.602-6efc0a88'
debian_version: '10.7'
kernel_version: '4.19.0-9-2-amd64'
asic_type: vs
commit_id: '6efc0a88'
branch: 'master'
release: 'none'
build_date: Tue Dec 29 06:54:02 UTC 2020
build_number: 602
built_by: johnar@jenkins-worker-23
```

example for 202012 release branch
```
build_version: '202012.602-6efc0a88'
debian_version: '10.7'
kernel_version: '4.19.0-9-2-amd64'
asic_type: vs
commit_id: '6efc0a88'
branch: '202012'
release: '202012'
build_date: Tue Dec 29 06:54:02 UTC 2020
build_number: 602
built_by: johnar@jenkins-worker-23
```

Signed-off-by: Guohan Lu <lguohan@gmail.com>
…et#8381)

install build dep causes dpkg lock issue in parallel build

Signed-off-by: Guohan Lu <lguohan@gmail.com>
* DellEMC: Change PG values for S5232f,Z9264f

* change-v1
…#8330)

#### Why I did it
1. Add version control for debian* docker image to white list.
2. Always record docker image sha256 value, regardless of white list.
This PR updates the following commits

cd3cca7 [Y-Cable][Credo] Credo implementation of YCable class which inherits from YCableBase required for Y-Cable API's in sonic-platform-daemons (sonic-net#203)
bd694b2 Load interval from thermal_policy.json (sonic-net#178)
c43dc17 [sonic_y_cable] add abstract class YCableBase required for Y-cable API support for multiple vendors (sonic-net#186)

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
This is the continuation of PR 8381 and is needed for debian 11 build.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
…et#7965)

#### Why I did it
hostcfgd is starting at the same time as 'create_switch' method is called on orchagent process.
This introduce a degradation on the function execution time which eventually cause the fast-boot flow and a boot scenarion in general to run slower (~6 seconds).
This change will delay the start time of this daemon.
The aaastatsd will delay as well since it has a dependency on hostcfgd, so it is required to delay both.
90 seconds determined as the maximum allowed downtime for control plane to come back up on fast-boot flow.

#### How I did it
Add two timers for hostcfgd and aaastatsd  services in order to delay the startup of these services.

#### How to verify it
Install an image with this change and observe the daemons start 90 seconds after the system boot.
…#8391)

Fix warning shown during compilation

[ DPKG ] Cache is not enabled for opennsl-modules-dnx_5.0.0.4_amd64.deb package
…e it to connect to dockerd (sonic-net#8398)

Use DOCKER_HOST. Every client including docker command and python docker API uses this environment variable to connect to dockerd.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
…et#8210)

Signed-off-by: Rajkumar Pennadam Ramamoorthy rpennadamram@marvell.com

Why I did it
Install sonic image from ONIE. Once system is up, execute "config reload" command.

Root cause is that "determine-reboot-cause.service" was in failed state.
root@sonic:/host/reboot-cause# systemctl list-units --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● determine-reboot-cause.service loaded failed failed Reboot cause determination service

How I did it
Fixed the issue by setting default reason to "REBOOT_CAUSE_UNKNOWN" instead of "None".

How to verify it
Check " determine-reboot-cause.service' loaded successfully post image installation from ONIE.
Verify "reboot-cause.txt" file is created and config reload succeeds.
…c-net#8382)

* BRCM Disable ACL Drop counted towards interface RX_DRP counters
arunlk-dell and others added 7 commits August 10, 2021 20:35
Co-authored-by: Arun LK <Arun_L_K@dell.com>
Why I did it
Support for show system-health command in s5232f

How I did it
Added the configuration, API changes to support system health

How to verify it
Execute "show system-health summary/detail/monitor-list" CLI.
…-net#8370)

enable automated test suites to selectively run relevant tests ( or not run tests ) based upon a new port_type identifier in hwsku.json

How I did it
Modified the valid optional fields in validity check for hwsku.json per recommendation from Joe in
https://github.com/Azure/sonic-mgmt/pull/2654/files

Co-authored-by: Carl Keene <keene@nokia.com>
… DellEMC-Z9332f-O32) (sonic-net#8420)

Updated pg_profile_lookup.ini for both HWSKU to match with BRCM recommendation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.