Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

201811 #157

Merged
merged 93 commits into from
Apr 30, 2020
Merged

201811 #157

merged 93 commits into from
Apr 30, 2020

Conversation

bbinxie
Copy link
Collaborator

@bbinxie bbinxie commented Apr 30, 2020

- What I did

- How I did it

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

Alex-Dai and others added 30 commits October 17, 2019 15:19
Submodule src/sonic-utilities ae274e5..8237848:
  > [fast/warm reboot] ignore errors after shutting down critical service(s) (sonic-net#761)
  > [neighbor advertiser] raise exception when http endpoint return failure (sonic-net#758)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Submodule src/sonic-swss 8ef513c..f6bfe77:
  > [aclorch] Enable DSCP rules on IPv6 mirror tables (sonic-net#1146)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
…-net#3877)

* Add watchdog-control service to disable watchdog during bootup

Disable only if it's applicable and the watchdog is enabled.

* Address the review comment

* Correct the watchdog start script name

* Change to call common watchdog api instead of platform specific

* Start watchdog control service after swss starts

* advance sonic-utility submodule
* lldpctl: put a lock around some commands to avoid race conditions

* Read all notifications in lldpctl_recv

* lib: fix memory leak

* lib: fix memory leak when handling I/O

* Update series
In place editing (sed -i) seems having some issues with filesystem
interaction. It could leave 0 size file or corrupted file behind.

It would be safer to sed the file contents into a new file and switch
new file with the old file.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
…#3906)

* Correct the watch-control service to call the right script

* make watchdog-control.sh executable (chmod +x)
…ned when warm reboot follows a hardware caused reboot (sonic-net#3880)

* [process-reboot-cause]Address the issue: Incorrect reboot cause returned when warm reboot follows a hardware caused reboot
1. check whether /proc/cmdline indicates warm/fast reboot.
   if yes the software reboot cause file will be treated as the reboot cause.
   finish
2. check whether platform api returns a reboot cause.
   if yes it is treated as the reboot cause.
   finish.
3. check whether /hosts/reboot-cause contains a cause.
   if yes it is treated as the cause otherwise return unknown.

* [process-reboot-cause]Fix review comments

* [process-reboot-cause]address comments
1. use "with" statement
2. update fast/warm reboot BOOT_ARG

* [process-reboot-cause]address comments

* refactor the code flow

* Remove escape

* Remove extra ':'
…ic-net#3908)

If we need to stop swss during fast-reboot procedure on the boot up path,
it means that something went wrong, like syncd/orchagent crashed already,
we are stopping and restarting swss/syncd to re-initialize. In this case,
we should proceed as if it is a cold reboot.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Rely on platform= and sid= on the command line to detect the platform rather than the eeprom
The platform will now properly initialize even if the system eeprom died or is unreachable.

Add support for the 7260CX3-64E
This is a variant of the 7260CX3-64 with no real difference for software.
…s6100 (sonic-net#3065)

- What I did
Added Daemon to Log LPC bus degradation in Intel C2000 processor. Intel Rangeley C2000 processors with revision less than or equal to 2 have issue where LPC bus degrades over time in some processors. To identify the problem and to notify the issue, a daemon has been added which will log on encountering the issue.

- How I did it
Added a daemon which validates the CPLD scratch(0x102) and SMF scratch(0x202) registers by writing and reading values on regular polling intervals (300 seconds). If there is a discrepancy between read and write, a critical log will be thrown.

- How to verify it
The infra is verify by simulating the issue where between write and read, the value in register is modified and the log appearance is checked.

- Description for the changelog

Added Daemon to identify LPC bus degradation issue and notify using syslog in Dell S6100 and Z9100 platforms. This daemon will only run on processors with revision less than or equal to 2.
* Corefile uploader service

1) A service is added to watch /var/core and upload to Azure storage
2) The service is disabled on boot. One may enable explicitly.
3) The .rc file to be updated with acct credentials and http proxy to use.
4) If service is enabled with no credentials, it would sleep, with periodic log messages
5) For any update in .rc, the service has to be restarted to take effect.

* Remove rw permission for .rc file for group & others.

* Changes per review comments.
Re-ordered .rc file per JSON.dump order.
Added a script to enable partial update of .rc, which HWProxy would use to add acct key.

* Azure storage upload requires python module futures, hence added it to install list.

* Removed trailing spaces.

* A mistake in name corrected.
Copy the .rc updater script to /usr/bin.
* Updates per review comments
1) core_uploader service waits for syslog.service
2) core_uploader service enabled for restart on failure
3) Use mtime instead of file size + ample time to be robust.

* Avoid reloading already uploaded file, by marking the names with a prefix.

* Updated failing path.
1) If rc file is missing or required data missing, it periodically logs error in forever loop.
2) If upload fails, retry every hour with a error log, forever.

* Fix few bugs

* The binary update_json.py will come from sonic-utilities.
…et#3982)

* [201811][monit] address build issue: hard code ARCH to amd64

- also hard code the debian package path as in 201811 branch.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Backport fan driver fix for 201811 release branch.
Original issue leads to invalid RPM readings on a few devices.
Submodule src/sonic-utilities 792df20..7a265b8:
  > A generic JSON file updater, which can add/update-existing attributes. (sonic-net#770)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
…onic-net#3974)

* [Monit] Change the monitoring period of monit from 120 seconds to 60
seconds and also at the same time double the interval for existing sonic monit config file in
host.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
yozhao101 and others added 22 commits March 19, 2020 22:49
…4161)

* Add restart configuration of fancontrol for pmon.

* Clean up the default value setting for exitcodes

* Remove the default setting of stopwaitsecs
Submodule src/sonic-utilities e9747899a..f431510ae:
  > [201811][intfutil] set speed to 0 when interface speed is not available (sonic-net#840)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Signed-off-by: Wenda Ni <wenni@microsoft.com>
This new FW version includes the following fixes:

SFP thermal shutdown issue

Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
admin@sonic:~$ sudo hw-management-wd.sh
Usage: hw-management-wd.sh start [timeout] | stop | tleft | check_reset | help
start - start watchdog
        timeout is optional. Default value will be used in case if it's omitted
        timeout provided in seconds
stop - stop watchdog
tleft - check watchdog timeout left
check_reset - check if previous reset was caused by watchdog
        Prints only in case of watchdog reset
help -this help

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
New CPLD includes support for watchdog type 3 with maximum timeout 65536 sec.

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
… VLAN interfaces (sonic-net#4229)

- Support parsing egress ACLs from minigraph file specified by the "OutAcl" element
- Support attaching ACLs to VLAN interfaces
…ic-net#4263)

- What I did
Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large.

- How I did it
Added "tinker panic 0" in ntp.conf file.

- How to verify it
[this assumes that there is a valid NTP server IP in config_db/ntp.conf]

Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s.
Reboot the device.
Before the fix:
3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time.

After the fix:
3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.
Submodule src/sonic-utilities f431510ae..d7e8f84cf:
Fix issue of fields overwritten before display (sonic-net#863)

Signed-off-by: Guohan Lu <lguohan@gmail.com>
Co-authored-by: Guohan Lu <lguohan@gmail.com>
* DellEMC: S6100 CPLD upgrade

* DellEMC: S6100 CPLD upgrade - Retry on failure
* fixes an issue when /host/warmboot/issu_bank.txt is empty/corrupted
switch is not able to over come this and enters continuos reload/reboot
failure.

Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
* [mellanox]: Add SSD FW update tool.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>

* [mellanox]: Update SSD tool.

Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
…nic-net#4337)

* Include platform info in name.
Get SONiC Version as parameter and use
Make additional tag as optional.
Avoid repetitions by using function.

* Per review comments, make SONIC_VERSION optional and added some comments.

* 1) Added additional params are optional
2) Handle DOCKER_IMAGE_TAG only if given
3) Use BUILD_NUMBER only if SONIC_VERSION not given
4) Tag with SONIC_VERSION if given.

Current behavior is not changed, unless SONIC_VERSION is given.

* Update per review comments
1) Added new args with options
2) Handle PORT possible being empty
3) Exhibit new behavior only if both version & platform are given.

* Drop redundant quotes
Signed-off-by: Danny Allen <daall@microsoft.com>
* [dhcpmon] Filter DHCP O/A Messages of Neighboring Vlans

This code fixes a bug where two or more vlans exist. Cross contamination
happens for DHCP packets Offer/Ack when received on shared northbound links.
The code filters out those packet based on dst IP equal Vlan loopback IP.

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
Submodule src/sonic-utilities d7e8f84cf..8c21fc151:
  > [utility] Filter FDB entries (sonic-net#890)
  > Fix the warm-reboot script to support FRR based warm-reboot (sonic-net#842)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
update from azure 201811
@bbinxie bbinxie requested a review from tiantianlv April 30, 2020 03:07
@bbinxie bbinxie merged commit 2a2bde5 into 201811_cel_wb Apr 30, 2020
mudsut4ke pushed a commit that referenced this pull request Jan 25, 2021
pick up follow commits:

  > hwmon-lm75: backport support for PCT2075 thermal sensor (#165)
  > [arista] Reassign prefetch memory per platform (#163)
  > Add .gitignore file (#157)
  > Enable PCA9541 I2C mux module (#160)
mudsut4ke pushed a commit that referenced this pull request Jan 25, 2021
…nic-net#6352)

src/sonic-platform-common 9935fca...8664efc (2):

Make sonic_sfp Python2 and Python3 compatible (#157)
[sffbase.py] Fix to make Python 3-compatible (#156)

src/sonic-platform-daemons e6c786b...81318f7 (1):

[psud] Fix issue where PSU Fan info is not updated in State DB (#137)

Fixes sonic-net#6341
mudsut4ke pushed a commit that referenced this pull request Apr 2, 2021
this PR updates the following commits in sonic-platform-daemons
260cf2d [xcvrd] change firmware information fields name inside MUX_CABLE_INFO table for Y cable (#165)
cfa600f [thermalctld] Initialize fan led in thermalctld for the first run (#167)
8509f43 [thermalctld] Refactor to allow for greater unit test coverage; Add more unit tests (#157)
70f4e7b [syseepromd] Update warning message to be more informative (#160)

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
jerseyang pushed a commit that referenced this pull request Jun 4, 2021
…nic-net#6352)

src/sonic-platform-common 9935fca...8664efc (2):

Make sonic_sfp Python2 and Python3 compatible (#157)
[sffbase.py] Fix to make Python 3-compatible (#156)

src/sonic-platform-daemons e6c786b...81318f7 (1):

[psud] Fix issue where PSU Fan info is not updated in State DB (#137)

Fixes sonic-net#6341
jerseyang pushed a commit that referenced this pull request Jun 4, 2021
this PR updates the following commits in sonic-platform-daemons
260cf2d [xcvrd] change firmware information fields name inside MUX_CABLE_INFO table for Y cable (#165)
cfa600f [thermalctld] Initialize fan led in thermalctld for the first run (#167)
8509f43 [thermalctld] Refactor to allow for greater unit test coverage; Add more unit tests (#157)
70f4e7b [syseepromd] Update warning message to be more informative (#160)

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.