-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Celestica-E1031] Enable CPU watchdog #16083
Conversation
@lizhijianrd how did you test this? Did the system reboot if watchdog is not ticked? Can you issue |
@lizhijianrd after reboot of system due to watchdog timeout, is the "show reboot-cause" showing "watchdog" ? |
Hi @prgeor I tried to set the watchdog time to 60 seconds and keepalive interval to 120 seconds, the watchdog was not ticked on time and triggered reboot successfully. After the reboot, I can see the reboot-cause is Per your suggestion, I also tried |
We request to backport 202012 only because:
|
Enable CPU watchdog on Celestica-E1031.
Cherry-pick PR to 202012: #16193 |
Enable CPU watchdog on Celestica-E1031.
What is the motivation for this PR? PR sonic-net/sonic-buildimage#16083 introduced cpu_wdt service on Celestica E1031 platform. The cpu_wdt service periodically sends keep alive message to watchdog via "watchdogutil arm -s " command. This may affect the test result of test_watchdog_reboot. This PR add one step to stop the cpu_wdt service before doing watchdog reboot on the DUT. How did you do it? Add one step in test_watchdog_reboot to stop the cpu_wdt service before doing watchdog reboot. How did you verify/test it? Verified on Celestica-E1031 testbed. Signed-off-by: Zhijian Li <zhijianli@microsoft.com>
Backport #9745 What is the motivation for this PR? PR sonic-net/sonic-buildimage#16083 introduced cpu_wdt service on Celestica E1031 platform. The cpu_wdt service periodically sends keep alive message to watchdog via "watchdogutil arm -s " command. This may affect the test result of test_watchdog_reboot. This PR add one step to stop the cpu_wdt service before doing watchdog reboot on the DUT. How did you do it? Add one step in test_watchdog_reboot to stop the cpu_wdt service before doing watchdog reboot. How did you verify/test it? Verified on Celestica-E1031 testbed. Signed-off-by: Zhijian Li <zhijianli@microsoft.com>
What is the motivation for this PR? PR sonic-net/sonic-buildimage#16083 introduced cpu_wdt service on Celestica E1031 platform. The cpu_wdt service periodically sends keep alive message to watchdog via "watchdogutil arm -s " command. This may affect the test result of test_watchdog_reboot. This PR add one step to stop the cpu_wdt service before doing watchdog reboot on the DUT. How did you do it? Add one step in test_watchdog_reboot to stop the cpu_wdt service before doing watchdog reboot. How did you verify/test it? Verified on Celestica-E1031 testbed. Signed-off-by: Zhijian Li <zhijianli@microsoft.com>
What is the motivation for this PR? PR sonic-net/sonic-buildimage#16083 introduced cpu_wdt service on Celestica E1031 platform. The cpu_wdt service periodically sends keep alive message to watchdog via "watchdogutil arm -s " command. This may affect the test result of test_watchdog_reboot. This PR add one step to stop the cpu_wdt service before doing watchdog reboot on the DUT. How did you do it? Add one step in test_watchdog_reboot to stop the cpu_wdt service before doing watchdog reboot. How did you verify/test it? Verified on Celestica-E1031 testbed. Signed-off-by: Zhijian Li <zhijianli@microsoft.com>
Enable CPU watchdog on Celestica-E1031.
Enable CPU watchdog on Celestica-E1031.
* [Celestica-E1031] Enable CPU watchdog (#16083) Enable CPU watchdog on Celestica-E1031. * Add info syslog for cpu_wdt.service (#16678) Why I did it Add info syslog for cpu_wdt.service when trigger watchdog arm action. How I did it Add info syslog for cpu_wdt.service when trigger watchdog arm action.
What is the motivation for this PR? PR sonic-net/sonic-buildimage#16083 introduced cpu_wdt service on Celestica E1031 platform. The cpu_wdt service periodically sends keep alive message to watchdog via "watchdogutil arm -s " command. This may affect the test result of test_watchdog_reboot. This PR add one step to stop the cpu_wdt service before doing watchdog reboot on the DUT. How did you do it? Add one step in test_watchdog_reboot to stop the cpu_wdt service before doing watchdog reboot. How did you verify/test it? Verified on Celestica-E1031 testbed. Signed-off-by: Zhijian Li <zhijianli@microsoft.com>
Why I did it
Enable CPU watchdog on Celestica-E1031.
Work item tracking
How I did it
Add a system service
cpu_wdt
to enable CPU watchdog and send keep-alive signal to watchdog periodically.How to verify it
Build SONiC image and installed on physical device. Can see the
cpu_wdt
work as expected.When I stopped
cpu_wtd
service, it will disarm watchdog before exit:Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)