Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] [pmon] Fix for PMON service not starting when restarting SWSS service after fast/warm reboot #10901

Merged
merged 4 commits into from
Jun 16, 2022
Merged

Conversation

shlomibitton
Copy link
Contributor

@shlomibitton shlomibitton commented May 23, 2022

Why I did it

Recent change to delay PMON service in case of fast/warm reboot introduce an issue when restarting only SWSS service after fast/warm reboot for Nvidia platform.
Since the timer is triggered only when the system boot, in a scenario when the system is after a fast/warm reboot and the user restart SWSS service, as part of syncd.sh script, PMON service will stop but the timer will not start again.

How I did it

On syncd.sh script, in case of fast/warm indication, check if pmon.timer is running.
If it is running it means we are at the first boot and continue normally.
If it is not running, meaning the service was restarted, start the timer to keep the system behavior consistent.

How to verify it

  1. Run fast/warm reboot.
  2. service swss restart.
  3. Observe PMON service starting.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…fast/warm reboot

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
@liat-grozovik liat-grozovik changed the title [PMON] [Nvidia] Fix for PMON service not starting when restarting SWSS service after fast/warm reboot [Mellanox] [pmon] Fix for PMON service not starting when restarting SWSS service after fast/warm reboot May 24, 2022
@liat-grozovik liat-grozovik added Bug 🐛 Platform: Mellanox Request for 202111 Branch For PRs being requested for 202111 branch labels May 24, 2022
@liat-grozovik
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

liat-grozovik
liat-grozovik previously approved these changes May 25, 2022
@shlomibitton
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@liat-grozovik
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@liat-grozovik
Copy link
Collaborator

@shlomibitton could you please checkout the failures?

@shlomibitton
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@shlomibitton
Copy link
Contributor Author

@shlomibitton could you please checkout the failures?

@liat-grozovik looks like environment issue, I retriggered.

@shlomibitton
Copy link
Contributor Author

@liat-grozovik all checkers are now passing

@liat-grozovik liat-grozovik merged commit 1474ad7 into sonic-net:master Jun 16, 2022
yxieca pushed a commit that referenced this pull request Jun 17, 2022
…WSS service after fast/warm reboot (#10901)

- Why I did it
Recent change to delay PMON service in case of fast/warm reboot introduce an issue when restarting only SWSS service after fast/warm reboot for Nvidia platform.
Since the timer is triggered only when the system boot, in a scenario when the system is after a fast/warm reboot and the user restart SWSS service, as part of syncd.sh script, PMON service will stop but the timer will not start again.

- How I did it
On syncd.sh script, in case of fast/warm indication, check if pmon.timer is running.
If it is running it means we are at the first boot and continue normally.
If it is not running, meaning the service was restarted, start the timer to keep the system behavior consistent.

- How to verify it
Run fast/warm reboot.
service swss restart.
Observe PMON service starting.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
judyjoseph pushed a commit that referenced this pull request Jun 20, 2022
…WSS service after fast/warm reboot (#10901)

- Why I did it
Recent change to delay PMON service in case of fast/warm reboot introduce an issue when restarting only SWSS service after fast/warm reboot for Nvidia platform.
Since the timer is triggered only when the system boot, in a scenario when the system is after a fast/warm reboot and the user restart SWSS service, as part of syncd.sh script, PMON service will stop but the timer will not start again.

- How I did it
On syncd.sh script, in case of fast/warm indication, check if pmon.timer is running.
If it is running it means we are at the first boot and continue normally.
If it is not running, meaning the service was restarted, start the timer to keep the system behavior consistent.

- How to verify it
Run fast/warm reboot.
service swss restart.
Observe PMON service starting.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants