Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sonic-package-manager] support warm/fast reboot for extension packages #1554

Merged
merged 75 commits into from
Jul 2, 2021

Conversation

stepanblyschak
Copy link
Contributor

- What I did

Implemented functionality for SONiC package manager allowing to support packages wich require special handling for fast and warm reboots. For more details refer to HLD - https://github.com/stepanblyschak/SONiC/blob/sonic-app-ext-3/doc/sonic-application-extention/sonic-application-extention-hld.md#warmboot-and-fastboot-design-impact.

- How I did it

I extended manifest with warm/fast shutdown fields and added a logic that will account special requirements on fast/warm reboot for a package. Fast/Warm reboot scripts are enhanced to read the ordered list of services from a file on filesystem instead of having the list of services hardcoded in the script. This file is regenerated when package is installed/uninstalled/upgraded and also this file will be generated once during build time. Similary, a warmboot-finalizer service is enhanced by making it read the file on filesystem with processes that perfrom reconciliation.

- How to verify it

There is an open example extension I pushed to Docker Hub stepanblischak/cpu-report:warm.
It can be installed on the switch:

admin@sonic:~$ sudo sonic-package-manager show package manifest --from-repository stepanblischak/cpu-report:warm | grep warm -A 6
        "warm-shutdown": {
            "after": [
                "swss"
            ],
            "before": [
                "syncd"
            ]
admin@sonic;~$ sudo sonic-package-manager install --from-repository stepanblischak/cpu-report:warm -y -v DEBUG

Then perform warm-reboot and observe that cpu-report is stopped at the right place in shutdown sequence:

admin@sonic:~$ sudo warm-reboot -v
sudo warm-reboot -v
Wed 31 Mar 2021 12:54:10 PM UTC Saving counters folder before warmboot...
Wed 31 Mar 2021 12:54:13 PM UTC Prepare MLNX ASIC to fastfast-reboot: install new FW if required
Wed 31 Mar 2021 12:54:15 PM UTC Pausing orchagent ...
Wed 31 Mar 2021 12:54:15 PM UTC Collecting logs to check ssd health before fastfast-reboot...
Wed 31 Mar 2021 12:54:15 PM UTC Stopping lldp ...
Wed 31 Mar 2021 12:54:17 PM UTC Stopped lldp
Wed 31 Mar 2021 12:54:17 PM UTC Stopping nat ...
Dumping conntrack entries failed
Wed 31 Mar 2021 12:54:18 PM UTC Stopped nat
Wed 31 Mar 2021 12:54:18 PM UTC Stopping radv ...
Wed 31 Mar 2021 12:54:18 PM UTC Stopped radv
Wed 31 Mar 2021 12:54:18 PM UTC Stopping sflow ...
Wed 31 Mar 2021 12:54:18 PM UTC Stopped sflow
Wed 31 Mar 2021 12:54:18 PM UTC Stopping bgp ...
Wed 31 Mar 2021 12:54:22 PM UTC Stopped bgp
Wed 31 Mar 2021 12:54:22 PM UTC Stopping swss ...
Wed 31 Mar 2021 12:54:31 PM UTC Stopped swss
Wed 31 Mar 2021 12:54:31 PM UTC Initialize pre-shutdown ...
Wed 31 Mar 2021 12:54:31 PM UTC Requesting pre-shutdown ...
Wed 31 Mar 2021 12:54:32 PM UTC Waiting for pre-shutdown ...
Wed 31 Mar 2021 12:54:41 PM UTC Pre-shutdown succeeded, state: pre-shutdown-succeeded ...
Wed 31 Mar 2021 12:54:41 PM UTC Backing up database ...
Wed 31 Mar 2021 12:54:41 PM UTC Stopping cpu-report...
Wed 31 Mar 2021 12:54:41 PM UTC Stopped cpu-report
Wed 31 Mar 2021 12:54:41 PM UTC Stopping teamd ...
Wed 31 Mar 2021 12:54:48 PM UTC Stopped teamd
Wed 31 Mar 2021 12:54:48 PM UTC Stopping syncd ...
Wed 31 Mar 2021 12:54:51 PM UTC Stopped syncd
Wed 31 Mar 2021 12:54:51 PM UTC Stopping all remaining containers ...
Wed 31 Mar 2021 12:54:53 PM UTC Stopped all remaining containers ...
Wed 31 Mar 2021 12:54:55 PM UTC Enabling Watchdog before fastfast-reboot
Watchdog armed for 180 seconds
Wed 31 Mar 2021 12:54:56 PM UTC Rebooting with /sbin/kexec -e to SONiC-OS-master.0-ae9ccf39 ...

- Previous command output (if the output of a command-line utility has changed)

- New command output (if the output of a command-line utility has changed)

stepanblyschak and others added 30 commits November 3, 2020 15:56
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyshchak <stepanb@nvidia.com>
…packages-migration

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Co-authored-by: Joe LeVeque <jleveque@users.noreply.github.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
@liat-grozovik
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

stepanblyschak added a commit to stepanblyschak/sonic-buildimage that referenced this pull request Jun 11, 2021
vs has components from swss, bgp, teamd and nat. This table is needed by this change sonic-net/sonic-utilities#1554.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
@stepanblyschak
Copy link
Contributor Author

VS test failed. This change requires sonic-net/sonic-buildimage#7857 to pass VS tests

@liat-grozovik
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stepanblyschak
Copy link
Contributor Author

A successful run requires sonic-net/sonic-buildimage#7857 to be merged and vstest running on update vs image

lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Jun 23, 2021
vs has components from swss, bgp, teamd and nat. This table is needed by this change sonic-net/sonic-utilities#1554.

Because sonic-net/sonic-utilities#1554 requires "config warm_restart enable FEATURE" and the FEATURE has to be in feature table.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
@liat-grozovik
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stepanblyschak
Copy link
Contributor Author

Build failure due to canceled job:
image

@prsunny
Copy link
Contributor

prsunny commented Jul 1, 2021

It doesn't look good to open and close the PR multiple times to trigger test. It can be triggered by /azp run.

@renukamanavalan renukamanavalan merged commit 4818360 into sonic-net:master Jul 2, 2021
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Jul 7, 2021
Updates:
888701b [Mellanox] Remove mstdump from Mellanoxs collect dump script ([sonic-net/sonic-utilities#1706])
4818360 [sonic-package-manager] support warm/fast reboot for extension packages ([sonic-net/sonic-utilities#1554])
793b847 [show priority-group drop counters] Remove backup with cached PG drop counters after 'config reload' ([sonic-net/sonic-utilities#1679])
24fe1ac [show][config] support for interface alias for muxcable commands ([sonic-net/sonic-utilities#1699])
xumia pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Jul 9, 2021
Updates:
888701b [Mellanox] Remove mstdump from Mellanoxs collect dump script ([sonic-net/sonic-utilities#1706])
4818360 [sonic-package-manager] support warm/fast reboot for extension packages ([sonic-net/sonic-utilities#1554])
793b847 [show priority-group drop counters] Remove backup with cached PG drop counters after 'config reload' ([sonic-net/sonic-utilities#1679])
24fe1ac [show][config] support for interface alias for muxcable commands ([sonic-net/sonic-utilities#1699])
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
vs has components from swss, bgp, teamd and nat. This table is needed by this change sonic-net/sonic-utilities#1554.

Because sonic-net/sonic-utilities#1554 requires "config warm_restart enable FEATURE" and the FEATURE has to be in feature table.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
Updates:
888701b [Mellanox] Remove mstdump from Mellanoxs collect dump script ([sonic-net/sonic-utilities#1706])
4818360 [sonic-package-manager] support warm/fast reboot for extension packages ([sonic-net/sonic-utilities#1554])
793b847 [show priority-group drop counters] Remove backup with cached PG drop counters after 'config reload' ([sonic-net/sonic-utilities#1679])
24fe1ac [show][config] support for interface alias for muxcable commands ([sonic-net/sonic-utilities#1699])
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-utilities that referenced this pull request Aug 10, 2021
…es (sonic-net#1554)

- What I did

Implemented functionality for SONiC package manager allowing to support packages wich require special handling for fast and warm reboots. For more details refer to HLD - https://github.com/stepanblyschak/SONiC/blob/sonic-app-ext-3/doc/sonic-application-extention/sonic-application-extention-hld.md#warmboot-and-fastboot-design-impact.

- How I did it

I extended manifest with warm/fast shutdown fields and added a logic that will account special requirements on fast/warm reboot for a package. Fast/Warm reboot scripts are enhanced to read the ordered list of services from a file on filesystem instead of having the list of services hardcoded in the script. This file is regenerated when package is installed/uninstalled/upgraded and also this file will be generated once during build time. Similary, a warmboot-finalizer service is enhanced by making it read the file on filesystem with processes that perfrom reconciliation.

- How to verify it

There is an open example extension I pushed to Docker Hub stepanblischak/cpu-report:warm.
It can be installed on the switch:

admin@sonic:~$ sudo sonic-package-manager show package manifest --from-repository stepanblischak/cpu-report:warm | grep warm -A 6
        "warm-shutdown": {
            "after": [
                "swss"
            ],
            "before": [
                "syncd"
            ]
admin@sonic;~$ sudo sonic-package-manager install --from-repository stepanblischak/cpu-report:warm -y -v DEBUG
Then perform warm-reboot and observe that cpu-report is stopped at the right place in shutdown sequence:

admin@sonic:~$ sudo warm-reboot -v
sudo warm-reboot -v
Wed 31 Mar 2021 12:54:10 PM UTC Saving counters folder before warmboot...
Wed 31 Mar 2021 12:54:13 PM UTC Prepare MLNX ASIC to fastfast-reboot: install new FW if required
Wed 31 Mar 2021 12:54:15 PM UTC Pausing orchagent ...
Wed 31 Mar 2021 12:54:15 PM UTC Collecting logs to check ssd health before fastfast-reboot...
Wed 31 Mar 2021 12:54:15 PM UTC Stopping lldp ...
Wed 31 Mar 2021 12:54:17 PM UTC Stopped lldp
Wed 31 Mar 2021 12:54:17 PM UTC Stopping nat ...
Dumping conntrack entries failed
Wed 31 Mar 2021 12:54:18 PM UTC Stopped nat
Wed 31 Mar 2021 12:54:18 PM UTC Stopping radv ...
Wed 31 Mar 2021 12:54:18 PM UTC Stopped radv
Wed 31 Mar 2021 12:54:18 PM UTC Stopping sflow ...
Wed 31 Mar 2021 12:54:18 PM UTC Stopped sflow
Wed 31 Mar 2021 12:54:18 PM UTC Stopping bgp ...
Wed 31 Mar 2021 12:54:22 PM UTC Stopped bgp
Wed 31 Mar 2021 12:54:22 PM UTC Stopping swss ...
Wed 31 Mar 2021 12:54:31 PM UTC Stopped swss
Wed 31 Mar 2021 12:54:31 PM UTC Initialize pre-shutdown ...
Wed 31 Mar 2021 12:54:31 PM UTC Requesting pre-shutdown ...
Wed 31 Mar 2021 12:54:32 PM UTC Waiting for pre-shutdown ...
Wed 31 Mar 2021 12:54:41 PM UTC Pre-shutdown succeeded, state: pre-shutdown-succeeded ...
Wed 31 Mar 2021 12:54:41 PM UTC Backing up database ...
Wed 31 Mar 2021 12:54:41 PM UTC Stopping cpu-report...
Wed 31 Mar 2021 12:54:41 PM UTC Stopped cpu-report
Wed 31 Mar 2021 12:54:41 PM UTC Stopping teamd ...
Wed 31 Mar 2021 12:54:48 PM UTC Stopped teamd
Wed 31 Mar 2021 12:54:48 PM UTC Stopping syncd ...
Wed 31 Mar 2021 12:54:51 PM UTC Stopped syncd
Wed 31 Mar 2021 12:54:51 PM UTC Stopping all remaining containers ...
Wed 31 Mar 2021 12:54:53 PM UTC Stopped all remaining containers ...
Wed 31 Mar 2021 12:54:55 PM UTC Enabling Watchdog before fastfast-reboot
Watchdog armed for 180 seconds
Wed 31 Mar 2021 12:54:56 PM UTC Rebooting with /sbin/kexec -e to SONiC-OS-master.0-ae9ccf39 ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants