-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fast/warm-reboot] add cpufreq.default_governor=performance to BOOT_OPTIONS #3435
Conversation
…PTIONS Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
What platform and T0 topology were this change tested on? Can you share what changed in the boot up sequence when We need to make sure that this inbuilt performance improvement will not lead to some unexpected bootup sequence, such as race conditions, unexpected preemption sequence, etc. 1s improvement is too tricky to be rightfully attributed to this change. How certain are we that 1s improvement is due to this change? If we know that a certain sequence that got changed due to enabling |
scripts/fast-reboot
Outdated
|
||
# Set governor to performance to speed up boot process. | ||
# The governor is reset back to kernel default in warmboot-finalizer script. | ||
BOOT_OPTIONS="${BOOT_OPTIONS} cpufreq.default_governor=performance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that you don't add this as a default behavior. This needs to be set by an option provided to warm/fast-reboot script.
That way we can tightly control when/if this is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think an optimization is an optinal feature.
We already did enhancements which are enabled by default, e.g. LACP retry count, that even had a larger impact as the changes are applied also on the peer side.
Also, from the testing perspective we would need to test all combinations of possible optimization flags user may pass to the script.
From the user perspective, it becomes unclear what flags he needs to use, what is heavily tested and what is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need an architecture check? Is this supported on armhf and arm64 as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be useful in some scenarios and I think user should be given an option to control this behavior. Some examples I can think of are:
- High CPU usage during boot up path.
- High thermal load due to use new file location for sys eeprom and sfp eeprom on msn2700 #1.
- If this mode has dependency on underlying hardware then it depends on the hardware if this is supported.
Mostly these concerns are not critical on new platforms. But some old/slow platforms do not really want this high CPU, high fan activity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaibhavhd @saiarcot895 I limit this setting for nvidia where this is tested and working.
Results were shared over emails.
Could you please clarify what information you are requesting? This PR does not change the boot sequence.
I consider this a good thing, if this change exposes unexpected bootup sequence or a race condition we will catch it and fix it rather then waiting for a hidden bug to occur in production with .1% repro rate without this change.
fast-reboot test gives a consistent result with deviations less then 1s on nvidia platform. Another our platform with another CPU gives 1.5 sec improvement. A reduction of the average time may be considered an improvement. If you'd like us to prove it with statistics, please suggest the number of runs that will raise the confidence. |
@vaibhavhd Can you please help review? |
@saiarcot895 please review this PR. |
Why is this being done only for reboot? Why not for service warm-restart? Has it been considered what other scenarios will/may benefit from this? |
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
@vaibhavhd I do not find performance issues with service warm-restart |
#### Why I did it Set cpufreq.default_governor to *performance* for faster boot time. We observe consistent 1 sec improvement across several devices. The change in finalize-warmboot.sh restores the default governor after fast or warm boot is finished. **NOTE**: This will apply to upgrades starting from 202405 since this is set in shutdown path to avoid any extra scripts running at boot time. Upgrade from older versions/branches will require a runtime patch to fast-reboot and warm-reboot script. #### How I did it After fast or warm boot is finished restore to default governor. #### How to verify it Run fast-reboot or warm-reboot. Check: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance ``` After boot is finalized check that it is reset back to default: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor schedutil ``` Tested with sonic-net/sonic-utilities#3435
…9634) #### Why I did it Set cpufreq.default_governor to *performance* for faster boot time. We observe consistent 1 sec improvement across several devices. The change in finalize-warmboot.sh restores the default governor after fast or warm boot is finished. **NOTE**: This will apply to upgrades starting from 202405 since this is set in shutdown path to avoid any extra scripts running at boot time. Upgrade from older versions/branches will require a runtime patch to fast-reboot and warm-reboot script. #### How I did it After fast or warm boot is finished restore to default governor. #### How to verify it Run fast-reboot or warm-reboot. Check: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance ``` After boot is finalized check that it is reset back to default: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor schedutil ``` Tested with sonic-net/sonic-utilities#3435
…9634) #### Why I did it Set cpufreq.default_governor to *performance* for faster boot time. We observe consistent 1 sec improvement across several devices. The change in finalize-warmboot.sh restores the default governor after fast or warm boot is finished. **NOTE**: This will apply to upgrades starting from 202405 since this is set in shutdown path to avoid any extra scripts running at boot time. Upgrade from older versions/branches will require a runtime patch to fast-reboot and warm-reboot script. #### How I did it After fast or warm boot is finished restore to default governor. #### How to verify it Run fast-reboot or warm-reboot. Check: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance ``` After boot is finalized check that it is reset back to default: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor schedutil ``` Tested with sonic-net/sonic-utilities#3435
…9634) #### Why I did it Set cpufreq.default_governor to *performance* for faster boot time. We observe consistent 1 sec improvement across several devices. The change in finalize-warmboot.sh restores the default governor after fast or warm boot is finished. **NOTE**: This will apply to upgrades starting from 202405 since this is set in shutdown path to avoid any extra scripts running at boot time. Upgrade from older versions/branches will require a runtime patch to fast-reboot and warm-reboot script. #### How I did it After fast or warm boot is finished restore to default governor. #### How to verify it Run fast-reboot or warm-reboot. Check: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance ``` After boot is finalized check that it is reset back to default: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor schedutil ``` Tested with sonic-net/sonic-utilities#3435
#### Why I did it Set cpufreq.default_governor to *performance* for faster boot time. We observe consistent 1 sec improvement across several devices. The change in finalize-warmboot.sh restores the default governor after fast or warm boot is finished. **NOTE**: This will apply to upgrades starting from 202405 since this is set in shutdown path to avoid any extra scripts running at boot time. Upgrade from older versions/branches will require a runtime patch to fast-reboot and warm-reboot script. #### How I did it After fast or warm boot is finished restore to default governor. #### How to verify it Run fast-reboot or warm-reboot. Check: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance ``` After boot is finalized check that it is reset back to default: ``` admin@sonic:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor schedutil ``` Tested with sonic-net/sonic-utilities#3435
What I did
Set cpufreq.default_governor to performance for faster boot time. We observe consistent 1 sec improvement across several devices.
NOTE: This will apply to upgrades starting from 202405 since this is set in shutdown path to avoid any extra scripts running at boot time. Upgrade from older versions/branches will require a runtime patch to fast-reboot and warm-reboot script.
DEPENDS ON: sonic-net/sonic-buildimage#19634
How I did it
Append this option to BOOT_OPTIONS variable.
How to verify it
Run fast-reboot or warm-reboot. Check:
After boot is finalized check that it is reset back to default:
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)