Fix an issue that rsyslog-config service starts failed #20840
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I did it
rsyslog-config service is a one shot service. It's found that sometimes rsyslog-config service could fail after config reload. This issue is found in our OC testing bed on supervisor card which has fully loaded fabric cards. Test case is platform_tests/test_reload_config.py::test_reload_configuration_checks
assert wait_until(300, 20, 0, config_system_checks_passed, duthost, delayed_services)
E AssertionError
After config reload:
admin@ixre-cpm-chassis19:~$ sudo systemctl status rsyslog-config.service
× rsyslog-config.service - Update rsyslog configuration
Loaded: loaded (/lib/systemd/system/rsyslog-config.service; enabled-runtime; preset: enabled)
Active: failed (Result: exit-code) since Thu 2024-11-14 17:06:12 UTC; 9min ago
Main PID: 11123 (code=exited, status=1/FAILURE)
Nov 14 17:06:11 ixre-cpm-chassis19 systemd[1]: Starting rsyslog-config.service - Update rsyslog configuration...
Nov 14 17:06:12 ixre-cpm-chassis19 systemctl[11277]: Job for rsyslog.service failed because the control process exited with error code.
Nov 14 17:06:12 ixre-cpm-chassis19 systemctl[11277]: See "systemctl status rsyslog.service" and "journalctl -xeu rsyslog.service" for details.
Nov 14 17:06:12 ixre-cpm-chassis19 systemd[1]: rsyslog-config.service: Main process exited, code=exited, status=1/FAILURE
Nov 14 17:06:12 ixre-cpm-chassis19 systemd[1]: rsyslog-config.service: Failed with result 'exit-code'.
Nov 14 17:06:12 ixre-cpm-chassis19 systemd[1]: Failed to start rsyslog-config.service - Update rsyslog configuration.
admin@ixre-cpm-chassis19:~$ sudo systemctl list-units --state=failed
UNIT LOAD ACTIVE SUB DESCRIPTION
? rsyslog-config.service loaded failed failed Update rsyslog configuration
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
1 loaded units listed.
Work item tracking
How I did it
Add "Restart=on-failure" to rsyslog-config.service so rsyslog-config service can restart if the service ever fails to start.
How to verify it
With the fix issue was not seen anymore on the same setup where we observed the issues.
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
202405