-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[reboot-cause] Fix a broken symlink of previous-reboot-cause file removal issue #46
Conversation
@judyjoseph @abdosi Hi Judy and Abhishek, Please review this PR to fix the "show reboot-cause history" shows no info. Thanks. |
This is fix which has been reverted before. -- sonic-net/sonic-buildimage#10751. Per discussion, we need to have this PR to fix the issue in case of it happens to be a broken symlink |
@mlok-nokia did we investigate why the symlink is getting broken at the first place? |
For the platform which doesn't have the RTC and super-cab, when system reboot or shutdown for a long period of time, the clock will go back to the old time. If the system doesn't have NTP enabled, the system will start with the old time. The filename of the reboot-cause history is constructed with the current timestamp. The "previous-reboot-cause" is created and softlink to that filename. When system has more than 10 histories, if the newer history with the older timesatmp, it will be removed because it filename has oldest timestamp. Hence, the file "previous-reboot-cause" point to a file which is just been removed. And it becomes a broken link. Once this broken symlink exists, it will be stay forever until user manually removes it. This PR just recovers it once the problem happens. |
please create separate PR for 202205 as master has submodule while 202205 does not have |
A new PR has been created on 202205 branch -- sonic-net/sonic-buildimage#14106 |
@mlok-nokia where is the code which removes |
This has been done in the process-reboot-cause script in fucntion read_reboot_cause_files_and_save_state_db() |
@mlok-nokia Thanks for pointing me the code. The reboot-cause_files_ with timestamp gets generated before the reboot i.e the current stamp after reboot has no impact on the deletion of file because the code removes the OLDEST files and keeps only the ten latest one. So, how is the NTP causing wrong deletion of files? Is it that during the long period of system operation and reboots, the timestamp of various reboot cause file gets out of order? |
|
…cause file removal issue (sonic-host-services #46) (#14106) Why I did it Porting/cherry-pick PR sonic-net/sonic-host-services#46 "show reboot-cause history" shows empty history. When the previous-reboot-cause has a broken symlink, And rebooting the system will not be able to generate a new symlink of the new previous-reboot-cause. admin@sonic:~$ show reboot-cause history Name Cause Time User Comment ------ ------- ------ ------ --------- How I did it Somehow, when the symlink file /host/reboot-cause/previous-reboot-cause is broken (which its destination files doesn't exist in this case), the current condition check "if os.path,exists(PREVIOUS_REBOOT_CAUSE_FILE)" will return False in determine-reboot-cause script. Hence, the current previous-reboot-cause is not been removed and the recreation of the new previous-reboot-cause failed. In case of previous-reboot-cause is a broken synlink file, add condition os.path.islink(PREVIOUS_REBOOT_CAUSE) to check and allow the remove operation happens. How to verify it Manually make the /host/reboot-cause/previous-reboot-cause to be a broken symlink file by removing its destination file reboot the system. "show reboot-cause history" should show the correct info Signed-off-by: mlok <marty.lok@nokia.com>
@mlok-nokia I deleted the symlink itself, and I don't see the issue that you are seeing. I dont' understand your explanation that
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mlok-nokia please help me to understand better
You need to remove the src file -- /host/reboot-cause/history/reboot-cause-2023_03_06_23_24_13.json. Then , it becomes a broken. After that, reboot the system, the problem will occur. |
@mlok-nokia I rebooted after deleting the history file. Because the symlink is broken, the reboot cause is 'unknown' . But I don't see the issue. If there are history files under
|
@prgeor The following is what I got by using the 202205 which is built on 02/24/2023.
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@mlok-nokia could you take care of test coverage ? |
…oval issue Signed-off-by: mlok <marty.lok@nokia.com>
4d97bfe
to
2e909f3
Compare
@judyjoseph I have added the case test_determinre_reboot_cause_main() for it. Thanks |
@mlok-nokia could you update the test to cover this change -- so as to get the coverage, thanks |
@prgeor Hi Prince, I have moved the code to the place which you suggested. But somehow, our conversion have been cleared. I just let you know that. Please review the change. Thanks |
… to the pace before its new symlink creation
d4a309f
to
6c11931
Compare
All set. Thanks. |
[reboot-cause] Fix a broken symlink of previous-reboot-cause file removal issue
Why I did it
"show reboot-cause history" shows empty history. When the previous-reboot-cause has a broken symlink, And rebooting the system will not be able to generate a new symlink of the new previous-reboot-cause.
How I did it
Somehow, when the symlink file /host/reboot-cause/previous-reboot-cause is broken (which its destination files doesn't exist in this case), the current condition check "if os.path,exists(PREVIOUS_REBOOT_CAUSE_FILE)" will return False in determine-reboot-cause script. Hence, the current previous-reboot-cause is not been removed and the recreation of the new previous-reboot-cause failed. In case of previous-reboot-cause is a broken synlink file, add condition os.path.islink(PREVIOUS_REBOOT_CAUSE) to check and allow the remove operation happens.
How to verify it
Which release branch to backport (provide reason below if selected)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)