Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "/host unmount failure" during reboot #4558

Merged
merged 3 commits into from
May 20, 2020
Merged

Fix "/host unmount failure" during reboot #4558

merged 3 commits into from
May 20, 2020

Conversation

rkdevi27
Copy link
Contributor

@rkdevi27 rkdevi27 commented May 8, 2020

- Why I did it
During /sbin/reboot, the /host unmount failure happened.
- How I did it
The /var/log is mounted as a separate ext4 filesystem. This ext4 filesystem is associated with a loop device and mounted along with root partition.This loop device association needs to be detached after unmounting /var/log which systemd is not performing.
The script to stop the journal services, unmount /var/log and detele the loop device has been invoked at the closure of syslog.socket.

- How to verify it
I have verified the 100 iterations of reboot and confirmed issue has been fixed in all dell platforms.
Once the device is up, have checked if the /var/log, host partitions has been mounted properly.
Fast-boot, warm-boot and normal reboot has also been verified.

- Description for the changelog

Fixed the /host unmount failure issue.
Attaching the logs of 100 iterations in S6100 device.

- A picture of a cute animal (not mandatory but encouraged)

logs_S6100.txt

@lguohan lguohan requested a review from qiluo-msft May 9, 2020 18:37
@lguohan
Copy link
Collaborator

lguohan commented May 11, 2020

retest broadcom please

@lguohan
Copy link
Collaborator

lguohan commented May 11, 2020

retest vsimage please

Copy link
Collaborator

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments

@qiluo-msft
Copy link
Collaborator

Do you have new commit for reviewing?

@rkdevi27
Copy link
Contributor Author

Hi Qi,

I'm working on the changes. I will commit it soon.

@joyas-joseph
Copy link
Contributor

Here's an observation.

In stretch based builds:

root@sonic:~# which reboot
/usr/bin/reboot
root@sonic:~# 
root@sonic:~# reboot
requested COLD shutdown
/var/log: 3.9 GiB (4134080512 bytes) trimmed
/boot: 8.7 GiB (9333067776 bytes) trimmed
Thu May 14 03:07:19 UTC 2020 Rebooting with platform x86_64-dell_s6000_s1220-r0 specific tool ...

In buster based builds:

root@sonic:~# which reboot
/usr/sbin/reboot
root@sonic:~# file /usr/sbin/reboot
/usr/sbin/reboot: symbolic link to /bin/systemctl
root@sonic:~# 
root@sonic:~# 
root@sonic:~# reboot
[ 1805.643776] kdump-tools[11990]: Stopping kdump-tools: unloaded kdump kernel.
[ 1809.078008] i2c i2c-1: delete_device: Deleting device 24c02 at 0x50
[ 1809.090970] i2c i2c-1: delete_device: Deleting device 24c02 at 0x51
[ 1809.098838] i2c i2c-1: delete_device: Deleting device dni_dps460 at 0x58
[ 1809.106068] i2c i2c-1: delete_device: Deleting device dni_dps460 at 0x59
[  OK  ] Unmounted /var/lib/docker/…3e104c5c2af56de201/mounts/shm.

We do really want the reboot to happen through the platform specific tool.

@rkdevi27
Copy link
Contributor Author

Hi Joyas,

In S6100 and Z9100, platform reboot is happening using systemd services to ensure graceful unmount.
Please refer PR [https://github.com//pull/2912]
In that devices, normal reboot command will also have /host unmount issue.

@rkdevi27
Copy link
Contributor Author

Do you have new commit for reviewing?

Hi Qi,

committed the last changes addressing the review comments.

Copy link
Collaborator

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. Please also check other reviews's comments

@rkdevi27
Copy link
Contributor Author

Qi/Joyas,

Please let us know if there is there any further queries on this for the merge to happen.

@qiluo-msft
Copy link
Collaborator

Why the title contains 'DellEMC', I think it is a general fix.

@rkdevi27
Copy link
Contributor Author

Yeah Right. I will change it..

@rkdevi27 rkdevi27 changed the title DellEMC: Fix for the issue "/host unmount failure" during reboot Fix for the issue "/host unmount failure" during reboot May 20, 2020
@joyas-joseph
Copy link
Contributor

Qi/Joyas,

Please let us know if there is there any further queries on this for the merge to happen.

Looks good to me.

@qiluo-msft qiluo-msft changed the title Fix for the issue "/host unmount failure" during reboot Fix "/host unmount failure" during reboot May 20, 2020
@qiluo-msft qiluo-msft merged commit 32f58b5 into sonic-net:master May 20, 2020
lguohan pushed a commit that referenced this pull request Jul 25, 2020
Fix for the host unmount issue through PR #4558 and #4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket

Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.
lguohan pushed a commit that referenced this pull request Jul 25, 2020
Fix for the host unmount issue through PR #4558 and #4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket

Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.
abdosi pushed a commit that referenced this pull request Aug 9, 2020
Fix for the host unmount issue through PR #4558 and #4865 creates the timeout of syslog.socket closure during reboot since the journald socket closure has been included in syslog.socket

Removed the journal socket closure. The host unmount is fixed with just stopping the services which gets restarted only after /var/log unmount and not causing the unmount issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants