-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run fsck filesystem check support prior mounting filesystem #4431
Conversation
If the filesystem become non clean ("dirty"), SONiC does not run fsck to repair and mark it as clean again. This patch adds the functionality to run fsck on each boot, prior to the filesystem being mounted. This allows the filesystem to be repaired if needed. Note that if the filesystem is maked as clean, fsck does nothing and simply return so this is perfectly fine to call fsck every time prior to mount the filesystem. How to verify this patch (using bash): Using an image without this patch: Make the filesystem "dirty" (not clean) [we are making the assumption that filesystem is stored in /dev/sda3 - Please adjust depending of the platform] [do this only on a test platform!] dd if=/dev/sda3 of=superblock bs=1 count=2048 printf "$(printf '\\x%02X' 2)" | dd of="superblock" bs=1 seek=1082 count=1 conv=notrunc &> /dev/null dd of=/dev/sda3 if=superblock bs=1 count=2048 Verify that filesystem is not clean tune2fs -l /dev/sda3 | grep "Filesystem state:" reboot and verify that the filesystem is still not clean Redo the same test with an image with this patch, and verify that at next reboot the filesystem is repaired and becomes clean. fsck log is stored on syslog, using the string FSCK as markup.
@@ -340,4 +340,10 @@ if [ -f $FIRST_BOOT_FILE ]; then | |||
firsttime_exit | |||
fi | |||
|
|||
# Copy the fsck log into syslog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy [](start = 2, length = 4)
Why copy to syslog? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syslog being the unified way of seeing system messages, we thought that its was more appropriate to hold the fsck log messages instead of a compressed file under the /var/log directory
@@ -85,5 +85,12 @@ then | |||
mount -t tmpfs -o rw,nosuid,nodev,size=${varlogsize}M tmpfs ${rootmnt}/var/log | |||
[ -f ${rootmnt}/host/disk-img/var-log.ext4 ] && rm -rf ${rootmnt}/host/disk-img/var-log.ext4 | |||
else | |||
[ -f ${rootmnt}/host/disk-img/var-log.ext4 ] && fsck.ext4 -v -p ${rootmnt}/host/disk-img/var-log.ext4 2>&1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fsck.ext4 [](start = 50, length = 9)
Is there an option -y
to prevent interactive prompt? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -p option that we use here will prevent interactive prompt.
The "-p" option as documented in fsck.ext3 and fsck.ext4 :
-p Automatic repair (no questions)
[ -f ${rootmnt}/host/disk-img/var-log.ext4 ] && mount -t ext4 -o loop,rw ${rootmnt}/host/disk-img/var-log.ext4 ${rootmnt}/var/log | ||
fi | ||
|
||
## fscklog file: /tmp will be lost when overlayfs is mounted | ||
if [ -f /tmp/fsck.log.gz ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only need to mv inside above else
block. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we leave here?
if running fsck is later added to arista platforms, the test will need to be done.
@@ -85,5 +85,12 @@ then | |||
mount -t tmpfs -o rw,nosuid,nodev,size=${varlogsize}M tmpfs ${rootmnt}/var/log | |||
[ -f ${rootmnt}/host/disk-img/var-log.ext4 ] && rm -rf ${rootmnt}/host/disk-img/var-log.ext4 | |||
else | |||
[ -f ${rootmnt}/host/disk-img/var-log.ext4 ] && fsck.ext4 -v -p ${rootmnt}/host/disk-img/var-log.ext4 2>&1 \ | |||
| gzip -c >> /tmp/fsck.log.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/tmp/fsck.log.gz [](start = 60, length = 19)
Why append instead of overwrite? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you only gzip 2 files, use 2 names and mv
twice.
In reply to: 409184727 [](ancestors = 409184727)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are actually two ext3/ext4 filesystems that need to be checked:
- the root filesystem mounted on the hard-drive or ssd, defined as root=X in /proc/cmdline
(this is done early from initrd by script fsck-rootfs) - the log filesystem, which is stored in an ext4 file [host/disk-img/var-log.ext4]
This is why we use one file, and we append to this file when the log fileystem is checked.
@@ -165,6 +165,10 @@ sudo chmod +x $FILESYSTEM_ROOT/etc/initramfs-tools/scripts/init-premount/arista- | |||
sudo cp files/initramfs-tools/resize-rootfs $FILESYSTEM_ROOT/etc/initramfs-tools/scripts/init-premount/resize-rootfs | |||
sudo chmod +x $FILESYSTEM_ROOT/etc/initramfs-tools/scripts/init-premount/resize-rootfs | |||
|
|||
# Hook into initramfs: run fsck to repair a non-clean filesystem prior to be mounted | |||
sudo cp files/initramfs-tools/fsck-rootfs $FILESYSTEM_ROOT/etc/initramfs-tools/scripts/init-premount/fsck-rootfs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fsck [](start = 101, length = 4)
How slow is the fsck? any impact on fast-reboot or warm-reboot?
If it is slow, can we only check and log, but expect user to repair it later? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no error on the filesystem (i.e not marked as dirty), the command execution is immediate, So, on a clean filesystem, there is no impact on either fast-reboot or warm-reboot.
But if the filesystem is marked as dirty, there is no guarantee that the filesystem can be mounted as R/W. Or worse, if it still mounted, some files can be truncated or corrupted. Therefore, we strongly advise to repair a filesystem marked as dirty to avoid potential issues later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor issue remains. ok to me right now.
How long does fsck take? This operation will cut into fast/warm reboot time budget. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you assess the impact to warm/fast reboot from this change?
I hope my reply to Qi Luo answers your questions. If the filesystem has not been cleanly unmounted, fsck detects a dirty bit on the filesystem, which is a matter of reading a single block from the disk or the ssd. If clean, fsck then return immediately, . so the time budget is negligible. If the filesystem is detected as dirty, this might be a different story. But would-you want to mount R/W a filesystem which was not cleanly unmounted (power outage, kernel crash, etc.)? |
@yxieca , i agree with @olivier-singla , it is better to do fsck during reboot. for fast reboot or warm reboot, we should umount disk cleanly before reboot. |
…tem (#4431) * Run fsck filesystem check support prior mounting filesystem If the filesystem become non clean ("dirty"), SONiC does not run fsck to repair and mark it as clean again. This patch adds the functionality to run fsck on each boot, prior to the filesystem being mounted. This allows the filesystem to be repaired if needed. Note that if the filesystem is maked as clean, fsck does nothing and simply return so this is perfectly fine to call fsck every time prior to mount the filesystem. How to verify this patch (using bash): Using an image without this patch: Make the filesystem "dirty" (not clean) [we are making the assumption that filesystem is stored in /dev/sda3 - Please adjust depending of the platform] [do this only on a test platform!] dd if=/dev/sda3 of=superblock bs=1 count=2048 printf "$(printf '\\x%02X' 2)" | dd of="superblock" bs=1 seek=1082 count=1 conv=notrunc &> /dev/null dd of=/dev/sda3 if=superblock bs=1 count=2048 Verify that filesystem is not clean tune2fs -l /dev/sda3 | grep "Filesystem state:" reboot and verify that the filesystem is still not clean Redo the same test with an image with this patch, and verify that at next reboot the filesystem is repaired and becomes clean. fsck log is stored on syslog, using the string FSCK as markup.
If the filesystem become non clean ("dirty"), SONiC does not run fsck to
repair and mark it as clean again.
This patch adds the functionality to run fsck on each boot, prior to the
filesystem being mounted. This allows the filesystem to be repaired if
needed.
Note that if the filesystem is maked as clean, fsck does nothing and simply
return so this is perfectly fine to call fsck every time prior to mount the
filesystem.
How to verify this patch (using bash):
Using an image without this patch:
Make the filesystem "dirty" (not clean)
[we are making the assumption that filesystem is stored in /dev/sda3 - Please adjust depending of the platform]
[do this only on a test platform!]
dd if=/dev/sda3 of=superblock bs=1 count=2048
printf "$(printf '\x%02X' 2)" | dd of="superblock" bs=1 seek=1082 count=1 conv=notrunc &> /dev/null
dd of=/dev/sda3 if=superblock bs=1 count=2048
Verify that filesystem is not clean
tune2fs -l /dev/sda3 | grep "Filesystem state:"
reboot and verify that the filesystem is still not clean
Redo the same test with an image with this patch, and verify that at next reboot the filesystem is repaired and becomes clean.
fsck log is stored on syslog, using the string FSCK as markup.