Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/deploy: Use a FIFREEZE/FITHAW cycle for /boot #1049

Closed
wants to merge 3 commits into from

Conversation

cgwalters
Copy link
Member

See: http://marc.info/?l=linux-fsdevel&m=149520244919284&w=2

XFS doesn't flush the journal on syncfs(). GRUB doesn't know how to follow the
XFS journal, so if the filesystem is in a dirty state (possible with xfs
/boot, extremely likely with /, if the journaled data includes content for
/boot, the system may be unbootable if a system crash occurs.

Fix this by doing a FIFREEZE+FITHAW cycle. Now, most people
probably would have replaced the syncfs() invocation with those two
ioctls. But this would have become (I believe) the only place in
libostree where we weren't safe against interruption. The failure
mode would be ugly; nothing else would be able to write to the filesystem
until manual intervention.

The real fix here I think is to land an atomic FIFREEZETHAW ioctl
in the kernel. I might try a patch.

In the meantime though, let's jump through some hoops and set up
a "watchdog" child process that acts as a fallback unfreezer.

Closes: #876

@vtolstov
Copy link

vtolstov commented Aug 3, 2017

@cgwalters thanks, sorry i don't understand about

The failure
mode would be ugly; nothing else would be able to write to the filesystem
until manual intervention.

as i know, that kernel by default supports automatic unfreeze via:

 A value of 0 for the argument also means there is no timeout, but any other value is treated as a pointer to a timeout value in seconds

@cgwalters
Copy link
Member Author

@vtolstov
Copy link

vtolstov commented Aug 3, 2017

@cgwalters thanks =). i'm use this feature is more then 3 years and think that it worked =)).
I'm lucky that i need to unfreeze by hand only two times =)

const char *path,
GCancellable *cancellable,
GError **error)
fsfreeze_thaw_cycle (int rootfs_dfd,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a small comment here with some of the rationale from the commit msg? E.g. why we're doing FIFREEZE/FITHAW instead of just syncfs.

else
return glnx_throw_errno_prefix (error, "ioctl(FIFREEZE)");
}
if (ioctl (rootfs_dfd, FITHAW, 0) != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a shortcircuit here based on some env var so we can add an insttest that the watchdog kicks in as predicted?

if (TEMP_FAILURE_RETRY (write (pipe_write, &c, sizeof (c))) < 0)
err (1, "write");
/* Wait for the parent to say it's going to freeze. */
(void) TEMP_FAILURE_RETRY (read (pipefds[0], &c, sizeof (c)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, this is referring to the same pipe, though right? Won't we just pick up the data we just wrote up above? If we want bidirectional comms, I think we need two pipes, right? (So 4 fds total).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, oops 😄 Fixed this by using a socketpair().

See: http://marc.info/?l=linux-fsdevel&m=149520244919284&w=2

XFS doesn't flush the journal on `syncfs()`. GRUB doesn't know how to follow the
XFS journal, so if the filesystem is in a dirty state (possible with xfs
`/boot`, extremely likely with `/`, if the journaled data includes content for
`/boot`, the system may be unbootable if a system crash occurs.

Fix this by doing a `FIFREEZE`+`FITHAW` cycle.  Now, most people
probably would have replaced the `syncfs()` invocation with those two
ioctls.  But this would have become (I believe) the *only* place in
libostree where we weren't safe against interruption.  The failure
mode would be ugly; nothing else would be able to write to the filesystem
until manual intervention.

The real fix here I think is to land an atomic `FIFREEZETHAW` ioctl
in the kernel.  I might try a patch.

In the meantime though, let's jump through some hoops and set up
a "watchdog" child process that acts as a fallback unfreezer.

Closes: ostreedev#876
@jlebon
Copy link
Member

jlebon commented Aug 8, 2017

@rh-atomic-bot r+ 84f064e

@rh-atomic-bot
Copy link

⌛ Testing commit 84f064e with merge 8642ef5...

@rh-atomic-bot
Copy link

☀️ Test successful - status-atomicjenkins
Approved by: jlebon
Pushing 8642ef5 to master...

@cmurf
Copy link

cmurf commented Aug 8, 2017

While the original bug that brought this up only reproduces on XFS, the problem could happen on ext4 or Btrfs. So they all probably should get freeze/thaw after bootloader related things change.

cgwalters referenced this pull request in rpm-software-management/rpm Sep 6, 2017
Bit hysterical that we haven't done this...
On Linux we could call syncfs() for only those filesystems that we
actually touched but somehow I doubt it's worth the trouble.

Another option might be doing this at rpmtxnEnd() but maybe that's
excessive.
cgwalters added a commit to cgwalters/coreos-assembler that referenced this pull request May 26, 2020
Trying to get osmet to work with RHCOS, I kept being unable to
mount the rootfs and it turned out that while the code I'd added
to unpack the LUKS bits was working fine, mounting the rootfs
was failing because the block device was read-only, but XFS
wanted to replay the journal.

And for some reason I haven't traced through yet, this apparently
isn't affecting FCOS, but the unclean journal is breaking osmet
w/RHCOS.

This of course is the *second* time I've hit this problem of
"ext4 flushes the journal on umount, XFS doesn't" - see
also ostreedev/ostree#1049

Adding the necessary invocation of `xfs_freeze` ensures
that our firstboot has a clean XFS journal for `/` which
is just a good idea anyways.
openshift-merge-robot pushed a commit to coreos/coreos-assembler that referenced this pull request May 27, 2020
Trying to get osmet to work with RHCOS, I kept being unable to
mount the rootfs and it turned out that while the code I'd added
to unpack the LUKS bits was working fine, mounting the rootfs
was failing because the block device was read-only, but XFS
wanted to replay the journal.

And for some reason I haven't traced through yet, this apparently
isn't affecting FCOS, but the unclean journal is breaking osmet
w/RHCOS.

This of course is the *second* time I've hit this problem of
"ext4 flushes the journal on umount, XFS doesn't" - see
also ostreedev/ostree#1049

Adding the necessary invocation of `xfs_freeze` ensures
that our firstboot has a clean XFS journal for `/` which
is just a good idea anyways.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

boot: Tracker for single partition journaling+GRUB issue
5 participants