-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: Garbage collect WAL files #1327
Conversation
/cc @chanwit, this will fix the problem you had with wal files taking too much space. I believe that if you want to try this out, you could stop dockerd, start swarmd with this patch and @aaronlehmann is that correct? |
Not quite. The garbage collection happens every time a snapshot is triggered, so it takes on the order of 10,000 raft commits to make it happen. |
ecd52e4
to
24ed09e
Compare
I'm having some trouble with the unit test in CI. It seems to get stuck. It passes locally for me. I think it may just be too memory-intensive for the CI virtual machine. The WAL package has a hardcoded 64 MB file rotation threshold, so we need to fill the logs with that much data, and multiplied by 3 nodes in the cluster it must end up being too much for the VM to handle. I've skipped the test when |
Current coverage is 55.07% (diff: 43.10%)@@ master #1327 diff @@
==========================================
Files 80 80
Lines 12582 12606 +24
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 6931 6943 +12
- Misses 4693 4702 +9
- Partials 958 961 +3
|
err := os.Remove(filepath.Join(n.snapDir(), snapFile)) | ||
if err != nil && removeErr == nil { | ||
removeErr = err | ||
if curSnapshotIdx >= 0 && i > curSnapshotIdx { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we construct remainingSnapshots
within this loop? So that it can reduce disk I/O by avoiding the call of ioutil.ReadDir
, (or it may not harm the performance because of cache? I don't know).
Also, it may reduce some redundant code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@runshenzhu: Updated.
} | ||
|
||
// Sort snapshot filenames in lexical order | ||
sort.Sort(sort.StringSlice(remainingSnapshots)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remainingSnapshots
is already sorted, (in reversed order), because snapshots
is sorted.
We may even replace this array with var oldestSnap string
and construct it from the loop above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@runshenzhu: Further simplified.
We currently garbage collect snapshot files (keeping only KeepOldSnapshot outdated snapshots, which defaults to 0). However, we don't garbage collect the WAL files that the snapshots replace. Delete any WALs which are so old that they only contain information that predates the oldest of the snapshots we have retained. This means that by default, we will remove old WALs once they are supplanted by a snapshot. However, if KeepOldSnapshots is set above 0, we will keep whichever WALs are necessary to catch up from the oldest of the retained snapshots. This makes sure that the old snapshots we retain are actually useful, and avoids adding an independent knob for WAL retention that might end up with an inconsistent setting. Also, fix serious brokenness in the the deletion of old snapshots (it was deleting the most recent outdated snapshots, instead of the oldest). Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
LGTM |
Thank you @aaronlehmann for implementing this and @aluzzardi for the ping. An addition requirement here. Is it possible to make
|
Yes, I agree that |
But note that setting |
@aaronlehmann I see thank you for that :-) |
LGTM |
🎉 🎉 |
We currently garbage collect snapshot files (keeping only
KeepOldSnapshot outdated snapshots, which defaults to 0). However, we
don't garbage collect the WAL files that the snapshots replace.
Delete any WALs which are so old that they only contain information that
predates the oldest of the snapshots we have retained. This means that
by default, we will remove old WALs once they are supplanted by a
snapshot. However, if KeepOldSnapshots is set above 0, we will keep
whichever WALs are necessary to catch up from the oldest of the retained
snapshots. This makes sure that the old snapshots we retain are actually
useful, and avoids adding an independent knob for WAL retention that
might end up with an inconsistent setting.
Also, fix serious brokenness in the the deletion of old snapshots (it
was deleting the most recent outdated snapshots, instead of the oldest).
Tested using the following patch, using a failure loop to create a steady stream of writes to raft, and restarting the daemon frequently:
Also added a unit test that adds a lot of data to force a WAL rollover, checks that the old WAL is deleted iff it completely predates the snapshot, and restarts the cluster to make sure the remaining wal/snapshot combination contains all the data.
cc @abronan @runshenzhu @aluzzardi