Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdserver: separate maybeCompactRaftLog function to compact raft log independently #18635

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
22 changes: 15 additions & 7 deletions server/etcdserver/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -980,6 +980,7 @@ func (s *EtcdServer) applyAll(ep *etcdProgress, apply *toApply) {
<-apply.notifyc

s.triggerSnapshot(ep)
s.maybeCompactRaftLog(ep)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really want to get this small merged firstly, then please ensure it's as independent as possible. Currently the s.snapshot performs both snapshot and compaction operations. It makes sense to extract the compaction operation as an independent function/method, but let's call the method inside s.triggerSnapshot,

func (s *EtcdServer) triggerSnapshot(ep *etcdProgress) {
	if !s.shouldSnapshot(ep) {
		return
	}
	lg := s.Logger()
	lg.Info(
		"triggering snapshot",
		zap.String("local-member-id", s.MemberID().String()),
		zap.Uint64("local-member-applied-index", ep.appliedi),
		zap.Uint64("local-member-snapshot-index", ep.snapi),
		zap.Uint64("local-member-snapshot-count", s.Cfg.SnapshotCount),
		zap.Bool("snapshot-forced", s.forceSnapshot),
	)
	s.forceSnapshot = false

	s.snapshot(ep.appliedi, ep.confState)
	ep.snapi = ep.appliedi
	s.compact(xxxxx)   // call the new method here, so we still do it each time after creating a snapshot.

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No rush on merging this PR. If we do merge it, we need to ensure etcd actually benefits from it. Let's resolve #17098 (comment) first.

Copy link
Member

@serathius serathius Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the PR is to make compaction independent from snapshot. Not just refactoring it to function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just refactor the function.

Just refactoring the function (extract the compact into a separate method) is an independent and safe change, accordingly can be merged soon.

The goal of the PR is to make compaction independent from snapshot

It modifies the logic/semantics, so it's no longer an independent change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Refactoring it to function" is a subset of "refactoring".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clearer about #18635 (comment), I am proposing an independent & safe minor refactoring below as the very first step

ahrtr@efae0d2

select {
// snapshot requested via send()
case m := <-s.r.msgSnapC:
Expand Down Expand Up @@ -2170,6 +2171,19 @@ func (s *EtcdServer) snapshot(snapi uint64, confState raftpb.ConfState) {
"saved snapshot",
zap.Uint64("snapshot-index", snap.Metadata.Index),
)
}

func (s *EtcdServer) maybeCompactRaftLog(ep *etcdProgress) {
// Retain all log entries up to the latest snapshot index to ensure any member can recover from that snapshot.
// Beyond the snapshot index, preserve the most recent s.Cfg.SnapshotCatchUpEntries entries in memory.
// This allows slow followers to catch up by synchronizing entries instead of requiring a full snapshot transfer.
if ep.snapi <= s.Cfg.SnapshotCatchUpEntries {
return
}

compacti := ep.snapi - s.Cfg.SnapshotCatchUpEntries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
compacti := ep.snapi - s.Cfg.SnapshotCatchUpEntries
compacti := ep.appliedi - s.Cfg.SnapshotCatchUpEntries

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of the change?
If you use ep.snapi, then the behaviour is exactly the same as existing behaviour, because ep.snapi only gets updated each time after creating the (v2) snapshot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr #18622 made a lot of changes, but Serathius and I agreed to keep PRs small. So, in this PR, we just separated the compaction from the snapshot while keeping the existing behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #17098 (comment). Can we get that confirmed firstly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't change when compaction is run or how many times it's executed. I was aware of https://github.com/etcd-io/raft/blob/5d6eb55c4e6929e461997c9113aba99a5148e921/storage.go#L266-L269 code, that's why I was proposing compacting only ever X applies.


lg := s.Logger()

// When sending a snapshot, etcd will pause compaction.
// After receives a snapshot, the slow follower needs to get all the entries right after
Expand All @@ -2181,13 +2195,7 @@ func (s *EtcdServer) snapshot(snapi uint64, confState raftpb.ConfState) {
return
}

// keep some in memory log entries for slow followers.
compacti := uint64(1)
if snapi > s.Cfg.SnapshotCatchUpEntries {
compacti = snapi - s.Cfg.SnapshotCatchUpEntries
}

err = s.r.raftStorage.Compact(compacti)
err := s.r.raftStorage.Compact(compacti)
serathius marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
// the compaction was done asynchronously with the progress of raft.
// raft log might already been compact.
Expand Down
Loading