-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: use RawNode for node's event loop; clean up bootstrap #10892
Conversation
tbg
commented
Jul 15, 2019
•
edited
Loading
edited
4d598d8
to
ecf60c2
Compare
ecf60c2
to
069125b
Compare
The first commit LGTM. I agree in spirit with the second commit, but worry about backwards compatibility. For better or worse, RawNode has supported the same auto-initialization as Node, so someone other than CockroachDB could be depending on it (at least the signature change on NewRawNode will break their build so the failure can't pass silently). I think that since the first commit reduces the duplication so the bootstrapping only lives in one place, it's probably best if that one place continues to be in RawNode instead of StartNode, at least until the anticipated helpers are available. |
It has always bugged me that any new feature essentially needed to be tested twice due to the two ways in which apps can use raft (`*node` and `*RawNode`). Due to upcoming testing work for joint consensus, now is a good time to rectify this somewhat. This commit removes most logic from `(*node).run` and uses `*RawNode` internally. This simplifies the logic and also lead (via debugging) to some insight on how the semantics of the approaches differ, which is now documented in the comments.
This is the first (maybe not last) step in cleaning up the bootstrap code around StartNode. Initializing a Raft group for the first time is awkward, since a configuration has to be pulled from thin air. The way this is solved today is unclean: The app is supposed to pass peers to StartNode(), we add configuration changes for them to the log, immediately pretend that they are applied, but actually leave them unapplied (to give the app a chance to observe them, though if the app did decide to not apply them things would really go off the rails), and then return control to the app. The app will then process the initial Readys and as a result the configuration will be persisted to disk; restarts of the node then use RestartNode which doesn't take any peers. The code that did this lived awkwardly in two places fairly deep down the callstack, though it was really only necessary in StartNode(). This commit refactors things to make this more obvious: only StartNode does this dance now. In particular, RawNode does not support this at all any more; it expects the app to set up its Storage correctly. Future work may provide helpers to make this "preseeding" of the Storage more user-friendly. It isn't entirely straightforward to do so since the Storage interface doesn't provide the right accessors for this purpose. Briefly speaking, we want to make sure that a non-bootstrapped node can never catch up via the log so that we can implicitly use one of the "skipped" log entries to represent the configuration change into the bootstrap configuration. This is an invasive change that affects all consumers of raft, and it is of lower urgency since the code (post this commit) already encapsulates the complexity sufficiently.
We are worried about breaking backwards compatibility for any application out there that may have relied on the old behavior. Their RawNode invocation would have been broken by the removal of the peers argument so it would not have changed silently; an associated comment tells callers how to fix it.
0aadfa8
to
500af91
Compare
Makes sense. I didn't want to bring this mostly-useless extra arg back though, so I added |
It has a data race between the test's call to `reduceUncommittedSize` and a corresponding call during Ready handling in `(*node).run()`. The corresponding RawNode test still verifies the functionality, so instead of fixing the test we can remove it.
I changed `(*RawNode).Ready`'s behavior in etcd-io#10892 in a problematic way. Previously, `Ready()` would create and immediately "accept" a Ready (i.e. commit the app to actually handling it). In etcd-io#10892, Ready() became a pure read-only operation and the "accepting" was moved to `Advance(rd)`. As a result it was illegal to use the RawNode in certain ways while the Ready was being handled. Failure to do so would result in dropped messages (and perhaps worse). For example, with the following operations 1. `rd := rawNode.Ready()` 2. `rawNode.Step(someMsg)` 3. `rawNode.Advance(rd)` `someMsg` would be dropped, because `Advance()` would clear out the outgoing messages thinking that they had all been handled by the client. I mistakenly assumed that this restriction had existed prior, but this is incorrect. I noticed this while trying to pick up the above PR in CockroachDB, where it caused unit test failures, precisely due to the above example. This PR reestablishes the previous behavior (result of `Ready()` must be handled by the app) and adds a regression test. While I was there, I carried out a few small clarifying refactors.