Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add getTxStatus RPC #1111

Merged
merged 38 commits into from
May 1, 2020
Merged

Add getTxStatus RPC #1111

merged 38 commits into from
May 1, 2020

Conversation

eddyashton
Copy link
Member

@eddyashton eddyashton commented Apr 28, 2020

This adds a getTxStatus RPC, replacing the getCommit(N) variant. getCommit now has a single purpose of fetching the current kv version, and the "is this committed" logic is handled in a single place, by the service.

I'm unsure about naming:

  • I've used term and commit for consistency with existing RPCs/header names/externally facing things, are we planning to convert all of these to view and index? FWIW, I'm still fond of formalising version = view.index, so I've sprinkled that into the docs.
  • What do we call a rolled back transactions? I've gone with LOST, but am open to suggestions.
  • Should we add an UNKNOWN (I've never heard of this transaction) state distinct from PENDING (I know it, but I don't know if it is globally committed)? A long time in the UNKNOWN state may indicate that we've had an election but haven't yet reached your index, but this is only a heuristic (not true for partitioned nodes), and introduces possibility of odd response chains (UNKNOWN -> PENDING -> UNKNOWN -> PENDING -> ... -> [COMMITTED|LOST]).

I believe in the current implementation, if you ask a single node, the response will always transition exactly once from PENDING to either COMMITTED or LOST, and will be that way forever. And the transition will be consistent across all nodes - once you've heard COMMITTED, another node may still say PENDING, but can never say LOST (and vice versa). These are things I don't think we've clearly described in the docs, so I've added a note - let me know if it makes sense.

I've also removed the SigHttpRpcTlsClient - its functionality was almost entirely (and now entirely) handled by the base HttpRpcTlsClient.

EDIT - leaving the original description for posterity, but here's a more accurate current state:

  • Endpoint is simply /tx (ie /users/tx, /members/tx), as an exemplar of where we want to go with HTTP naming style
  • Similarly the transaction ID is given view.seqno. Although we currently use term and commit in response headers/other RPCs, we intend to rename soon.
  • There are 4 states: UNKNOWN, PENDING, COMMITTED, and INVALID. Unknown and Pending can be usefully distinguished, and we can report Invalid in some cases I'd missed initially. There's a non-systematic unit test for how we derive this status.

@eddyashton eddyashton requested a review from a team as a code owner April 28, 2020 16:52
@ccf-bot
Copy link
Collaborator

ccf-bot commented Apr 28, 2020

tx_status@7906 aka 20200501.10 vs master ewma over 50 builds from 7390 to 7895
images

@achamayou
Copy link
Member

I suggest we start with #968 with new endpoints, to minimise future disruption to users.

Not sure about LOST, I think ROLLED_BACK is more precise. I believe we'd agreed a commit id was composed of a view and seqno, which is PBFT terminology?

@eddyashton
Copy link
Member Author

I suggest we start with #968 with new endpoints, to minimise future disruption to users.

Not sure about LOST, I think ROLLED_BACK is more precise. I believe we'd agreed a commit id was composed of a view and seqno, which is PBFT terminology?

I was thinking ROLLED_BACK is too Raft-specific, its not really true for PBFT or a general consensus? We don't know if your transaction was partially accepted then rolled back (and the fact we do this in the kv seems like an implementation detail - in standard Raft terminology these 'local commits' are just stored in a pending lost, not applied then rolled back). The general externally visible behaviour is just "this didn't happen".

And we've settled on view/seqno within the code, but all of the external references are still to term/commit. I think we fix that with #968, in a single sweeping change. Unsure if its preferable for this RPC to be consistent with others, or in its final form already? Same point applies to removing the get prefix - we could make this inconsistent early, if we choose txStatus vs tx_status - but I think its easier to describe if it stays consistent?

@achamayou
Copy link
Member

@eddyashton there's a tradeoff between consistency and user-pain here. If we start making the new endpoints from this one onwards correct, users who rely on those won't have to update the code that deals with it. If we go for consistency, we're making more work for ourselves and for users, in return for slightly less surprise across RPCs.

I would prefer to go straight for the end state here, and to add an OpenAPI schema for this, but I'd happy to hear everyone's opinion about this too.

Lost to me sounds like Rolled back but worse, it's an acknowledgement it existed, but then it was misplaced. I do agree that rolled back isn't quite right either.

@achamayou
Copy link
Member

How about:

[ REPLICATING | UNKNOWN ] -> [ COMMITTED | WONT_COMMIT | UNKNOWN ]

@eddyashton
Copy link
Member Author

How about:

[ REPLICATING | UNKNOWN ] -> [ COMMITTED | WONT_COMMIT | UNKNOWN ]

UNKNOWN is definitely growing on me. Don't think its safe/worth it to distinguish REPLICATING and UNKNOWN. WONT_COMMIT sounds a little too much like a choice. Maybe UNKNOWN -> [ COMMITTED | NOT_COMMITTED ]? (This may look like a bikeshed, but it houses the emergency generator)

@olgavrou
Copy link
Member

NOT_COMMITTED may imply that the transaction is not committed yet, but it might get committed in the future. Another option is REJECTED which clearly states the transaction state.


This example queries the status of transaction ID 2.18 (constructed from view 2 and sequence number 18). The response indicates this was successfully committed. The headers also show that the service has since made progress with other requests, and global commit index has continued to increase.

The possible statuses are:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we are not removing all the mentions to globally committed just yet, but I think that in the description of the statuses it is a bit confusing. I would say globally replicated, replicated to enough nodes to make the transaction durable, or something similar


It is possible that intermediate states are not visible (eg - a transition from Unknown to Committed may never publically show a Pending result). Nodes may disagree on the current state due to communication delays, but will never disagree on transitions (in other words, they may believe a Committed transaction is still Unknown or Pending, but will never report it as Invalid).

Note that transaction IDs are uniquely assigned by the service - once a request has been assigned an ID, this ID will never be associated with a different write transaction. In normal operation, the next requests will be given versions 2.19, then 2.20, and so on, and after a short delay 2.18 will be globally committed. If requests are submitted in parallel, they will be applied in a consistent order indicated by their assigned versions. If the network is unable to reach consensus, it will trigger a leadership election which increments the view. In this case the user's next request may be given a version 3.16, followed by 3.17, then 3.18. The sequence number is reused, but in a different view; the service knows that 2.18 can never be assigned, so it can report this as an invalid ID. Read-only transactions are an exception - they do not get a unique transaction ID but instead return the ID of the last write transaction whose state they may have read.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to remove globally committed from this paragraph too and just leave it as committed

@@ -251,49 +253,36 @@ namespace timing
record_receive(response.id, commit_ids);
}

const auto response_term = body["term"].get<size_t>();
if (response_term == 0)
// NB: Eventual header re-org should be exposing API types so
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s... a submarine TODO? I would suggest updating the headers ticket instead.

x-ccf-global-commit: 33
x-ccf-term: 2
x-ccf-commit: 42
x-ccf-global-commit: 40
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were changing those to seqno/view? Next time then.

@achamayou achamayou merged commit 95e7ea3 into microsoft:master May 1, 2020
olgavrou added a commit that referenced this pull request May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants