Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachpb: introduce QueryIntent request #26335

Merged
merged 1 commit into from
Jun 25, 2018

Conversation

nvanbenschoten
Copy link
Member

This change introduces a new request method called QueryIntent. The request
type draws parallels to the QueryTxn method, but it visits an intent instead
of a transaction record and returns whether the intent exists or not. The
request type also includes an "if missing" behavior, which allows clients
to optionally ensure that the request prevents a missing intent from ever
being written or return an error if the intent is found to be missing.

This request type was proposed/discussed in both #24194 and #16026. It is a
prerequisite for either proposal.

Release note: None

@nvanbenschoten nvanbenschoten requested review from tbg and a team June 1, 2018 21:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@bdarnell
Copy link
Contributor

bdarnell commented Jun 4, 2018

Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.


pkg/roachpb/api.go, line 1060 at r1 (raw file):

func (*PushTxnRequest) flags() int               { return isWrite | isAlone }
func (*QueryTxnRequest) flags() int              { return isRead | isAlone }
func (*QueryIntentRequest) flags() int           { return isRead | updatesReadTSCache }

Comment that this only updates the read TS cache sometimes (and a reference to the code below)


pkg/roachpb/api.proto, line 797 at r1 (raw file):

    // response.
    DO_NOTHING = 0;
    // Throw an IntentMissingError.

s/throw/return/


pkg/roachpb/api.proto, line 810 at r1 (raw file):

  // behavior could be a function of (BatchRequest.Txn.ID == ExpectedTxn.ID).
  // This might be convenient, but I think it's probably better in the long run
  // to keep this behavior explicit and orthogonal to the state of the caller.

👍


pkg/roachpb/api.proto, line 816 at r1 (raw file):

  // It also might make more sense to make if_missing_throw_error and
  // if_missing_prevent independent fields so that they become completely
  // orthogonal options that can be used together or independently.

I'm not sure we would ever want both - if an error is returned, the rest of the batch is failed, so we wouldn't reliably prevent all the intents if any of them encountered an error.


pkg/roachpb/api.proto, line 825 at r1 (raw file):

  // Whether an intent matching the expected transaction was found at the key.
  //
  // TODO DURING REIVEW: we could return the entire TxnMeta here instead. This

I think it's fine to not return a TxnMeta, since the caller is required to pass in a TxnMeta.

Does a bool capture all possibilities? Do we need to distinguish between "not found, but could still be written" from "not found, and could no longer be written"?


pkg/roachpb/errors.proto, line 384 at r1 (raw file):

  option (gogoproto.equal) = true;

  optional storage.engine.enginepb.TxnMeta txn = 1 [(gogoproto.nullable) = false];

Document this. Why do we echo the caller's TxnMeta back to them? If we need a TxnMeta here, maybe we should include one in the QueryIntentResponse.


pkg/storage/replica.go, line 2089 at r1 (raw file):

						// TODO DURING REVIEW: do we want to update the write timestamp
						// cache instead so that we ensure that the WriteToOld flag
						// is set on the eventual intent write?

No, I don't think setting WriteTooOld is required here. WriteTooOld forces transaction restarts because there is known to be a conflicting write; to prevent the writing of an intent the read timestamp cache is sufficient and a bit cheaper.


pkg/storage/batcheval/cmd_query_intent.go, line 59 at r1 (raw file):

		i := intents[0]
		reply.FoundIntent = (i.Txn.ID == args.Txn.ID) &&
			(i.Txn.Epoch >= args.Txn.Epoch) &&

Is the >= required here? When would the epoch and sequence not be an exact match?


Comments from Reviewable

@nvanbenschoten
Copy link
Member Author

Review status: all files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


pkg/roachpb/api.go, line 1060 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Comment that this only updates the read TS cache sometimes (and a reference to the code below)

Done.


pkg/roachpb/api.proto, line 797 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/throw/return/

Done.


pkg/roachpb/api.proto, line 816 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I'm not sure we would ever want both - if an error is returned, the rest of the batch is failed, so we wouldn't reliably prevent all the intents if any of them encountered an error.

Good point. I'll keep this as is.


pkg/roachpb/api.proto, line 825 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think it's fine to not return a TxnMeta, since the caller is required to pass in a TxnMeta.

Does a bool capture all possibilities? Do we need to distinguish between "not found, but could still be written" from "not found, and could no longer be written"?

I don't think we need to make this distinction, at least for any of our proposed uses of this request. If a client needs to prevent a missing intent then it will use the PREVENT behavior, in which case this bool indicates the latter. If a client doesn't need to prevent the intent, then it currently doesn't cares about the difference.


pkg/roachpb/errors.proto, line 384 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Document this. Why do we echo the caller's TxnMeta back to them? If we need a TxnMeta here, maybe we should include one in the QueryIntentResponse.

Done. And added the incorrect intent that was found, if any, to give the error message more color.


pkg/storage/replica.go, line 2089 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

No, I don't think setting WriteTooOld is required here. WriteTooOld forces transaction restarts because there is known to be a conflicting write; to prevent the writing of an intent the read timestamp cache is sufficient and a bit cheaper.

This depends on what we mean by "prevent the writing of an intent". I'm realizing (see TestQueryIntentRequest) that the read timestamp cache doesn't prevent the intent entirely, it just forces it to bump its timestamp. That's sufficient in the case of parallel commits because DistSender will check whether the transaction had its timestamp pushed by any promised write before committing the transaction.

Removed this comment.


pkg/storage/batcheval/cmd_query_intent.go, line 59 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Is the >= required here? When would the epoch and sequence not be an exact match?

The epoch should have been ==. This was also missing the very important Timestamp comparison, which is necessary because an intent that has a larger timestamp than expected must have been pushed and should not count as a matching intent.

The sequence number is the only interesting case where we want more than exact equality. See the parallel commits RFC or the command I added to QueryIntentRequest.Txn for a discussion on why.


Comments from Reviewable

@nvanbenschoten nvanbenschoten changed the title [WIP] roachpb: introduce QueryIntent request roachpb: introduce QueryIntent request Jun 7, 2018
@nvanbenschoten
Copy link
Member Author

Review status: 0 of 13 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


pkg/storage/batcheval/cmd_query_intent.go, line 59 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The epoch should have been ==. This was also missing the very important Timestamp comparison, which is necessary because an intent that has a larger timestamp than expected must have been pushed and should not count as a matching intent.

The sequence number is the only interesting case where we want more than exact equality. See the parallel commits RFC or the command I added to QueryIntentRequest.Txn for a discussion on why.

I think the Timestamp comparison is off here and in the parallel commits RFC. I think it actually should be checking that the intent has a timestamp that is "less than or equal to" the txn's provisional commit timestamp. The reason for this is that a txn's intents don't need to be at its commit timestamp to allow it to commit, they just need to be at a timestamp equal to or less than its commit timestamp.

This comes up with write pipelining because a batch may have its timestamp bumped after it laid down intents but from the point of view of the txn coordinator it's unclear which of its intents were laid down at which timestamp. If the txn refreshes successfully then its intents, even at older timestamps, are still valid and a QueryIntent request should still see them.

Likewise, for parallel commits, a pusher trying to prevent a transaction from committing needs to prove that the transaction has failed to lay down an intent that it promises to write before committing. If the pusher finds an intent with the same sequence number but a smaller timestamp then bumping the timestamp cache won't do anything from the point of view of the committing transaction. In its eyes, it has laid down a "committable" intent.

cc. @tschottdorf to double check this reasoning.


Comments from Reviewable

@tbg
Copy link
Member

tbg commented Jun 9, 2018 via email

@bdarnell
Copy link
Contributor

bdarnell commented Jun 9, 2018

Reviewed 13 of 13 files at r2.
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


pkg/roachpb/api.proto, line 797 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Done.

Change the constant too.


pkg/roachpb/api.proto, line 749 at r2 (raw file):

  // the specified key, the intent is only considered a match if it has the same
  // ID, the same epoch, and the same provisional commit timestamp. The
  // timestamp check enforces that if a transaction was pushed before laying

Update this comment with the new rules for timestamp matches.


pkg/roachpb/errors.proto, line 384 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Done. And added the incorrect intent that was found, if any, to give the error message more color.

Why do we need to return this at all since it was a part of the request?


pkg/storage/replica.go, line 2089 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This depends on what we mean by "prevent the writing of an intent". I'm realizing (see TestQueryIntentRequest) that the read timestamp cache doesn't prevent the intent entirely, it just forces it to bump its timestamp. That's sufficient in the case of parallel commits because DistSender will check whether the transaction had its timestamp pushed by any promised write before committing the transaction.

Removed this comment.

Yes, I think we just mean "prevent writing an intent at this timestamp", so the read timestamp cache is sufficient.


Comments from Reviewable

@nvanbenschoten
Copy link
Member Author

I updated this change with the new timestamp rule. I also made two other changes:

  1. The MVCCGet call now operates at MaxTimestamp so that we ensure that we see all intents. This is similar to how MVCCGet will observe a transaction's own intent regardless of its timestamp.
  2. PREVENT is no longer supported with SNAPSHOT transactions, as pushing a snapshot transaction's timestamp forward won't force it to abort.

PTAL.


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.proto, line 797 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Change the constant too.

Done.


pkg/roachpb/api.proto, line 749 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Update this comment with the new rules for timestamp matches.

Done.


pkg/roachpb/errors.proto, line 384 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why do we need to return this at all since it was a part of the request?

Done.


pkg/storage/replica.go, line 2089 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Yes, I think we just mean "prevent writing an intent at this timestamp", so the read timestamp cache is sufficient.

Done.


Comments from Reviewable

This change introduces a new request method called QueryIntent. The request
type draws parallels to the QueryTxn method, but it visits an intent instead
of a transaction record and returns whether the intent exists or not. The
request type also includes an "if missing" behavior, which allows clients
to optionally ensure that the request prevents a missing intent from ever
being written or return an error if the intent is found to be missing.

This request type was proposed/discussed in both cockroachdb#24194 and cockroachdb#16026. It is a
prerequisite for either proposal.

Release note: None
@bdarnell
Copy link
Contributor

:lgtm:


Reviewed 8 of 8 files at r3.
Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


Comments from Reviewable

@nvanbenschoten
Copy link
Member Author

TFTR.

bors r=bdarnell

craig bot pushed a commit that referenced this pull request Jun 25, 2018
26335: roachpb: introduce QueryIntent request r=bdarnell a=nvanbenschoten

This change introduces a new request method called QueryIntent. The request
type draws parallels to the QueryTxn method, but it visits an intent instead
of a transaction record and returns whether the intent exists or not. The
request type also includes an "if missing" behavior, which allows clients
to optionally ensure that the request prevents a missing intent from ever
being written or return an error if the intent is found to be missing.

This request type was proposed/discussed in both #24194 and #16026. It is a
prerequisite for either proposal.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig
Copy link
Contributor

craig bot commented Jun 25, 2018

Build succeeded

@craig craig bot merged commit b2a8f75 into cockroachdb:master Jun 25, 2018
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/checkIntent branch June 25, 2018 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants