-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: make BeginTxn optional proposal #25437
Comments
gist for a test demonstrating the issue: |
Update the proposal above. |
Note that we do not yet support continuing after an error like this, and whenever this comes up we usually decide we'd rather add an option to turn ConditionFailedError into a non-error return value instead of supporting continuing after an error.
This is a big change from the current semantics. A missing txn record is currently equivalent to an aborted one. I'll need a lot more convincing that "either case is fine". Is the transaction state machine just obsolete because it predates our current uses of the timestamp cache and abort span?
What about repeated EndTxns without BeginTxn? EndTxn doesn't consult the timestamp cache. If we can (sometimes) succeed without a BeginTxn, why would we ever send one? Is BeginTxn now simply the first HeartbeatTxn? |
Release note (sql change): introduce experimental statements `CREATE ROLE` and `DROP ROLE` to create/delete roles. Part of [role-based access control](https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20171220_sql_role_based_access_control.md). This is almost exactly the same as CREATE/DROP USER, with the change of a single field. Thus, we do the following: * introduce wrapped plans in CCL code * in CCL: check license then call into OSS code * lookup "isRole" on conflict for clearer error message * add node-creation helpers to `PlanHookState` We really need to improve CCL/OSS interaction. eg: * better OSS-build errors: "unknown statement type" sucks * make more OSS code reusable from CCL: maybe export planner and nodes
Tobi, Ben and I discussed this offline. Saying that "one can't continue with a txn after a non-retriable error" seems indeed good for now, so that takes away the pressure from doing anything here. However the proposal generally seemed like a good idea.
I think the answer is that, if there's any writes in the EndTxn batch, then there'll be a
Indeed, the |
Why would it be inconsequential? Note that EndTxn is normally sent in a batch by itself for any non-1PC transaction.
Are you sure about that? Today there's |
I'm currently getting rid of Does this answer your concern? |
Yes, I think so. |
Because it won't be committing anything that should already be committed, right? It could be pointing at intents that don't exist because they have already been resolved, but we already handle that case. |
My concern is that this opens the door for a transaction to transition from an aborted state to a committed state, and for some of its intents to be resolved while it is considered to be aborted (by a concurrent pusher) and others to be resolved while it is committed. We currently prevent this by the combination of the transaction state machine (specifically that a transaction is only allowed to transition to a committed state from a pending state) and the timestamp cache. If we relax the transaction state transition rules, we'd be relying solely on the timestamp cache (as I understand it). Is that sufficient? It might be, but I don't think I've seen a strong enough argument for that yet. |
My reading of the transaction abortion logic is that a push that doesn't find a txn record writes one in the aborted state. It then leaves this record around until the record is GCed or an This is related to this section in the parallel commits RFC. It's also very similar to what we do with the txn AbortSpan and the poison option. |
@andreimatei and I spent a while talking about this today. The conclusion we came to is that we shouldn't need the timestamp cache to store txn IDs at all in order to prevent aborted txns from becoming committed or to prevent committed txn replays from causing issues. Here are all of the states possible (I think):
*Unfortunately this isn't actually true because we allow transactions that hit WriteTooOld errors to commit without client buy-in here and here. I think this alone necessitates the timestamp cache check. @bdarnell I'm curious if you have any feelings about this optimization. In the meantime, checking the timestamp cache for EndTxn requests instead of BeginTxn requests should provide the same protection, so it's not a blocker for this change. |
What's this talk about txn IDs in the timestamp cache? The timestamp cache already stores txn IDs (so that txns can overwrite values that they read at the same timestamp).
The former case (advance timestamp in EndTransaction if no refresh spans) makes a difference in ycsb, i think (it affects blind upserts that experience contention). I think the replica-level retry on WriteTooOldError is ancient and I'm skeptical that it provides much value (without this retry we stop laying down intents for a batch after the first WriteTooOldError; with it we stop after the second). There are two ways that txn records get cleaned up by the GC operation: the GC queue (which sets TxnSpanGCThreshold) and intentResolver (which does not). The latter happens when an EndTransaction runs and can't delete the record immediately because it has external intents. I think this is generally equivalent to the "immediate" case since there is an on-disk record until it's done but it muddies the list of cases a bit. |
Sorry not txn IDs, txn keys in the write timestamp cache on
Yeah, it affects blind upserts that experience contention. Unfortunately ycsb performs UPDATEs which read first, so it doesn't take advantage of the optimisation.
Continuing to lay down intents after the first WriteTooOldError is fine. The part that makes this dangerous is c6f9b92, which allows us to commit after doing so without client buy-in.
Yes, we should still be protected by WriteTooOld errors. |
Informs cockroachdb#25437. Informs cockroachdb#32971. This is the first part of addressing cockroachdb#32971. Part two will update concurrent txns to not immediately abort missing txn records and part three will update the txn client to stop sending BeginTxn records. The change closely resembles what was laid out in the corresponding RFC sections. `HeartbeatTxn` and `EndTxn` are both updated to permit the creation of transaction records when they finds that one is missing. To prevent replays from causing issues, they *both* check the write timestamp cache when creating a transaction record. This is one area where the change diverges from the RFC. Instead of having `EndTxn` always check the write timestamp cache, it only does so if it is creating a txn record from scratch. To make this safe, `HeartbeatTxn` also checks the write timestamp cache if it is creating a txn record from scratch. This prevents a long running transaction from increasingly running the risk of being aborted by a lease transfer as its `EndTxn` continues to be delayed. Instead, a lease transfer will only abort a transaction if it comes before the transaction record creation, which should be within a second of the transaction starting. The change pulls out a new `CanCreateTxnRecord` method, which has the potential of being useful for detecting eagerly GCed transaction records and solving the major problem from the RFC without an extra QueryIntent. This is what Andrei was pushing for before. More details are included in TODOs within the PR. _### Migration Safety Most of the changes to the transaction state machine are straightforward to validate. Transaction records can still never be created without checking for replays and only an EndTxn from the client's sync path can GC an ABORTED txn record. This means that it is still impossible for a transaction to move between the two finalized statuses (at some point it would be worth it to draw this state machine). The one tricky part is ensuring that the changes in this PR are safe when run in mixed-version clusters. The two areas of concern are: 1. lease transfers between a new and an old cluster on a range that should/does contain a transaction record. 2. an old txn client that is not expecting HeartbeatTxn and EndTxn requests to create txn records if they are found to be missing. 3. an old txn client may not expect a HeartbeatTxn to hit the write timestamp cache if run concurrently with an EndTxn. The first concern is protected by the write timestamp cache. Regardless of which replica creates that transaction record from a BeginTxn req or a HeartbeatTxn req (on the new replica), it will first need to check the timestamp cache. This prevents against any kind of replay that could create a transaction record that the old replica is not prepared to handle. We can break the second concern into two parts. First, an old txn client will never send an EndTxn without having sent a successful BeginTxn, so the new behavior there is not an issue. Second, an old txn client may send a HeartbeatTxn concurrently with a BeginTxn. If the heartbeat wins, it will create a txn record on a new replica. If the BeginTxn evaluates on a new replica, it will be a no-op. If it evaluates on an old replica, it will result in a retryable error. The third concern is specific to the implementation of the heartbeat loop itself. If a HeartbeatTxn loses a race with an EndTxn, it may get an aborted error after checking the timestamp cache. This is desirable from the replica-side, but we'd need to ensure that the client will handle it correctly. This PR adds some protection (see `sendingEndTxn`) to the txn client to prevent this case from causing weirdness on the client, but I don't think this could actually cause issues even without this protection. The reason is that the txn client might mark itself as aborted due to the heartbeat, but this will be overwritten when the EndTxn returns and the sql client will still hear back from the successful EndTxn. This must have actually always been an issue because it was always possible for a committed txn record to be GCed and then written as aborted later, at which point a concurrent heartbeat could have observed it. I'd like Andrei to sign off on this last hazard, as he's thought about this kind of thing more than anybody. Release note: None
33396: storage: allow HeartbeatTxn and EndTxn requests to create txn records r=nvanbenschoten a=nvanbenschoten Informs #25437. Informs #32971. This is the first part of addressing #32971. Part two will update concurrent txns to not immediately abort missing txn records and part three will update the txn client to stop sending BeginTxn records. The change closely resembles what was laid out in the corresponding RFC sections. `HeartbeatTxn` and `EndTxn` are both updated to permit the creation of transaction records when they finds that one is missing. To prevent replays from causing issues, they *both* check the write timestamp cache when creating a transaction record. This is one area where the change diverges from the RFC. Instead of having `EndTxn` always check the write timestamp cache, it only does so if it is creating a txn record from scratch. To make this safe, `HeartbeatTxn` also checks the write timestamp cache if it is creating a txn record from scratch. This prevents a long-running transaction from increasingly running the risk of being aborted by a lease transfer as its `EndTxn` continues to be delayed. Instead, a lease transfer will only abort a transaction if it comes before the transaction record creation, which should be within a second of the transaction starting. The change pulls out a new `CanCreateTxnRecord` method, which has the potential of being useful for detecting eagerly GCed transaction records and solving the major problem from the RFC without an extra QueryIntent. This is what Andrei was pushing for before. More details are included in TODOs within the PR. ### Migration Safety Most of the changes to the transaction state machine are straightforward to validate. Transaction records can still never be created without checking for replays and only an EndTxn from the client's sync path can GC an ABORTED txn record. This means that it is still impossible for a transaction to move between the two finalized statuses (at some point it would be worth it to draw this state machine). The one tricky part is ensuring that the changes in this PR are safe when run in mixed-version clusters. The two areas of concern are: 1. lease transfers between a new and an old cluster on a range that should/does contain a transaction record. 2. an old txn client that is not expecting HeartbeatTxn and EndTxn requests to create txn records if they are found to be missing. 3. an old txn client may not expect a HeartbeatTxn to hit the write timestamp cache if run concurrently with an EndTxn. The first concern is protected by the write timestamp cache. Regardless of which replica creates that transaction record from a BeginTxn req or a HeartbeatTxn req (on the new replica), it will first need to check the timestamp cache. This prevents against any kind of replay that could create a transaction record that the old replica is not prepared to handle. We can break the second concern into two parts. First, an old txn client will never send an EndTxn without having sent a successful BeginTxn, so the new behavior there is not an issue. Second, an old txn client may send a HeartbeatTxn concurrently with a BeginTxn. If the heartbeat wins, it will create a txn record on a new replica. If the BeginTxn evaluates on a new replica, it will be a no-op. If it evaluates on an old replica, it will result in a retryable error. The third concern is specific to the implementation of the heartbeat loop itself. If a HeartbeatTxn loses a race with an EndTxn, it may get an aborted error after checking the timestamp cache. This is desirable from the replica-side, but we'd need to ensure that the client will handle it correctly. This PR adds some protection (see `sendingEndTxn`) to the txn client to prevent this case from causing weirdness on the client, but I don't think this could actually cause issues even without this protection. The reason is that the txn client might mark itself as aborted due to the heartbeat, but this will be overwritten when the EndTxn returns and the sql client will still hear back from the successful EndTxn. This must have actually always been an issue because it was always possible for a committed txn record to be GCed and then written as aborted later, at which point a concurrent heartbeat could have observed it. I'd like Andrei to sign off on this last hazard, as he's thought about this kind of thing more than anybody. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Informs cockroachdb#25437. Informs cockroachdb#32971. This is the second part of addressing cockroachdb#32971. The final part will be updating the txn client to stop sending BeginTxn requests. The change closely resembles what is laid out in the corresponding RFC sections (`PushTxn` and `QueryTxn / txnwait.Queue`). Both `PushTxn` requests and `QueryTxn` requests are adjusted to tolerate missing transaction records and to synthesize them based on the information pulled from intents if necessary. By hiding these details behind the request API abstraction, we don't actually need to modify the `txnwait.Queue` at all! The change does diverge from the RFC in two distinct ways. The first is that it introduces a new invariant about transaction record creation. Transaction can now only ever be created by requests originating from their own coordinator (`BeginTxn`, `HeartbeatTxn`, and `EndTxn`). They can never be created in any state (not even ABORTED) by concurrent actors. This is actually a much stronger invariant than what previously existed (and even what was proposed in the RFC), and it simplifies a lot of logic by dramatically refining the transaction state machine. We now know that for a transaction record to exist, it must have been created by its own coordinator and it must have checked with `CanCreateTxnRecord`. Making this new invariant work required the second major divergence from the RFC. The RFC suggested that we use the read timestamp cache for successful transaction timestamp pushes before a transaction record was written. This PR takes this a step further and uses the write timestamp cache for successful transaction aborts before a transaction record was written. In doing this, we emerge with a very sane timestamp cache policy - the read timestamp cache is used to push transaction timestamps and the write timestamp cache is used to abort them entirely (which it was already essentially being used for). The txnID marker on these timestamp cache entries then becomes the transaction that initiated the push/abort. Transactions then consult both of these sources before creating their transaction record in `CanCreateTxnRecord`. In doing this, we avoid an entire class of situations where concurrent transactions created and abandoned transaction records for another transaction, requiring eventual GC. Release note: None
Informs cockroachdb#25437. Informs cockroachdb#32971. This is the second part of addressing cockroachdb#32971. The final part will be updating the txn client to stop sending BeginTxn requests. The change closely resembles what is laid out in the corresponding RFC sections (`PushTxn` and `QueryTxn / txnwait.Queue`). Both `PushTxn` requests and `QueryTxn` requests are adjusted to tolerate missing transaction records and to synthesize them based on the information pulled from intents if necessary. By hiding these details behind the request API abstraction, we don't actually need to modify the `txnwait.Queue` at all! The change does diverge from the RFC in two distinct ways. The first is that it introduces a new invariant about transaction record creation. Transaction can now only ever be created by requests originating from their own coordinator (`BeginTxn`, `HeartbeatTxn`, and `EndTxn`). They can never be created in any state (not even ABORTED) by concurrent actors. This is actually a much stronger invariant than what previously existed (and even what was proposed in the RFC), and it simplifies a lot of logic by dramatically refining the transaction state machine. We now know that for a transaction record to exist, it must have been created by its own coordinator and it must have checked with `CanCreateTxnRecord`. Making this new invariant work required the second major divergence from the RFC. The RFC suggested that we use the read timestamp cache for successful transaction timestamp pushes before a transaction record was written. This PR takes this a step further and uses the write timestamp cache for successful transaction aborts before a transaction record was written. In doing this, we emerge with a very sane timestamp cache policy - the read timestamp cache is used to push transaction timestamps and the write timestamp cache is used to abort them entirely (which it was already essentially being used for). The txnID marker on these timestamp cache entries then becomes the transaction that initiated the push/abort. Transactions then consult both of these sources before creating their transaction record in `CanCreateTxnRecord`. In doing this, we avoid an entire class of situations where concurrent transactions created and abandoned transaction records for another transaction, requiring eventual GC. Release note: None
Based on cockroachdb#33523. Closes cockroachdb#25437. This is the final part of addressing cockroachdb#32971. The change gates the use of BeginTransaction on a cluster version. When a cluster's version is sufficiently high, it will stop sending them. To make this work, a small change was needed for the detection of 1PC transactions. We now use sequence numbers to determine that a batch includes all of the writes in a transaction. This is in line with the proposal from the RFC's `1PC Transactions` section and will work correctly with parallel commits. Release note (performance improvement): Reduce the network and storage overhead of multi-Range transactions.
Informs cockroachdb#25437. Informs cockroachdb#32971. This is the second part of addressing cockroachdb#32971. The final part will be updating the txn client to stop sending BeginTxn requests. The change closely resembles what is laid out in the corresponding RFC sections (`PushTxn` and `QueryTxn / txnwait.Queue`). Both `PushTxn` requests and `QueryTxn` requests are adjusted to tolerate missing transaction records and to synthesize them based on the information pulled from intents if necessary. By hiding these details behind the request API abstraction, we don't actually need to modify the `txnwait.Queue` at all! The change does diverge from the RFC in two distinct ways. The first is that it introduces a new invariant about transaction record creation. Transaction can now only ever be created by requests originating from their own coordinator (`BeginTxn`, `HeartbeatTxn`, and `EndTxn`). They can never be created in any state (not even ABORTED) by concurrent actors. This is actually a much stronger invariant than what previously existed (and even what was proposed in the RFC), and it simplifies a lot of logic by dramatically refining the transaction state machine. We now know that for a transaction record to exist, it must have been created by its own coordinator and it must have checked with `CanCreateTxnRecord`. Making this new invariant work required the second major divergence from the RFC. The RFC suggested that we use the read timestamp cache for successful transaction timestamp pushes before a transaction record was written. This PR takes this a step further and uses the write timestamp cache for successful transaction aborts before a transaction record was written. In doing this, we emerge with a very sane timestamp cache policy - the read timestamp cache is used to push transaction timestamps and the write timestamp cache is used to abort them entirely (which it was already essentially being used for). The txnID marker on these timestamp cache entries then becomes the transaction that initiated the push/abort. Transactions then consult both of these sources before creating their transaction record in `CanCreateTxnRecord`. In doing this, we avoid an entire class of situations where concurrent transactions created and abandoned transaction records for another transaction, requiring eventual GC. Release note: None
Informs cockroachdb#25437. Informs cockroachdb#32971. This is the second part of addressing cockroachdb#32971. The final part will be updating the txn client to stop sending BeginTxn requests. The change closely resembles what is laid out in the corresponding RFC sections (`PushTxn` and `QueryTxn / txnwait.Queue`). Both `PushTxn` requests and `QueryTxn` requests are adjusted to tolerate missing transaction records and to synthesize them based on the information pulled from intents if necessary. By hiding these details behind the request API abstraction, we don't actually need to modify the `txnwait.Queue` at all! The change does diverge from the RFC in two distinct ways. The first is that it introduces a new invariant about transaction record creation. Transaction can now only ever be created by requests originating from their own coordinator (`BeginTxn`, `HeartbeatTxn`, and `EndTxn`). They can never be created in any state (not even ABORTED) by concurrent actors. This is actually a much stronger invariant than what previously existed (and even what was proposed in the RFC), and it simplifies a lot of logic by dramatically refining the transaction state machine. We now know that for a transaction record to exist, it must have been created by its own coordinator and it must have checked with `CanCreateTxnRecord`. Making this new invariant work required the second major divergence from the RFC. The RFC suggested that we use the read timestamp cache for successful transaction timestamp pushes before a transaction record was written. This PR takes this a step further and uses the write timestamp cache for successful transaction aborts before a transaction record was written. In doing this, we emerge with a very sane timestamp cache policy - the read timestamp cache is used to push transaction timestamps and the write timestamp cache is used to abort them entirely (which it was already essentially being used for). The txnID marker on these timestamp cache entries then becomes the transaction that initiated the push/abort. Transactions then consult both of these sources before creating their transaction record in `CanCreateTxnRecord`. In doing this, we avoid an entire class of situations where concurrent transactions created and abandoned transaction records for another transaction, requiring eventual GC. Release note: None
33523: storage: tolerate missing transaction records when pushing r=nvanbenschoten a=nvanbenschoten Informs #25437. Informs #32971. This is the second part of addressing #32971. The final part will be updating the txn client to stop sending BeginTxn requests. The change closely resembles what is laid out in the corresponding RFC sections (`PushTxn` and `QueryTxn / txnwait.Queue`). Both `PushTxn` requests and `QueryTxn` requests are adjusted to tolerate missing transaction records and to synthesize them based on the information pulled from intents if necessary. By hiding these details behind the request API abstraction, we don't actually need to modify the `txnwait.Queue` at all! The change does diverge from the RFC in two distinct ways. The first is that it introduces a new invariant about transaction record creation. Transaction records can now only ever be created by requests originating from their own coordinator (`BeginTxn`, `HeartbeatTxn`, and `EndTxn`). They can never be created in any state (not even ABORTED) by concurrent actors. This is a much stronger invariant than what previously existed (and even what was proposed in the RFC), and it simplifies a lot of logic by dramatically refining the transaction state machine. We now know that for a transaction record to exist, it must have been created by its own coordinator and it must have checked with `CanCreateTxnRecord`. Making this new invariant work required the second major divergence from the RFC. The RFC suggested that we use the read timestamp cache for successful transaction timestamp pushes before a transaction record is written. This PR takes this a step further and uses the write timestamp cache for successful transaction aborts before a transaction record is written. In doing this, we emerge with a very sane timestamp cache policy - the read timestamp cache is used to push transaction timestamps and the write timestamp cache is used to abort them entirely (which it was already essentially being used for). The txnID marker on these timestamp cache entries becomes the transaction that initiated the push/abort. Transactions then consult both of these sources before creating their transaction record in `CanCreateTxnRecord`. In doing this, we avoid an entire class of situations where concurrent transactions created and abandoned transaction records for another transaction, requiring eventual GC. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Based on cockroachdb#33523. Closes cockroachdb#25437. This is the final part of addressing cockroachdb#32971. The change gates the use of BeginTransaction on a cluster version. When a cluster's version is sufficiently high, it will stop sending them. To make this work, a small change was needed for the detection of 1PC transactions. We now use sequence numbers to determine that a batch includes all of the writes in a transaction. This is in line with the proposal from the RFC's `1PC Transactions` section and will work correctly with parallel commits. Release note (performance improvement): Reduce the network and storage overhead of multi-Range transactions.
Based on cockroachdb#33523. Closes cockroachdb#25437. This is the final part of addressing cockroachdb#32971. The change gates the use of BeginTransaction on a cluster version. When a cluster's version is sufficiently high, it will stop sending them. To make this work, a small change was needed for the detection of 1PC transactions. We now use sequence numbers to determine that a batch includes all of the writes in a transaction. This is in line with the proposal from the RFC's `1PC Transactions` section and will work correctly with parallel commits. Release note (performance improvement): Reduce the network and storage overhead of multi-Range transactions.
33566: kv: stop using BeginTransaction requests r=nvanbenschoten a=nvanbenschoten Based on #33523. Closes #25437. This is the final part of addressing #32971. The change gates the use of BeginTransaction on a cluster version. When a cluster's version is sufficiently high, it will stop sending them. To make this work, a small change was needed for the detection of 1PC transactions. We now use sequence numbers to determine that a batch includes all of the writes in a transaction. This is in line with the proposal from the RFC's `1PC Transactions` section and will work correctly with parallel commits. Release note (performance improvement): Reduce the network and storage overhead of multi-Range transactions. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
If a transaction sends a batch that touches multiple ranges, and if the sub-batch for the anchor range succeeds, but another one fails, we are now in a situation where the txn record has been written, but the
client.Txn
is not aware of that fact so it will attempt to send anotherBeginTxn
with the next batch.So, if the client choses to ignore the error (say, it's an expected
ConditionFailedError
on anInitPut
), and send another batch, that batch will have anBeginTxn
which fails withRETRY_POSSIBLE_REPLAY
because the txn record already exists. If you're inside adb.Txn
function, for example, you'll retry endlessly.On the other hand, if we don't send the
BeginTxn
again then, in case it does exist, it would fail (because of its replay protection via the timestamp cache).Discussing this with @tschottdorf, this is the proposal. We're moving towards a world with best-effort
BeginTxn
and idempotentEndTxn
.client.Txn
never send more than oneBeginTxn
. Once aBeginTxn
has been sent, regardless of whether it succeeded or not (or if it's unclear whether it did), there's no secondBeginTxn
. This way there's no failure when continuing to write in a transaction whoseBeginTxn
batch has previously failed. (*)EndTxn
succeed if it doesn't find a transaction record. In this case, theEndTxn
will simply write the transaction record and declare success. The txn record might not exist because it was never written, or it might not exist because this is a replay and it had already been cleaned up. Either case is fine.TxnCoordSender
never sends more than oneEndTxn
. This ensures that a commit doesn't race with a rollback such that the rollback succeeds and then the commit also succeeds.EndTxn
will continue to populate the timestamp cache, as it currently does, andBeginTxn
will continue to consult it. This way, we continue to be protected against a replayedBeginTxn
recreating a phantom txn record after the correspondingEndTxn
.(*) This also simplifies the logic in
client.Txn
a bunch, which currently uses a combination ofTxnProto.Writing
andwritingTxnRecord
to figure out whether aBeginTxn/EndTxn
needs to be sent. Beside being complex, this logic is also broken because it's not in sync with the logic used by theTxnCoordSender
when starting the hearbeat loop. It's currently possible for theTCS
to think that there might be intents laid down and so there's a heartbeat loop, but for theclient.Txn
to think that the transaction is read-only and so anEndTxn
does not need to be sent (in which case nobody stops the heartbeat loop). Attempting to fix these things led to finding this current bug.cc @tschottdorf @nvanbenschoten @bdarnell
The text was updated successfully, but these errors were encountered: