Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsonrpc: add finality to tx requests (tx, EXPERIMENTAL_tx_status), add method broadcast_tx #9644

Merged
merged 3 commits into from
Oct 24, 2023

Conversation

telezhnaya
Copy link
Contributor

@telezhnaya telezhnaya commented Oct 5, 2023

Fixes #6837

I'll need to update the docs and jsonrpc_client (and maybe something else, I haven't already checked)

Important updates:

  1. We have a new concept for all tx-related methods - finality. finality in the response is merged to master 2 weeks ago: add status field to response in RPC methods tx, EXPERIMENTAL_tx_status, broadcast_tx_commit, broadcast_tx_async #9556. finality field in the request means "I want at least this level of confidence". So, the stricter the level, the longer the user needs to wait.
  2. I decided to use Final as a default finality value because it gives the strongest guarantee and does not change backward compatibility for any of the methods (though, the waiting time may be increased sometimes to achieve the strictest level of confidence).
  3. Methods tx, EXPERIMENTAL_tx_status have finality as an additional optional field.
  4. A new method appeared - broadcast_tx. It allows to send the tx, it also have the optional field finality.
  5. broadcast_tx_async is equal to broadcast_tx with hardcoded finality None, I'll mark it as deprecated in the doc
  6. broadcast_tx_commit is equal to broadcast_tx with hardcoded finality Final, I'll mark it as deprecated in the doc.

@telezhnaya telezhnaya force-pushed the add_tx_request_parameter branch 4 times, most recently from 3ee947f to e577372 Compare October 16, 2023 17:17
@telezhnaya telezhnaya changed the title wip jsonrpc: add finality to tx requests (tx, EXPERIMENTAL_tx_status), add method broadcast_tx Oct 16, 2023
pub enum TxExecutionStatus {
/// Transaction is waiting to be included into the block
None,
/// Transaction is included into the block. The block may be not finalised yet
Inclusion,
Included,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change might be dangerous, but we merged this enum only few weeks ago, and it's not used in production yet.
I realised that it's better if all the options would be adjectives, it's more intuitive to see "included" near "executed"

@telezhnaya telezhnaya force-pushed the add_tx_request_parameter branch 3 times, most recently from d9e4f26 to 1b113c0 Compare October 17, 2023 14:04
break Err(err);
if finality == TxExecutionStatus::None {
break Err(err);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the ability to provide finality to requests of methods tx, EXPERIMENTAL_tx_status, and Final as the default status value, the behavior changes in some cases. It may be considered as a breaking of backward compatibility.
In particular, I can imagine 2 situations:

  1. When the tx was just sent to the network, we previously showed "Unknown transaction" error next ~3-6 seconds. With this change, we will wait these ~3-6 seconds and then return the existing tx.
  2. When the tx never existed (let's say it's the tx from testnet, and the user tries to find in on mainnet), and the user provides any finality other than None (or leaves the field blanc and we take the default value), user will wait for 10 seconds and then see Timeout error. Previously, we showed "Unknown transaction" almost immediately.

My opinion here:
The first one seems more like an improvement to me. (@khorolets will be happy because I will accidentally improve the correctness of Read RPC with this change)
The second one seems rare enough. If someone has a bunch of random hashes and they need to find all the existing transactions, they may use finality None to speed up the process

@telezhnaya telezhnaya force-pushed the add_tx_request_parameter branch 2 times, most recently from 0febd67 to be36a7a Compare October 17, 2023 15:47
> {
self.broadcast_tx(RpcBroadcastTransactionRequest {
signed_transaction: request_data.signed_transaction,
finality: TxExecutionStatus::Final,
Copy link
Contributor Author

@telezhnaya telezhnaya Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I have to hardcode finality because it's the handler of the legacy method.
Digging into the code, I see that we never checked in broadcast_tx_commit if the tx was included into the canonical chain. Ofk, same applies to all the corresponding receipts.
At the same time, the doc says that broadcast_tx_commit gives strictest guarantees that we may give: https://docs.near.org/api/rpc/transactions#send-transaction-await
I think we have never seen any issues here because the situation of parallel chains is very rare, so the majority of the existing blocks were included into the canonical chain.
Moreover, if the receipts chain is long enough (3+ sequential receipts), we have a huge probability that the original transaction's block was finalized. broadcast_tx_commit waits for all the execution outcomes, so accidentally it waits for tx finalization as well.
The closest finality we defined to this behavior is Executed.

However, I think we should use Final here because broadcast_tx_commit was always promoted as the method with the strictest guarantees, so let's follow these expectations.

@telezhnaya telezhnaya marked this pull request as ready for review October 17, 2023 16:00
@telezhnaya telezhnaya requested a review from a team as a code owner October 17, 2023 16:00
@telezhnaya
Copy link
Contributor Author

telezhnaya commented Oct 17, 2023

@khorolets @firatNEAR I've added you to the reviewers because I want your opinion about the changes to the interface I've proposed.
You may totally ignore the code changes, it's enough just to read the comments I've left and argue with me :)

@nikurt sorry but you need to look at the code as well

khorolets
khorolets previously approved these changes Oct 18, 2023
Copy link
Member

@khorolets khorolets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find anything to argue with. I've left a small comment, though.

While I agree with the changes and cleanups you did, it seems like you have refactored some of the methods changing the legacy behavior. Don't get me wrong, this is nice and all, but I think it'd be better to add all of these like EXPERIMENTAL_ while keeping the legacy, since we don't have a way to actually deprecate the stuff.

I don't insist, though.

chain/jsonrpc/src/lib.rs Outdated Show resolved Hide resolved
nikurt
nikurt previously approved these changes Oct 18, 2023
Copy link
Contributor

@nikurt nikurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good.
Inclusion->Included is a good change.
Thank you for the comments, that makes reviewing easier :)

@@ -312,6 +314,9 @@ impl JsonRpcHandler {
Ok(match request.method.as_ref() {
// Handlers ordered alphabetically
"block" => process_method_call(request, |params| self.block(params)).await,
"broadcast_tx" => {
Copy link
Contributor

@nikurt nikurt Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word broadcast bothers me a bit.
It's not a broadcast (doesn't send a message to everyone), and it exposes an implementation detail.
In the documentation this method is described as "Send transaction", which I like more.

That's bikeshedding, and I don't suggest to rename existing methods, but maybe this new method can have a better naming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firatNEAR
firatNEAR previously approved these changes Oct 18, 2023
Copy link
Contributor

@firatNEAR firatNEAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a minor comment about 'EXPERIMENTAL_', other than that LGTM.

@telezhnaya telezhnaya dismissed stale reviews from firatNEAR, nikurt, and khorolets via 58b96be October 18, 2023 18:48
@telezhnaya telezhnaya force-pushed the add_tx_request_parameter branch 2 times, most recently from 2a616a9 to 3396a9c Compare October 18, 2023 19:12
nikurt
nikurt previously approved these changes Oct 19, 2023
Copy link
Contributor Author

@telezhnaya telezhnaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikurt please have a look

// previous blocks may be not in the canonical chain
Ok(match self.chain.check_blocks_final_and_canonical(&headers) {
Err(_) => TxExecutionStatus::Executed,
Ok(_) => TxExecutionStatus::Final,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a bug in this code. For InclusionFinal, there was no check of block finality.
I've rewritten it fully. It's not written in a conventional Rust way, but, imho, now it's much easier to read.

@@ -33,7 +33,7 @@ pub fn start_all_with_validity_period_and_no_epoch_sync(
enable_doomslug: bool,
) -> (Addr<ViewClientActor>, tcp::ListenerAddr) {
let actor_handles = setup_no_network_with_validity_period_and_no_epoch_sync(
vec!["test1".parse().unwrap(), "test2".parse().unwrap()],
vec!["test1".parse().unwrap()],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this change, we miss every second block -> we have no finalised blocks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I'm not familiar with these tests I don't have an opinion whether 1 validator is always a good idea.
I assume it is, the fewer moving pieces, the simpler the tests.

.tx(RpcTransactionStatusRequest {
transaction_info: TransactionInfo::TransactionId {
tx_hash,
sender_account_id: signer_account_id,
Copy link
Contributor Author

@telezhnaya telezhnaya Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've copy-pasted this logic from chain/jsonrpc/src/lib.rs
And added explicit naming

And with these changes, I'm curious
How will it work with meta tx?
And is sender_account_id naming correct here? (I've copy pasted this from the doc)

upd that comment should be here https://github.com/near/nearcore/pull/9644/files#diff-b0dd5d6f8f4d2a7b135860666232d02f02a60ee1dd9d7fd65ee6afe79550e98eR80
It's not about tests, it's about general implementation in file chain/jsonrpc-primitives/src/types/transactions.rs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In meta tx doc, we also have this sender/signer mess
https://nomicon.io/RuntimeSpec/Actions#delegate-actions

struct DelegateAction {
    /// Signer of the delegated actions
    pub sender_id: AccountId,
...
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code has simply sender_id, receiver_id:

pub struct DelegateAction {

Should be fine

}

#[derive(Clone, Debug, serde::Serialize, serde::Deserialize)]
pub enum SignedTransaction {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this +1 nesting enum is required to make serde's annotations magic work.
I've added functions below to avoid unpacking it each time

@@ -454,7 +454,7 @@ fn test_validators_ordered() {
.unwrap();
assert_eq!(
validators.into_iter().map(|v| v.take_account_id()).collect::<Vec<_>>(),
vec!["test1".parse().unwrap(), "test2".parse().unwrap()]
vec!["test1".parse().unwrap()]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I treat this test as "we have this method and it works at least somehow"
Previous 2 validators did not help us with checking if the order is correct (whatever it should be).

@@ -33,7 +33,7 @@ pub fn start_all_with_validity_period_and_no_epoch_sync(
enable_doomslug: bool,
) -> (Addr<ViewClientActor>, tcp::ListenerAddr) {
let actor_handles = setup_no_network_with_validity_period_and_no_epoch_sync(
vec!["test1".parse().unwrap(), "test2".parse().unwrap()],
vec!["test1".parse().unwrap()],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I'm not familiar with these tests I don't have an opinion whether 1 validator is always a good idea.
I assume it is, the fewer moving pieces, the simpler the tests.

.tx(RpcTransactionStatusRequest {
transaction_info: TransactionInfo::TransactionId {
tx_hash,
sender_account_id: signer_account_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code has simply sender_id, receiver_id:

pub struct DelegateAction {

Should be fine

@telezhnaya telezhnaya added this pull request to the merge queue Oct 24, 2023
Merged via the queue into near:master with commit fa1cfa1 Oct 24, 2023
14 checks passed
@telezhnaya telezhnaya deleted the add_tx_request_parameter branch October 24, 2023 09:17
nikurt pushed a commit that referenced this pull request Nov 2, 2023
…, add method `broadcast_tx` (#9644)

Fixes #6837

I'll need to update the [docs](https://github.com/near/docs) and
[jsonrpc_client](https://github.com/near/near-jsonrpc-client-rs) (and
maybe something else, I haven't already checked)

Important updates:
1. We have a new concept for all tx-related methods - `finality`.
`finality` in the response is merged to master 2 weeks ago:
#9556. `finality` field in the
request means "I want at least this level of confidence". So, the
stricter the level, the longer the user needs to wait.
2. I decided to use `Final` as a default `finality` value because it
gives the strongest guarantee and does not change backward compatibility
for any of the methods (though, the waiting time may be increased
sometimes to achieve the strictest level of confidence).
3. Methods `tx`, `EXPERIMENTAL_tx_status` have `finality` as an
additional optional field.
4. A new method appeared - `broadcast_tx`. It allows to send the tx, it
also have the optional field `finality`.
6. `broadcast_tx_async` is equal to `broadcast_tx` with hardcoded
`finality None`, I'll mark it as deprecated in the doc
7. `broadcast_tx_commit` is equal to `broadcast_tx` with hardcoded
`finality Final`, I'll mark it as deprecated in the doc.
marcelo-gonzalez added a commit to marcelo-gonzalez/nearcore that referenced this pull request Nov 9, 2023
near#9644 added a new RPC method
for sending transactions, and added unused "wait_until" arguments
to the existing ones. This change was actually not a no-op for the
"broadcast_tx_commit" method, since now that method will wait until
all transactions and descendant receipts are final, whereas the previous
behavior was closer to the new "INCLUDED" "wait_until" parameter, and
we used to just wait for the receipt outcomes to show up, but would return
before they're executed and finalized.

So fix it by making send_tx_and_wait() call the send_tx method to get something
closer to the old behavior.

Also, an unrelated problem with this test is that the "consensus" changes
to the config were put in the wrong place, so that measure to decrease the flakiness
of this test wasn't actually in place
github-merge-queue bot pushed a commit that referenced this pull request Nov 10, 2023
#9644 added a new RPC method for
sending transactions, and added unused "wait_until" arguments to the
existing ones. This change was actually not a no-op for the
"broadcast_tx_commit" method, since now that method will wait until all
transactions and descendant receipts are final, whereas the previous
behavior was closer to the new "INCLUDED" "wait_until" parameter, and we
used to just wait for the receipt outcomes to show up, but would return
before they're executed and finalized.

So fix it by making send_tx_and_wait() call the send_tx method to get
something closer to the old behavior.

Also, an unrelated problem with this test is that the "consensus" change
to the config were put in the wrong place, so that measure to decrease
the flakiness of this test wasn't actually in place
@bowenwang1996
Copy link
Collaborator

@telezhnaya question: from the PR description it seems that we broadcast_tx_commit now waits for a transaction is finalized before returning. I am pretty sure the behavior before was that it just wait for the transaction to finish, not necessarily finalized. What prompted the change in behavior? cc @gmilescu

@telezhnaya
Copy link
Contributor Author

@bowenwang1996 we promoted this method as the one which gives 100% guarantees that the transaction was finished [in a way how people usually understand this, not in a Near-specific terms]. In reality, the transaction could be included and executed inside non-canonical chain which would be fully ignored later. We never explained this anywhere.
After the discussion with other engineers, I decided to change this behaviour and add some more guarantees.

Do we have any issues because of this change?

@bowenwang1996
Copy link
Collaborator

@telezhnaya yes please revert this part of the change. The issue is that it will introduce a nontrivial amount of latency (3s in the good case, potentially much longer if blocks or chunks are missed), which would make a lot of UI that relies on real time confirmation (such as wallets) break. I wonder whether this is also related to the latency issue that the SRE team is investigating. cc @rtsainear

@telezhnaya
Copy link
Contributor Author

@bowenwang1996 let's leave the check if the block with the initial transaction is finalised?
I want to set the guarantee to TxExecutionStatus::Executed

pub enum TxExecutionStatus {
/// Transaction is waiting to be included into the block
None,
/// Transaction is included into the block. The block may be not finalised yet
Included,
/// Transaction is included into finalised block
IncludedFinal,
/// Transaction is included into finalised block +
/// All the transaction receipts finished their execution.
/// The corresponding blocks for each receipt may be not finalised yet
Executed,
/// Transaction is included into finalised block +
/// Execution of transaction receipts is finalised
#[default]
Final,
}

It sounds like a good balance between the speed (we no longer wait for the finalisation of blocks with all the receipts) and the guarantees (the block with the transaction is finalised, it means that sooner or later all the receipts will also be included into finalised blocks).

@bowenwang1996
Copy link
Collaborator

@telezhnaya in general I am not against discussing what the default behavior of this method should be and making changes to it. However, given the vast impact it has on millions of users that are using NEAR today, I think we should be very careful with rolling it out. As far as I know, all the major apps (HOT, Kaiching, Sweat) rely on the endpoint being fast. So I suggest we revert the change to the default behavior and start a separate discussion as to what it should be changed to and make the community aware of the change before hand, rather than doing it unilaterally.

@telezhnaya
Copy link
Contributor Author

OK, I'll return the logic of broadcast_tx_commit back as it was before the changes.
FYI, I'll do this not by reverting, but by the new commit, so that we can still have the other changes (e.g. adding new method send_tx).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RPC end point to configure broadcast transaction await
5 participants