-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix wallet sync for RpcBlockchain
#683
Conversation
5045f92
to
894e744
Compare
1116b58
to
d984ac1
Compare
RpcBlockchain
RpcBlockchain
c7fbd0d
to
1205e6a
Compare
This comment was marked as resolved.
This comment was marked as resolved.
0ed7e52
to
45caf45
Compare
We should probably check whether this fixes #598 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This review was a collaborative effort between me and @afilini, but I want to make it clear that if some comments are stupid, it's only his fault. (jk) (well, maybe not)
Concept ACK, some comments
src/blockchain/rpc.rs
Outdated
let is_derivable = db_spk_count > 2; | ||
|
||
// ensure db scripts meet start script count requirements | ||
if is_derivable && db_spks.len() < self.sync_params.start_script_count { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use db_spk_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe you should check those per KeychainKind: start_script_count = 100
imho should mean 100 cached for the external and 100 cached for the internal descriptor. Otherwise you risk having start_script_count = 100
and, instead of 50/50, having 100 cached for the external and 0 for the internal.
If that's the case, maybe we should rename count
-> index
? So you can say "index = 50" instead of count = 100
and it's clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Changed.
let ext_spks = db.iter_script_pubkeys(Some(KeychainKind::External))?;
let int_spks = db.iter_script_pubkeys(Some(KeychainKind::Internal))?;
// This is a hack to see whether atleast one of the keychains comes from a derivable
// descriptor. We assume that non-derivable descriptors always has a script count of 1.
let last_count = std::cmp::max(ext_spks.len(), int_spks.len());
let has_derivable = last_count > 1;
// If atleast one descriptor is derivable, we need to ensure scriptPubKeys are sufficiently
// cached.
if has_derivable && last_count < params.start_script_count {
let inner_err = MissingCachedScripts {
last_count,
missing_count: params.start_script_count - last_count,
};
debug!("requesting more spks with: {:?}", inner_err);
return Err(Error::MissingCachedScripts(inner_err));
}
src/blockchain/rpc.rs
Outdated
.map(|v| v.map(|i| (*keychain, i))) | ||
.transpose() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe here it's better for readability if you use match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it's a matter of taste, but I've change it :/
.filter_map(|keychain| match db.get_last_index(*keychain) {
Ok(li_opt) => li_opt.map(|li| Ok((*keychain, li))),
Err(err) => Some(Err(err)),
})
src/blockchain/rpc.rs
Outdated
let raw_tx = | ||
match &db_tx.transaction { | ||
Some(raw_tx) => raw_tx, | ||
None => { | ||
updated = true; | ||
db_tx.transaction.insert(client.get_raw_transaction( | ||
&tx_res.info.txid, | ||
tx_res.info.blockhash.as_ref(), | ||
)?) | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unclear that in the None case, you're inserting in the same thing you're matching on (took me a second). What about:
match &mut db_tx.transaction {
Some(raw_tx) => raw_tx,
tx_opt => {
updated = true;
tx_opt.insert(client.get_raw_transaction(
&tx_res.info.txid,
tx_res.info.blockhash.as_ref(),
)?)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea. Changed.
src/blockchain/rpc.rs
Outdated
self._sent_from_raw_tx(db, db_tx.transaction.as_ref()?) | ||
.map(|sent| { | ||
if db_tx.sent != sent { | ||
Some((*txid, sent)) | ||
} else { | ||
None | ||
} | ||
}) | ||
.transpose() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, a match here might be more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
}) | ||
.transpose() | ||
}) | ||
.collect::<Result<Vec<_>, _>>()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might not need to collect into a Vec, as you're iterating them just below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH... Why do you need two loops? Can't you record the updates directly when you found them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because we cannot have two mutable references (self
is already a mutable reference).
The solution I changed to is to clone self.txs
first. However, I am thinking now DbState::update_state
should really returns DbStateUpdates
instead of having self
mutable? But I am already spending too much time on this PR!!! haha
src/blockchain/rpc.rs
Outdated
keychain.as_byte(), | ||
index | ||
); | ||
self.updated_last_indexes.insert(keychain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's worth it to have both a last_indexes
and a updated_last_indexes
, as it's quite complex and not a great performance improvement. You could simply have last_indexes
and at the end always rewrite the last index of each keychain in the database - without really caring if you didn't modify it. It's just two rows to update, not thousands, and I feel like it would make the code significantly easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
src/blockchain/rpc.rs
Outdated
self.txs | ||
.keys() | ||
.filter(|&txid| !self.retained_txs.contains(txid)) | ||
.try_for_each(|txid| batch.del_tx(txid, false).map(|_| del_txs += 1))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will work, as usually batch.del_tx
will return Some
only if you're deleting something that has just been inserted in the batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it? You're still doing:
if batch.del_tx(txid, false)?.is_some() {
debug!("deleting tx: {}", txid);
del_txs += 1;
}
But my point is that del_tx
will delete a transaction even when returning None...
Ping @afilini for an explanation, I don't remember why is the BatchDatabase doing so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because the batch object is basically disconnected from the actual database, it's just a collection of operations that are then applied at once at the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afilini @danielabrozzoni Thanks for pointing this out! I'll actually fix it now
info!( | ||
"db batch updates: del_txs={}, update_txs={}, update_utxos={}", | ||
del_txs, | ||
self.updated_txs.len(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this count correct? Above, you filter_map
the updated_txs
before inserting them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only filter the deleted txs, in which case I have the del_txs
variable :)
src/blockchain/rpc.rs
Outdated
// update txs | ||
self.updated_txs | ||
.iter() | ||
.filter_map(|txid| self.txs.get(txid)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe if self.txs.get(txid)
never returns None
it's better to expect with an explanation?
....Unless there are some cases where None
might be returned! But I can't think of any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorted.
// update txs
self.updated_txs
.iter()
.inspect(|&txid| debug!("updating tx: {}", txid))
.try_for_each(|txid| batch.set_tx(self.txs.get(txid).unwrap()))?;
90d1d36
to
d851ea3
Compare
d851ea3
to
cbe2c98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a few very small comment, code looks very good now!
/// Calculates received amount from raw tx. | ||
fn received_from_raw_tx(db: &D, raw_tx: &Transaction) -> Result<u64, Error> { | ||
raw_tx.output.iter().try_fold(0_u64, |recv, txo| { | ||
let v = if db.is_mine(&txo.script_pubkey)? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can use self.db
and remove the argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at how received_from_raw_tx
is used, it is used while a mutable reference to a field of self
still exists.
The suggested change will result in:
error[E0502]: cannot borrow `*self` as immutable because it is also borrowed as mutable
src/blockchain/rpc.rs
Outdated
let last_count = std::cmp::max(ext_spks.len(), int_spks.len()); | ||
let has_derivable = last_count > 1; | ||
|
||
// If atleast one descriptor is derivable, we need to ensure scriptPubKeys are sufficiently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
English is such an annoying language. Rules do not make sense. Anyway, thank you for pointing this out! haha
src/blockchain/rpc.rs
Outdated
|
||
// update tx sent fields from tx inputs | ||
self.txs | ||
.clone() // clone is required as we cannot have more than two mutable references |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: more than one (or alternatively just "two")
src/blockchain/rpc.rs
Outdated
} | ||
|
||
// update tx sent fields from tx inputs | ||
self.txs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of suggestions here to improve this a bit:
- instead of cloning the whole thing you can call
values()
first, so at least you avoid cloning the keys - the check that looks for the tx in
retained_txs
could be moved into afilter()
method, which maybe makes this a bit more readable. Also this check could be moved before theclone()
to again avoid cloning things you don't need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afilini Thank you for the suggestions!
However, using .values().cloned()
would not work, as calling .cloned()
only clones one element at a time (and we have a mutable reference of the whole self
).
In my original commit, I had all filters in an .filter_map
and had two loop (one for marking changes and one for applying changes) so we avoid the mutable-reference problems.
I think it is best to kinda go back to my original solution, but improve commenting and split "filter logic" into multiple .filter..()
calls.
This is what I have now:
// obtain vector of `TransactionDetails::sent` changes
let sent_updates = self
.txs
.values()
// only bother to update txs that are retained
.filter(|db_tx| self.retained_txs.contains(&db_tx.txid))
// only bother to update txs where the raw tx is accessable
.filter_map(|db_tx| (db_tx.transaction.as_ref().map(|tx| (tx, db_tx.sent))))
// recalcuate sent value, only update txs in which sent value is changed
.filter_map(|(raw_tx, old_sent)| {
self.sent_from_raw_tx(raw_tx)
.map(|sent| {
if sent != old_sent {
Some((raw_tx.txid(), sent))
} else {
None
}
})
.transpose()
})
.collect::<Result<Vec<_>, _>>()?;
// record send updates
sent_updates.iter().for_each(|&(txid, sent)| {
// apply sent field changes
self.txs.entry(txid).and_modify(|db_tx| db_tx.sent = sent);
// mark tx as modified
self.updated_txs.insert(txid);
});
This was my original:
Lines 454 to 476 in 9d787bf
// update sent from tx inputs | |
let sent_updates = self | |
.txs | |
.values() | |
.filter_map(|db_tx| { | |
let txid = self.retained_txs.get(&db_tx.txid)?; | |
self._sent_from_raw_tx(db, db_tx.transaction.as_ref()?) | |
.map(|sent| { | |
if db_tx.sent != sent { | |
Some((*txid, sent)) | |
} else { | |
None | |
} | |
}) | |
.transpose() | |
}) | |
.collect::<Result<Vec<_>, _>>()?; | |
// record send updates | |
sent_updates.into_iter().for_each(|(txid, sent)| { | |
self.txs.entry(txid).and_modify(|db_tx| db_tx.sent = sent); | |
self.updated_txs.insert(txid); | |
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had a thought. If we can guarantee that txs returned from listtransactions
RPC call are in chronological order (which I assume it would be), we can have the TransactionDetails::sent
recalculations in the same loop as the recalculations of all the other fields.
The only benefit of this is for a slight performance gain, so maybe no point to bother? (I would assume that reducing IO would contribute the most to performance)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I remember this now, I think I was the one who suggested merging the two loops 😅
In the end I like what you came up with!
Also if you could rebase to fix the conflicts in changelog.md |
cbe2c98
to
9d04879
Compare
The new implementation fixes the following: * We can track more than 100 scriptPubKeys * We can obtain more than 1000 transactions per sync * `TransactionDetails` for already-synced transactions are updated when new scriptPubKeys are introduced (fixing the missing balance/coins issue of supposedly tracked scriptPubKeys) `RpcConfig` changes: * Introduce `RpcSyncParams`. * Remove `RpcConfig::skip_blocks` (this is replaced by `RpcSyncParams::start_time`).
Before this commit, the rpc backend would not notice immature utxos (`listunspent` does not return them), making the rpc balance different to other blockchain implementations. Co-authored-by: Daniela Brozzoni <danielabrozzoni@protonmail.com>
9d04879
to
d5cdc53
Compare
These are as suggested by @danielabrozzoni and @afilini Also introduced `RpcSyncParams::force_start_time` for users who prioritise reliability above all else. Also improved logging.
d5cdc53
to
5eeba6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 5eeba6c
74e2c47 Replace `rpc::CoreTxIter` with `list_transactions` fn. (志宇) Pull request description: ### Description This fixes a bug where `CoreTxIter` attempts to call `listtransactions` immediately after a tx result is filtered (instead of being returned), when in fact, the correct logic will be to pop another tx result. The new logic also ensures that tx results are returned in chonological order. The test `test_list_transactions` verifies this. We also now ensure that `page_size` is between the range `[0 to 1000]` otherwise an error is returned. Some needless cloning is removed from `from_config` as well as logging improvements. ### Notes to the reviewers This is an oversight by me (sorry) for PR #683 ### Checklists #### All Submissions: * [x] I've signed all my commits * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) * [x] I ran `cargo fmt` and `cargo clippy` before committing #### Bugfixes: ~* [ ] This pull request breaks the existing API~ * [x] I've added tests to reproduce the issue which are now passing * [x] I'm linking the issue being fixed by this PR ACKs for top commit: afilini: ACK 74e2c47 Tree-SHA512: f32314a9947067673d19d95da8cde36b350c0bb0ebe0924405ad50602c14590f7ccb09a3e03cdfdd227f938dccd0f556f3a2b4dd7fdd6eba1591c0f8d3e65182
Fixes #677
Description
Unfortunately to fix all the problems, I had to do a complete re-implementation of
RpcBlockchain
.The new implementation fixes the following:
RpcConfig
changes:RpcSyncParams
.RpcConfig::skip_blocks
(this is replaced byRpcSyncParams::start_time
).Notes to the reviewers
RpcConfig
structure is changed. It will be good to confirm whether this is an okay change.Checklists
All Submissions:
cargo fmt
andcargo clippy
before committingNew Features:
* [ ] I've added tests for the new featureCHANGELOG.md
Bugfixes: