-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds --forceupdate to teosd #216
Conversation
I've tested this manually and it seems to run just fine. In order to test this you'd need a tower that has a last known block at height I may add some E2E testing for this if I manage to create a test where full blocks are created, so eviction can be triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about what the chain_monitor
would do given it's being passed an old/pruned tip. Is poll_best_tip()
gonna just skip the pruned blocks between tip
and the best tip?
teos/src/main.rs
Outdated
.validate(block_hash) | ||
.unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ummm, doesn't this fail as well when we are out of blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it does not because the headers chain is preserved even if the blocks are pruned.
teos/src/main.rs
Outdated
bitcoin_cli | ||
.deref() | ||
.get_header(&hash, Some(height as u32)) | ||
.await | ||
.unwrap() | ||
.validate(hash) | ||
.unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we move this into a variable and call it tip
.
teos/src/main.rs
Outdated
.await | ||
{ | ||
Ok(blocks) => blocks, | ||
Err(_) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go into this branch we would return some last_n_blocks whose tip isn't the same as the tip
variable we are using in the initialization of the tower components (watcher, etc...).
teos/src/main.rs
Outdated
let hash = rpc | ||
.get_block_hash(height + IRREVOCABLY_RESOLVED as u64) | ||
.unwrap(); | ||
if let Ok(blocks) = get_last_n_blocks( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These n
blocks won't be delivered to the tower components, right?
Yeah, you're right. I've come up with a better solution than this, will force push asap |
This should make more sense now |
teos/src/main.rs
Outdated
// If we are running in pruned mode some data may be missing (if we happen to have been offline for a wile) | ||
if let Some(prune_height) = rpc.get_blockchain_info().unwrap().prune_height { | ||
if last_known_header.height < prune_height as u32 { | ||
log::debug!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be logged as a warning.
teos/src/main.rs
Outdated
|
||
// If we are running in pruned mode some data may be missing (if we happen to have been offline for a wile) | ||
if let Some(prune_height) = rpc.get_blockchain_info().unwrap().prune_height { | ||
if last_known_header.height < prune_height as u32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would still fail if some of the 100 blocks that proceeds the last known one are pruned.
The check should be: last_known_header.height - 100 < prune_height as u32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<=
, but you're right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, NVM, I was thinking about the positive case (>=
)
I fixed suggestions and updated some of the logs since I didn't like them much. I've also updated the PR description with guidelines for how to test this in regtest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, looks good otherwise.
Didn't manually test it yet though.
Nits should be fixed. I'd be nice to get a tACK |
Yeah will work on that tonight. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK
Some comments (again :p) inline, looks good otherwise.
teos/src/config.rs
Outdated
@@ -102,6 +102,11 @@ pub struct Opt { | |||
#[structopt(long)] | |||
pub tor_support: bool, | |||
|
|||
/// Forces the tower to run even if the underlying chain has gone too far out of sync. This can only happen | |||
// if the node is being run in pruned mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
teos/src/main.rs
Outdated
if let Some(prune_height) = rpc.get_blockchain_info().unwrap().prune_height { | ||
if last_known_header.height - IRREVOCABLY_RESOLVED < prune_height as u32 { | ||
log::warn!( | ||
"Cannot load blocks in the range {}-{}. Chain has gone to far out of sync", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "totoo far our of sync"
teos/src/main.rs
Outdated
|
||
// If we are running in pruned mode some data may be missing (if we happen to have been offline for a while) | ||
if let Some(prune_height) = rpc.get_blockchain_info().unwrap().prune_height { | ||
if last_known_header.height - IRREVOCABLY_RESOLVED < prune_height as u32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has an off-by-one error 😂, should be last_known_header.height - IRREVOCABLY_RESOLVED + 1 < prune_height as u32
.
Let's say we last witnessed block 3000, then we will want to load blocks 2901-3000 (100 blocks). prune_height
indicates the lowest block we have stored, so if prune_height
is 2901 we should be fine, but last_known_header.height - IRREVOCABLY_RESOLVED
will be 2900, which would make the check fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right
teos/src/main.rs
Outdated
if last_known_header.height - IRREVOCABLY_RESOLVED < prune_height as u32 { | ||
log::warn!( | ||
"Cannot load blocks in the range {}-{}. Chain has gone to far out of sync", | ||
last_known_header.height - IRREVOCABLY_RESOLVED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be last_known_header.height - IRREVOCABLY_RESOLVED + 1
.
Addressed comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK c7f9f0a
If a tower hasn't been running for a long time and the backend runs in pruned mode it could be the case that by the time the tower comes back online, the Last known block by the tower is not being known by the node anymore. In this situation, the tower cannot bootstrap normally, given the cache cannot be populated. This commits adds a new argument to teosd (`--forceupdate`) that can be used to force a tower to update its last known block to the earliest known block by the backend under this situation. Notice that doing so may make the tower miss some of its state transitions (the ones triggered by missed blocks), so this must be done as a last resource.
If a tower hasn't been running for a long time and the backend runs in pruned mode it could be the case that by the time the tower comes back online, the Last known block by the tower is not being known by the node anymore. In this situation, the tower cannot bootstrap normally, given the cache cannot be populated.
This commits adds a new argument to teosd (
--forceupdate
) that can be used to force a tower to update its last known block to the earliest known block by the backend under this situation. Notice that doing so may make the tower miss some of its state transitions (the ones triggered by missed blocks), so this must be done as a last resource.Fixes #209
Supersedes #212
Testing
This can be tested easily in regtest by running
bitcoind
with the (hidden, undocumented)-fastprune
modifier. That would make blocks smaller and easier to prune. After that, you can prune the chain usingbitcoin-cli pruneblockchain x
wherex
is your target prune height. The pruning is not likely to happen to the exact block you specified, given block files are pruned as a whole, so the chain would be pruned to the closest value that allows a bock file to be deleted.So the flow of testing should be:
teosd
and mine a few blocks so there is alast_known_block
known by the towerbitcoin-cli pruneblockchain x
and checked the returnedy
valuey - IRREVOCABLY_RESOLVED < last_known_block_height
, continue, otherwise go back to3.
teosd
again and test it with and without--forceupdate