Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: OversizedVectorAllocation during initial synch (0.9.9) #806

Closed
amosgroth opened this issue Nov 19, 2022 · 26 comments
Closed

Bug: OversizedVectorAllocation during initial synch (0.9.9) #806

amosgroth opened this issue Nov 19, 2022 · 26 comments
Labels
bug Something isn't working

Comments

@amosgroth
Copy link

Describe the bug
Fresh Umbrel installation, fully synched Bitcoin Core, freshly installed Electrum fails and stops to synch:

Attaching to electrs_app_1, electrs_app_proxy_1, electrs_electrs_1, electrs_tor_1
electrs_1 | [2022-11-19T11:25:17.079Z INFO electrs::index] indexing 2000 blocks: [548741..550740]
electrs_1 | thread 'p2p_loop' panicked at 'invalid message: OversizedVectorAllocation { requested: 14330322073680149042, max: 4000000 }', src/p2p.rs:250:70
electrs_1 | note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
electrs_1 | [2022-11-19T11:25:17.578Z INFO electrs::db] closing DB at /data/db/bitcoin
electrs_1 | Error: electrs failed
electrs_1 |
electrs_1 | Caused by:
electrs_1 | 0: sync failed
electrs_1 | 1: failed to get block 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5
electrs_1 | 2: receiving on an empty and disconnected channel

I have tried this for the 3rd time now (un- and reinstalled Electrum, deleted all files in between, restarted synching). It seems it's always giving me the same error and it always stops at the same block height or same synching percentage (72%).

Electrs version
0.9.9

To Reproduce
Steps to reproduce the behavior:

  1. Configure and start Umbrel with BTC node, install Electrum after BTC is fully synched
  2. Install Electrum App and let it synch initially
  3. Wait
  4. See error

Expected behavior
Electrum Server synchs to 100% and doesn't stop in between, not being able to finish.

Configuration

  • Umbrel 0.5.3 on RasPi
  • Bitcoin Core 23.0
  • Electrum 0.9.9
@amosgroth amosgroth added the bug Something isn't working label Nov 19, 2022
@amosgroth amosgroth changed the title Bug: OversizedVectorAllocation Bug: OversizedVectorAllocation during initial synch (0.9.9) Nov 19, 2022
@romanz
Copy link
Owner

romanz commented Nov 19, 2022

It seems that electrs fails to parse 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 block returned from bitcoind :(

Could you please try to retrieve it using:

$ bitcoin-cli getblock 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 0 > block.hex

and attach the resulting block.hex file?

@amosgroth
Copy link
Author

Hi Roman,

Thx for your support. Pls find the file attached...
block.hex.zip

@amosgroth
Copy link
Author

Would there be the possibility to redownload and integrate the corrupted block with BTC Core in order to fix my problem with Electrum? (to prevent synching the whole blockchain from scratch)

@romanz
Copy link
Owner

romanz commented Nov 20, 2022

It seems that your node doesn't have the block available (the content of the block.hex.zip above):

error code: -1
error message:
Block not found on disk

@romanz
Copy link
Owner

romanz commented Nov 20, 2022

BTW, are the any errors in bitcoind logs?

@amosgroth
Copy link
Author

amosgroth commented Nov 20, 2022

It seems that your node doesn't have the block available (the content of the block.hex.zip above):

error code: -1
error message:
Block not found on disk

That's strange. My node says it's 100% synched and didn't complain while synching. I can also connect wallets through BTC Core. The log seems to be fine, only errors regarding peer connections, nothing said about block errors or equivalent.

One additional info that might be relevant for this is that I had a backup copy from the Bitcoin data directories (blocks/chainstate/indexes) from a few months ago - I used that to save time and bandwidth for synching...

@amosgroth
Copy link
Author

amosgroth commented Nov 20, 2022

BTW, are the any errors in bitcoind logs?

I searched the whole log file at ~/umbrel/app-data/bitcoin/data/bitcoin/debug.log:

  • block number 548750 is not mentioned
  • 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 is not mentioned

Only mentioned by Electrum...

@romanz
Copy link
Owner

romanz commented Nov 20, 2022

Could you please look for ERROR: in the logs?

//// in src/rpc/blockchain.cpp
static CBlock GetBlockChecked(BlockManager& blockman, const CBlockIndex* pblockindex) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
{
    AssertLockHeld(::cs_main);
    CBlock block;
    if (blockman.IsBlockPruned(pblockindex)) {
        throw JSONRPCError(RPC_MISC_ERROR, "Block not available (pruned data)");
    }

    if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus())) {
        // Block not found on disk. This could be because we have the block
        // header in our index but not yet have the block or did not accept the
        // block.
        throw JSONRPCError(RPC_MISC_ERROR, "Block not found on disk");
    }

    return block;
}

//// in src/node/blockstorage.cpp
bool ReadBlockFromDisk(CBlock& block, const FlatFilePos& pos, const Consensus::Params& consensusParams)
{
    block.SetNull();

    // Open history file to read
    CAutoFile filein(OpenBlockFile(pos, true), SER_DISK, CLIENT_VERSION);
    if (filein.IsNull()) {
        return error("ReadBlockFromDisk: OpenBlockFile failed for %s", pos.ToString());
    }

    // Read block
    try {
        filein >> block;
    } catch (const std::exception& e) {
        return error("%s: Deserialize or I/O error - %s at %s", __func__, e.what(), pos.ToString());
    }

    // Check the header
    if (!CheckProofOfWork(block.GetHash(), block.nBits, consensusParams)) {
        return error("ReadBlockFromDisk: Errors in block header at %s", pos.ToString());
    }

    // Signet only: check block solution
    if (consensusParams.signet_blocks && !CheckSignetBlockSolution(block, consensusParams)) {
        return error("ReadBlockFromDisk: Errors in block solution at %s", pos.ToString());
    }

    return true;
}

//// in src/util/system.h
template<typename... Args>
bool error(const char* fmt, const Args&... args)
{
    LogPrintf("ERROR: %s\n", tfm::format(fmt, args...));
    return false;
}

@romanz
Copy link
Owner

romanz commented Nov 20, 2022

Would there be the possibility to redownload and integrate the corrupted block with BTC Core in order to fix my problem with Electrum? (to prevent synching the whole blockchain from scratch)

Maybe you can try:

getblockfrompeer "blockhash" peer_id

Attempt to fetch block from a given peer.

We must have the header for this block, e.g. using submitheader.
Subsequent calls for the same block and a new peer will cause the response from the previous peer to be ignored.

Returns an empty JSON object if the request was successfully scheduled.

Arguments:
1. blockhash    (string, required) The block hash to try to fetch
2. peer_id      (numeric, required) The peer to fetch it from (see getpeerinfo for peer IDs)

Result:
{}    (empty JSON object)

Examples:
> bitcoin-cli getblockfrompeer "00000000c937983704a73af28acdec37b049d214adbda81d7e2a3dd146f6ed09" 0

@amosgroth
Copy link
Author

amosgroth commented Nov 20, 2022

Downloaded the whole log file, searched for string "error:". First error I found was this:

...
2022-11-18T12:21:57Z tor: Thread interrupt
2022-11-18T12:21:57Z addcon thread exit
2022-11-18T12:21:57Z torcontrol thread exit
2022-11-18T12:21:57Z Shutdown: In progress...
2022-11-18T12:21:57Z net thread exit
2022-11-18T12:21:57Z msghand thread exit
2022-11-18T12:21:58Z ERROR: Error while reading proxy response
2022-11-18T12:21:58Z opencon thread exit
2022-11-18T12:21:58Z DumpAnchors: Flush 2 outbound block-relay-only peer addresses to anchors.dat started
2022-11-18T12:21:58Z DumpAnchors: Flush 2 outbound block-relay-only peer addresses to anchors.dat completed (0.01s)
2022-11-18T12:21:58Z scheduler thread exit
2022-11-18T12:21:59Z Writing 0 unbroadcast transactions to disk.
2022-11-18T12:21:59Z Dumped mempool: 0.01994s to copy, 1.23883s to dump
2022-11-18T12:22:02Z Shutdown: done
...

I would guess this is irrelevant and has to do with the shutdown of the server because I found this similar error 3 times in the logs and it got thrown every time the node was shutting down.

More interesting, I guess:

...
2022-11-19T20:35:54Z New outbound peer connected: version: 70016, blocks=763900, peer=10 (outbound-full-relay)
2022-11-19T20:36:03Z UpdateTip: new best=00000000000000000004a22df70b00432b727159a79e6cfdee240241e5705774 height=763851 version=0x2608e000 log2_work=93.852113 tx=781602834 date='2022-11-19T14:28:21Z' progress=0.999918 cache=28.7MiB(216571txo)
2022-11-19T20:36:56Z UpdateTip: new best=000000000000000000008f61f85025a0281e71851e2eb7045bbae1496b71c454 height=763852 version=0x29e5c000 log2_work=93.852126 tx=781605824 date='2022-11-19T14:39:23Z' progress=0.999921 cache=37.7MiB(273802txo)
2022-11-19T20:37:13Z UpdateTip: new best=000000000000000000031d01bdddd233879dcd7ee789c54b51426d515204b324 height=763853 version=0x20600000 log2_work=93.852139 tx=781607574 date='2022-11-19T14:39:43Z' progress=0.999921 cache=40.8MiB(299195txo)
2022-11-19T20:37:20Z ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: iostream error at FlatFilePos(nFile=1421, nPos=50038035)
2022-11-19T20:37:24Z Socks5() connect to 143.110.240.88:8446 failed: general failure
2022-11-19T20:37:31Z UpdateTip: new best=00000000000000000000bce516dfc3213e00e3c0142a791003d599e2e63110a4 height=763854 version=0x20000000 log2_work=93.852151 tx=781609976 date='2022-11-19T14:52:02Z' progress=0.999923 cache=43.9MiB(324437txo)
...

This error was thrown once and after that the log is free from errors...

@amosgroth
Copy link
Author

Would there be the possibility to redownload and integrate the corrupted block with BTC Core in order to fix my problem with Electrum? (to prevent synching the whole blockchain from scratch)

Maybe you can try:

getblockfrompeer "blockhash" peer_id

Attempt to fetch block from a given peer.

We must have the header for this block, e.g. using submitheader.
Subsequent calls for the same block and a new peer will cause the response from the previous peer to be ignored.

Returns an empty JSON object if the request was successfully scheduled.

Arguments:
1. blockhash    (string, required) The block hash to try to fetch
2. peer_id      (numeric, required) The peer to fetch it from (see getpeerinfo for peer IDs)

Result:
{}    (empty JSON object)

Examples:
> bitcoin-cli getblockfrompeer "00000000c937983704a73af28acdec37b049d214adbda81d7e2a3dd146f6ed09" 0

No luck:

umbrel@umbrel:~ $ ./umbrel/scripts/app compose bitcoin exec bitcoind bitcoin-cli getblockfrompeer 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 84
error code: -1
error message:
Block already downloaded

@romanz
Copy link
Owner

romanz commented Nov 20, 2022

2022-11-19T20:37:20Z ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: iostream error at FlatFilePos(nFile=1421, nPos=50038035)

is indeed the relevant issue - looks like the block is corrupted on disk :(

@amosgroth
Copy link
Author

2022-11-19T20:37:20Z ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: iostream error at FlatFilePos(nFile=1421, nPos=50038035)

is indeed the relevant issue - looks like the block is corrupted on disk :(

Oh, boy. :(

A few questions regarding that:

  • If BTC Core knows that it has downloaded that block already and knows where to look for it, it's not possible to "force-replace" that one? Either manually or programmatically?
  • Why does my node tell me everything is fine and is 100% synched, isn't that a false behavior of BTC Core since I have at least one corrupted block? Shouldn't BTC Core have stopped synching after throwing this error, as Electrum does?
  • Why is Electrum so strict comparing to BTC Core? Would there be a workaround for Electrs to continue synching? (I ask because I have funds to recover and feeling a little bit impatient/insecure with waiting, also I am on mobile Internet - that's why I refuse to resynch BTC entirely again...)

@romanz
Copy link
Owner

romanz commented Nov 20, 2022

  • If BTC Core knows that it has downloaded that block already and knows where to look for it, it's not possible to "force-replace" that one? Either manually or programmatically?

Not sure currently it's possible to force re-downloading a specific block :(

  • Why does my node tell me everything is fine and is 100% synched, isn't that a false behavior of BTC Core since I have at least one corrupted block? Shouldn't BTC Core have stopped synching after throwing this error, as Electrum does?

If the corruption happened after bitcoind sync was over, it's hard to detect (since most of the queries do not involve historical blocks) except of re-reading the blocks (as we did above using getblock RPC).

  • Why is Electrum so strict comparing to BTC Core?

electrs needs access to all historical blocks to build its index.

Would there be a workaround for Electrs to continue synching? (I ask because I have funds to recover and feeling a little bit impatient/insecure with waiting, also I am on mobile Internet - that's why I refuse to resynch BTC entirely again...)

Maybe you could try https://github.com/bwt-dev/bwt? CC: @shesek

@Kixunil
Copy link
Contributor

Kixunil commented Nov 20, 2022

I don't think BWT will help here since it will need to rescan whole chain and will hit the same bug. The only solution here is to run Core with -reindex and wait. Unfortunately, RPis are slow so you'll need to wait 2-3 days.

@romanz
Copy link
Owner

romanz commented Nov 20, 2022

I hope it can work if we rescan only the last N blocks - see https://github.com/bwt-dev/bwt#pruning

@amosgroth
Copy link
Author

amosgroth commented Nov 20, 2022

I don't think BTW will help here since it will need to rescan whole chain and will hit the same bug. The only solution here is to run Core with -reindex and wait. Unfortunately, RPis are slow so you'll need to wait 2-3 days.

Yeah, wouldn't help me. If it was only for that I would just use a public Electrum server and Sparrow. But the app I had installed on Umbrel and not have the recovery words from (ItchySats) derives it's wallet from the Umbrel seed/wallet (which I have and recovered so far with BTC Core and LND). Since ItchySats needs Electrum Server, I am kind of in a trap.

I know I am stupid, so telling me so wouldn't help anyone. ;)

@amosgroth
Copy link
Author

Thanks so much for your support and time anyway! If you have more ideas to share, let me know! Otherwise I will be sitting in the library tomorrow, trying to resync. :)

@Kixunil
Copy link
Contributor

Kixunil commented Nov 20, 2022

Just wishing you good luck that it's not caused by HW failure. RPis are not very reliable either. :(

@svpi11
Copy link

svpi11 commented Nov 27, 2022

Hello,

I have a similar issue with electrs 0.9.10... With a different block :
I'm running electrs as a service.

nov 27 14:45:10 electrs[437134]: thread 'p2p_recv' panicked at 'invalid message: OversizedVectorAllocation { requested: 12000072, max: 4000000 }', src/p2p.rs:210:62
nov 27 14:45:10  electrs[437134]: note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
nov 27 14:40:13 : Error: electrs failed
nov 27 14:40:13 : Caused by:
nov 27 14:40:13     0: sync failed
nov 27 14:40:13  electrs[436819]:     1: failed to get block 000000000000000000070742427fa10ec3c66d006160155a704d9f56d090a3ea
nov 27 14:40:13  electrs[436819]:     2: receiving on an empty and disconnected channel

But while reading answers from romanz, my case is a bit different cause I can actually find the block from my bitcoin node (here the file is attached).

block.hex.zip

@romanz
Copy link
Owner

romanz commented Nov 27, 2022

@svpi11 it seems like #783.

It should be resolved in electrs 0.9.10 - could you please check the logs for the version you are running?

"Starting electrs {} on {} {} with {:?}",

@svpi11
Copy link

svpi11 commented Nov 27, 2022

Thank you very much for your answer !

I see this in the log of the service, while starting it:

nov 27 16:08:58  systemd[1]: Started Electrs daemon.
nov 27 16:08:58  electrs[443265]: Starting electrs 0.9.1 on x86_64 linux with Config { network: Bitcoin, db_path: "/home/electrs/.electrs/db/bitcoin", daemon_dir: "/home/bitcoin/.bitcoin", daemon_auth: CookieFile("/home/bitcoin/.bitcoin/.cookie"), daemon_rpc_addr: 127.0.0.1:8332, daemon_p2p_addr: 127.0.0.1:8333, electrum_rpc_addr: 127.0.0.1:50001, monitoring_addr: 127.0.0.1:4224, wait_duration: 10s, jsonrpc_timeout: 15s, index_batch_size: 10, index_lookup_limit: Some(1000), reindex_last_blocks: 0, auto_reindex: true, ignore_mempool: false, sync_once: false, disable_electrum_rpc: false, server_banner: "Welcome to electrs 0.9.1 (Electrum Rust Server)!", args: [] }

Is there another way of printing the logs ? It seems to be 0.9.1 running.

Thank you

@romanz
Copy link
Owner

romanz commented Nov 27, 2022

Please upgrade to the latest release (0.9.10).

You can also check the version by running:

$ ./target/release/electrs --version
v0.9.10

@svpi11
Copy link

svpi11 commented Nov 27, 2022

Thank you...

Just did a stupid mistake mixing up beetwen v0.9.1 and v0.9.10, last time I upgraded, sorry 😅

Thank you for your answers and great work 👍

@romanz
Copy link
Owner

romanz commented Nov 27, 2022

No problem, happy to help :)

@romanz
Copy link
Owner

romanz commented Nov 27, 2022

@amosgroth I am closing this issue for now, please reopen if the corruption reproduces.

@romanz romanz closed this as completed Nov 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants