Bug: OversizedVectorAllocation during initial synch (0.9.9) #806

amosgroth · 2022-11-19T11:42:25Z

Describe the bug
Fresh Umbrel installation, fully synched Bitcoin Core, freshly installed Electrum fails and stops to synch:

Attaching to electrs_app_1, electrs_app_proxy_1, electrs_electrs_1, electrs_tor_1
electrs_1 | [2022-11-19T11:25:17.079Z INFO electrs::index] indexing 2000 blocks: [548741..550740]
electrs_1 | thread 'p2p_loop' panicked at 'invalid message: OversizedVectorAllocation { requested: 14330322073680149042, max: 4000000 }', src/p2p.rs:250:70
electrs_1 | note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
electrs_1 | [2022-11-19T11:25:17.578Z INFO electrs::db] closing DB at /data/db/bitcoin
electrs_1 | Error: electrs failed
electrs_1 |
electrs_1 | Caused by:
electrs_1 | 0: sync failed
electrs_1 | 1: failed to get block 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5
electrs_1 | 2: receiving on an empty and disconnected channel

I have tried this for the 3rd time now (un- and reinstalled Electrum, deleted all files in between, restarted synching). It seems it's always giving me the same error and it always stops at the same block height or same synching percentage (72%).

Electrs version
0.9.9

To Reproduce
Steps to reproduce the behavior:

Configure and start Umbrel with BTC node, install Electrum after BTC is fully synched
Install Electrum App and let it synch initially
Wait
See error

Expected behavior
Electrum Server synchs to 100% and doesn't stop in between, not being able to finish.

Configuration

Umbrel 0.5.3 on RasPi
Bitcoin Core 23.0
Electrum 0.9.9

The text was updated successfully, but these errors were encountered:

romanz · 2022-11-19T17:49:10Z

It seems that electrs fails to parse 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 block returned from bitcoind :(

Could you please try to retrieve it using:

$ bitcoin-cli getblock 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 0 > block.hex

and attach the resulting block.hex file?

amosgroth · 2022-11-19T20:42:40Z

Hi Roman,

Thx for your support. Pls find the file attached...
block.hex.zip

amosgroth · 2022-11-20T12:33:56Z

Would there be the possibility to redownload and integrate the corrupted block with BTC Core in order to fix my problem with Electrum? (to prevent synching the whole blockchain from scratch)

romanz · 2022-11-20T15:20:41Z

It seems that your node doesn't have the block available (the content of the block.hex.zip above):

error code: -1
error message:
Block not found on disk

romanz · 2022-11-20T15:32:01Z

BTW, are the any errors in bitcoind logs?

amosgroth · 2022-11-20T15:49:18Z

It seems that your node doesn't have the block available (the content of the block.hex.zip above):
error code: -1
error message:
Block not found on disk

That's strange. My node says it's 100% synched and didn't complain while synching. I can also connect wallets through BTC Core. The log seems to be fine, ~~only errors regarding peer connections, nothing said about block errors or equivalent.~~

One additional info that might be relevant for this is that I had a backup copy from the Bitcoin data directories (blocks/chainstate/indexes) from a few months ago - I used that to save time and bandwidth for synching...

amosgroth · 2022-11-20T15:57:46Z

BTW, are the any errors in bitcoind logs?

I searched the whole log file at ~/umbrel/app-data/bitcoin/data/bitcoin/debug.log:

block number 548750 is not mentioned
0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 is not mentioned

Only mentioned by Electrum...

romanz · 2022-11-20T17:15:12Z

Could you please look for ERROR: in the logs?

//// in src/rpc/blockchain.cpp
static CBlock GetBlockChecked(BlockManager& blockman, const CBlockIndex* pblockindex) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
{
    AssertLockHeld(::cs_main);
    CBlock block;
    if (blockman.IsBlockPruned(pblockindex)) {
        throw JSONRPCError(RPC_MISC_ERROR, "Block not available (pruned data)");
    }

    if (!ReadBlockFromDisk(block, pblockindex, Params().GetConsensus())) {
        // Block not found on disk. This could be because we have the block
        // header in our index but not yet have the block or did not accept the
        // block.
        throw JSONRPCError(RPC_MISC_ERROR, "Block not found on disk");
    }

    return block;
}

//// in src/node/blockstorage.cpp
bool ReadBlockFromDisk(CBlock& block, const FlatFilePos& pos, const Consensus::Params& consensusParams)
{
    block.SetNull();

    // Open history file to read
    CAutoFile filein(OpenBlockFile(pos, true), SER_DISK, CLIENT_VERSION);
    if (filein.IsNull()) {
        return error("ReadBlockFromDisk: OpenBlockFile failed for %s", pos.ToString());
    }

    // Read block
    try {
        filein >> block;
    } catch (const std::exception& e) {
        return error("%s: Deserialize or I/O error - %s at %s", __func__, e.what(), pos.ToString());
    }

    // Check the header
    if (!CheckProofOfWork(block.GetHash(), block.nBits, consensusParams)) {
        return error("ReadBlockFromDisk: Errors in block header at %s", pos.ToString());
    }

    // Signet only: check block solution
    if (consensusParams.signet_blocks && !CheckSignetBlockSolution(block, consensusParams)) {
        return error("ReadBlockFromDisk: Errors in block solution at %s", pos.ToString());
    }

    return true;
}

//// in src/util/system.h
template<typename... Args>
bool error(const char* fmt, const Args&... args)
{
    LogPrintf("ERROR: %s\n", tfm::format(fmt, args...));
    return false;
}

romanz · 2022-11-20T18:41:51Z

Would there be the possibility to redownload and integrate the corrupted block with BTC Core in order to fix my problem with Electrum? (to prevent synching the whole blockchain from scratch)

Maybe you can try:

getblockfrompeer "blockhash" peer_id

Attempt to fetch block from a given peer.

We must have the header for this block, e.g. using submitheader.
Subsequent calls for the same block and a new peer will cause the response from the previous peer to be ignored.

Returns an empty JSON object if the request was successfully scheduled.

Arguments:
1. blockhash    (string, required) The block hash to try to fetch
2. peer_id      (numeric, required) The peer to fetch it from (see getpeerinfo for peer IDs)

Result:
{}    (empty JSON object)

Examples:
> bitcoin-cli getblockfrompeer "00000000c937983704a73af28acdec37b049d214adbda81d7e2a3dd146f6ed09" 0

amosgroth · 2022-11-20T18:48:05Z

Downloaded the whole log file, searched for string "error:". First error I found was this:

...
2022-11-18T12:21:57Z tor: Thread interrupt
2022-11-18T12:21:57Z addcon thread exit
2022-11-18T12:21:57Z torcontrol thread exit
2022-11-18T12:21:57Z Shutdown: In progress...
2022-11-18T12:21:57Z net thread exit
2022-11-18T12:21:57Z msghand thread exit
2022-11-18T12:21:58Z ERROR: Error while reading proxy response
2022-11-18T12:21:58Z opencon thread exit
2022-11-18T12:21:58Z DumpAnchors: Flush 2 outbound block-relay-only peer addresses to anchors.dat started
2022-11-18T12:21:58Z DumpAnchors: Flush 2 outbound block-relay-only peer addresses to anchors.dat completed (0.01s)
2022-11-18T12:21:58Z scheduler thread exit
2022-11-18T12:21:59Z Writing 0 unbroadcast transactions to disk.
2022-11-18T12:21:59Z Dumped mempool: 0.01994s to copy, 1.23883s to dump
2022-11-18T12:22:02Z Shutdown: done
...

I would guess this is irrelevant and has to do with the shutdown of the server because I found this similar error 3 times in the logs and it got thrown every time the node was shutting down.

More interesting, I guess:

...
2022-11-19T20:35:54Z New outbound peer connected: version: 70016, blocks=763900, peer=10 (outbound-full-relay)
2022-11-19T20:36:03Z UpdateTip: new best=00000000000000000004a22df70b00432b727159a79e6cfdee240241e5705774 height=763851 version=0x2608e000 log2_work=93.852113 tx=781602834 date='2022-11-19T14:28:21Z' progress=0.999918 cache=28.7MiB(216571txo)
2022-11-19T20:36:56Z UpdateTip: new best=000000000000000000008f61f85025a0281e71851e2eb7045bbae1496b71c454 height=763852 version=0x29e5c000 log2_work=93.852126 tx=781605824 date='2022-11-19T14:39:23Z' progress=0.999921 cache=37.7MiB(273802txo)
2022-11-19T20:37:13Z UpdateTip: new best=000000000000000000031d01bdddd233879dcd7ee789c54b51426d515204b324 height=763853 version=0x20600000 log2_work=93.852139 tx=781607574 date='2022-11-19T14:39:43Z' progress=0.999921 cache=40.8MiB(299195txo)
2022-11-19T20:37:20Z ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: iostream error at FlatFilePos(nFile=1421, nPos=50038035)
2022-11-19T20:37:24Z Socks5() connect to 143.110.240.88:8446 failed: general failure
2022-11-19T20:37:31Z UpdateTip: new best=00000000000000000000bce516dfc3213e00e3c0142a791003d599e2e63110a4 height=763854 version=0x20000000 log2_work=93.852151 tx=781609976 date='2022-11-19T14:52:02Z' progress=0.999923 cache=43.9MiB(324437txo)
...

This error was thrown once and after that the log is free from errors...

amosgroth · 2022-11-20T19:10:17Z

Would there be the possibility to redownload and integrate the corrupted block with BTC Core in order to fix my problem with Electrum? (to prevent synching the whole blockchain from scratch)

Maybe you can try:

getblockfrompeer "blockhash" peer_id

Attempt to fetch block from a given peer.

We must have the header for this block, e.g. using submitheader.
Subsequent calls for the same block and a new peer will cause the response from the previous peer to be ignored.

Returns an empty JSON object if the request was successfully scheduled.

Arguments:
1. blockhash    (string, required) The block hash to try to fetch
2. peer_id      (numeric, required) The peer to fetch it from (see getpeerinfo for peer IDs)

Result:
{}    (empty JSON object)

Examples:
> bitcoin-cli getblockfrompeer "00000000c937983704a73af28acdec37b049d214adbda81d7e2a3dd146f6ed09" 0

No luck:

umbrel@umbrel:~ $ ./umbrel/scripts/app compose bitcoin exec bitcoind bitcoin-cli getblockfrompeer 0000000000000000000b187c3d983a2b73d33d873532d44f8c9081ec28d403b5 84
error code: -1
error message:
Block already downloaded

romanz · 2022-11-20T19:10:38Z

2022-11-19T20:37:20Z ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: iostream error at FlatFilePos(nFile=1421, nPos=50038035)

is indeed the relevant issue - looks like the block is corrupted on disk :(

amosgroth · 2022-11-20T19:26:45Z

2022-11-19T20:37:20Z ERROR: ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: iostream error at FlatFilePos(nFile=1421, nPos=50038035)

is indeed the relevant issue - looks like the block is corrupted on disk :(

Oh, boy. :(

A few questions regarding that:

If BTC Core knows that it has downloaded that block already and knows where to look for it, it's not possible to "force-replace" that one? Either manually or programmatically?
Why does my node tell me everything is fine and is 100% synched, isn't that a false behavior of BTC Core since I have at least one corrupted block? Shouldn't BTC Core have stopped synching after throwing this error, as Electrum does?
Why is Electrum so strict comparing to BTC Core? Would there be a workaround for Electrs to continue synching? (I ask because I have funds to recover and feeling a little bit impatient/insecure with waiting, also I am on mobile Internet - that's why I refuse to resynch BTC entirely again...)

romanz · 2022-11-20T19:42:45Z

If BTC Core knows that it has downloaded that block already and knows where to look for it, it's not possible to "force-replace" that one? Either manually or programmatically?

Not sure currently it's possible to force re-downloading a specific block :(

Why does my node tell me everything is fine and is 100% synched, isn't that a false behavior of BTC Core since I have at least one corrupted block? Shouldn't BTC Core have stopped synching after throwing this error, as Electrum does?

If the corruption happened after bitcoind sync was over, it's hard to detect (since most of the queries do not involve historical blocks) except of re-reading the blocks (as we did above using getblock RPC).

Why is Electrum so strict comparing to BTC Core?

electrs needs access to all historical blocks to build its index.

Would there be a workaround for Electrs to continue synching? (I ask because I have funds to recover and feeling a little bit impatient/insecure with waiting, also I am on mobile Internet - that's why I refuse to resynch BTC entirely again...)

Maybe you could try https://github.com/bwt-dev/bwt? CC: @shesek

Kixunil · 2022-11-20T19:52:36Z

I don't think BWT will help here since it will need to rescan whole chain and will hit the same bug. The only solution here is to run Core with -reindex and wait. Unfortunately, RPis are slow so you'll need to wait 2-3 days.

romanz · 2022-11-20T19:54:55Z

I hope it can work if we rescan only the last N blocks - see https://github.com/bwt-dev/bwt#pruning

amosgroth · 2022-11-20T20:00:02Z

I don't think BTW will help here since it will need to rescan whole chain and will hit the same bug. The only solution here is to run Core with -reindex and wait. Unfortunately, RPis are slow so you'll need to wait 2-3 days.

Yeah, wouldn't help me. If it was only for that I would just use a public Electrum server and Sparrow. But the app I had installed on Umbrel and not have the recovery words from (ItchySats) derives it's wallet from the Umbrel seed/wallet (which I have and recovered so far with BTC Core and LND). Since ItchySats needs Electrum Server, I am kind of in a trap.

I know I am stupid, so telling me so wouldn't help anyone. ;)

amosgroth · 2022-11-20T20:17:33Z

Thanks so much for your support and time anyway! If you have more ideas to share, let me know! Otherwise I will be sitting in the library tomorrow, trying to resync. :)

Kixunil · 2022-11-20T20:53:51Z

Just wishing you good luck that it's not caused by HW failure. RPis are not very reliable either. :(

svpi11 · 2022-11-27T13:42:24Z

Hello,

I have a similar issue with electrs 0.9.10... With a different block :
I'm running electrs as a service.

nov 27 14:45:10 electrs[437134]: thread 'p2p_recv' panicked at 'invalid message: OversizedVectorAllocation { requested: 12000072, max: 4000000 }', src/p2p.rs:210:62
nov 27 14:45:10  electrs[437134]: note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
nov 27 14:40:13 : Error: electrs failed
nov 27 14:40:13 : Caused by:
nov 27 14:40:13     0: sync failed
nov 27 14:40:13  electrs[436819]:     1: failed to get block 000000000000000000070742427fa10ec3c66d006160155a704d9f56d090a3ea
nov 27 14:40:13  electrs[436819]:     2: receiving on an empty and disconnected channel

But while reading answers from romanz, my case is a bit different cause I can actually find the block from my bitcoin node (here the file is attached).

block.hex.zip

romanz · 2022-11-27T14:22:56Z

@svpi11 it seems like #783.

It should be resolved in electrs 0.9.10 - could you please check the logs for the version you are running?

electrs/src/config.rs

Line 346 in faf1f07

"Starting electrs {} on {} {} with {:?}",

svpi11 · 2022-11-27T15:18:28Z

Thank you very much for your answer !

I see this in the log of the service, while starting it:

nov 27 16:08:58  systemd[1]: Started Electrs daemon.
nov 27 16:08:58  electrs[443265]: Starting electrs 0.9.1 on x86_64 linux with Config { network: Bitcoin, db_path: "/home/electrs/.electrs/db/bitcoin", daemon_dir: "/home/bitcoin/.bitcoin", daemon_auth: CookieFile("/home/bitcoin/.bitcoin/.cookie"), daemon_rpc_addr: 127.0.0.1:8332, daemon_p2p_addr: 127.0.0.1:8333, electrum_rpc_addr: 127.0.0.1:50001, monitoring_addr: 127.0.0.1:4224, wait_duration: 10s, jsonrpc_timeout: 15s, index_batch_size: 10, index_lookup_limit: Some(1000), reindex_last_blocks: 0, auto_reindex: true, ignore_mempool: false, sync_once: false, disable_electrum_rpc: false, server_banner: "Welcome to electrs 0.9.1 (Electrum Rust Server)!", args: [] }

Is there another way of printing the logs ? It seems to be 0.9.1 running.

Thank you

romanz · 2022-11-27T15:57:32Z

Please upgrade to the latest release (0.9.10).

You can also check the version by running:

$ ./target/release/electrs --version
v0.9.10

svpi11 · 2022-11-27T16:38:00Z

Thank you...

Just did a stupid mistake mixing up beetwen v0.9.1 and v0.9.10, last time I upgraded, sorry 😅

Thank you for your answers and great work 👍

romanz · 2022-11-27T16:45:33Z

No problem, happy to help :)

romanz · 2022-11-27T16:46:20Z

@amosgroth I am closing this issue for now, please reopen if the corruption reproduces.

amosgroth added the bug Something isn't working label Nov 19, 2022

amosgroth changed the title ~~Bug: OversizedVectorAllocation~~ Bug: OversizedVectorAllocation during initial synch (0.9.9) Nov 19, 2022

romanz mentioned this issue Nov 19, 2022

Improve error logging in case of p2p parsing error #807

Merged

romanz closed this as completed Nov 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: OversizedVectorAllocation during initial synch (0.9.9) #806

Bug: OversizedVectorAllocation during initial synch (0.9.9) #806

amosgroth commented Nov 19, 2022

romanz commented Nov 19, 2022

amosgroth commented Nov 19, 2022

amosgroth commented Nov 20, 2022

romanz commented Nov 20, 2022 •

edited

Loading

romanz commented Nov 20, 2022

amosgroth commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022 •

edited

Loading

romanz commented Nov 20, 2022 •

edited

Loading

romanz commented Nov 20, 2022

amosgroth commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022

romanz commented Nov 20, 2022

amosgroth commented Nov 20, 2022

romanz commented Nov 20, 2022

Kixunil commented Nov 20, 2022 •

edited by romanz

Loading

romanz commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022

Kixunil commented Nov 20, 2022

svpi11 commented Nov 27, 2022 •

edited

Loading

romanz commented Nov 27, 2022 •

edited

Loading

svpi11 commented Nov 27, 2022

romanz commented Nov 27, 2022

svpi11 commented Nov 27, 2022

romanz commented Nov 27, 2022

romanz commented Nov 27, 2022

Bug: OversizedVectorAllocation during initial synch (0.9.9) #806

Bug: OversizedVectorAllocation during initial synch (0.9.9) #806

Comments

amosgroth commented Nov 19, 2022

romanz commented Nov 19, 2022

amosgroth commented Nov 19, 2022

amosgroth commented Nov 20, 2022

romanz commented Nov 20, 2022 • edited Loading

romanz commented Nov 20, 2022

amosgroth commented Nov 20, 2022 • edited Loading

amosgroth commented Nov 20, 2022 • edited Loading

romanz commented Nov 20, 2022 • edited Loading

romanz commented Nov 20, 2022

amosgroth commented Nov 20, 2022 • edited Loading

amosgroth commented Nov 20, 2022

romanz commented Nov 20, 2022

amosgroth commented Nov 20, 2022

romanz commented Nov 20, 2022

Kixunil commented Nov 20, 2022 • edited by romanz Loading

romanz commented Nov 20, 2022 • edited Loading

amosgroth commented Nov 20, 2022 • edited Loading

amosgroth commented Nov 20, 2022

Kixunil commented Nov 20, 2022

svpi11 commented Nov 27, 2022 • edited Loading

romanz commented Nov 27, 2022 • edited Loading

svpi11 commented Nov 27, 2022

romanz commented Nov 27, 2022

svpi11 commented Nov 27, 2022

romanz commented Nov 27, 2022

romanz commented Nov 27, 2022

romanz commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022 •

edited

Loading

romanz commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022 •

edited

Loading

Kixunil commented Nov 20, 2022 •

edited by romanz

Loading

romanz commented Nov 20, 2022 •

edited

Loading

amosgroth commented Nov 20, 2022 •

edited

Loading

svpi11 commented Nov 27, 2022 •

edited

Loading

romanz commented Nov 27, 2022 •

edited

Loading