-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Besu uses 100 % CPU after losing the internet connection #5348
Comments
Same issue after downtime for abut 10 days (power supply failure). |
From what I gathered using Debug logs, this issue could be related to Besu enode being blacklisted by other nodes. I would recommend to delete the key file inside Besu data path and restart Besu. |
I'm pretty sure I had tried that earlier with no success, based on something I found in the Besu documentation. I just tested again and after deleting the Eventually I get:
After that I repeatedly get stack traces thrown by Again, this is all after I deleted the |
Thanks for the feedback @Pablosan. Looking into the stack strace of the thread holding 99% of the heap, we can see that this happens because Besu tries to do a huge reorg, to a block number 15_537_445 (0a46f5e956718147fe6b8f16301496283c28059fd07c16e6f842c0a3dfc1a711). shouldContinuedownloading method calls rewindToBlock(0a46f5e956718147fe6b8f16301496283c28059fd07c16e6f842c0a3dfc1a711), which calls handleChainReorg. This method in DefaultBlockchain creates two local list variables, and each one consumes around 2 GiB memory because the chosen pivot block is too far from current chain head in the database (more than 292 days old).
The peak in CPU usage is related to GC threads trying to free up some memory with no success, as the lists just keep growing. |
To try to understand why Besu trusts this old pivot block and triggers a huge reorg, @matkt suggested to check the file inside [Besu_data_path]/fastsync/pivotBlockHeader.rlp. |
Thanks for all the details @ahamlat. Removing the
This clears the database and starts the syncing process back at the beginning but it did get my node back online eventually. On my machine running at home it took about 38 hours (using SNAP sync on a Mac Mini M1, 16GB RAM, 1 GbE ethernet to a 1 Gigabit fiber connection, using a Thunderbolt 4 2TB SSD). |
Thanks for the feedback @Pablosan. Checking the logs you shared with me makes me think you're using Fast sync and not snap sync. |
My apologies. It did fix the issue but it didn't get me back to a fully sync'ed node without the extra step I mentioned above. Regarding the long sync time: I must have something wrong in my command line args. I'll take a closer look and see if I notice something amiss. |
Reference #5699 |
Description
A discord user reported Besu consuming 100% CPU after restarting his router.
RAM usage was affected too (from 5.19 to 11.7 GB of 32 GB total).
Below the logs shared by the user
Besu shouldn't trigger OutOfMemoryError after loosing internet connection.
The text was updated successfully, but these errors were encountered: