Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BSC synchronization issues #338

Closed
j75689 opened this issue Jul 30, 2021 · 250 comments · Fixed by #257 or #333
Closed

BSC synchronization issues #338

j75689 opened this issue Jul 30, 2021 · 250 comments · Fixed by #257 or #333
Assignees
Labels
enhancement New feature or request

Comments

@j75689
Copy link
Contributor

j75689 commented Jul 30, 2021

Description

In the 24 hours of July 28, Binance Smart Chain (BSC) processed 12.9 million transactions. This number and the below numbers are all from the great BSC network explorer bscscan.com powered by the Etherscan team.

This means 150 transactions per second (TPS) processed on the mainnet, not in isolated environment tests or white paper. If we zoom in, we will also notice that these were not light transactions as BNB or BEP20 transfers, but heavy transactions, as many users were "fighting" each other in the “Play and Earn”, which is majorly contributed by GameFi dApps from MVBII.

The total gas used on July 28 was 2,052,084 million. If all these were for a simple BEP20 transaction that typically cost 50k gas, it could cover 41 millions transactions, and stand for 470 TPS.

On the other hand, with the flood of volume, the network experienced congestion on July 28 for about 4 hours, and many low spec or old version nodes could not catch up with processing blocks in time.

Updates

A new version of beta client is released which has better performance in order to handle the high volume. Please feel free to upgrade and raise bug reports if you encounter any. Please note this is just a beta version, some known bug fix is on the way. Click here to download the beta client.

To improve the performance of nodes and achieve faster block times, we recommend the following specifications.

  • validator:
    • 2T GB of free disk space, solid-state drive(SSD), gp3, 8k IOPS, 250MB/S throughput, read latency <1ms.
    • 12 cores of CPU and 48 gigabytes of memory (RAM)
    • m5zn.3xlarge instance type on AWS, or c2-standard-8 on Google cloud.
    • A broadband Internet connection with upload/download speeds of 10 megabyte per second
  • fullnode:
    • 1T GB of free disk space, solid-state drive(SSD), gp3, 3k IOPS, 125MB/S throughput, read latency <1ms. (if start with snap/fast sync, it will need NVMe SSD)
    • 8 cores of CPU and 32 gigabytes of memory (RAM).
    • c5.4xlarge instance type on AWS, c2-standard-8 on Google cloud.
    • A broadband Internet connection with upload/download speeds of 5 megabyte per second

If you don’t need an archive node, choose the latest snapshot and rerun from scratch from there.

Problems

  • Fast/snap sync mode cannot catch up with the current state data.
  • Full sync cannot catch up with the current block.
  • High CPU usage.

Suggestions

  • Use the latest released binary version.
  • Don't use fast/snap sync for now, use the snapshot we provide to run full sync.
  • Confirm your hardware is sufficient, you can refer to our official documents (we will update if there are new discoveries).
  • Regularly prune data to reduce disk pressure.
  • Make sure the peer you connect to is not too slow.

Reference PRs


We will update this board, If there are any updates.
If you have a suggestion or want to propose some improvements, please visit our Github.
If you encounter any synchronization issues, please report them here.

@j75689 j75689 added the enhancement New feature or request label Jul 30, 2021
@j75689 j75689 pinned this issue Jul 30, 2021
@kgcdream2019
Copy link

I updated to latest binary 1.1.1-beta.
my hardware CPU 36, RAM 72, GP3 10000 IOPS, 1000 MB/S
now block is behind 1 days 23 hours.
block generation 2~3 / 10 seconds.
what is issue?
this is my geth command
./build/bin/geth --config ./config.toml --datadir ./node --gcmode archive --syncmode=full --snapshot=false --http.vhosts=* --cache=18000 --cache.preimages --rpc.allow-unprotected-txs --txlookuplimit 0 console

@bifot
Copy link

bifot commented Dec 3, 2021

Generated snapshot successfully:

t=2021-12-03T14:29:50+0000 lvl=info msg="Generated state snapshot"               accounts=141,851,636 slots=484,368,228 storage="41.12 GiB" elapsed=29h14m53.500s

Then stucked at 12_965_001 block and getting eth.syncing is false, my output:

t=2021-12-03T18:18:01+0000 lvl=info msg="Looking for peers"                      peercount=0 tried=122 static=14
t=2021-12-03T18:18:15+0000 lvl=info msg="Looking for peers"                      peercount=0 tried=185 static=14

@newb23
Copy link

newb23 commented Dec 4, 2021

Not sure if this help to sync but :

Remove clients not running geth 1.1.5 (this is bash inline, adapt to your needs) : for i in echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i name | grep -v 1.1.5 | grep -v 'name: ""' | sort | uniq ; do echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i $i -B 5 | grep -i enode | awk '{ print $NF }' | tr -d ',' | awk '{ print "admin.removePeer(" $1 ")" }' | ./geth --datadir ./node/ attach; done

Remove clients without diffsync on echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i "diff_sync: false" -B 12 | grep -i enode | awk '{ print $NF }' | tr -d ',' | awk '{ print "admin.removePeer(" $1 ")" }' | ./geth --datadir ./node/ attach

For your fist command, I cannot for the life of me get it to run. Running just the first section, dropping the for/do, it runs fine, but after adding it back I get:
/usr/local/bin/removepeers.sh: line 7: syntax error near unexpected token |'
/usr/local/bin/removepeers.sh: line 7: for i in echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i name | grep -v 1.1.6 | grep -v 'name: ""' | sort | uniq; do'

The second commands works flawlessly as well. Ideas @jbriaux?

Edit: The issue is the damned "`" (shift-tilde) marks before the first echo, and after the uniq. Thanks, GitHub! >_<

for i in `echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i name | grep -v 1.1.5 | grep -v 'name: ""' | sort | uniq` ; do
        echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i $i -B 5 | grep -i enode | awk '{ print $NF }' | tr -d ',' | awk '{ print "admin.removePeer(" $1 ")" }' | ./geth --datadir ./node/ attach ;
done

@guiltylelouch
Copy link

same issue, sync too slow
infra: AWS
instance type: c5.4xlarge
config: geth --config ./config.toml --datadir ./ --cache 18000 --rpc.allow-unprotected-txs --snapshot=false --ws --ws.addr 0.0.0.0 --ws.port 3334 --ws.api eth,net,web3 --txlookuplimit 0 --diffsync

@mj-dcb
Copy link

mj-dcb commented Dec 5, 2021 via email

@TehnobitSystems
Copy link

AWS doesn’t perform well Verstuurd vanaf mijn iPhone

what is your advice or recommendation?

@jun0tpyrc
Copy link

AWS doesn’t perform well Verstuurd vanaf mijn iPhone

what is your advice or recommendation?

For those who need to make heavy http jsonrpc requests, from my experience should consider at least i3en.3xlarge

@TehnobitSystems
Copy link

AWS doesn’t perform well Verstuurd vanaf mijn iPhone

what is your advice or recommendation?

For those who need to make heavy http jsonrpc requests, from my experience should consider at least i3en.3xlarge

what disks do you have in this instance and what is the disk configuration?

@jun0tpyrc
Copy link

AWS doesn’t perform well Verstuurd vanaf mijn iPhone

what is your advice or recommendation?

For those who need to make heavy http jsonrpc requests, from my experience should consider at least i3en.3xlarge

what disks do you have in this instance and what is the disk configuration?

We could have a 7.5TB nvme instance store there , xfs would work well

@TehnobitSystems
Copy link

AWS doesn’t perform well Verstuurd vanaf mijn iPhone

what is your advice or recommendation?

For those who need to make heavy http jsonrpc requests, from my experience should consider at least i3en.3xlarge

what disks do you have in this instance and what is the disk configuration?

We could have a 7.5TB nvme instance store there , xfs would work well

how did you directly connect nvme to i3en.3xlarge ?

@mj-dcb
Copy link

mj-dcb commented Dec 5, 2021 via email

@a114437
Copy link

a114437 commented Dec 6, 2021

微信截图_20211206132335
Why does snapshot synchronization automatically switch to full synchronization?

command: ./geth-linux --config ./config.toml --datadir ./node --diffsync --syncmode=snap --snapshot=true --cache 32000 --rpc.allow-unprotected-txs --txlookuplimit 0

@yaoyf888
Copy link

yaoyf888 commented Dec 8, 2021

I am using Amazon Cloud, 32 core, 128GB ram, 2T solid state, but BSC nodes are often slow to keep up with synchronization, does anyone have a good suggestion?instance type is c5

@hdiass
Copy link

hdiass commented Dec 8, 2021

@yaoyf888 afaik c5 don't have NVME right ? i think you need to change i3 instances and make use of that nvme for proper sync

@yaoyf888
Copy link

yaoyf888 commented Dec 8, 2021

@yaoyf888 afaik c5 don't have NVME right ? i think you need to change i3 instances and make use of that nvme for proper sync

I often can't keep up with the synchronization, and then the synchronization is very slow. Is it ok to change it to I3 Instances?

@DeepBorys
Copy link

My current setup is i3en.3xlarge and 1.1.5 version of the client. Experimented with few other instance types but this is by far the best.

I have trouble syncing from scratch but no problems to keep up while in sync, and its fast enough to catch up from the bsc_snapshots that are posted daily.

@vietthang207
Copy link

I did have a lots of problem to stay in sync with the network on various setup in the past, but things seem to be much better now. I guess it is because of the recent market cool down, or because of the recent v1.1.7 release.

@guiltylelouch
Copy link

I have same issue as #649, I dont konw how to solve it.

@noXi89
Copy link

noXi89 commented Dec 28, 2021

Can't sync here, running with v1.1.7, windows, decent and latest hardware (i7 12700k, 2T NVMe bsc exclusive, 100Mb/s), tried all sync methods from genesis + snapshot

@Tronglx
Copy link

Tronglx commented Feb 15, 2022

Please provide the snapshot for testnet.

@KyloYang888
Copy link

KyloYang888 commented Feb 16, 2022

Hi team. My bsc full node cannot find other nodes to sync data. How should I deal with it please? Thanks

#770

@goev-lab
Copy link

Not sure if this help to sync but :
Remove clients not running geth 1.1.5 (this is bash inline, adapt to your needs) : for i in echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i name | grep -v 1.1.5 | grep -v 'name: ""' | sort | uniq ; do echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i $i -B 5 | grep -i enode | awk '{ print $NF }' | tr -d ',' | awk '{ print "admin.removePeer(" $1 ")" }' | ./geth --datadir ./node/ attach; done
Remove clients without diffsync on echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i "diff_sync: false" -B 12 | grep -i enode | awk '{ print $NF }' | tr -d ',' | awk '{ print "admin.removePeer(" $1 ")" }' | ./geth --datadir ./node/ attach

For your fist command, I cannot for the life of me get it to run. Running just the first section, dropping the for/do, it runs fine, but after adding it back I get: /usr/local/bin/removepeers.sh: line 7: syntax error near unexpected token |' /usr/local/bin/removepeers.sh: line 7: for i in echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i name | grep -v 1.1.6 | grep -v 'name: ""' | sort | uniq; do'

The second commands works flawlessly as well. Ideas @jbriaux?

Edit: The issue is the damned "`" (shift-tilde) marks before the first echo, and after the uniq. Thanks, GitHub! >_<

for i in `echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i name | grep -v 1.1.5 | grep -v 'name: ""' | sort | uniq` ; do
        echo "admin.peers" | ./geth --datadir ./node/ attach | grep -i $i -B 5 | grep -i enode | awk '{ print $NF }' | tr -d ',' | awk '{ print "admin.removePeer(" $1 ")" }' | ./geth --datadir ./node/ attach ;
done

i do this why all my peers gone lol, or maybe i do something wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet