Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Devnet: not able to sync public node with current tfchain docker image tfchain-devnet2 #442

Closed
coesensbert opened this issue Sep 7, 2022 · 13 comments · Fixed by #462
Closed
Assignees
Labels
type_bug Something isn't working

Comments

@coesensbert
Copy link
Contributor

Current external 3 devnet validators also use the dylanverstraete/tfchain-devnet2 docker image, which works fine.

Trying to sync a public node results in the following:
image

Not sure who this peer is, but could try to find out if necessary. A node with these error's also seems to be on "a different chain" in the telemetry data:
image

Such a faulty public node has been started with:
docker run -d --restart unless-stopped -v /storage/:/storage/ --name tfchain-dev-pub-int --network host dylanverstraete/tfchain-devnet2 --name tfchain-dev-pub-int --base-path /storage --chain /etc/chainspecs/dev/chainSpecRaw.json --bootnodes /ip4/185.206.122.7/tcp/30333/p2p/12D3KooWRdfuKqX8hULMZz521gdqZB2TXJjfrJE5FV71WiuAUrpk --rpc-cors all --node-key 0b4945a1e5c568a453f017f11f331d0e6e0ee3ea433c41e3c389c09ffcb53405 --prometheus-external --ws-external --ws-max-connections=148576 --pruning archive --telemetry-url 'wss://shard1.telemetry.tfchain.grid.tf/submit 1' --rpc-methods Unsafe --rpc-external

By now we excluded networking issues. We have setup 4 unsafe public nodes for each net on silver boxes with only one internal ip. The current images for qa, test and mainnet work fine. So the devnet node was setup using the same procedures and networking environment/hardware.
Normally we start the docker image with specific ports exposed: -p 0.0.0.0:9944:9944 -p 0.0.0.0:30333:30333 -p 0.0.0.0:9933:9933 -p 0.0.0.0:9615:9615
Changing to --network host made no difference.
There is also regular communication over port 30333.

Will test some older images

@DylanVerstraete
Copy link
Contributor

I can sync a node on devnet using latest development tfchain. It's probably the image that doesn't work.

@coesensbert
Copy link
Contributor Author

testing sync with: dylanverstraete/tfchain:2.1.0-b6

@coesensbert
Copy link
Contributor Author

image tfchain:2.1.0-b6 started to sync until block 113254

image
image

Restarted the node using the same data directory with tfchain-devnet2 image but no improvement, the container crashes

image

@DylanVerstraete
Copy link
Contributor

was able to reproduce on devnet with tfchain v2.1.0

2022-09-21 11:20:30 ⚙️  Syncing 32.0 bps, target=#2416742 (5 peers), best: #112869 (0xfa0a…8119), finalized #112640 (0x3c2b…c601), ⬇ 2.8kiB/s ⬆ 2.1kiB/s    
2022-09-21 11:20:35 ⚙️  Syncing 33.2 bps, target=#2416743 (5 peers), best: #113035 (0xd32f…a386), finalized #112640 (0x3c2b…c601), ⬇ 0.6kiB/s ⬆ 0.3kiB/s    
2022-09-21 11:20:40 ⚙️  Syncing 30.8 bps, target=#2416744 (5 peers), best: #113189 (0x7100…bb96), finalized #113152 (0x5062…70ed), ⬇ 194.5kiB/s ⬆ 3.9kiB/s    
2022-09-21 11:20:42 💔 Verification failed for block 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3 received from peer: 12D3KooWAnibsVN4yBcKNKnnRm8pxmHkEf3DUT65dpb3P252RpZK, "Bad signature on 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3"    
2022-09-21 11:20:42 💔 Error importing block 0x0a43d051e38aafd6b414911dd8154e2827cc17c1170b9ac5bd7a0dafea9eaef0: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x0b5da794c93fb95e9091401649719c8bb0824229649d2810c939ffd610c2aedb: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0xc866caa4a8e26f9e01db1bd39faa13aeb258362e73dd077f00c71ae8da912f07: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x16a49afbb90e8ede648740b517a0a415edbdd17f891c017dc0f1708c04052f4c: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x69feefa70018caa90d232056a2cd6fc156e0bef4c8553c46f21d826e84b1c3ba: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x5b46ae3b7307a9014419b99dbd9f85bcded04457378207930d3129154de36b76: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x8c9026aa1e70c718ea63c1e3381846ec7bbd96b4d5c7e2434b61bb3985e33eb8: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x860bfafefed3f956a1ce69a2f98d4fe07087bbe9d1c423003c3346bf72cc9b7d: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0xcc82f27a3484112f1de904e5d5fa62f60e5e7a2c29902e086dec31782a3008ae: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x4fbde896e13fafac42d1aeeaf3d73411e1c7973f8213be0fef7cb8271be57e10: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0xf6547f1cee026a4c3d8e779ce20debc79185af1b08668cfe8a083c7670b3a48f: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0xcee33e8a2c09c2b8a6ef4fb7b2f0da18594d3ac38e76b9b0da715749dce83be9: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0x96d3fc4ab602d8206f2afe4e42c9bf1746425758abb1e72ca2b7be3496b19e55: block has an unknown parent    
2022-09-21 11:20:42 💔 Error importing block 0xec9867ba1ad9050e18ca0e5c8443cb6688853cdba35734ca3e9e416cd3148c23: block has an unknown parent    
2022-09-21 11:20:42 💔 Verification failed for block 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3 received from peer: 12D3KooWRdfuKqX8hULMZz521gdqZB2TXJjfrJE5FV71WiuAUrpk, "Bad signature on 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3"    
2022-09-21 11:20:42 💔 Error importing block 0x2dd86bd25075ce978dc873316b02a49b848175e7cf87213003c57e55c85a7aee: block has an unknown parent    
2022-09-21 11:20:42 💔 Verification failed for block 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3 received from peer: 12D3KooWBkwH8LfJsz48Q8LHXSQnuqKJK8YdoQokDeS9wQX1j8mm, "Bad signature on 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3"    
2022-09-21 11:20:42 💔 Verification failed for block 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3 received from peer: 12D3KooWMPnWPkAi9UhVgGv9QmozXiPDE94rymPCsXLJ5EKynwta, "Bad signature on 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3"    
2022-09-21 11:20:45 💤 Idle (1 peers), best: #113254 (0x9511…8c92), finalized #113152 (0x5062…70ed), ⬇ 348.1kiB/s ⬆ 2.3kiB/s    
2022-09-21 11:20:47 💔 Verification failed for block 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3 received from peer: 12D3KooWAnibsVN4yBcKNKnnRm8pxmHkEf3DUT65dpb3P252RpZK, "Bad signature on 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3"    
2022-09-21 11:20:47 💔 Verification failed for block 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3 received from peer: 12D3KooWRdfuKqX8hULMZz521gdqZB2TXJjfrJE5FV71WiuAUrpk, "Bad signature on 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3"    
2022-09-21 11:20:47 💔 Error importing block 0x31cba8aa187f41df3ee545f95e0227c0900e0d161c5fec51b0dc96c9d239565e: block has an unknown parent    
2022-09-21 11:20:47 💔 Verification failed for block 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3 received from peer: 12D3KooWBkwH8LfJsz48Q8LHXSQnuqKJK8YdoQokDeS9wQX1j8mm, "Bad signature on 0xa06d1231c7ad3403216a427883e18bccc95e461c902ff430f13543ce7c11f5d3"    
2022-09-21 11:20:50 💤 Idle (1 peers), best: #113254 (0x9511…8c92), finalized #113152 (0x5062…70ed), ⬇ 239.8kiB/s ⬆ 3.7kiB/s  

@DylanVerstraete
Copy link
Contributor

Found this issue, looks like polkadot had the same issue at some point:

paritytech/polkadot#3089

Will try some earlier version of tfchain and try to sync further

@DylanVerstraete
Copy link
Contributor

Trying to sync with an earlier version on a directory that was created with tfchain 2.1.0 does not work.

I will try to sync a node from tfchain 1.12.1

@DylanVerstraete
Copy link
Contributor

DylanVerstraete commented Sep 21, 2022

-> I can use both 1.12.3 and 2.1.0 to sync to height: 113245
-> 2.1.0 stops after height 113245 was reached, cannot fall back to 1.12.3 due to incompatible database structure
-> 1.12.3 can be used to sync from 0 to ??? height (to be checked)

@LeeSmet started a node from version 1.12.3 sync on devnet, will update to what height it will get

note: 1.12.3 has incompatible devnet chainspecs, something went wrong on that release but it was fixed in: #368

@DylanVerstraete DylanVerstraete self-assigned this Sep 22, 2022
@DylanVerstraete DylanVerstraete added the type_bug Something isn't working label Sep 22, 2022
@DylanVerstraete
Copy link
Contributor

DylanVerstraete commented Sep 22, 2022

Okay more progress on investigating what happened:

Both on devnet and testnet it fails to sync with version 2.1.0. Both processes are stopped on a block where a new validator was added:

dev: https://polkadot.js.org/apps/?rpc=wss%3A%2F%2Ftfchain.dev.grid.tf#/explorer/query/113254
test: https://polkadot.js.org/apps/?rpc=wss%3A%2F%2Ftfchain.test.grid.tf#/explorer/query/362418

It seems it can still import this block but the next one fails. I also see in the subsequent block (on both networks) following:

image

For some reason in the logs we see consensus showing up two times, aura and FRNK?

@DylanVerstraete
Copy link
Contributor

related issue found: paritytech/substrate#10103

@DylanVerstraete
Copy link
Contributor

Fixed in #462

@DylanVerstraete DylanVerstraete added this to the 2.1.1 milestone Sep 26, 2022
@DylanVerstraete
Copy link
Contributor

Seems like it still an issue. I will do some more tests locally

@DylanVerstraete DylanVerstraete modified the milestones: 2.1.1, 2.2.0 Oct 3, 2022
@DylanVerstraete
Copy link
Contributor

@coesensbert this is resolved right?

@coesensbert
Copy link
Contributor Author

yes, other images worked like:

dylanverstraete/tfchain:1.12.3-fix
dylanverstraete/tfchain:2.1.1

https://docs.grid.tf/threefold/itenv_threefold_main/src/branch/master/Kubernetes-clusters/hagrid-dev/applications/tfchain/Adding-validators.md#deploy-via-docker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type_bug Something isn't working
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants