Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling issue #135

Open
Nashatyrev opened this issue Sep 12, 2024 · 2 comments
Open

Sampling issue #135

Nashatyrev opened this issue Sep 12, 2024 · 2 comments

Comments

@Nashatyrev
Copy link
Owner

Issues

1 issue: The slot couldn't be sampled for some reason
2 issue: The remote peer was banned because sent the block which we consider unavailable

Log

At some we couldn't sample block #1059.
We were retrying later but never succeed

[cl-1-teku-geth] 2024-09-12 12:41:15.819 INFO  - Availability check slot 1059 result: DataAndValidationResult{validationResult=NOT_AVAILABLE, dataSize=0, cause=Optional[java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present]}
[cl-1-teku-geth] 2024-09-12 12:41:25.082 INFO  - Availability check slot 1059 result: DataAndValidationResult{validationResult=NOT_AVAILABLE, dataSize=0, cause=Optional[java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present]}
[cl-1-teku-geth] 2024-09-12 12:45:12.531 INFO  - Availability check slot 1059 result: DataAndValidationResult{validationResult=NOT_AVAILABLE, dataSize=0, cause=Optional[java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present]}
[cl-1-teku-geth] 2024-09-12 12:45:17.633 INFO  - Availability check slot 1059 result: DataAndValidationResult{validationResult=NOT_AVAILABLE, dataSize=0, cause=Optional[java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present]}

When importing blocks we decided that the batch is invalid cause it contains NOT_AVAILABLE block which is considered invalid and the peer was significantly downscored

[cl-1-teku-geth] 2024-09-12 12:45:12.532 DEBUG - Failed to import batch EventThreadOnlyBatch{delegate=SyncSourceBatch{firstSlot=1043, lastSlot=1077, count=35, complete=true, requiredParent=9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa, firstBlock=f84355460fdca170c61b2b1252f4a836f7bee466b3f9c349f3009c1be206a24d (1043) (parent: 9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa), lastBlock=a9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d (1077)}}: FAILED_DATA_AVAILABILITY_CHECK_NOT_AVAILABLE
[cl-1-teku-geth] java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present
[skipped]
[cl-1-teku-geth] at tech.pegasys.teku.statetransition.datacolumns.DasSamplerBasic.lambda$checkDataAvailability$3(DasSamplerBasic.java:94) ~[teku-ethereum-statetransition-develop.jar:24.3.1+181-gc5065b695f]
[cl-1-teku-geth] ... 28 more
[cl-1-teku-geth] 2024-09-12 12:45:12.611 DEBUG - Marking batch EventThreadOnlyBatch{delegate=SyncSourceBatch{firstSlot=1043, lastSlot=1077, count=35, complete=true, requiredParent=9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa, firstBlock=f84355460fdca170c61b2b1252f4a836f7bee466b3f9c349f3009c1be206a24d (1043) (parent: 9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa), lastBlock=a9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d (1077)}
    } as invalid because it extends from an invalid block
[cl-1-teku-geth] 2024-09-12 12:45:12.612 DEBUG - Penalising peer DefaultEth2Peer{id=16Uiu2HAmEuEiFQKhoZs2SrTrsqaKTyVkD6Hcuj9DgemWn5nHi9bn, remoteStatus=Optional[PeerStatus{forkDigest=0xd84e2ee0, finalizedRoot=0x52a4cbd4ff41c70b95b63e959e7677a92f26b486579cd9bd229e9abb3fa8df7f, finalizedEpoch=31, headRoot=0xa9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d, headSlot=1077}]
    } for providing invalid batch data SyncSourceBatch{firstSlot=1043, lastSlot=1077, count=35, complete=true, requiredParent=9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa, firstBlock=f84355460fdca170c61b2b1252f4a836f7bee466b3f9c349f3009c1be206a24d (1043) (parent: 9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa), lastBlock=a9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d (1077)}
[cl-1-teku-geth] 2024-09-12 12:45:12.612 DEBUG - Requesting blocks for 35 slots starting at 1043 from peer DefaultEth2Peer{id=16Uiu2HAmEuEiFQKhoZs2SrTrsqaKTyVkD6Hcuj9DgemWn5nHi9bn, remoteStatus=Optional[PeerStatus{forkDigest=0xd84e2ee0, finalizedRoot=0x52a4cbd4ff41c70b95b63e959e7677a92f26b486579cd9bd229e9abb3fa8df7f, finalizedEpoch=31, headRoot=0xa9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d, headSlot=1077}]}
[cl-1-teku-geth] 2024-09-12 12:45:12.612 DEBUG - Sending request for 35 blocks
[cl-1-teku-geth] 2024-09-12 12:45:12.629 DEBUG - Marking 0 batches complete because block at slot 1043 is child of starting point
[cl-1-teku-geth] 2024-09-12 12:45:12.629 INFO  - Created availabilityChecker for slot 1059
[cl-1-teku-geth] 2024-09-12 12:45:12.629 INFO  - Starting data availability check for slot 1059
[cl-1-teku-geth] 2024-09-12 12:45:12.629 INFO  - Requested checkDataAvailability for 0x0b3a6ca1b6b3d348ea3460875ff15da2844e8ed2e0b5e7c32fca64816b1957ef(1059)
[cl-1-teku-geth] 2024-09-12 12:45:17.633 INFO  - Collected [empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty],[empty] DataColumnSidecars in checkDataAvailability for 0x0b3a6ca1b6b3d348ea3460875ff15da2844e8ed2e0b5e7c32fca64816b1957ef(1059)
[cl-1-teku-geth] 2024-09-12 12:45:17.633 INFO  - Availability check slot 1059 result: DataAndValidationResult{validationResult=NOT_AVAILABLE, dataSize=0, cause=Optional[java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present]}
[cl-1-teku-geth] 2024-09-12 12:45:17.633 DEBUG - sidecars validation result: DataAndValidationResult{validationResult=NOT_AVAILABLE, dataSize=0, cause=Optional[java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present]}

... and again

[cl-1-teku-geth] 2024-09-12 12:45:17.633 DEBUG - Failed to import batch EventThreadOnlyBatch{delegate=SyncSourceBatch{firstSlot=1043, lastSlot=1077, count=35, complete=true, requiredParent=9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa, firstBlock=f84355460fdca170c61b2b1252f4a836f7bee466b3f9c349f3009c1be206a24d (1043) (parent: 9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa), lastBlock=a9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d (1077)}}: FAILED_DATA_AVAILABILITY_CHECK_NOT_AVAILABLE
[cl-1-teku-geth] java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present
[skipped]
[cl-1-teku-geth] Caused by: java.util.NoSuchElementException: No value present
[cl-1-teku-geth] at java.base/java.util.Optional.orElseThrow(Unknown Source) ~[?:?]
[cl-1-teku-geth] at tech.pegasys.teku.statetransition.datacolumns.DasSamplerBasic.lambda$checkDataAvailability$3(DasSamplerBasic.java:94) ~[teku-ethereum-statetransition-develop.jar:24.3.1+181-gc5065b695f]
[cl-1-teku-geth] ... 28 more

[cl-1-teku-geth] 2024-09-12 12:45:17.635 DEBUG - Marking batch EventThreadOnlyBatch
  {delegate=SyncSourceBatch{firstSlot=1043, lastSlot=1077, count=35, complete=true, requiredParent=9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa,  firstBlock=f84355460fdca170c61b2b1252f4a836f7bee466b3f9c349f3009c1be206a24d (1043) (parent: 9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa), lastBlock=a9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d (1077)}
    } as invalid because it extends from an invalid block
[cl-1-teku-geth] 2024-09-12 12:45:17.635 DEBUG - Penalising peer 
   DefaultEth2Peer{id=16Uiu2HAmEuEiFQKhoZs2SrTrsqaKTyVkD6Hcuj9DgemWn5nHi9bn, remoteStatus=Optional[PeerStatus{forkDigest=0xd84e2ee0, finalizedRoot=0x52a4cbd4ff41c70b95b63e959e7677a92f26b486579cd9bd229e9abb3fa8df7f, finalizedEpoch=31, headRoot=0xa9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d, headSlot=1077}]
   } for providing invalid batch data SyncSourceBatch{firstSlot=1043, lastSlot=1077, count=35, complete=true, requiredParent=9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa, firstBlock=f84355460fdca170c61b2b1252f4a836f7bee466b3f9c349f3009c1be206a24d (1043) (parent: 9c4f9e98dc2b9b2af84fb1f43607a9f04c17533e7dedf638d35b5b47d039ebfa), lastBlock=a9a71628c3697312d4279df77aa21d363b43999df64f8b0f4680e289332b8a2d (1077)}
[cl-1-teku-geth] 2024-09-12 12:45:17.635 INFO  - Disconnecting cleanly because REMOTE_FAULT from /ip4/172.16.0.13/tcp/9000

now the peer is probably banned

@Nashatyrev
Copy link
Owner Author

Sync is not capturing the latest slots because for some reasons firstIncompleteCustodySlot is not moving and is still 0
So basically if we are missing columns by gossip we never retrieve them later

[cl-1-teku-geth] 2024-09-12 12:40:47.423 INFO  - [nyota] DataColumnSidecarDB.addSidecar: new slot: 1057, prevSlot count: 128, total added: 134403, finalizedSlot: 0
[cl-1-teku-geth] 2024-09-12 12:40:56.004 INFO  - [nyota] DataCustodySync.fillUp: synced=0 pending=0, missingColumns=0([])
[cl-1-teku-geth] 2024-09-12 12:40:59.667 INFO  - [nyota] DataColumnSidecarDB.addSidecar: new slot: 1058, prevSlot count: 128, total added: 134530, finalizedSlot: 0
[cl-1-teku-geth] 2024-09-12 12:41:08.004 INFO  - [nyota] DataCustodySync.fillUp: synced=0 pending=0, missingColumns=0([])
[cl-1-teku-geth] 2024-09-12 12:41:20.004 INFO  - [nyota] DataCustodySync.fillUp: synced=0 pending=0, missingColumns=0([])
[cl-1-teku-geth] 2024-09-12 12:41:32.004 INFO  - [nyota] DataCustodySync.fillUp: synced=0 pending=0, missingColumns=0([])
[cl-1-teku-geth] 2024-09-12 12:41:38.614 INFO  - [nyota] DataColumnSidecarDB.addSidecar: new slot: 1061, prevSlot count: 128, total added: 134684, finalizedSlot: 0
[cl-1-teku-geth] 2024-09-12 12:41:44.004 INFO  - [nyota] DataCustodySync.fillUp: synced=0 pending=0, missingColumns=0([])
[cl-1-teku-geth] 2024-09-12 12:41:50.415 INFO  - [nyota] DataColumnSidecarDB.addSidecar: new slot: 1062, prevSlot count: 128, total added: 134796, finalizedSlot: 0

@Nashatyrev
Copy link
Owner Author

The peer can't sync from the very beginning due to failed sampling:

2024-09-13 17:06:45.395+04:00 | forkchoice-async-0 | DEBUG | BatchImporter | Failed to import batch EventThreadOnlyBatch{delegate=
    SyncSourceBatch{firstSlot=1, lastSlot=32, count=32, complete=true, requiredParent=c88f63b75a8780fe0f1ef720150256a5d1226ab3cf376a4586f67b449752a7b3, firstBlock=1b39ac777f6bb62dc3b200adec33b7aeb70d2435e885bf57f8dad5ab2a8b2cc2 (2) (parent: c88f63b75a8780fe0f1ef720150256a5d1226ab3cf376a4586f67b449752a7b3), lastBlock=77bf2eea9555b78a34d39b0b08e95c18c12e7d1a3665636581dea28c464a3663 (30)}}: 
    FAILED_DATA_AVAILABILITY_CHECK_NOT_AVAILABLE
    java.util.concurrent.CompletionException: java.util.NoSuchElementException: No value present

Also it assumes remote peer sending invalid data:

2024-09-13 17:06:45.396+04:00 | sync-async-0 | DEBUG | PeerScoringConflictResolutionStrategy | Penalising peer DefaultEth2Peer{id=16Uiu2HAkv2yYzHSyaWM8yNhQKtxdP18v6LQNRThRCgqiipo2bvFe, remoteStatus=Optional[PeerStatus{forkDigest=0x300c9ac3, finalizedRoot=0x77bf2eea9555b78a34d39b0b08e95c18c12e7d1a3665636581dea28c464a3663, finalizedEpoch=4, headRoot=0x883ffcf5b2981e58b0d3e8e9893f3be4dba898f813fd5e961c4b49a186e51747, headSlot=47}]
    } for providing invalid batch data SyncSourceBatch{firstSlot=1, lastSlot=32, count=32, complete=true, requiredParent=c88f63b75a8780fe0f1ef720150256a5d1226ab3cf376a4586f67b449752a7b3, firstBlock=1b39ac777f6bb62dc3b200adec33b7aeb70d2435e885bf57f8dad5ab2a8b2cc2 (2) (parent: c88f63b75a8780fe0f1ef720150256a5d1226ab3cf376a4586f67b449752a7b3), lastBlock=77bf2eea9555b78a34d39b0b08e95c18c12e7d1a3665636581dea28c464a3663 (30)}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant