Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gossip attestation validation: handle no committee error #4589

Merged
merged 3 commits into from
Oct 3, 2022

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Sep 23, 2022

Motivation

When an attestation came from an out of synced node, our logs look like:

Aug-11 02:29:41.278[NETWORK]         error: Gossip validation beacon_attestation threw a non-GossipActionError  Requesting slot committee out of range epoch: 113742 current: 113740
Error: Requesting slot committee out of range epoch: 113742 current: 113740
    at EpochContext.getShufflingAtEpoch (file:///usr/app/node_modules/@lodestar/state-transition/src/cache/epochContext.ts:720:13)
    at EpochContext.getShufflingAtSlot (file:///usr/app/node_modules/@lodestar/state-transition/src/cache/epochContext.ts:709:17)
    at getCommitteeIndices (file:///usr/app/node_modules/@lodestar/beacon-node/src/chain/validation/attestation.ts:287:56)
    at validateGossipAttestation (file:///usr/app/node_modules/@lodestar/beacon-node/src/chain/validation/attestation.ts:91:28)
    at runMicrotasks (<anonymous>)

Two issues here:

  • the log look dangerous to the user
  • we did not handle no committee error well, and we treated this case as IGNORE instead of REJECT

Description

  • Handle the no committee scenario in gossip attestation/aggregateAndProof validation: return REJECT to gossipsub
  • Correct log level just in case we got non-attestation errors

Closes #4396

@twoeths twoeths requested a review from a team as a code owner September 23, 2022 08:09
@github-actions
Copy link
Contributor

github-actions bot commented Sep 23, 2022

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 2f73da9 Previous: 3df7e2b Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.8634 ms/op 2.2460 ms/op 1.27
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 86.672 us/op 92.068 us/op 0.94
BLS verify - blst-native 2.1440 ms/op 2.7306 ms/op 0.79
BLS verifyMultipleSignatures 3 - blst-native 4.2756 ms/op 5.7843 ms/op 0.74
BLS verifyMultipleSignatures 8 - blst-native 9.1562 ms/op 12.182 ms/op 0.75
BLS verifyMultipleSignatures 32 - blst-native 33.225 ms/op 44.871 ms/op 0.74
BLS aggregatePubkeys 32 - blst-native 46.110 us/op 59.810 us/op 0.77
BLS aggregatePubkeys 128 - blst-native 178.67 us/op 241.96 us/op 0.74
getAttestationsForBlock 110.05 ms/op 109.45 ms/op 1.01
isKnown best case - 1 super set check 503.00 ns/op 522.00 ns/op 0.96
isKnown normal case - 2 super set checks 479.00 ns/op 516.00 ns/op 0.93
isKnown worse case - 16 super set checks 493.00 ns/op 524.00 ns/op 0.94
CheckpointStateCache - add get delete 11.342 us/op 11.490 us/op 0.99
validate gossip signedAggregateAndProof - struct 4.8699 ms/op 6.5009 ms/op 0.75
validate gossip attestation - struct 2.5521 ms/op 2.9810 ms/op 0.86
pickEth1Vote - no votes 2.6112 ms/op 2.5382 ms/op 1.03
pickEth1Vote - max votes 24.322 ms/op 22.973 ms/op 1.06
pickEth1Vote - Eth1Data hashTreeRoot value x2048 13.820 ms/op 14.796 ms/op 0.93
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 23.755 ms/op 24.046 ms/op 0.99
pickEth1Vote - Eth1Data fastSerialize value x2048 1.7748 ms/op 1.8909 ms/op 0.94
pickEth1Vote - Eth1Data fastSerialize tree x2048 16.479 ms/op 15.100 ms/op 1.09
bytes32 toHexString 1.2410 us/op 1.2980 us/op 0.96
bytes32 Buffer.toString(hex) 777.00 ns/op 845.00 ns/op 0.92
bytes32 Buffer.toString(hex) from Uint8Array 1.1180 us/op 1.1810 us/op 0.95
bytes32 Buffer.toString(hex) + 0x 820.00 ns/op 839.00 ns/op 0.98
Object access 1 prop 0.40700 ns/op 0.46700 ns/op 0.87
Map access 1 prop 0.34300 ns/op 0.36100 ns/op 0.95
Object get x1000 15.191 ns/op 16.743 ns/op 0.91
Map get x1000 0.93200 ns/op 0.95900 ns/op 0.97
Object set x1000 120.11 ns/op 124.94 ns/op 0.96
Map set x1000 78.031 ns/op 82.565 ns/op 0.95
Return object 10000 times 0.44060 ns/op 0.44550 ns/op 0.99
Throw Error 10000 times 8.0759 us/op 8.3197 us/op 0.97
enrSubnets - fastDeserialize 64 bits 3.0000 us/op 3.2380 us/op 0.93
enrSubnets - ssz BitVector 64 bits 758.00 ns/op 921.00 ns/op 0.82
enrSubnets - fastDeserialize 4 bits 424.00 ns/op 461.00 ns/op 0.92
enrSubnets - ssz BitVector 4 bits 746.00 ns/op 838.00 ns/op 0.89
prioritizePeers score -10:0 att 32-0.1 sync 2-0 109.43 us/op 113.49 us/op 0.96
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 147.18 us/op 154.39 us/op 0.95
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 264.05 us/op 295.08 us/op 0.89
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 642.20 us/op 643.44 us/op 1.00
prioritizePeers score 0:0 att 64-1 sync 4-1 520.39 us/op 666.38 us/op 0.78
RateTracker 1000000 limit, 1 obj count per request 207.02 ns/op 216.13 ns/op 0.96
RateTracker 1000000 limit, 2 obj count per request 152.37 ns/op 165.85 ns/op 0.92
RateTracker 1000000 limit, 4 obj count per request 128.19 ns/op 134.71 ns/op 0.95
RateTracker 1000000 limit, 8 obj count per request 112.76 ns/op 123.55 ns/op 0.91
RateTracker with prune 5.2880 us/op 5.8760 us/op 0.90
array of 16000 items push then shift 5.0960 us/op 5.5728 us/op 0.91
LinkedList of 16000 items push then shift 18.744 ns/op 21.545 ns/op 0.87
array of 16000 items push then pop 265.99 ns/op 259.75 ns/op 1.02
LinkedList of 16000 items push then pop 18.508 ns/op 19.202 ns/op 0.96
array of 24000 items push then shift 7.6672 us/op 7.8835 us/op 0.97
LinkedList of 24000 items push then shift 20.861 ns/op 19.355 ns/op 1.08
array of 24000 items push then pop 230.03 ns/op 228.56 ns/op 1.01
LinkedList of 24000 items push then pop 17.772 ns/op 19.107 ns/op 0.93
intersect bitArray bitLen 8 11.972 ns/op 12.649 ns/op 0.95
intersect array and set length 8 200.07 ns/op 204.16 ns/op 0.98
intersect bitArray bitLen 128 64.725 ns/op 75.093 ns/op 0.86
intersect array and set length 128 2.3377 us/op 2.4581 us/op 0.95
Buffer.concat 32 items 2.2250 ns/op 2.7230 ns/op 0.82
pass gossip attestations to forkchoice per slot 4.6191 ms/op 4.6031 ms/op 1.00
computeDeltas 5.5493 ms/op 5.5480 ms/op 1.00
computeProposerBoostScoreFromBalances 826.85 us/op 900.85 us/op 0.92
altair processAttestation - 250000 vs - 7PWei normalcase 4.9763 ms/op 5.1312 ms/op 0.97
altair processAttestation - 250000 vs - 7PWei worstcase 7.3482 ms/op 7.5932 ms/op 0.97
altair processAttestation - setStatus - 1/6 committees join 233.69 us/op 250.30 us/op 0.93
altair processAttestation - setStatus - 1/3 committees join 466.46 us/op 463.97 us/op 1.01
altair processAttestation - setStatus - 1/2 committees join 677.26 us/op 642.07 us/op 1.05
altair processAttestation - setStatus - 2/3 committees join 871.79 us/op 917.85 us/op 0.95
altair processAttestation - setStatus - 4/5 committees join 1.2375 ms/op 1.2752 ms/op 0.97
altair processAttestation - setStatus - 100% committees join 1.5395 ms/op 1.4607 ms/op 1.05
altair processBlock - 250000 vs - 7PWei normalcase 33.228 ms/op 33.397 ms/op 0.99
altair processBlock - 250000 vs - 7PWei normalcase hashState 42.897 ms/op 45.110 ms/op 0.95
altair processBlock - 250000 vs - 7PWei worstcase 104.90 ms/op 121.18 ms/op 0.87
altair processBlock - 250000 vs - 7PWei worstcase hashState 113.89 ms/op 126.72 ms/op 0.90
phase0 processBlock - 250000 vs - 7PWei normalcase 4.2988 ms/op 4.6059 ms/op 0.93
phase0 processBlock - 250000 vs - 7PWei worstcase 54.529 ms/op 64.474 ms/op 0.85
altair processEth1Data - 250000 vs - 7PWei normalcase 990.33 us/op 1.4510 ms/op 0.68
Tree 40 250000 create 909.12 ms/op 1.0275 s/op 0.88
Tree 40 250000 get(125000) 293.82 ns/op 323.55 ns/op 0.91
Tree 40 250000 set(125000) 2.9522 us/op 3.6700 us/op 0.80
Tree 40 250000 toArray() 35.169 ms/op 36.958 ms/op 0.95
Tree 40 250000 iterate all - toArray() + loop 35.889 ms/op 37.664 ms/op 0.95
Tree 40 250000 iterate all - get(i) 125.65 ms/op 134.87 ms/op 0.93
MutableVector 250000 create 19.192 ms/op 18.441 ms/op 1.04
MutableVector 250000 get(125000) 13.232 ns/op 14.361 ns/op 0.92
MutableVector 250000 set(125000) 811.83 ns/op 935.17 ns/op 0.87
MutableVector 250000 toArray() 7.2988 ms/op 7.9912 ms/op 0.91
MutableVector 250000 iterate all - toArray() + loop 7.2256 ms/op 8.7956 ms/op 0.82
MutableVector 250000 iterate all - get(i) 3.2633 ms/op 4.1445 ms/op 0.79
Array 250000 create 6.6429 ms/op 8.3699 ms/op 0.79
Array 250000 clone - spread 3.6091 ms/op 3.3681 ms/op 1.07
Array 250000 get(125000) 1.4420 ns/op 1.4410 ns/op 1.00
Array 250000 set(125000) 1.4890 ns/op 1.4070 ns/op 1.06
Array 250000 iterate all - loop 130.81 us/op 149.10 us/op 0.88
effectiveBalanceIncrements clone Uint8Array 300000 184.93 us/op 269.63 us/op 0.69
effectiveBalanceIncrements clone MutableVector 300000 819.00 ns/op 924.00 ns/op 0.89
effectiveBalanceIncrements rw all Uint8Array 300000 262.26 us/op 307.41 us/op 0.85
effectiveBalanceIncrements rw all MutableVector 300000 189.49 ms/op 224.98 ms/op 0.84
phase0 afterProcessEpoch - 250000 vs - 7PWei 194.47 ms/op 222.30 ms/op 0.87
phase0 beforeProcessEpoch - 250000 vs - 7PWei 78.828 ms/op 73.979 ms/op 1.07
altair processEpoch - mainnet_e81889 633.83 ms/op 677.27 ms/op 0.94
mainnet_e81889 - altair beforeProcessEpoch 169.83 ms/op 193.99 ms/op 0.88
mainnet_e81889 - altair processJustificationAndFinalization 67.452 us/op 70.248 us/op 0.96
mainnet_e81889 - altair processInactivityUpdates 10.767 ms/op 11.942 ms/op 0.90
mainnet_e81889 - altair processRewardsAndPenalties 156.96 ms/op 104.25 ms/op 1.51
mainnet_e81889 - altair processRegistryUpdates 14.525 us/op 14.363 us/op 1.01
mainnet_e81889 - altair processSlashings 3.9800 us/op 4.0800 us/op 0.98
mainnet_e81889 - altair processEth1DataReset 4.2840 us/op 4.1780 us/op 1.03
mainnet_e81889 - altair processEffectiveBalanceUpdates 2.8473 ms/op 2.5248 ms/op 1.13
mainnet_e81889 - altair processSlashingsReset 25.328 us/op 26.531 us/op 0.95
mainnet_e81889 - altair processRandaoMixesReset 25.915 us/op 25.285 us/op 1.02
mainnet_e81889 - altair processHistoricalRootsUpdate 4.3730 us/op 4.2910 us/op 1.02
mainnet_e81889 - altair processParticipationFlagUpdates 15.188 us/op 15.797 us/op 0.96
mainnet_e81889 - altair processSyncCommitteeUpdates 3.3810 us/op 3.3910 us/op 1.00
mainnet_e81889 - altair afterProcessEpoch 198.76 ms/op 219.14 ms/op 0.91
phase0 processEpoch - mainnet_e58758 690.49 ms/op 645.40 ms/op 1.07
mainnet_e58758 - phase0 beforeProcessEpoch 292.93 ms/op 317.37 ms/op 0.92
mainnet_e58758 - phase0 processJustificationAndFinalization 61.088 us/op 63.789 us/op 0.96
mainnet_e58758 - phase0 processRewardsAndPenalties 151.72 ms/op 120.46 ms/op 1.26
mainnet_e58758 - phase0 processRegistryUpdates 32.135 us/op 35.085 us/op 0.92
mainnet_e58758 - phase0 processSlashings 3.3020 us/op 3.4890 us/op 0.95
mainnet_e58758 - phase0 processEth1DataReset 3.4450 us/op 3.6100 us/op 0.95
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 2.2977 ms/op 2.0624 ms/op 1.11
mainnet_e58758 - phase0 processSlashingsReset 18.029 us/op 16.432 us/op 1.10
mainnet_e58758 - phase0 processRandaoMixesReset 24.197 us/op 26.106 us/op 0.93
mainnet_e58758 - phase0 processHistoricalRootsUpdate 4.4330 us/op 4.2020 us/op 1.05
mainnet_e58758 - phase0 processParticipationRecordUpdates 23.324 us/op 24.633 us/op 0.95
mainnet_e58758 - phase0 afterProcessEpoch 161.61 ms/op 182.31 ms/op 0.89
phase0 processEffectiveBalanceUpdates - 250000 normalcase 2.3515 ms/op 2.4530 ms/op 0.96
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 3.0291 ms/op 3.4001 ms/op 0.89
altair processInactivityUpdates - 250000 normalcase 52.366 ms/op 53.676 ms/op 0.98
altair processInactivityUpdates - 250000 worstcase 62.408 ms/op 54.376 ms/op 1.15
phase0 processRegistryUpdates - 250000 normalcase 29.316 us/op 30.286 us/op 0.97
phase0 processRegistryUpdates - 250000 badcase_full_deposits 499.07 us/op 511.20 us/op 0.98
phase0 processRegistryUpdates - 250000 worstcase 0.5 257.00 ms/op 249.97 ms/op 1.03
altair processRewardsAndPenalties - 250000 normalcase 100.22 ms/op 109.89 ms/op 0.91
altair processRewardsAndPenalties - 250000 worstcase 150.06 ms/op 145.89 ms/op 1.03
phase0 getAttestationDeltas - 250000 normalcase 13.097 ms/op 14.704 ms/op 0.89
phase0 getAttestationDeltas - 250000 worstcase 13.952 ms/op 14.916 ms/op 0.94
phase0 processSlashings - 250000 worstcase 6.2004 ms/op 6.8320 ms/op 0.91
altair processSyncCommitteeUpdates - 250000 344.04 ms/op 349.07 ms/op 0.99
BeaconState.hashTreeRoot - No change 577.00 ns/op 781.00 ns/op 0.74
BeaconState.hashTreeRoot - 1 full validator 68.331 us/op 75.265 us/op 0.91
BeaconState.hashTreeRoot - 32 full validator 764.78 us/op 788.48 us/op 0.97
BeaconState.hashTreeRoot - 512 full validator 7.2234 ms/op 8.2665 ms/op 0.87
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 97.520 us/op 101.75 us/op 0.96
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.4531 ms/op 1.4323 ms/op 1.01
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 18.206 ms/op 17.409 ms/op 1.05
BeaconState.hashTreeRoot - 1 balances 70.614 us/op 71.979 us/op 0.98
BeaconState.hashTreeRoot - 32 balances 664.48 us/op 689.61 us/op 0.96
BeaconState.hashTreeRoot - 512 balances 6.9665 ms/op 5.9335 ms/op 1.17
BeaconState.hashTreeRoot - 250000 balances 100.73 ms/op 124.66 ms/op 0.81
aggregationBits - 2048 els - zipIndexesInBitList 32.080 us/op 37.103 us/op 0.86
regular array get 100000 times 55.395 us/op 57.614 us/op 0.96
wrappedArray get 100000 times 52.787 us/op 58.323 us/op 0.91
arrayWithProxy get 100000 times 33.248 ms/op 35.545 ms/op 0.94
ssz.Root.equals 539.00 ns/op 625.00 ns/op 0.86
byteArrayEquals 575.00 ns/op 628.00 ns/op 0.92
shuffle list - 16384 els 11.211 ms/op 12.655 ms/op 0.89
shuffle list - 250000 els 162.27 ms/op 184.01 ms/op 0.88
processSlot - 1 slots 17.046 us/op 17.653 us/op 0.97
processSlot - 32 slots 2.2455 ms/op 2.4162 ms/op 0.93
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 450.60 us/op 499.90 us/op 0.90
getCommitteeAssignments - req 1 vs - 250000 vc 4.8982 ms/op 5.5257 ms/op 0.89
getCommitteeAssignments - req 100 vs - 250000 vc 7.0408 ms/op 7.8395 ms/op 0.90
getCommitteeAssignments - req 1000 vs - 250000 vc 7.6545 ms/op 8.8183 ms/op 0.87
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 8.1000 ns/op 9.4800 ns/op 0.85
state getBlockRootAtSlot - 250000 vs - 7PWei 1.3152 us/op 1.3149 us/op 1.00
computeProposers - vc 250000 19.744 ms/op 21.581 ms/op 0.91
computeEpochShuffling - vc 250000 168.31 ms/op 190.54 ms/op 0.88
getNextSyncCommittee - vc 250000 323.10 ms/op 349.47 ms/op 0.92

by benchmarkbot/action

@wemeetagain
Copy link
Member

I think the error is happening when a node hasn't been getting blocks for an epoch, but manages to publish an attestation the next epoch.

Eg: node last received block in epoch 1, publishes an attestation in epoch 3.
Then when we're verifying the attestation, we use regen to get the state at epoch 1, and can't get the shuffling for epoch 3.

In order to process the attestation, we could use regen.getBlockSlotState to dial the state to the right epoch.

But I don't think we should just reject, since this may be a valid attestation.

@twoeths
Copy link
Contributor Author

twoeths commented Sep 29, 2022

In order to process the attestation, we could use regen.getBlockSlotState to dial the state to the right epoch.

yeah that's exactly what we did in the past, there's a performance concern with it as we may need extra epoch transition just to process an attestation, and we changed to just use the state at block which make our node more stable. The main point for this issue is just for the log

But I don't think we should just reject, since this may be a valid attestation.

agree, we should not reject, I'll change to ignore (same behaviour to earlier too)

Copy link
Contributor

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a try / catch would prefer to make getShufflingAtEpoch or a separate function return null if out of range. Then the validation code can throw the error directly if returned value is null

@dapplion dapplion enabled auto-merge (squash) October 3, 2022 06:51
@dapplion dapplion merged commit 07eb658 into unstable Oct 3, 2022
@dapplion dapplion deleted the tuyen/gossip-attestation-log branch October 3, 2022 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not able to validate attestation: Requesting slot committee out of range
3 participants