Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeerManager failed to connect to subnet peers #3940

Closed
twoeths opened this issue Apr 20, 2022 · 6 comments · Fixed by #3955 or #3956
Closed

PeerManager failed to connect to subnet peers #3940

twoeths opened this issue Apr 20, 2022 · 6 comments · Fixed by #3955 or #3956
Assignees
Labels
scope-profitability Issues to directly improve validator performance and its profitability.

Comments

@twoeths
Copy link
Contributor

twoeths commented Apr 20, 2022

Describe the bug

In hetzner-c0, there is a missed attestation after I deploy master there:

Apr-20 02:54:24.422[]                ^[[31merror^[[39m: Error publishing attestations slot=3633270 Internal Server Error: PublishError.InsufficientPeers
Error: Internal Server Error: PublishError.InsufficientPeers
    at HttpClient.request (/usr/app/node_modules/@chainsafe/lodestar-api/src/client/utils/httpClient.ts:109:15)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at HttpClient.json (/usr/app/node_modules/@chainsafe/lodestar-api/src/client/utils/httpClient.ts:71:12)
    at Object.request [as submitPoolAttestations] (/usr/app/node_modules/@chainsafe/lodestar-api/src/client/utils/client.ts:74:19)
    at AttestationService.signAndPublishAttestations (/usr/app/node_modules/@chainsafe/lodestar-validator/src/services/attestation.ts:156:9)
    at AttestationService.runAttestationTasks (/usr/app/node_modules/@chainsafe/lodestar-validator/src/services/attestation.ts:69:5)

PeerManager is supposed to find subnet peers before validator submit subnet attestations but it failed to do that

Expected behavior

PeerManager should be able to find and connect to subnet peers before validators submit subnet attestations

@twoeths twoeths self-assigned this Apr 22, 2022
@twoeths
Copy link
Contributor Author

twoeths commented Apr 22, 2022

part of #3527

@twoeths
Copy link
Contributor Author

twoeths commented Apr 22, 2022

it shows that some of our peers have 0 long lived subnets, we should prefer disconnecting those peers per heartbeat

Screen Shot 2022-04-22 at 10 14 46

also just double check that lighthouse does that

@twoeths
Copy link
Contributor Author

twoeths commented Apr 24, 2022

grep -e "PublishError.InsufficientPeers" -rn validator*.log | wc -l returns 100 after 4 days

also there are a lot of unaggregated attestations that's sent to 1 peer
grep -e "sentPeers=1" -rn beacon*.log | wc -l returns 786

Screen Shot 2022-04-24 at 16 51 36

@twoeths
Copy link
Contributor Author

twoeths commented Apr 24, 2022

the issue is not reproducable on prater
Screen Shot 2022-04-24 at 16 58 21

@twoeths
Copy link
Contributor Author

twoeths commented May 2, 2022

The missed attestation in hetzner-c0 is very correlated to the low long-lived subnets metric

Screen Shot 2022-05-02 at 10 43 40

Screen Shot 2022-05-02 at 10 44 24

@dapplion dapplion added the scope-profitability Issues to directly improve validator performance and its profitability. label May 10, 2022
@twoeths
Copy link
Contributor Author

twoeths commented May 23, 2022

#3955 and #3956 resolved the issue

@twoeths twoeths closed this as completed May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope-profitability Issues to directly improve validator performance and its profitability.
Projects
None yet
2 participants