Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R4R] fix: graceful shutdown bug #509

Merged
merged 2 commits into from
Nov 8, 2021
Merged

Conversation

SolidityGo
Copy link
Contributor

@SolidityGo SolidityGo commented Nov 3, 2021

fix: graceful shutdown bug

Description

  1. delete(ps.diffWait, id) while the handshake fails on handler_diff.go

Rationale

goroutine 21121611 [chan receive, 9 minutes]:
github.com/ethereum/go-ethereum/eth.(*peerSet).waitDiffExtension(0xc0027f8230, 0xc0d4793c00, 0xc266559f40, 0x0, 0x0)
	github.com/ethereum/go-ethereum/eth/peerset.go:206 +0x20c
github.com/ethereum/go-ethereum/eth.(*handler).runEthPeer(0xc0002990e0, 0xc0d4793c00, 0xc207a63a10, 0x0, 0x0)
	github.com/ethereum/go-ethereum/eth/handler.go:261 +0x1db
github.com/ethereum/go-ethereum/eth.(*ethHandler).RunPeer(0xc0002990e0, 0xc0d4793c00, 0xc207a63a10, 0xc0d1709540, 0x7f7d641990e0)
	github.com/ethereum/go-ethereum/eth/handler_eth.go:46 +0x3f
github.com/ethereum/go-ethereum/eth/protocols/eth.MakeProtocols.func1(0xc24b3a55c0, 0x1b02cb8, 0xc0d1709540, 0x0, 0x0)
	github.com/ethereum/go-ethereum/eth/protocols/eth/handler.go:117 +0x11a
github.com/ethereum/go-ethereum/p2p.(*Peer).startProtocols.func1(0xc24b3a55c0, 0xc0d1709540, 0x1b02cb8, 0xc0d1709540)
	github.com/ethereum/go-ethereum/p2p/peer.go:396 +0x98
created by github.com/ethereum/go-ethereum/p2p.(*Peer).startProtocols
	github.com/ethereum/go-ethereum/p2p/peer.go:394 +0x205

The preceding error occurs multiple times when node stop(). Wait channel in waitDiffExtension can't exit normally.

In runEthPeer of eth/handler.go, if the peer has a diff extension, wait for it to connect through peerset.diffWait channel.
While RunPeer of eth/handler_diff.go runs, if the peer.Handshake is successful, peerset.diffWait[id] will exit and be removed. If it it not successful, peerset.diffWait[id] on waitDiffExtension will keep waiting. Which caused graceful shutdown cannot be enabled

Example

add an example CLI or API response...

Changes

Notable changes:

  • add each change in a bullet point here
  • ...

Preflight checks

  • build passed (make build)
  • tests passed (make test)
  • manual transaction test passed

Already reviewed by

...

Related issues

#479

@@ -33,6 +33,15 @@ func (h *diffHandler) Chain() *core.BlockChain { return h.chain }
// RunPeer is invoked when a peer joins on the `diff` protocol.
func (h *diffHandler) RunPeer(peer *diff.Peer, hand diff.Handler) error {
if err := peer.Handshake(h.diffSync); err != nil {
// ensure that waitDiffExtension receives the exit signal normally
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add lock for peers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

delete(ps.diffWait, id)
wait <- peer
}
ps.lock.Unlock()
return err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the error that you observe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants