Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: monitor wallet peers #1445

Merged
merged 1 commit into from
Feb 11, 2022
Merged

client: monitor wallet peers #1445

merged 1 commit into from
Feb 11, 2022

Conversation

chappjc
Copy link
Member

@chappjc chappjc commented Jan 24, 2022

This adds a PeersChange callback to client/asset.WalletConfig, similar to TipChange, for backends to report changes to the network peer count. The backends now implement a monitorPeers goroutine similar to monitorBlocks but checking and reporting on peer count changes. Any changes in peer count result in a wallet state note. Change in peer count from 0 to >0 also start a sync status monitoring goroutine, which runs until sync is restored.

This also updates each backend's SyncStatus method to ensure that the synced bool will never be true if there are peer count is 0.

Killing the "beta" node with the SPV beta wallet:

image

image

Starting up beta dcrd makes the signal-! icon change to a syncing arrow circle icon, which then goes away when it catches up with the network.

Killing "alpha" bitcoind with native btcwallet:

image

image

That also returns to normal after restarting alpha bitcoind.

Now the full-node wallets (still simnet):

wallet-peer-state-icons.mp4
2022-02-07 11:04:07.769 [ERR] CORE[dcr]: Failed to get peer count: rawrequest error: -9: wallet.NetworkBackend: Decred network is unreachable
2022-02-07 11:04:07.769 [WRN] CORE: Wallet for asset dcr has zero network peers!
2022-02-07 11:04:07.771 [WRN] CORE: notify: |WARNING| (walletconfig) Wallet network issue - Decred wallet has no network peers!
2022-02-07 11:04:07.771 [TRC] CORE: notify: |DATA| (walletstate)
2022-02-07 11:04:16.004 [TRC] CORE: notify: |DATA| (epoch) - Index: 109616897
2022-02-07 11:04:18.586 [ERR] CORE: btc wallet is reporting a failed state: failed to get best block hash from btc node
...
2022-02-07 11:04:22.586 [ERR] CORE[btc]: Failed to get peer count: rawrequest error: Post "http://127.0.0.1:20556/wallet/": dial tcp 127.0.0.1:20556: connect: connection refused
2022-02-07 11:04:22.586 [WRN] CORE: Wallet for asset btc has zero network peers!
2022-02-07 11:04:22.587 [ERR] CORE: btc wallet is reporting a failed state: failed to get best block hash from btc node
2022-02-07 11:04:22.589 [WRN] CORE: notify: |WARNING| (walletconfig) Wallet network issue - Bitcoin wallet has no network peers!
2022-02-07 11:04:22.589 [TRC] CORE: notify: |DATA| (walletstate)
2022-02-07 11:04:23.587 [ERR] CORE: btc wallet is reporting a failed state: failed to get best block hash from btc node
...
<start back up the alpha dcrd and bitcoind>
...
2022-02-07 11:05:42.769 [TRC] CORE: New peer count for asset dcr: 2
2022-02-07 11:05:42.769 [TRC] CORE: notify: |DATA| (walletstate)
2022-02-07 11:05:45.772 [TRC] CORE: notify: |DATA| (walletstate)
2022-02-07 11:05:45.779 [TRC] CORE: notify: |DATA| (balance) balance updated
2022-02-07 11:05:45.779 [INF] CORE: Wallet synced for asset dcr
2022-02-07 11:05:46.004 [TRC] CORE: notify: |DATA| (epoch) - Index: 109616903
...
2022-02-07 11:06:22.588 [TRC] CORE: New peer count for asset btc: 1
2022-02-07 11:06:22.588 [TRC] CORE: notify: |DATA| (walletstate)
2022-02-07 11:06:25.592 [TRC] CORE: notify: |DATA| (walletstate)
2022-02-07 11:06:25.596 [TRC] CORE: notify: |DATA| (balance) balance updated
2022-02-07 11:06:25.596 [INF] CORE: Wallet synced for asset btc

@chappjc chappjc force-pushed the monitor-peers branch 2 times, most recently from 0b49f8d to e231a12 Compare January 25, 2022 21:26
Comment on lines +1251 to +1277
timeDiff := time.Now().Unix() - int64(bh.Time)
if timeDiff > dexeth.MaxBlockInterval && eth.net != dex.Simnet {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoeGruffins I noticed this was dividing now in seconds by a thousand before comparing with the header time in seconds, so in practice this check was almost impossible to fail. I'm pretty sure 180 seconds since last block in our simnet workflows is gonna be too tight so I'm excluding simnet here. Note that the unit tests leave net as 0 (mainnet), so we're still testing the conditions.

select {
case <-ticker.C:
if c.walletCheckAndNotify(wallet) {
return
Copy link
Member Author

@chappjc chappjc Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not new behavior (see the old connectWallet method), but in this case the canceler goroutine above is stuck on Wait until the wallet shuts down even though the sync status loop is done. Arguably a minor goroutine leak, but really just excessive lifetime. No issue IMO, but worth noting.

Copy link
Member

@buck54321 buck54321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iffy about the pattern used in loadWallet, but otherwise g2g.

@@ -1871,7 +1860,7 @@ func (c *Core) loadWallet(dbWallet *db.Wallet) (*xcWallet, error) {

// Construct the unconnected xcWallet.
contractLockedAmt, orderLockedAmt := c.lockedAmounts(assetID)
return &xcWallet{
wallet = &xcWallet{ // captured by the PeersChange closure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// captured by the PeersChange closure

Assumption is that a wallet will never call its PeersChange callback in the constructor itself, otherwise we'd panic in (*Core).peerChange.

Comment on lines 146 to 158
// PeersChange is a function that will be called when the number of
// wallet/node peers changes, or the wallet fails to get the count.
PeersChange func(uint32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per other comment, it's important that PeersChange is not called in the constructor (actually a little fuzzier than that). It's presumably not an issue since we wouldn't have a peer change until Connect, but we're relying on the wallet to get this right.

If you wanna roll as-is, please comment here on the importance of not calling this method during initialization.

Comment on lines +1929 to +1940
go func() {
if wallet.connector.On() {
wallet.connector.Wait()
}
cancel()
}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me want a Done() <-chan struct{} on the ConnectionMaster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing a Done() method in #1474

client/core/core_test.go Outdated Show resolved Hide resolved
@@ -1860,6 +1846,9 @@ func (c *Core) loadWallet(dbWallet *db.Wallet) (*xcWallet, error) {
// of deadlocking a Core method that calls Wallet.Disconnect.
go c.tipChange(assetID, err)
},
PeersChange: func(numPeers uint32) {
go c.peerChange(wallet, numPeers)
Copy link
Member Author

@chappjc chappjc Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buck54321 I can put a wallet != nil check in here.
If the check is hit that would mean there's a data race though..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just pass an assetID and pull the wallet from the map?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had actually started that way. Will see why I veered away. Might be fine.

Copy link
Member Author

@chappjc chappjc Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, because it's not in the wallets map until after successful connect and a variety of other operations depending on whos calling loadWallet. :(
I recall now being mystified why peerChanges sometimes didn't work.

Copy link
Member Author

@chappjc chappjc Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially:

diff --git a/client/core/core.go b/client/core/core.go
index f81f9895..45ca809d 100644
--- a/client/core/core.go
+++ b/client/core/core.go
@@ -1834,6 +1834,7 @@ func (c *Core) assetDataDirectory(assetID uint32) string {
 // wallet. The returned wallet is running but not connected.
 func (c *Core) loadWallet(dbWallet *db.Wallet) (*xcWallet, error) {
        var wallet *xcWallet
+       var ready uint32 // in case asset.Open tries to call PeersChange before actually starting
        // Create the client/asset.Wallet.
        assetID := dbWallet.AssetID
        walletCfg := &asset.WalletConfig{
@@ -1847,6 +1848,9 @@ func (c *Core) loadWallet(dbWallet *db.Wallet) (*xcWallet, error) {
                        go c.tipChange(assetID, err)
                },
                PeersChange: func(numPeers uint32) {
+                       if atomic.LoadUint32(&ready) == 0 {
+                               return
+                       }
                        go c.peerChange(wallet, numPeers)
                },
                DataDir: c.assetDataDirectory(assetID),
@@ -1875,6 +1879,7 @@ func (c *Core) loadWallet(dbWallet *db.Wallet) (*xcWallet, error) {
                walletType: dbWallet.Type,
                traits:     asset.DetermineWalletTraits(w),
        }
+       atomic.StoreUint32(&ready, 1)
        return wallet, nil
 }

but... 🤮

@chappjc chappjc merged commit 14f1b48 into decred:master Feb 11, 2022
@chappjc chappjc deleted the monitor-peers branch February 11, 2022 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wallet health check on each order placement need for constant node/wallet connectivity checks
2 participants