Significant increase in force closures after upgrade to LND 0.15 #6744

mcsnubbs · 2022-07-16T16:42:54Z

Background

Over the past 3 months I built a node with about 200 channels using 0.14.3. Initially I saw ~1-3 force closures a month, and typically the other peer confirmed that they had a major issue with their node leading to the failure. After upgrading to 0.15, I've seen on average 3 force closures per day. I've been unable to decipher the logs to get a sense of what is happening and what I should attempt to do to rectify the problem. As is, it seems my most active channels are disappearing daily.

I am not aware of any connectivity issues with my node, running hybrid mode, pubkey: 022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995.

Although Im not sure if it's relevant, I previously watched the HTLC stream with lntop, which would run for weeks in terminal. After upgrading to 0.15, lntop would lock up with many "pending" HTLCs. Now, it fails to launch at all.

Furthermore, lnd is running on a dedicated machine with bitcoind; prior to upgrading to 0.15, my cpu cores were running at an average of 0-5%; after upgrading to 0.15, most cores are at 5-10% utilization, with 1 or two cores spiking to 60-80% at any given time... No other changes have been made to the node, and I am not running rebalancing software.

Your environment

version of lnd 0.15
which operating system (uname -a on Nix) Linux WhiteBox 5.15.0-41-generic Make routing table persistent #44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
version of btcd, bitcoind, or other backend 0.23
any other relevant environment details

*** My most recent channel with c-otto force closed. I will include log entries for this channel funding outpoint below, followed by what c-otto sent me from his logs:

2022-07-16 11:55:43.342 [ERR] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): failing link: unable to accept revocation: revocation key mismatch with error: invalid revocation

2022-07-16 11:57:14.994 [ERR] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): failing link: ChannelPoint(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): unable to accept new commitment: not enough HTLC signatures with error: invalid commitment

2022-07-16 11:57:34.865 [INF] CNCT: ChannelArbitrator(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): force closing chan

*** Below are the logs sent from c-otto:

2022-07-16 17:55:32.875 [DBG] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): Failed to send 200022399 mSAT

^ Some failed HTLC, i.e. a working connection.

2022-07-16 17:55:39.614 [INF] PEER: Peer(022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995): unable to read message from peer: EOF
2022-07-16 17:55:39.614 [INF] PEER: Peer(022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995): disconnecting 022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995@96.47.237.10:44598, reason: read handler closed
2022-07-16 17:55:39.816 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): stopping
2022-07-16 17:55:39.817 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): exited

^ Disconnected

Then: reconnection

2022-07-16 17:55:39.927 [INF] SRVR: Finalizing connection to 022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995@96.47.237.10:44602, inbound=true
2022-07-16 17:55:39.934 [INF] PEER: Peer(022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995): loading ChannelPoint(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1)
2022-07-16 17:55:39.934 [DBG] CNCT: New ChainEventSubscription(id=146) for ChannelPoint(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1)
2022-07-16 17:55:39.934 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): starting
2022-07-16 17:55:39.935 [INF] CNCT: Attempting to update ContractSignals for ChannelPoint(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1)
2022-07-16 17:55:39.935 [INF] PEER: Peer(022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995): Negotiated chan series queries
2022-07-16 17:55:39.935 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): HTLC manager started, bandwidth=2393706690 mSAT
2022-07-16 17:55:39.935 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): attempting to re-synchronize
2022-07-16 17:55:40.948 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): received re-establishment message from remote side
2022-07-16 17:55:40.959 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): sending 2 updates to synchronize the state
2022-07-16 17:55:40.966 [DBG] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): loaded 1 fwd pks
2022-07-16 17:55:40.966 [DBG] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): removing completed fwd pkg for height=101142
2022-07-16 17:56:44.460 [DBG] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): queueing keystone of ADD open circuit: (Chan ID=0:0:0, HTLC ID=10466197)->(Chan ID=737690:1073:1, HTLC ID=45951)

2022-07-16 17:57:44.522 [ERR] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): failing link: unable to complete dance with error: remote unresponsive

^ That's the only non-common log entry.

2022-07-16 17:57:44.522 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): exited
2022-07-16 17:57:44.525 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): stopping
2022-07-16 17:59:52.987 [INF] NTFN: Dispatching confirmed spend notification for outpoint=671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1, script=0 f4530c60eab59c020938bc93bbdb6028b758a556b8e53f42023018ae5d9cf767 at current height=745254: f68f0277aba6755ae46c98b532da40da6161444c0216ffca115fe061a99a4702[0] spending 671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1 at height=745254

The text was updated successfully, but these errors were encountered:

C-Otto · 2022-07-16T16:58:54Z

Thanks for the issue! Let me know if you need further logs from my node.

C-Otto · 2022-07-16T17:13:33Z

I have a few of those:

2022-07-16 16:31:26.514 [ERR] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): failing link: unable to complete dance with error: remote unresponsive

Previous reconnection:

logs/lnd.log.53511.gz:2022-07-16 17:26:41.212 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): received re-establishment message from remote side
logs/lnd.log.53511.gz:2022-07-16 17:26:41.218 [DBG] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): loaded 1 fwd pks
logs/lnd.log.53511.gz:2022-07-16 17:26:41.218 [DBG] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): removing completed fwd pkg for height=101140
logs/lnd.log.53512.gz:2022-07-16 17:29:51.680 [DBG] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): queueing keystone of ADD open circuit: (Chan ID=0:0:0, HTLC ID=10464892)->(Chan ID=737690:1073:1, HTLC ID=45951)
logs/lnd.log.53512.gz:2022-07-16 17:30:51.744 [ERR] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): failing link: unable to complete dance with error: remote unresponsive
logs/lnd.log.53512.gz:2022-07-16 17:30:51.744 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): exited
logs/lnd.log.53512.gz:2022-07-16 17:30:51.744 [INF] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): stopping

hieblmi · 2022-07-16T17:37:27Z

@mcsnubbs on a side note, lntop works for me running v0.15.0-beta. Could check your ~/.lntop/lntop.log

mcsnubbs · 2022-07-17T10:11:19Z

I had two force closes since posting this issue, both with large reputable nodes. Will include the log info. The second list of log entries has a "fwding package db has been corrupted" error that I have not yet seen.

2022-07-17 05:27:52.213 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:1): HTLC manager started, bandwidth=3075323126 mSAT
2022-07-17 05:27:52.213 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:1): attempting to re-synchronize
2022-07-17 05:27:52.213 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:1): received re-establishment message from remote side
2022-07-17 05:30:40.784 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): stopping
2022-07-17 05:30:40.784 [WRN] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): error when syncing channel states: link shutting down
2022-07-17 05:30:40.785 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): exited
2022-07-17 05:30:40.854 [INF] PEER: NodeKey(03c436af41160a355fc1ed230a64f6a64bcbd2ae50f12171d1318f9782602be601) loading ChannelPoint(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2)
2022-07-17 05:30:40.854 [INF] HSWC: Removing channel link with ChannelID(4f8707322809a34b2cd464b35517f3dbc4dc22e0ff9395ae8a3724437465f0ce)
2022-07-17 05:30:40.854 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): starting
2022-07-17 05:30:40.854 [INF] CNCT: Attempting to update ContractSignals for ChannelPoint(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2)
2022-07-17 05:30:40.854 [INF] NTFN: New block epoch subscription
2022-07-17 05:30:40.854 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): HTLC manager started, bandwidth=4778017444 mSAT
2022-07-17 05:30:40.854 [INF] HSWC: ChannelLink(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): attempting to re-synchronize
2022-07-17 05:33:01.520 [INF] NTFN: Dispatching confirmed spend notification for outpoint=ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2, script=0 a3921047b009617866ad37ca1738d652dd123e5a5bd6ac2475eebd737421ce48 at current height=745344: f2910adaa3dd4faf75a92309020c13634610db16376ca58085b6b53b15d88606[0] spending ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2 at height=745344
2022-07-17 05:33:01.524 [INF] CNCT: Unilateral close of ChannelPoint(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2) detected
2022-07-17 05:33:01.525 [INF] CNCT: ChannelArbitrator(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): remote party has closed channel out on-chain
2022-07-17 05:33:15.614 [WRN] CHBU: Replacing disk backup for ChannelPoint(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:1) w/ newer version
2022-07-17 05:33:33.674 [INF] CNCT: ChannelArbitrator(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): still awaiting contract resolution
2022-07-17 05:33:45.683 [INF] CNCT: *contractcourt.commitSweepResolver(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): Sweeping with witness type: CommitmentToRemoteConfirmed
2022-07-17 05:33:45.683 [INF] CNCT: *contractcourt.commitSweepResolver(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): sweeping commit output
2022-07-17 05:34:44.332 [INF] NANN: Announcing channel(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2) disabled [detected]
2022-07-17 05:34:44.332 [ERR] NANN: Unable to sign update disabling channel(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): edge not found
2022-07-17 05:39:50.397 [INF] CNCT: *contractcourt.commitSweepResolver(ccf065744324378aae9593ffe022dcc4dbf31755b364d42c4ba309283207874f:2): local commitment output fully resolved by sweep tx: 997a314d838c55958f276afe3f00151453ce54922b3dccd3bfa28d8e7c811cc7

2022-07-16 21:56:36.420 [ERR] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): failing link: unable to complete dance with error: remote unresponsive
2022-07-16 21:56:36.420 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): exited
2022-07-16 21:56:36.949 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): stopping
2022-07-16 21:58:02.945 [INF] PEER: NodeKey(0262ab2e8c0f8d7062f51847eaf05900b43fe0902ba67858a4d8ec74fec68f4ba3) loading ChannelPoint(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0)
2022-07-16 21:58:02.945 [INF] HSWC: Removing channel link with ChannelID(32da91f3615a77ab6625579b80da3f46ed0fa0f19edaf2888712d6fcb25ac844)
2022-07-16 21:58:02.945 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): starting
2022-07-16 21:58:02.946 [INF] CNCT: Attempting to update ContractSignals for ChannelPoint(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0)
2022-07-16 21:58:02.946 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): HTLC manager started, bandwidth=3676794443 mSAT
2022-07-16 21:58:02.946 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): attempting to re-synchronize
2022-07-16 21:58:03.269 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): received re-establishment message from remote side
2022-07-16 21:58:05.661 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): sending 2 updates to synchronize the state
2022-07-16 21:58:27.683 [INF] NTFN: Dispatching confirmed spend notification for outpoint=44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0, script=0 61d99f1ebaa4a483603f65b56015dc50d294c5d2f542d985bc6a8f0ec56f1164 at current height=745303: b6fa986768f6ded26a5e465c8a0dfee1bd65a79de0cab12b20cf45f1c23a33b3[0] spending 44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0 at height=745303
2022-07-16 21:58:27.687 [INF] CNCT: Unilateral close of ChannelPoint(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0) detected
2022-07-16 21:58:27.688 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): remote party has closed channel out on-chain
**2022-07-16 21:58:41.211 [ERR] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): unable to remove fwd pkg for height=12890: fwding package db has been corrupted
2022-07-16 21:58:41.211 [ERR] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): failing link: unable to resolve fwd pkgs: fwding package db has been corrupted with error: internal error**
2022-07-16 21:58:41.211 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): exited
2022-07-16 21:58:58.495 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): still awaiting contract resolution
2022-07-16 21:59:06.077 [INF] CNCT: *contractcourt.commitSweepResolver(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): Sweeping with witness type: CommitmentToRemoteConfirmed
2022-07-16 21:59:06.077 [INF] CNCT: *contractcourt.commitSweepResolver(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): sweeping commit output
2022-07-16 21:59:06.529 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): a contract has been fully resolved!
2022-07-16 21:59:06.529 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): still awaiting contract resolution
2022-07-16 21:59:09.020 [INF] HSWC: Removing channel link with ChannelID(32da91f3615a77ab6625579b80da3f46ed0fa0f19edaf2888712d6fcb25ac844)
2022-07-16 21:59:09.020 [INF] HSWC: ChannelLink(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): stopping
2022-07-16 22:02:31.623 [INF] CNCT: *contractcourt.commitSweepResolver(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): local commitment output fully resolved by sweep tx: 64775e68ee277d9321f571f67f25a04313f3b0663d59cd356e61190eb69cb0ea
2022-07-16 22:02:40.153 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): a contract has been fully resolved!
2022-07-16 22:02:40.153 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): still awaiting contract resolution
2022-07-16 22:02:40.497 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): a contract has been fully resolved!
2022-07-16 22:02:40.497 [INF] CNCT: ChannelArbitrator(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): still awaiting contract resolution
2022-07-16 22:17:44.891 [INF] NANN: Announcing channel(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0) disabled [detected]
2022-07-16 22:17:44.894 [ERR] NANN: Unable to sign update disabling channel(44c85ab2fcd6128788f2da9ef1a00fed463fda809b572566ab775a61f391da32:0): edge not found

mcsnubbs · 2022-07-17T10:17:07Z

@mcsnubbs on a side note, lntop works for me running v0.15.0-beta. Could check your ~/.lntop/lntop.log

LNTOP log showing a long list of permission denied rpc errors

{"level":"error","ts":1657704389.4633987,"caller":"ui/controller.go:121","msg":"failed","logger":"controller","error":"rpc error: code = Unknown desc = edge not found","errorVerbose":"rpc error: code = Unknown desc = edge not found\ngh.neting.cc/edouardparis/lntop/network/backend/lnd.Backend.GetChannelInfo\n\t/tmp/lntop/network/backend/lnd/lnd.go:346\ngh.neting.cc/edouardparis/lntop/ui/models.(*Models).RefreshChannels\n\t/tmp/lntop/ui/models/models.go:63\ngh.neting.cc/edouardparis/lntop/ui.(*controller).Listen.func1\n\t/tmp/lntop/ui/controller.go:119\ngh.neting.cc/edouardparis/lntop/ui.(*controller).Listen\n\t/tmp/lntop/ui/controller.go:148\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581","stacktrace":"github.com/edouardparis/lntop/ui.(*controller).Listen.func1\n\t/tmp/lntop/ui/controller.go:121\ngh.neting.cc/edouardparis/lntop/ui.(*controller).Listen\n\t/tmp/lntop/ui/controller.go:148"}
{"level":"error","ts":1657788681.9180472,"caller":"pubsub/pubsub.go:49","msg":"SubscribeInvoice returned an error","logger":"pubsub","error":"rpc error: code = Unknown desc = permission denied","stacktrace":"github.com/edouardparis/lntop/pubsub.(*PubSub).invoices.func2\n\t/tmp/lntop/pubsub/pubsub.go:49"}

C-Otto · 2022-07-17T16:34:57Z

Could you please edit your posts and use code block formatting (three backticks - code - three backticks)?

mcsnubbs · 2022-07-17T18:15:14Z

Done. I compacted channel db, went from 18 to 2 gb, since then my routing has largely normalized, as has my cpu utilization and ability to run lntop. Would it make sense that the large channel.db caused the issue?

Crypt-iQ · 2022-07-18T11:25:14Z

I haven't seen this forwarding log error before, that might be an issue

2022-07-16 11:55:43.342 [ERR] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): failing link: unable to accept revocation: revocation key mismatch with error: invalid revocation

Should be fixed in 0.15.1

2022-07-16 11:57:14.994 [ERR] HSWC: ChannelLink(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): failing link: ChannelPoint(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): unable to accept new commitment: not enough HTLC signatures with error: invalid commitment

Should be fixed in 0.15.1

2022-07-16 11:57:34.865 [INF] CNCT: ChannelArbitrator(671cfce1d2f30c93333e4fe472f95c859893343aa6c50dac10b4ff58b6207bdb:1): force closing chan

Regular go-to-chain force close

Crypt-iQ · 2022-07-18T11:39:50Z

The forwarding log error appears to occur when an on-chain spend is detected (forwarding packages get wiped) and the link is still up and tries to remove a forwarding pacakge via resolveFwdPkg. This should be fine, but what I am not sure of is why the peer with the 4f8707322809a34b2cd464b35517f3dbc4dc22e0ff9395ae8a3724437465f0ce channel would reestablish the channel if it went to force close. CC @yyforyongyu . I did make note of this a while ago and it is benign, but is probably something that should be changed

Crypt-iQ · 2022-07-18T11:46:45Z

Things that could cause buggy force closes in 0.15 and prior from either you or C-Otto:

connectivity issues
- tor had a ddos attack on hidden services recently, not sure if that's still ongoing
scripts that call DisconnectPeer and reconnect

C-Otto · 2022-07-18T11:48:09Z

scripts that call DisconnectPeer and reconnect

I do that. My node has the "add linkStopIndex to cleanly shutdown ChannelLink" fix, though.

Crypt-iQ · 2022-07-18T11:58:39Z

I do that. My node has the "add linkStopIndex to cleanly shutdown ChannelLink" fix, though.

I am pretty sure you'll want the DisconnectPeer patch from here #6655 also. I think that even if one ChannelLink waits for the current one to finish, the new one might have outdated state when it is loaded.

C-Otto · 2022-07-18T11:59:59Z

I already have that, thanks.

Roasbeef · 2022-07-19T16:49:24Z

tor had a ddos attack on hidden services recently, not sure if that's still ongoing

Yep it is (though of a slightly different nature): https://status.torproject.org/

Roasbeef · 2022-07-19T16:51:03Z

Based on this comment, I think we can close this issue as most of the known force close scenarios have been resolved in 0.15.1? The one wild card still is the Tor network, as it still isn't hack at prior levels of connectivity/reliability.

mcsnubbs · 2022-07-20T15:29:54Z

So until 0.15.1 I should be regularly compacting channel.db to avoid similar issues?

RE TOR this node runs on clearnet as well; channel issues werent limited to TOR connected peers.

Roasbeef · 2022-07-27T00:03:33Z

So until 0.15.1 I should be regularly compacting channel.db to avoid similar issues?

No, the issues we fixed aren't related to DB compaction.

guggero · 2022-07-27T12:59:19Z

@mcsnubbs are you running with sync-freelist=true? It could be that the heavy CPU usage is coming from that which could perhaps also explain the force closes (your node not being able to respond in time). See #6737.

mcsnubbs · 2022-07-27T13:43:30Z

@mcsnubbs are you running with sync-freelist=true? It could be that the heavy CPU usage is coming from that which could perhaps also explain the force closes (your node not being able to respond in time). See #6737.

You are correct! This was my issue... both .db bloating and CPU usage returned to normal. Appreciate the assistance!

lncyberhub · 2022-07-27T17:29:24Z

@mcsnubbs are you running with sync-freelist=true? It could be that the heavy CPU usage is coming from that which could perhaps also explain the force closes (your node not being able to respond in time). See #6737.

You are correct! This was my issue... both .db bloating and CPU usage returned to normal. Appreciate the assistance!

I had the same thing going on! Crazy db bloating after ~7 days. It looks like sync-freelist=false is helping!

The setting sync-freelist=true caused heavy bloating of the channel.db (several GB per day). Disabling this setting fixes this issue. Related: lightningnetwork/lnd#6800 lightningnetwork/lnd#6744 lightningnetwork/lnd#6737 lightningnetwork/lnd#6837 (comment)

* Change sync-freelist to false in the lnd.conf The setting sync-freelist=true caused heavy bloating of the channel.db (several GB per day). Disabling this setting fixes this issue. Related: lightningnetwork/lnd#6800 lightningnetwork/lnd#6744 lightningnetwork/lnd#6737 lightningnetwork/lnd#6837 (comment) * remove sync-freelist=true from lnd.conf

lucasdcf added bug Unintended code behaviour channel closing Related to the closing of channels cooperatively and uncooperatively channels labels Jul 16, 2022

mcsnubbs closed this as completed Jul 27, 2022

C-Otto mentioned this issue Aug 17, 2022

Proposal: drop support for sync-freelist #6837

Open

Bitcoinite mentioned this issue Aug 25, 2022

Change sync-freelist to false in the lnd.conf raspibolt/raspibolt#1069

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant increase in force closures after upgrade to LND 0.15 #6744

Significant increase in force closures after upgrade to LND 0.15 #6744

mcsnubbs commented Jul 16, 2022 •

edited

Loading

C-Otto commented Jul 16, 2022

C-Otto commented Jul 16, 2022 •

edited

Loading

hieblmi commented Jul 16, 2022

mcsnubbs commented Jul 17, 2022 •

edited

Loading

mcsnubbs commented Jul 17, 2022

C-Otto commented Jul 17, 2022

mcsnubbs commented Jul 17, 2022

Crypt-iQ commented Jul 18, 2022 •

edited

Loading

Crypt-iQ commented Jul 18, 2022

Crypt-iQ commented Jul 18, 2022

C-Otto commented Jul 18, 2022

Crypt-iQ commented Jul 18, 2022

C-Otto commented Jul 18, 2022

Roasbeef commented Jul 19, 2022

Roasbeef commented Jul 19, 2022

mcsnubbs commented Jul 20, 2022

Roasbeef commented Jul 27, 2022

guggero commented Jul 27, 2022

mcsnubbs commented Jul 27, 2022

lncyberhub commented Jul 27, 2022 •

edited

Loading

Significant increase in force closures after upgrade to LND 0.15 #6744

Significant increase in force closures after upgrade to LND 0.15 #6744

Comments

mcsnubbs commented Jul 16, 2022 • edited Loading

Background

Your environment

C-Otto commented Jul 16, 2022

C-Otto commented Jul 16, 2022 • edited Loading

hieblmi commented Jul 16, 2022

mcsnubbs commented Jul 17, 2022 • edited Loading

mcsnubbs commented Jul 17, 2022

C-Otto commented Jul 17, 2022

mcsnubbs commented Jul 17, 2022

Crypt-iQ commented Jul 18, 2022 • edited Loading

Crypt-iQ commented Jul 18, 2022

Crypt-iQ commented Jul 18, 2022

C-Otto commented Jul 18, 2022

Crypt-iQ commented Jul 18, 2022

C-Otto commented Jul 18, 2022

Roasbeef commented Jul 19, 2022

Roasbeef commented Jul 19, 2022

mcsnubbs commented Jul 20, 2022

Roasbeef commented Jul 27, 2022

guggero commented Jul 27, 2022

mcsnubbs commented Jul 27, 2022

lncyberhub commented Jul 27, 2022 • edited Loading

mcsnubbs commented Jul 16, 2022 •

edited

Loading

C-Otto commented Jul 16, 2022 •

edited

Loading

mcsnubbs commented Jul 17, 2022 •

edited

Loading

Crypt-iQ commented Jul 18, 2022 •

edited

Loading

lncyberhub commented Jul 27, 2022 •

edited

Loading