-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significant increase in force closures after upgrade to LND 0.15 #6744
Comments
Thanks for the issue! Let me know if you need further logs from my node. |
I have a few of those:
Previous reconnection:
|
@mcsnubbs on a side note, lntop works for me running v0.15.0-beta. Could check your |
I had two force closes since posting this issue, both with large reputable nodes. Will include the log info. The second list of log entries has a "fwding package db has been corrupted" error that I have not yet seen.
|
LNTOP log showing a long list of permission denied rpc errors {"level":"error","ts":1657704389.4633987,"caller":"ui/controller.go:121","msg":"failed","logger":"controller","error":"rpc error: code = Unknown desc = edge not found","errorVerbose":"rpc error: code = Unknown desc = edge not found\ngh.neting.cc/edouardparis/lntop/network/backend/lnd.Backend.GetChannelInfo\n\t/tmp/lntop/network/backend/lnd/lnd.go:346\ngh.neting.cc/edouardparis/lntop/ui/models.(*Models).RefreshChannels\n\t/tmp/lntop/ui/models/models.go:63\ngh.neting.cc/edouardparis/lntop/ui.(*controller).Listen.func1\n\t/tmp/lntop/ui/controller.go:119\ngh.neting.cc/edouardparis/lntop/ui.(*controller).Listen\n\t/tmp/lntop/ui/controller.go:148\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581","stacktrace":"github.com/edouardparis/lntop/ui.(*controller).Listen.func1\n\t/tmp/lntop/ui/controller.go:121\ngh.neting.cc/edouardparis/lntop/ui.(*controller).Listen\n\t/tmp/lntop/ui/controller.go:148"} |
Could you please edit your posts and use code block formatting (three backticks - code - three backticks)? |
Done. I compacted channel db, went from 18 to 2 gb, since then my routing has largely normalized, as has my cpu utilization and ability to run lntop. Would it make sense that the large channel.db caused the issue? |
I haven't seen this forwarding log error before, that might be an issue
Should be fixed in 0.15.1
Should be fixed in 0.15.1
Regular go-to-chain force close |
The forwarding log error appears to occur when an on-chain spend is detected (forwarding packages get wiped) and the link is still up and tries to remove a forwarding pacakge via |
Things that could cause buggy force closes in 0.15 and prior from either you or C-Otto:
|
I do that. My node has the "add linkStopIndex to cleanly shutdown ChannelLink" fix, though. |
I am pretty sure you'll want the DisconnectPeer patch from here #6655 also. I think that even if one ChannelLink waits for the current one to finish, the new one might have outdated state when it is loaded. |
I already have that, thanks. |
Yep it is (though of a slightly different nature): https://status.torproject.org/ |
Based on this comment, I think we can close this issue as most of the known force close scenarios have been resolved in 0.15.1? The one wild card still is the Tor network, as it still isn't hack at prior levels of connectivity/reliability. |
So until 0.15.1 I should be regularly compacting channel.db to avoid similar issues? RE TOR this node runs on clearnet as well; channel issues werent limited to TOR connected peers. |
No, the issues we fixed aren't related to DB compaction. |
You are correct! This was my issue... both .db bloating and CPU usage returned to normal. Appreciate the assistance! |
I had the same thing going on! Crazy db bloating after ~7 days. It looks like |
The setting sync-freelist=true caused heavy bloating of the channel.db (several GB per day). Disabling this setting fixes this issue. Related: lightningnetwork/lnd#6800 lightningnetwork/lnd#6744 lightningnetwork/lnd#6737 lightningnetwork/lnd#6837 (comment)
* Change sync-freelist to false in the lnd.conf The setting sync-freelist=true caused heavy bloating of the channel.db (several GB per day). Disabling this setting fixes this issue. Related: lightningnetwork/lnd#6800 lightningnetwork/lnd#6744 lightningnetwork/lnd#6737 lightningnetwork/lnd#6837 (comment) * remove sync-freelist=true from lnd.conf
Background
Over the past 3 months I built a node with about 200 channels using 0.14.3. Initially I saw ~1-3 force closures a month, and typically the other peer confirmed that they had a major issue with their node leading to the failure. After upgrading to 0.15, I've seen on average 3 force closures per day. I've been unable to decipher the logs to get a sense of what is happening and what I should attempt to do to rectify the problem. As is, it seems my most active channels are disappearing daily.
I am not aware of any connectivity issues with my node, running hybrid mode, pubkey: 022a03c83e94ab037a64dd71e54f1796db185f21b1d88ceea5486a274ec257e995.
Although Im not sure if it's relevant, I previously watched the HTLC stream with lntop, which would run for weeks in terminal. After upgrading to 0.15, lntop would lock up with many "pending" HTLCs. Now, it fails to launch at all.
Furthermore, lnd is running on a dedicated machine with bitcoind; prior to upgrading to 0.15, my cpu cores were running at an average of 0-5%; after upgrading to 0.15, most cores are at 5-10% utilization, with 1 or two cores spiking to 60-80% at any given time... No other changes have been made to the node, and I am not running rebalancing software.
Your environment
lnd
0.15uname -a
on Nix) Linux WhiteBox 5.15.0-41-generic Make routing table persistent #44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linuxbtcd
,bitcoind
, or other backend 0.23*** My most recent channel with c-otto force closed. I will include log entries for this channel funding outpoint below, followed by what c-otto sent me from his logs:
*** Below are the logs sent from c-otto:
The text was updated successfully, but these errors were encountered: