fix: Possible fix to one cause of the "connection deadlock" #1580
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Firsts things first: since this issue isn't easily reproducible, I based this fix on a stacktrace provided in one issue, in this case, #1572 . So there's a few things to consider:
The issue
I read the stacktrace from #1572 again and again and came to the conclusion that what happens is due to an exception thrown from a rest request being cancelled and reaching the ConnectionManager that will permanently disconnect due to it.
It goes as this:
await ConnectAsync(reconnectCancelToken).ConfigureAwait(false);
await _onConnecting().ConfigureAwait(false);
(that is actually DiscordSocketClient.OnConnectingAsync)await ApiClient.ConnectAsync().ConfigureAwait(false);
var gatewayResponse = await GetGatewayAsync().ConfigureAwait(false);
fails and throws aTaskCanceledException
(could be any other rest request in DiscordSocketClient.OnConnectingAsync)TaskCanceledException
inherits fromOperationCanceledException
, it'll end in this catch:This catch will cancel all tokens in
Cancel()
, including the reconnect, so it'll just Disconnect and stop there.Solution
I decided to try-catch the
await _onConnecting().ConfigureAwait(false);
to prevent anyTaskCanceledException
from bubbling up (and disconnecting the client permanently) and sent the inner exception instead (added a null check to prevent throwing "null", but I didn't also see a reason why that would happen, so it's there just to be safe).Now this could cause an unintended side effect. I wasn't able to identify a method would legitly throw that specific exception, so I don't believe it'll happen but it could if I missed it, and end not disconnecting permanently the client as expected.
If anyone that has this issue recurrently could test this PR, it would be highly appreaciated, same goes if you see something I missed that could potentially cause an unintended consequence.