-
-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't swallow errors coming from the shareSession call #2908
Conversation
0cfd51f
to
093901a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
07ca367
to
aaeb7a2
Compare
Force pushed to get rid of the fixup commits. |
The ensureOutboundSession users and modifies the setupPromise of the MegolmEncryption class. Some comments suggest that setupPromise will always resolve, in other words it should never contain a promise that will get rejected. Other comments also seem to suggest that the return value of ensureOutboundSession, a promise as well, may fail. The critical error here is that the promise that gets set as the next setupPromise, as well as the promise that ensureOutboundSession returns, is the same promise. It seems that the intention was for setupPromise to contain a promise that will always resolve to either `null` or `OutboundSessionInfo`. We can see that a couple of lines before we set setupPromise to its new value we construct a promise that logs and discards errors using the `Promise.catch()` method. The `Promise.catch()` method does not mutate the promise, instead it returns a new promise. The intention of the original author might have been to set the next setupPromise to the promise which `Promise.catch()` produces. This patch modifies the updating of setupPromise in the ensureOutboundSession so that setupPromise discards errors correctly. Using `>>=` to represent the promise chaining operation, setupPromise is now updated using the following logic: setupPromise = previousSetupPromise >>= setup >>= discardErrors
A call to ensureSession() has two steps: 1. prepareSession(), where an outbound group session might get created or rotated 2. shareSession(), where an outbound group session might get encrypted and queued up to be sent to other devices Both of those calls may mostly fail due to storage errors, yet only the errors from prepareSession get propagated to the caller. Errors from prepareSession will mean that you can't get an outbound group session so you can't encrypt an event. Errors from shareSession, especially if the error happens in the part where the to-device requests are queued up to be sent out, mean that other people will not be able to decrypt the events that will get encrypted using the outbound group session. Both of those cases are catastrophic, the second case is just much harder to debug, since the error happens on another device at some arbitrary point in the future. Let's just return the error instead, people can then retry and the storage issue might have been resolved, or at least the error becomes visible when it happens.
aaeb7a2
to
752b612
Compare
Force pushed to resolve a merge conflict. What is needed to get this merged? I don't have write access in this repo, if the expectations is that I'm going to do that. |
Oh, sorry, I didn't realise you didn't have write access. And now we have lint failures :(.
It should be possible to resolve a conflict without force-pushing? |
Sure, but I posted some fixup commits that required a rebase/autosquash workflow so I continued it. Are rebases frowned upon in this repo, should I stop using them as well as fixup commits? |
Rebases after a review make it hard to see what has been changed since the review, so one tends to have to do the review all over again. |
I understand this, that's why the fixup commits were used before the approval. Are you re-reviewing my merge-conflict resolution? Should I not use fixup commits for this lint issue now? |
@poljar unfortunately this is failing the test-coverage gate :( |
This has now a unrelated change that I'm trying to merge with #2959. I'm closing this and going to reopen two separate PRs, one for the fix for |
This PR intends to fix element-hq/element-web#23792.
What should have been an easy fix sadly turned out to be a deciphering of the logic here. It seems to me that the
ensureOutboundSession()
has a bug which would wedge the outbound group session in the case of an error.This wasn't problematic until now, since almost no errors happen in the
prepareSession()
call, but as we found out may happen in theshareSession
call.The first patch fixes the mentioned bug, aligning the code with the comments.
The second patch now propagates errors from the
shareSession
call to the caller.The individual commits have even more info around what each of them does and means.
I tested this in Element Web, if there's an error in
shareSession
we get prompted to retry the sending of the event. If the error happened to be transient, a press of the retry button will successfully share the outbound session and send the event.This closes: element-hq/element-web#23792.
Here's what your changelog entry will look like:
🐛 Bug Fixes