You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update: the algorithm below describes background tasks but after discussion on a PR for this, we think we can avoid that, and instead trigger a retry when receive_keys_query_response runs. I updated #3546 to specify that we should do this during receive_keys_query_response as well as on a schedule.
When we receive a to-device message containing a megolm session, fetch and store the sender data for it.
Part of #3544 which is part of Invisible Crypto. Depends on #3542 because it holds the information in the data structures created there.
This task includes adding a SessionManager to prevent clashes between concurrent tasks.
Preventing clashes
Cross-process, we are protected by the cross-process lock - the entire OlmMachine will be reloaded if some other process takes the lock. But we need a way to prevent 2 tasks both updating a session at the same time.
Possibly something like this (but could do with some validation from the Rust team):
struct SessionManager {
sessions_being_processed: HashSet<OwnedSessionId>
}
impl SessionManager {
fn try_lock(session_id: &SessionId) -> Option<SessionGuard>; // If None, give up on this session
}
Pass the SessionGuard in to any async tasks you spawn. I.e. you keep hold of the lock even across the async boundary.
Algorithm
When we receive a to-device message establishing a megolm session:
A (start)
[take the lock] TODO: if we fail to get it, can we just bail out here, dropping the information in this to-device message? What if it contained device info that we need? How will this work if someone maliciously sent us a duplicate of someone else's session id?
Does the to-device message contain the device_keys property from MSC4147? Yes->D No->B
B (no device info in to-device message)
We need to find the device details. If we have them in the store, we should use them immediately (rather than waiting for a background task to pick up the session for further processing).
Does the locally-cached (in the store) devices list contain a device with the curve key of the sender of the to-device message? Yes->D No->C
C (no device info locally)
Save this session into the store with no device info, marked as not-legacy, next_retry_time_ms = now (in case the app gets killed) and retry_count = 0.
↗️ Return, and kick off an async task [keep the lock]: run OlmMachine::get_user_devices (which waits for /keys/query to complete, then fetches all device info for the user.) Then it should find a device with the curve key we know we used to decrypt the to-device message (same as in get_verification_state. Probably we want to move the impl of get_verification_state into another function we call now, and get_verification_state will look up what we stored instead of calculating it at the time it is called).
If the device is there, -> D
If we still don’t have the device info, -> 😴 Wait to see whether we get device info later. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]
D (we have device info)
Is the device info cross-signed?
No -> 😴 Wait to see if the device becomes cross-signed soon. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]
Yes -> E
E (we have cross-signed device info)
Do we have the cross-signing key for this user? Yes -> G No -> F
F (we have cross-signed device info, but no cross-signing keys)
Upsert the session with the (cross-signed) device info we have, still marked as not-legacy. Set next_retry_time_ms = now and retry_count = 0.
↗️ Return, and kick off an async task [keep the lock]: run OlmMachine::get_identity (which waits for /keys/query to complete, then fetches this user's cross-signing key from the store.) If we still don’t have a cross-signing key -> 😴 Wait to see if we get one soon. Do nothing; let the background job pick it up [drop the lock]
G (we have cross-signing key)
Does the cross-signing key match that used to sign the device info?
Yes -> H
No -> 😴 Wait to see if the cross-signing key is updated soon. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]
H (cross-signing key matches that used to sign the device info!)
Is the signature in the device info (ed25519:<ssk_id>) valid (SelfSigningPubKey::verify_device_keys)?
Yes -> J
No -> ❗Session is invalid: drop it from the store and forget it (also the device???)
J (device info is verified by matching cross-signing key)
Look up the MXID and MSK for the user sending the to-device message.
Decide the MSK trust level based on whether we have verified this user (matrix_sdk_crypto::identities::user::UserIdentity::is_verified).
Upsert the session including the MXID, MSK and trust level. Remove the device info and retries since we don't need them.
Add this information to the sender_data.
[drop the lock]
Note: the sender data may become out-of-date if we later verify the user. We have no plans to update it if so.
The text was updated successfully, but these errors were encountered:
Part of #3543.
Builds on top of #3556
Implements the "fast lane" as described in
#3544
This will begin to populate `InboundGroupSession`s with the new
`SenderData` struct introduced in
#3556 but it will only
do it when the information is already available in the store. Future PRs
for this issue will query Matrix APIs using spawned async tasks.
Future issues will do retries and migration of old sessions.
---------
Signed-off-by: Andy Balaam <mail@artificialworlds.net>
Co-authored-by: Damir Jelić <poljar@termina.org.uk>
Update: the algorithm below describes background tasks but after discussion on a PR for this, we think we can avoid that, and instead trigger a retry when
receive_keys_query_response
runs. I updated #3546 to specify that we should do this duringreceive_keys_query_response
as well as on a schedule.When we receive a to-device message containing a megolm session, fetch and store the sender data for it.
Part of #3544 which is part of Invisible Crypto. Depends on #3542 because it holds the information in the data structures created there.
This task includes adding a
SessionManager
to prevent clashes between concurrent tasks.Preventing clashes
Cross-process, we are protected by the cross-process lock - the entire OlmMachine will be reloaded if some other process takes the lock. But we need a way to prevent 2 tasks both updating a session at the same time.
Possibly something like this (but could do with some validation from the Rust team):
Pass the
SessionGuard
in to any async tasks you spawn. I.e. you keep hold of the lock even across the async boundary.Algorithm
When we receive a to-device message establishing a megolm session:
A (start)
[take the lock] TODO: if we fail to get it, can we just bail out here, dropping the information in this to-device message? What if it contained device info that we need? How will this work if someone maliciously sent us a duplicate of someone else's session id?
Does the to-device message contain the device_keys property from MSC4147? Yes->D No->B
B (no device info in to-device message)
We need to find the device details. If we have them in the store, we should use them immediately (rather than waiting for a background task to pick up the session for further processing).
C (no device info locally)
Save this session into the store with no device info, marked as not-legacy,
next_retry_time_ms = now
(in case the app gets killed) andretry_count = 0.
OlmMachine::get_user_devices
(which waits for /keys/query to complete, then fetches all device info for the user.) Then it should find a device with the curve key we know we used to decrypt the to-device message (same as in get_verification_state. Probably we want to move the impl ofget_verification_state
into another function we call now, andget_verification_state
will look up what we stored instead of calculating it at the time it is called).If the device is there, -> D
If we still don’t have the device info, -> 😴 Wait to see whether we get device info later. Increment
retry_count
and setnext_retry_time_ms
per backoff algorithm; let the background job pick it up [drop the lock]D (we have device info)
No -> 😴 Wait to see if the device becomes cross-signed soon. Increment
retry_count
and setnext_retry_time_ms
per backoff algorithm; let the background job pick it up [drop the lock]Yes -> E
E (we have cross-signed device info)
F (we have cross-signed device info, but no cross-signing keys)
next_retry_time_ms = now
andretry_count = 0
.OlmMachine::get_identity
(which waits for /keys/query to complete, then fetches this user's cross-signing key from the store.) If we still don’t have a cross-signing key -> 😴 Wait to see if we get one soon. Do nothing; let the background job pick it up [drop the lock]G (we have cross-signing key)
Yes -> H
No -> 😴 Wait to see if the cross-signing key is updated soon. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]
H (cross-signing key matches that used to sign the device info!)
ed25519:<ssk_id>
) valid (SelfSigningPubKey::verify_device_keys
)?Yes -> J
No -> ❗Session is invalid: drop it from the store and forget it (also the device???)
J (device info is verified by matching cross-signing key)
matrix_sdk_crypto::identities::user::UserIdentity::is_verified
).sender_data
.Note: the sender data may become out-of-date if we later verify the user. We have no plans to update it if so.
The text was updated successfully, but these errors were encountered: