Sender Data: Fetch and store sender data when we receive a new megolm session #3543

andybalaam · 2024-06-13T11:07:15Z

Update: the algorithm below describes background tasks but after discussion on a PR for this, we think we can avoid that, and instead trigger a retry when receive_keys_query_response runs. I updated #3546 to specify that we should do this during receive_keys_query_response as well as on a schedule.

When we receive a to-device message containing a megolm session, fetch and store the sender data for it.

Part of #3544 which is part of Invisible Crypto. Depends on #3542 because it holds the information in the data structures created there.

This task includes adding a SessionManager to prevent clashes between concurrent tasks.

Preventing clashes

Cross-process, we are protected by the cross-process lock - the entire OlmMachine will be reloaded if some other process takes the lock. But we need a way to prevent 2 tasks both updating a session at the same time.

Possibly something like this (but could do with some validation from the Rust team):

struct SessionManager {
    sessions_being_processed: HashSet<OwnedSessionId>
}

impl SessionManager {
    fn try_lock(session_id: &SessionId) -> Option<SessionGuard>;  // If None, give up on this session
}

Pass the SessionGuard in to any async tasks you spawn. I.e. you keep hold of the lock even across the async boundary.

Algorithm

When we receive a to-device message establishing a megolm session:

A (start)

[take the lock] TODO: if we fail to get it, can we just bail out here, dropping the information in this to-device message? What if it contained device info that we need? How will this work if someone maliciously sent us a duplicate of someone else's session id?
Does the to-device message contain the device_keys property from MSC4147? Yes->D No->B

B (no device info in to-device message)

We need to find the device details. If we have them in the store, we should use them immediately (rather than waiting for a background task to pick up the session for further processing).

Does the locally-cached (in the store) devices list contain a device with the curve key of the sender of the to-device message? Yes->D No->C

C (no device info locally)

Save this session into the store with no device info, marked as not-legacy, next_retry_time_ms = now (in case the app gets killed) and retry_count = 0.
↗️ Return, and kick off an async task [keep the lock]: run OlmMachine::get_user_devices (which waits for /keys/query to complete, then fetches all device info for the user.) Then it should find a device with the curve key we know we used to decrypt the to-device message (same as in get_verification_state. Probably we want to move the impl of get_verification_state into another function we call now, and get_verification_state will look up what we stored instead of calculating it at the time it is called).

If the device is there, -> D
If we still don’t have the device info, -> 😴 Wait to see whether we get device info later. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]

D (we have device info)

Is the device info cross-signed?

No -> 😴 Wait to see if the device becomes cross-signed soon. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]
Yes -> E

E (we have cross-signed device info)

Do we have the cross-signing key for this user? Yes -> G No -> F

F (we have cross-signed device info, but no cross-signing keys)

Upsert the session with the (cross-signed) device info we have, still marked as not-legacy. Set next_retry_time_ms = now and retry_count = 0.
↗️ Return, and kick off an async task [keep the lock]: run OlmMachine::get_identity (which waits for /keys/query to complete, then fetches this user's cross-signing key from the store.) If we still don’t have a cross-signing key -> 😴 Wait to see if we get one soon. Do nothing; let the background job pick it up [drop the lock]

G (we have cross-signing key)

Does the cross-signing key match that used to sign the device info?

Yes -> H
No -> 😴 Wait to see if the cross-signing key is updated soon. Increment retry_count and set next_retry_time_ms per backoff algorithm; let the background job pick it up [drop the lock]

H (cross-signing key matches that used to sign the device info!)

Is the signature in the device info (ed25519:<ssk_id>) valid (SelfSigningPubKey::verify_device_keys)?

Yes -> J
No -> ❗Session is invalid: drop it from the store and forget it (also the device???)

J (device info is verified by matching cross-signing key)

Look up the MXID and MSK for the user sending the to-device message.
Decide the MSK trust level based on whether we have verified this user (matrix_sdk_crypto::identities::user::UserIdentity::is_verified).
Upsert the session including the MXID, MSK and trust level. Remove the device info and retries since we don't need them.
Add this information to the sender_data.
[drop the lock]

Note: the sender data may become out-of-date if we later verify the user. We have no plans to update it if so.

The text was updated successfully, but these errors were encountered:

Part of #3543. Builds on top of #3556 Implements the "fast lane" as described in #3544 This will begin to populate `InboundGroupSession`s with the new `SenderData` struct introduced in #3556 but it will only do it when the information is already available in the store. Future PRs for this issue will query Matrix APIs using spawned async tasks. Future issues will do retries and migration of old sessions. --------- Signed-off-by: Andy Balaam <mail@artificialworlds.net> Co-authored-by: Damir Jelić <poljar@termina.org.uk>

andybalaam · 2024-07-16T15:47:40Z

Given that the background task stuff is no longer relevant, this is done.

This was referenced Jun 13, 2024

Sender Data: Background task to retry fetching sender data for megolm sessions #3546

Closed

Store sender data for Megolm sessions as per Invisible crypto #3544

Closed

andybalaam self-assigned this Jun 17, 2024

This was referenced Jun 20, 2024

crypto: Fetch immediately-available sender data when we receive a room key #3590

Merged

crypto: Storage changes for keeping sender data with InboundGroupSessions #3556

Merged

andybalaam mentioned this issue Jul 10, 2024

crypto: Rename msk to master_key for consistency with the wider codebase #3677

Merged

andybalaam closed this as completed Jul 16, 2024

BillCarsonFr mentioned this issue Sep 6, 2024

[Shields] Shields disappearing when a session is removed element-hq/element-x-android#3411

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sender Data: Fetch and store sender data when we receive a new megolm session #3543

Sender Data: Fetch and store sender data when we receive a new megolm session #3543

andybalaam commented Jun 13, 2024 •

edited

Loading

andybalaam commented Jul 16, 2024

Sender Data: Fetch and store sender data when we receive a new megolm session #3543

Sender Data: Fetch and store sender data when we receive a new megolm session #3543

Comments

andybalaam commented Jun 13, 2024 • edited Loading

Preventing clashes

Algorithm

A (start)

B (no device info in to-device message)

C (no device info locally)

D (we have device info)

E (we have cross-signed device info)

F (we have cross-signed device info, but no cross-signing keys)

G (we have cross-signing key)

H (cross-signing key matches that used to sign the device info!)

J (device info is verified by matching cross-signing key)

andybalaam commented Jul 16, 2024

andybalaam commented Jun 13, 2024 •

edited

Loading