Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

New homeserver doesn't know about cross-signing keys created before it was set up #7504

Closed
babolivier opened this issue May 14, 2020 · 18 comments · Fixed by #7594
Closed

New homeserver doesn't know about cross-signing keys created before it was set up #7504

babolivier opened this issue May 14, 2020 · 18 comments · Fixed by #7594
Assignees
Labels
z-bug (Deprecated Label)

Comments

@babolivier
Copy link
Contributor

Not sure if this is related, but I recently set up a new homeserver B and am missing everyone's master_keys and self_signing_keys except for people who have since set up their E2E cross signing keys.

When I compared a sample user in device_lists_remote_extremeties I found that their stream_id was much larger (59005256) than on my long-running homeserver A (26308567). Unfortunately I can't check the other end (as it's matrix.org).

I tested checking out #7453 and inserted their user_id into device_lists_remote_resync but after it was removed from the table I still didn't have master_keys or self_signing_keys for the user however I did notice a timeout entry for the user_id in the log:

2020-05-12 22:04:53,641 - synapse.http.matrixfederationclient - 408 - INFO -  - {GET-O-3986} [matrix.org] Sending request: GET matrix://matrix.org/_matrix/federation/v1/user/devices/<user_id>; timeout 30.000000s
2020-05-12 22:04:53,651 - synapse.handlers.presence - 343 - INFO - persist_presence_changes-1 - Persisting 3 unpersisted presence updates
2020-05-12 22:04:53,809 - synapse.http.matrixfederationclient - 164 - INFO -  - {GET-O-3986} [matrix.org] Completed: 200 OK
2020-05-12 22:04:53,810 - synapse.storage.database - 527 - WARNING -  - Starting db txn 'update_remote_device_list_cache' from sentinel context
2020-05-12 22:04:53,810 - synapse.storage.database - 566 - WARNING -  - Starting db connection from sentinel context: metrics will be lost

Originally posted by @flackr in #7418 (comment)

@babolivier
Copy link
Contributor Author

@flackr also says:

I checked with one of my friend's running a homeserver. My server does not exist in their device_lists_outbound_last_success table, but their server's users do exist in my device_lists_remote_extremeties, though their keys aren't on my homeserver.

@turt2live
Copy link
Member

So, it looks like t2bot.io is running into this (not a new homeserver, but would meet the criteria of 'new' here). For background: only yesterday did t2bot.io stop dropping EDUs on the floor, which means it would have dropped all device list updates and such, making it be starting from scratch.

I did hack in some support for a device list cache purge (t2bot@41af03f), though this doesn't appear to help when used.

For some people this is fine, like when I verified my own device after a little while of turning on the support for EDUs again. For most this issue appears to come up.

@babolivier
Copy link
Contributor Author

babolivier commented May 28, 2020

I think I've managed to track this bug down, which is that we were not processing cross-signing keys when resyncing a device list. I've opened #7594 which should fix this.

@flackr
Copy link

flackr commented May 28, 2020

Very excited for this, thanks! So after this patch inserting into device_lists_remote_resync will sync the cross signing keys?

@babolivier
Copy link
Contributor Author

It should, yes. However, @turt2live told me that this patch might be malfunctioning, I'm going to investigate this today.

@babolivier
Copy link
Contributor Author

(the reason for Travis's issue was human error, and the patch seems to work for his HS)

@turt2live
Copy link
Member

(that human error was using the wrong database ftr. Terminal windows are hard to use.)

@flackr
Copy link

flackr commented Jun 1, 2020

Thanks for fixing! I'm looking forward to being able to verify people after the next synapse update.

On a related note, I assume this only fixes the issue if you are the administrator of your synapse HS and can insert into device lists for remote resync, and it's a pretty confusing issue. Is there a plan for the HS to be able to automatically resolve missing keys?

I.e. if matrix.org hadn't picked up the cross signing keys for someone I wouldn't be able to fix it.

@babolivier
Copy link
Contributor Author

babolivier commented Jun 1, 2020

On a related note, I assume this only fixes the issue if you are the administrator of your synapse HS and can insert into device lists for remote resync

That's mostly true. If the remote server needs to resync for some reason (e.g. it missed an update) then it's going to save cross-signing keys. Though it doesn't mean it's going to catch up for every user it missed.

Is there a plan for the HS to be able to automatically resolve missing keys?

I'm afraid there is no realistic way that I know of for a server to figure out which user it's missing keys for, as it believes it's got the up to date device lists for every user (except if it doesn't but then it'll retry them automatically). I don't really see how we could make Synapse fetch the missing keys without requesting the device list of every single user it knows about.

I know this isn't an ideal answer. Hopefully we caught that but early enough so that not too many people will be bitten by it.

@flackr
Copy link

flackr commented Jun 1, 2020

I was thinking that at the time of people sending verification requests would be a great time to check validity of current device lists / signatures. Perhaps the request could be signed and if the HS can't verify the signature it tries to resync the device list?

@flackr
Copy link

flackr commented Jun 1, 2020

I think this is a good time because in most cases this issue shows up when you try to verify someone on a homeserver that isn't aware of your keys yet, and if for some reason your keys / device lists get out of sync the likely action a user would take is to try to reverify that person who is no longer verified.

@flackr
Copy link

flackr commented Jun 9, 2020

Any word on when 1.15 will be released (presumably with this fix)? 😇

@babolivier
Copy link
Contributor Author

RC1 got released today (https://github.com/matrix-org/synapse/releases/tag/v1.15.0rc1), so it should be out later this week or early next :)

@flackr
Copy link

flackr commented Jun 12, 2020

I can confirm that resyncing the device lists of all of the users missing keys on 1.15 worked perfectly. Thanks again for fixing this!

@babolivier
Copy link
Contributor Author

babolivier commented Jun 12, 2020

Very glad to hear it!! :)

@babolivier
Copy link
Contributor Author

Just reacting to this now because I totally missed it then (sorry!):

I was thinking that at the time of people sending verification requests would be a great time to check validity of current device lists / signatures

This would be a great idea, unfortunately it's not possible to act on that at the homeserver level given that in encrypted rooms verification requests are sent as m.room.encrypted. So the homeserver isn't able to differentiate a verification request from a simple message.

This would probably be a nice thing to have at the client level, though, but it'd require a client endpoint, so we'd need to spec it before we implement anything. I'm not saying it's not worth the trouble, because it definitely is and iirc that feature has already been asked for since some time now (though I can't find the issue anymore), just that it's not likely to land now now now.

@babolivier
Copy link
Contributor Author

babolivier commented Jun 15, 2020

FYI, I've opened MSC2638 to fix the specific issue of not always being able to tell the homeserver to resync.

@flackr
Copy link

flackr commented Jun 15, 2020

That's great, thanks for the update!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-bug (Deprecated Label)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants