Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Current state oscillating between join and leave states cause confusing /sync responses (for clients) #8434

Closed
anoadragon453 opened this issue Oct 1, 2020 · 8 comments
Labels
z-bug (Deprecated Label) z-major (Deprecated Label) z-p2 (Deprecated Label)

Comments

@anoadragon453
Copy link
Member

anoadragon453 commented Oct 1, 2020

Description

@bwindels reported that upon initial sync he was seeing some weirdness with regards to the #synapse:matrix.org room, which is a room that in the past has been upgraded from v1 to v4. The room ID for the v1 room is !HsxjoYRFsDtWBgDQPh:matrix.org while the v4 room is !mjbDjyNsRXndKLkHIe:matrix.org.

The weirdness he was seeing that that the v4 room showed up in the leave section during initial sync, yet events from it were showing up in the join section of incremental syncs (not in initial sync). What's more, his client thought he was in the room, reporting the latest membership event to be:

{
  "content": {
    "avatar_url": "mxc://matrix.org/aerWVfICBMcyFcEyREcivLuI",
    "displayname": "Bruno",
    "membership": "join"
  },
  "origin_server_ts": 1560419507930,
  "sender": "@bwindels:matrix.org",
  "state_key": "@bwindels:matrix.org",
  "type": "m.room.member",
  "unsigned": {
    "age": 39661833298
  },
  "event_id": "$36bgSvxvl7Tuq-Y8op7gHARa1pA5MgsWxNEDHdAeMhs",
  "room_id": "!mjbDjyNsRXndKLkHIe:matrix.org"
}

However trying to send messages into the room failed with:

{
  "errcode": "M_FORBIDDEN",
  "error": "User @bwindels:matrix.org not in room !mjbDjyNsRXndKLkHIe:matrix.org (<FrozenEventV3 event_id='$3h7iJZGFOFjO74NTSGkyTZZJ7zU1lAqzEAwq63saTls', type='m.room.member', state_key='@bwindels:matrix.org'>)"
}

Querying matrix.org's database, we find that some tables think he is in the room, while local_current_membership thinks he is not:

matrix=> select * from users_in_public_rooms where user_id = '@bwindels:matrix.org' and room_id = '!mjbDjyNsRXndKLkHIe:matrix.org';
       user_id        |            room_id             
----------------------+--------------------------------
 @bwindels:matrix.org | !mjbDjyNsRXndKLkHIe:matrix.org
(1 row)

matrix=> select * from room_memberships where sender = '@bwindels:matrix.org' and room_id = '!mjbDjyNsRXndKLkHIe:matrix.org';
                   event_id                   |       user_id        |        sender        |            room_id             | membership | forgotten | display_name |                avatar_url                 
----------------------------------------------+----------------------+----------------------+--------------------------------+------------+-----------+--------------+-------------------------------------------
 $36bgSvxvl7Tuq-Y8op7gHARa1pA5MgsWxNEDHdAeMhs | @bwindels:matrix.org | @bwindels:matrix.org | !mjbDjyNsRXndKLkHIe:matrix.org | join       |         0 | Bruno        | mxc://matrix.org/aerWVfICBMcyFcEyREcivLuI
(1 row)

matrix=> select * from local_current_membership where user_id = '@bwindels:matrix.org' and room_id = '!mjbDjyNsRXndKLkHIe:matrix.org';
            room_id             |       user_id        |                   event_id                   | membership 
--------------------------------+----------------------+----------------------------------------------+------------
 !mjbDjyNsRXndKLkHIe:matrix.org | @bwindels:matrix.org | $3h7iJZGFOFjO74NTSGkyTZZJ7zU1lAqzEAwq63saTls | leave
(1 row)

The leave event in question ($3h7iJZGFOFjO74NTSGkyTZZJ7zU1lAqzEAwq63saTls) has the body of:

                   event_id                   |            room_id             |        internal_metadata        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                      json                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | format_version 
----------------------------------------------+--------------------------------+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------
 $3h7iJZGFOFjO74NTSGkyTZZJ7zU1lAqzEAwq63saTls | !mjbDjyNsRXndKLkHIe:matrix.org | {"stream_ordering": 1202884248} | {"auth_events": ["$36bgSvxvl7Tuq-Y8op7gHARa1pA5MgsWxNEDHdAeMhs", "$44vIat4gq6zdBgaaPeWPRZDjuU09wigV-dDKUhh4ffw", "$6HynbprLKscfJ208llg1BIcNjokH2r8SD9vV8snQKDI", "$VHLoOcFcMNwaE5VFn7mT10vhRIMh6J_D-CST_cGe-0M"], "content": {"membership": "leave", "reason": "User has been idle for 30+ days."}, "depth": 43688, "hashes": {"sha256": "phDZh5igT3Nz7JwKOnAjNn0Icrk8jLAjHKERYe2OuzY"}, "origin": "matrix.org", "origin_server_ts": 1577453605402, "prev_events": ["$4KCCa2MX-iFq7fbw3G9wfuRUuErnBX5IyJUoYCDXxro"], "prev_state": [], "room_id": "!mjbDjyNsRXndKLkHIe:matrix.org", "sender": "@appservice-irc:matrix.org", "state_key": "@bwindels:matrix.org", "type": "m.room.member", "signatures": {"matrix.org": {"ed25519:a_RXGa": "NyteqCKOtAdxaHbRoJ4CXcujJ9O+P0yIoB+YaUYOBgBhvOVu0IIPP0n+eNSyJtogHaXklrzqG7p1090YbkMzAA"}}, "unsigned": {"age_ts": 1577453605402, "replaces_state": "$36bgSvxvl7Tuq-Y8op7gHARa1pA5MgsWxNEDHdAeMhs"}} |              3

So what's happened here is that @appservice-irc:matrix.org kicked @bwindels:matrix.org from the room, and that did not propagate to all tables properly.

Looking from my Element Desktop on my own homeserver, the correct leave state event is shown as the latest:

{
  "content": {
    "membership": "leave",
    "reason": "User has been idle for 30+ days."
  },
  "origin_server_ts": 1577453605402,
  "sender": "@appservice-irc:matrix.org",
  "state_key": "@bwindels:matrix.org",
  "type": "m.room.member",
  "unsigned": {
    "replaces_state": "$36bgSvxvl7Tuq-Y8op7gHARa1pA5MgsWxNEDHdAeMhs",
    "age": 23587161173
  },
  "event_id": "$3h7iJZGFOFjO74NTSGkyTZZJ7zU1lAqzEAwq63saTls",
  "room_id": "!mjbDjyNsRXndKLkHIe:matrix.org"
}

So somehow this kick that happened entirely locally has resulted in some corrupted state on matrix.org, and it's slightly breaking Element and Hydrogen.

Version information

  • Homeserver: matrix.org
@ptman
Copy link
Contributor

ptman commented Oct 6, 2020

I've often wondered at the lack of foreign keys and other constraints.

@clokep
Copy link
Member

clokep commented Oct 6, 2020

Unfortunately since this happened back in December 2019, we're not going to be able to inspect logs and see if that helps at all.

It looks like all these tables get updated in _update_metadata_tables_txn at the same time so I'm not sure how they can get out of sync.

@bwindels Might be worth trying to leave that room and rejoin to see if it fixes it for you? This wouldn't help with an underlying issue though.

@erikjohnston
Copy link
Member

It's worth noting that the following isn't doing what it first looks like:

matrix=> select * from room_memberships where sender = '@bwindels:matrix.org' and room_id = '!mjbDjyNsRXndKLkHIe:matrix.org';

That is fetching all membership events in the room sent by @bwindels:matrix.org, not the current membership state of Bruno. It only returns one event because kicks are sent by other users. Querying by user_id (which is a terrible column name for the state_key) produces the following:

matrix=> select * from room_memberships where user_id = '@bwindels:matrix.org' and room_id = '!mjbDjyNsRXndKLkHIe:matrix.org';

                   event_id                   |       user_id        |           sender           |            room_id             | membership | forgotten | display_name |                avatar_url                 
----------------------------------------------+----------------------+----------------------------+--------------------------------+------------+-----------+--------------+-------------------------------------------
 $36bgSvxvl7Tuq-Y8op7gHARa1pA5MgsWxNEDHdAeMhs | @bwindels:matrix.org | @bwindels:matrix.org       | !mjbDjyNsRXndKLkHIe:matrix.org | join       |         0 | Bruno        | mxc://matrix.org/aerWVfICBMcyFcEyREcivLuI
 $3h7iJZGFOFjO74NTSGkyTZZJ7zU1lAqzEAwq63saTls | @bwindels:matrix.org | @appservice-irc:matrix.org | !mjbDjyNsRXndKLkHIe:matrix.org | leave      |         0 |              | 
(2 rows)

@clokep clokep removed their assignment Oct 15, 2020
@erikjohnston erikjohnston self-assigned this Oct 19, 2020
@erikjohnston
Copy link
Member

Looking at initial sync:

  1. !mjbDjyNsRXndKLkHIe:matrix.org appears in the leave section.
  2. The join appears in the state section (this is expected as that state is for the start of the timeline.
  3. The timeline section has no membership event for Bruno.

This is a perfectly valid*, albeit confusing, response to get. Usually you'd see the leave in the timeline section, but it doesn't have to be there if the state has e.g. changed because of a change in state due to state resolution, which is what appears to have happened here.

(* well, I don't think Synapse is working quite as intended here, but I think this is valid according to the spec...)

What's more, his client thought he was in the room, ....

This is probably because the client isn't correctly handling the above case.

@erikjohnston
Copy link
Member

Looking at the DB the current state looks to be oscillating between the join and leave, which would explain all of this.

@erikjohnston erikjohnston changed the title Some Synapse DB tables think the user is joined to a room, and some think they've left after appservice IRC idle kick Current state oscillating between join and leave states cause confusing /sync responses (for clients) Oct 19, 2020
@erikjohnston erikjohnston removed their assignment Oct 22, 2020
@richvdh
Copy link
Member

richvdh commented Oct 22, 2020

presumably related to #8629 (causing the oscillation) and matrix-org/matrix-spec#1209 (state changes due to state res aren't clearly propagated to clients)

@neilisfragile neilisfragile added the z-p2 (Deprecated Label) label Oct 22, 2020
@thegcat
Copy link
Contributor

thegcat commented Jan 18, 2021

I too seem to have !mjbDjyNsRXndKLkHIe:matrix.org show up in at least my Element Desktop each time I start it. The room also shows up as Github [@root:matrix.echelon4.xyz] (@_neb_github_=40root=3amatrix.echelon4.xyz:matrix.org) and 5013 others, not anything that would indicate that it's Synapse HQ.

Anything I can do to help debug this and/or make sure I stay out of the room? :-)

@richvdh
Copy link
Member

richvdh commented Apr 9, 2021

I'm going to assume this is a somewhat expected artifact of #8629, and there's not much that can really be done about it otherwise.

@richvdh richvdh closed this as completed Apr 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-bug (Deprecated Label) z-major (Deprecated Label) z-p2 (Deprecated Label)
Projects
None yet
Development

No branches or pull requests

8 participants