-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Prevent join->join membership transitions changing member count #7977
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the right fix be to change the if
on line 244 into an elif
? It seems to me that this is what's causing the bug (i.e. not no-oping if the membership doesn't change).
I'm not sure bailing out early buys us much at this point, but if you're still really keen on doing so it should probably be
if membership == prev_membership:
continue
instead, and getting rid of the similar conditional clauses lower in the function.
Oh and also could you add a test please? |
Oh, yes indeed! I completely missed that. Thanks 🙂
👍 |
Seems that running
works to fix the broken room counts, and runs in the background during operation rather than blocking startup. |
Applied your fix and added a test @babolivier 🙂 Still not sure what to do w.r.t the background update however. It will likely take a long time on big homeservers, but it does run in the background. I'm tempted to just chuck it in migration step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code lgtm!
Still not sure what to do w.r.t the background update however. It will likely take a long time on big homeservers, but it does run in the background. I'm tempted to just chuck it in migration step.
What exactly are you referring to with "in migration step"? A schema delta?
I'd be in favour of just making homeservers rerun the update, but it's probably better to get other opinions in #synapse-dev first.
Conclusion from #synapse-dev is that we're going to include a delta to re-run the update. However, it's important to note that the Additionally, if you've upgraded to v1.18.0 (which doesn't include this fix), the earlier bg job runs, and then update to v1.19.0, you'd end up with only half of your rooms having room stats recalculated after this fix was in place. So the proposed solution is to switch the old |
by creating a new bg job, and converting the old one to a no-op. Reasoning behind this is explained in the PR comments
…bership_join_count * 'develop' of github.com:matrix-org/synapse: Update workers docs (#7990) Fix invite rejection when we have no forward-extremeties (#7980) Fix typo in docs/workers.md (#7992) Convert federation client to async/await. (#7975) Convert appservice to async. (#7973) Convert some of the data store to async. (#7976) Fix formatting of changelog and upgrade notes Ensure that remove_pusher is always async (#7981) Add deprecation warnings 1.18.0 Update worker docs with recent enhancements (#7969) Ensure the msg property of HttpResponseException is a string. (#7979) Remove from the event_relations table when purging historical events. (#7978) Add additional logging for SAML sessions. (#7971) Add MSC reference to changelog for #7736 Re-implement unread counts (#7736) Various improvements to the docs (#7899)
I've requested the two people involved in the discussion linked above so one can check what I've done matches up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
I'm not completely sure this is the only bug in the room lists, but it's obviously a big one!
synapse/storage/data_stores/main/schema/delta/58/12room_stats.sql
Outdated
Show resolved
Hide resolved
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
I think the title of the linked bug is a bit misleading, as the discussing therein seems to only discuss the accidental increasing room members. If this continues to happen via another bug then we can open a new issue. |
Synapse 1.19.0rc1 (2020-08-13) ============================== Removal warning --------------- As outlined in the [previous release](https://github.com/matrix-org/synapse/releases/tag/v1.18.0), we are no longer publishing Docker images with the `-py3` tag suffix. On top of that, we have also removed the `latest-py3` tag. Please see [the announcement in the upgrade notes for 1.18.0](https://github.com/matrix-org/synapse/blob/develop/UPGRADE.rst#upgrading-to-v1180). Features -------- - Add option to allow server admins to join rooms which fail complexity checks. Contributed by @lugino-emeritus. ([\#7902](#7902)) - Add an option to purge room or not with delete room admin endpoint (`POST /_synapse/admin/v1/rooms/<room_id>/delete`). Contributed by @dklimpel. ([\#7964](#7964)) - Add rate limiting to users joining rooms. ([\#8008](#8008)) - Add a `/health` endpoint to every configured HTTP listener that can be used as a health check endpoint by load balancers. ([\#8048](#8048)) - Allow login to be blocked based on the values of SAML attributes. ([\#8052](#8052)) - Allow guest access to the `GET /_matrix/client/r0/rooms/{room_id}/members` endpoint, according to MSC2689. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#7314](#7314)) Bugfixes -------- - Fix a bug introduced in Synapse v1.7.2 which caused inaccurate membership counts in the room directory. ([\#7977](#7977)) - Fix a long standing bug: 'Duplicate key value violates unique constraint "event_relations_id"' when message retention is configured. ([\#7978](#7978)) - Fix "no create event in auth events" when trying to reject invitation after inviter leaves. Bug introduced in Synapse v1.10.0. ([\#7980](#7980)) - Fix various comments and minor discrepencies in server notices code. ([\#7996](#7996)) - Fix a long standing bug where HTTP HEAD requests resulted in a 400 error. ([\#7999](#7999)) - Fix a long-standing bug which caused two copies of some log lines to be written when synctl was used along with a MemoryHandler logger. ([\#8011](#8011), [\#8012](#8012)) Updates to the Docker image --------------------------- - We no longer publish Docker images with the `-py3` tag suffix, as [announced in the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/UPGRADE.rst#upgrading-to-v1180). ([\#8056](#8056)) Improved Documentation ---------------------- - Document how to set up a client .well-known file and fix several pieces of outdated documentation. ([\#7899](#7899)) - Improve workers docs. ([\#7990](#7990), [\#8000](#8000)) - Fix typo in `docs/workers.md`. ([\#7992](#7992)) - Add documentation for how to undo a room shutdown. ([\#7998](#7998), [\#8010](#8010)) Internal Changes ---------------- - Reduce the amount of whitespace in JSON stored and sent in responses. Contributed by David Vo. ([\#7372](#7372)) - Switch to the JSON implementation from the standard library and bump the minimum version of the canonicaljson library to 1.2.0. ([\#7936](#7936), [\#7979](#7979)) - Convert various parts of the codebase to async/await. ([\#7947](#7947), [\#7948](#7948), [\#7949](#7949), [\#7951](#7951), [\#7963](#7963), [\#7973](#7973), [\#7975](#7975), [\#7976](#7976), [\#7981](#7981), [\#7987](#7987), [\#7989](#7989), [\#8003](#8003), [\#8014](#8014), [\#8016](#8016), [\#8027](#8027), [\#8031](#8031), [\#8032](#8032), [\#8035](#8035), [\#8042](#8042), [\#8044](#8044), [\#8045](#8045), [\#8061](#8061), [\#8062](#8062), [\#8063](#8063), [\#8066](#8066), [\#8069](#8069), [\#8070](#8070)) - Move some database-related log lines from the default logger to the database/transaction loggers. ([\#7952](#7952)) - Add a script to detect source code files using non-unix line terminators. ([\#7965](#7965), [\#7970](#7970)) - Log the SAML session ID during creation. ([\#7971](#7971)) - Implement new experimental push rules for some users. ([\#7997](#7997)) - Remove redundant and unreliable signature check for v1 Identity Service lookup responses. ([\#8001](#8001)) - Improve the performance of the register endpoint. ([\#8009](#8009)) - Reduce less useful output in the newsfragment CI step. Add a link to the changelog section of the contributing guide on error. ([\#8024](#8024)) - Rename storage layer objects to be more sensible. ([\#8033](#8033)) - Change the default log config to reduce disk I/O and storage for new servers. ([\#8040](#8040)) - Add an assertion on `prev_events` in `create_new_client_event`. ([\#8041](#8041)) - Add a comment to `ServerContextFactory` about the use of `SSLv23_METHOD`. ([\#8043](#8043)) - Log `OPTIONS` requests at `DEBUG` rather than `INFO` level to reduce amount logged at `INFO`. ([\#8049](#8049)) - Reduce amount of outbound request logging at `INFO` level. ([\#8050](#8050)) - It is no longer necessary to explicitly define `filters` in the logging configuration. (Continuing to do so is redundant but harmless.) ([\#8051](#8051)) - Add and improve type hints. ([\#8058](#8058), [\#8064](#8064), [\#8060](#8060), [\#8067](#8067))
* commit '5d92a1428': Prevent join->join membership transitions changing member count (#7977)
Fixes #1294
StatsHandler
handles updates to thecurrent_state_delta_stream
, and updates room stats such as the amount of state events, joined users, etc.However, it counts every new join membership as a new user entering a room (and that user being in another room), whereas it's possible for a user's membership status to go from join -> join, for instance when they change their per-room profile information.
This PR adds a check for join->join membership transitions, and bails out early, as none of the further checks are necessary at that point.
Due to this bug, membership stats in many rooms have ended up being wildly larger than their true values. I am not sure if we also want to include a migration step which recalculates these statistics (possibly using the
_populate_stats_process_rooms
bg update).Bug introduced in the initial implementation #4338.