speed up /members and add at= and membership params #3568

ara4n · 2018-07-20T02:48:39Z

~~builds on #3567 (in turn #3331 (in turn #2970)).~~ (all now merged)

as per the proposal; we can deduplicate redundant lazy-loaded members which are sent in the same sync sequence. we do this heuristically rather than requiring the client to somehow tell us which members it has chosen to cache, by instead caching the last N members sent to a client, and not sending them again. For now we hardcode N to 100. Each cache for a given (user,device) tuple is in turn cached for up to X minutes (to avoid the caches building up). For now we hardcode X to 30.

ara4n · 2018-07-21T11:21:06Z

synapse/storage/state.py

@@ -89,6 +89,69 @@ def _get_current_state_ids_txn(txn):
            _get_current_state_ids_txn,
        )

+    # FIXME: how should this be cached?


I don't understanding the caching stuff to know how best to handle this - or whether this method should be combined with get_current_state_ids above, and somehow share the same caching.

I guess you've got three choices:

treat each call to get_filtered_current_state_ids as a separate cacheable thing - so for example if you do a lookup of get_filtered_current_state_ids(room, None, None), a lookup of get_filtered_current_state_ids(room, [(t, k)], None) would not hit the cache. To do this, you just need an @cached decorator, plus invalidation calls.

Implement this in terms of a call to get_current_state_ids followed by a filter. This might increase cache hits, but cache misses will of course be very expensive.

use a DictionaryCache, like _state_group_cache uses, since DictionaryCache has support for partial caches. The keys would be room ids rather than state group ids. again you'd need to consider invalidation.

Not bother with caching and see how much it hurts?

…azy_members

…y_load_apis

…azy_members

…y_load_apis

this feels much clunkier and more complicated for both clients and senders than just querying based on event_id, and i'm failing to see any way it's more correct than querying based on event_id. also fixes a thinko to check whether the user is allowed to view membership as of the given token

ara4n · 2018-08-12T15:25:11Z

The conclusions of the API shape were roughly:

/members is the right API, as /context provides other unnecessary stuff, /state is needlessly generic (and the boat has sailed on fixing that) and /joined_members is either obsolete or very specifically optimised for a single use case (depending on who you ask).
We need to sync /members to /sync responses to avoid races when doing a bg sync on /members in keeping with responses from the /sync API.

However, as requested out-of-band, I've switched /members?at to take a stream token rather than an event ID. I am failing to justify this to myself as I can't see any bug introduced by using event IDs, and meanwhile the implementation gets more complicated both serverside & clientside by thinking in terms of stream tokens instead. I've done it anyway to try to get this through review before getting blocked any further. matrix-org/sytest#469 has also been updated.

@richvdh PTAL.

richvdh · 2018-08-13T15:40:32Z

However, as requested out-of-band, I've switched /members?at to take a stream token rather than an event ID. I am failing to justify this to myself as I can't see any bug introduced by using event IDs

one question which using event ids opens is: are you talking about the state before or after that event? I think you want the state after the event, but nothing else that takes an event_id does that. Indeed, I don't think "after event X" is powerful enough on its own to define a particular point in the room state.

Consider a room with a diamond in the DAG:

   A
 /   \
B     C
 \   /
   D

B and C are both state events. Does the state after B change once C arrives?

You can solve it by knowing that B's stream ordering is lower than C's, so that a client referring to the "state after B" will not have seen the state at C - but then you are just using the event id as a token in the sync stream, for which we use sync tokens. (A server could just use event ids as its stream tokens. But we don't mandate that, because it leaves more options for server implementations).

richvdh · 2018-08-13T13:42:08Z

synapse/handlers/message.py

+    @defer.inlineCallbacks
+    def get_state_events(
+        self, user_id, room_id, types=None, filtered_types=None,
+        at_token=None, is_guest=False


nit: trailing comma would be nice :/

richvdh · 2018-08-13T14:51:07Z

synapse/handlers/message.py

+                at_token.room_key
+            ).stream
+
+            # XXX: is this the right method to be using?  What id we don't yet have an


no, it's not; as you note, there is a problem if we don't yet have an event after the token, which is moderately likely given my understanding of the proposed usecase (cf https://docs.google.com/document/d/11yn-mAkYll10RJpN0mkYEVqraTbU3U4eQx9MNrzqX1U/edit?disco=AAAACCLJnz0: please can we have an answer there?)

I'd suggest doing the same as SyncHandler.get_state_at (which is basically to call get_recent_events_for_room and then apply the change implied by the event in question - though as the comments there say, that is buggy too :/)

richvdh · 2018-08-13T14:51:23Z

synapse/handlers/message.py

-            room_state = yield self.store.get_state_for_events(
-                [membership_event_id], None
+            # check we are even allowed to be reading the room at this point
+            event = yield self.store.get_event(event_id, allow_none=True)


why allow_none? it's only going to make it explode in a more subtle way

richvdh · 2018-08-13T14:59:57Z

synapse/handlers/message.py

+            event = yield self.store.get_event(event_id, allow_none=True)
+            visible_events = yield filter_events_for_client(self.store, user_id, [event])
+
+            if len(visible_events) > 0:


nit: just if visible_events is more pythonic here

richvdh · 2018-08-13T15:44:54Z

synapse/handlers/message.py

+                )
+                room_state = room_state[event.event_id]
+            else:
+                room_state = {}


What's the logic for returning an empty result rather than a 403?

only that get_state_events() felt like a pretty generic method, and raising a 403 might break stuff upstream. in practice it's only used by /state and /members so i think it's fine - have switched to throw 404 if the token isn't recognised and 403 if the event at that token is filtered out wrt our user.

ara4n · 2018-08-14T20:05:30Z

@richvdh thanks for review - ptal.

richvdh · 2018-08-15T15:16:04Z

synapse/handlers/message.py

-            room_state = yield self.store.get_state_for_events(
-                [membership_event_id], None
+            if not last_events:
+                raise SynapseError(


let's use NotFoundError here

(done in 0d5770d)

richvdh · 2018-08-15T15:23:22Z

synapse/handlers/message.py

@@ -82,14 +82,49 @@ def get_room_data(self, user_id=None, room_id=None,
        defer.returnValue(data)

    @defer.inlineCallbacks
-    def get_state_events(self, user_id, room_id, is_guest=False):
+    def _check_in_room_or_world_readable(self, room_id, user_id):


fixed in f5189b9

richvdh · 2018-08-15T15:27:47Z

I'm going to squash-merge this

…rg#3568)" This reverts commit 2f78f43.

@vojeroen

Features -------- - Add support for the SNI extension to federation TLS connections. Thanks to @vojeroen! ([\#3439](#3439)) - Add /_media/r0/config ([\#3184](#3184)) - speed up /members API and add `at` and `membership` params as per MSC1227 ([\#3568](#3568)) - implement `summary` block in /sync response as per MSC688 ([\#3574](#3574)) - Add lazy-loading support to /messages as per MSC1227 ([\#3589](#3589)) - Add ability to limit number of monthly active users on the server ([\#3633](#3633)) - Support more federation endpoints on workers ([\#3653](#3653)) - Basic support for room versioning ([\#3654](#3654)) - Ability to disable client/server Synapse via conf toggle ([\#3655](#3655)) - Ability to whitelist specific threepids against monthly active user limiting ([\#3662](#3662)) - Add some metrics for the appservice and federation event sending loops ([\#3664](#3664)) - Where server is disabled, block ability for locked out users to read new messages ([\#3670](#3670)) - set admin uri via config, to be used in error messages where the user should contact the administrator ([\#3687](#3687)) - Synapse's presence functionality can now be disabled with the "use_presence" configuration option. ([\#3694](#3694)) - For resource limit blocked users, prevent writing into rooms ([\#3708](#3708)) Bugfixes -------- - Fix occasional glitches in the synapse_event_persisted_position metric ([\#3658](#3658)) - Fix bug on deleting 3pid when using identity servers that don't support unbind API ([\#3661](#3661)) - Make the tests pass on Twisted < 18.7.0 ([\#3676](#3676)) - Don’t ship recaptcha_ajax.js, use it directly from Google ([\#3677](#3677)) - Fixes test_reap_monthly_active_users so it passes under postgres ([\#3681](#3681)) - Fix mau blocking calulation bug on login ([\#3689](#3689)) - Fix missing yield in synapse.storage.monthly_active_users.initialise_reserved_users ([\#3692](#3692)) - Improve HTTP request logging to include all requests ([\#3700](#3700)) - Avoid timing out requests while we are streaming back the response ([\#3701](#3701)) - Support more federation endpoints on workers ([\#3705](#3705), [\#3713](#3713)) - Fix "Starting db txn 'get_all_updated_receipts' from sentinel context" warning ([\#3710](#3710)) - Fix bug where `state_cache` cache factor ignored environment variables ([\#3719](#3719)) - Fix bug in v0.33.3rc1 which caused infinite loops and OOMs ([\#3723](#3723)) - Fix bug introduced in v0.33.3rc1 which made the ToS give a 500 error ([\#3732](#3732)) Deprecations and Removals ------------------------- - The Shared-Secret registration method of the legacy v1/register REST endpoint has been removed. For a replacement, please see [the admin/register API documentation](https://github.com/matrix-org/synapse/blob/master/docs/admin_api/register_api.rst). ([\#3703](#3703)) Internal Changes ---------------- - The test suite now can run under PostgreSQL. ([\#3423](#3423)) - Refactor HTTP replication endpoints to reduce code duplication ([\#3632](#3632)) - Tests now correctly execute on Python 3. ([\#3647](#3647)) - Sytests can now be run inside a Docker container. ([\#3660](#3660)) - Port over enough to Python 3 to allow the sytests to start. ([\#3668](#3668)) - Update docker base image from alpine 3.7 to 3.8. ([\#3669](#3669)) - Rename synapse.util.async to synapse.util.async_helpers to mitigate async becoming a keyword on Python 3.7. ([\#3678](#3678)) - Synapse's tests are now formatted with the black autoformatter. ([\#3679](#3679)) - Implemented a new testing base class to reduce test boilerplate. ([\#3684](#3684)) - Rename MAU prometheus metrics ([\#3690](#3690)) - add new error type ResourceLimit ([\#3707](#3707)) - Logcontexts for replication command handlers ([\#3709](#3709)) - Update admin register API documentation to reference a real user ID. ([\#3712](#3712))

ara4n added 11 commits June 4, 2018 01:04

Merge branch 'develop' into matthew/remove_redundant_lazy_members

c341d81

add include_redundant_members filter option & make it work

f7bd5da

merge and apply isort

589e5aa

merge in #2970

8e66dd1

fix bad merge

a08b37b

make /context lazyload & filter aware

7362e6c

speed up /members and add at= and membership params

cd28d2f

make it work

a17f0b6

Merge branch 'matthew/lazy_load_apis' into matthew/members_at

9ba6ef2

make it work

c6117fa

ara4n added a commit to matrix-org/sytest that referenced this pull request Jul 20, 2018

test /members (for matrix-org/synapse#3568)

88c27a7

ara4n mentioned this pull request Jul 20, 2018

test /members matrix-org/sytest#469

Merged

ara4n requested a review from richvdh July 20, 2018 23:57

ara4n assigned richvdh Jul 20, 2018

make filtering work

8f1585d

ara4n commented Jul 21, 2018

View reviewed changes

ara4n mentioned this pull request Jul 23, 2018

initial cut at a room summary API #3574

Merged

richvdh requested a review from a team July 23, 2018 10:50

ara4n added 6 commits July 23, 2018 19:23

Merge branch 'matthew/filter_members' into matthew/remove_redundant_l…

f9c3c26

…azy_members

Merge branch 'matthew/remove_redundant_lazy_members' into matthew/laz…

c2870ab

…y_load_apis

Merge branch 'matthew/lazy_load_apis' into matthew/members_at

ffb7a4c

changelog

7d99b0e

changelog

cd27a77

changelog

4018a6d

ara4n mentioned this pull request Jul 23, 2018

lazyload aware /messages #3589

Merged

ara4n added 4 commits July 24, 2018 21:53

Merge branch 'matthew/filter_members' into matthew/remove_redundant_l…

d32e5f8

…azy_members

deduplicating redundant members via event_id rather than mxid

238f750

Merge branch 'matthew/remove_redundant_lazy_members' into matthew/laz…

2a79e1a

…y_load_apis

Merge branch 'matthew/lazy_load_apis' into matthew/members_at

48f3e43

richvdh assigned ara4n and unassigned richvdh Jul 27, 2018

ara4n added 2 commits July 30, 2018 09:17

incorporate review

e952368

Merge branch 'develop' into matthew/members_at

2d9c062

ara4n changed the base branch from matthew/lazy_load_apis to develop August 12, 2018 11:12

ara4n assigned richvdh and unassigned ara4n Aug 12, 2018

richvdh suggested changes Aug 13, 2018

View reviewed changes

richvdh assigned ara4n and unassigned richvdh Aug 13, 2018

incorporate PR review

859ad35

ara4n assigned richvdh and unassigned ara4n Aug 14, 2018

ara4n and others added 4 commits August 14, 2018 22:06

Merge branch 'develop' into matthew/members_at

dd4498f

Merge branch 'develop' into matthew/members_at

217d5dd

remove incorrectly reintroduced method

f5189b9

cleanups

0d5770d

richvdh reviewed Aug 15, 2018

View reviewed changes

richvdh approved these changes Aug 15, 2018

View reviewed changes

richvdh merged commit 2f78f43 into develop Aug 15, 2018

krombel added a commit to krombel/synapse that referenced this pull request Aug 15, 2018

Revert "speed up /members and add at= and membership params (matrix-o…

fb0960c

…rg#3568)" This reverts commit 2f78f43.

richvdh mentioned this pull request Aug 16, 2018

Proposal for lazy-loading room members to improve initial sync speed and client RAM usage matrix-org/matrix-spec-proposals#1227

Closed

hawkowl deleted the matthew/members_at branch September 20, 2018 14:01

turt2live mentioned this pull request May 28, 2019

Spec lazy-loading room members matrix-org/matrix-spec-proposals#2035

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed up /members and add at= and membership params #3568

speed up /members and add at= and membership params #3568

ara4n commented Jul 20, 2018 •

edited by richvdh

Loading

ara4n Jul 21, 2018

richvdh Jul 27, 2018

ara4n commented Aug 12, 2018

richvdh commented Aug 13, 2018

richvdh Aug 13, 2018

richvdh Aug 13, 2018

richvdh Aug 13, 2018

richvdh Aug 13, 2018

richvdh Aug 13, 2018

ara4n Aug 14, 2018

ara4n commented Aug 14, 2018

richvdh Aug 15, 2018

richvdh Aug 15, 2018

richvdh Aug 15, 2018

richvdh commented Aug 15, 2018

speed up /members and add at= and membership params #3568

speed up /members and add at= and membership params #3568

Conversation

ara4n commented Jul 20, 2018 • edited by richvdh Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ara4n commented Aug 12, 2018

richvdh commented Aug 13, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ara4n commented Aug 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richvdh commented Aug 15, 2018

ara4n commented Jul 20, 2018 •

edited by richvdh

Loading