Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

have_seen_events is eating up time while processing state and auth events while backfilling #13625

Open
Tracked by #15182
MadLittleMods opened this issue Aug 25, 2022 · 0 comments
Labels
A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) A-Performance Performance, both client-facing and admin-facing O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@MadLittleMods
Copy link
Contributor

MadLittleMods commented Aug 25, 2022

Mentioned in internal doc. Part of #13356


Optimize have_seen_events because when backfilling #matrix:matrix.org, 20s is just calling have_seen_events on the 200k state and auth events in the room.

  • have_seen_events (157 db.have_seen_events) takes 6.62s to process 77k events
  • have_seen_events (246 db.have_seen_events) takes 13.19s to process 122k events


Benchmark and timing is from the in-flight PR, #13561

The @cachedList is so slow 🐌 and we're better off removing it at this point. Would be good see if @cachedList can be improved or if we can improve things further with a better cache.

# events Timing Timing (removing the @cachedList cache)
50k .
Benchmark time (1 cold cache ): 3.7170820236206055
Benchmark time (2, warm cache): 0.2985079288482666
Benchmark time (3, warm cache): 0.28847789764404297
Benchmark time (4, odds ): 0.1537461280822754
Benchmark time (5, odds ): 0.14780497550964355
Benchmark time (6, evens ): 0.1475691795349121
Benchmark time (7, evens ): 0.14868617057800293
.
Benchmark time (1 cold cache ): 0.3248419761657715
Benchmark time (2, warm cache): 0.32351016998291016
Benchmark time (3, warm cache): 0.3136260509490967
Benchmark time (4, odds ): 0.15899014472961426
Benchmark time (5, odds ): 0.15054106712341309
Benchmark time (6, evens ): 0.15465688705444336
Benchmark time (7, evens ): 0.1412408351898193
100k .
Benchmark time (1 cold cache ): 8.10055136680603
Benchmark time (2, warm cache): 0.6121761798858643
Benchmark time (3, warm cache): 0.6093218326568604
Benchmark time (4, odds ): 0.29950785636901855
Benchmark time (5, odds ): 0.3049640655517578
Benchmark time (6, evens ): 0.3025388717651367
Benchmark time (7, evens ): 0.29833483695983887
.
Benchmark time (1 cold cache ): 0.8466510772705078
Benchmark time (2, warm cache): 0.8022150993347168
Benchmark time (3, warm cache): 0.7888422012329102
Benchmark time (4, odds ): 0.3941817283630371
Benchmark time (5, odds ): 0.416118860244751
Benchmark time (6, evens ): 0.42328405380249023
Benchmark time (7, evens ): 0.3695280551910400
200k .
Benchmark time (1 cold cache ): 19.106724977493286
Benchmark time (2, warm cache): 22.98161005973816
Benchmark time (3, warm cache): 23.126408100128174
Benchmark time (4, odds ): 11.401129007339478
Benchmark time (5, odds ): 0.6159579753875732
Benchmark time (6, evens ): 12.087002992630005
Benchmark time (7, evens ): 0.6241748332977295
.
Benchmark time (1 cold cache ): 1.328582763671875
Benchmark time (2, warm cache): 1.279066801071167
Benchmark time (3, warm cache): 1.2781598567962646
Benchmark time (4, odds ): 0.6520607471466064
Benchmark time (5, odds ): 0.647273063659668
Benchmark time (6, evens ): 0.6393017768859863
Benchmark time (7, evens ): 0.6427278518676758
@MadLittleMods MadLittleMods added the A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) label Aug 25, 2022
@MadLittleMods MadLittleMods self-assigned this Aug 25, 2022
@DMRobertson DMRobertson added S-Minor Blocks non-critical functionality, workarounds exist. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. A-Performance O-Uncommon Most users are unlikely to come across this or unexpected workflow A-Performance Performance, both client-facing and admin-facing and removed A-Performance labels Aug 25, 2022
MadLittleMods added a commit that referenced this issue Sep 27, 2022
Fix #13856
Fix #13865

> Discovered while trying to make Synapse fast enough for [this MSC2716 test for importing many batches](matrix-org/complement#214 (comment)). As an example, disabling the `have_seen_event` cache saves 10 seconds for each `/messages` request in that MSC2716 Complement test because we're not making as many federation requests for `/state` (speeding up `have_seen_event` itself is related to #13625) 
> 
> But this will also make `/messages` faster in general so we can include it in the [faster `/messages` milestone](https://github.com/matrix-org/synapse/milestone/11).
> 
> *-- #13856


### The problem

`_invalidate_caches_for_event` doesn't run in monolith mode which means we never even tried to clear the `have_seen_event` and other caches. And even in worker mode, it only runs on the workers, not the master (AFAICT).

Additionally there was bug with the key being wrong so `_invalidate_caches_for_event` never invalidates the `have_seen_event` cache even when it does run.

Because we were using the `@cachedList` wrong, it was putting items in the cache under keys like `((room_id, event_id),)` with a `set` in a `set` (ex. `(('!TnCIJPKzdQdUlIyXdQ:test', '$Iu0eqEBN7qcyF1S9B3oNB3I91v2o5YOgRNPwi_78s-k'),)`) and we we're trying to invalidate with just `(room_id, event_id)` which did nothing.
@MadLittleMods MadLittleMods removed their assignment Jan 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) A-Performance Performance, both client-facing and admin-facing O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants