Element never flushes accumulated non-LL room members, and so slowly uses more ambient RAM. #25264
Labels
A-Performance
O-Uncommon
Most users are unlikely to come across this or unexpected workflow
S-Major
Severely degrades major functionality or product features, with no satisfactory workaround
T-Defect
Steps to reproduce
Presumably we get ~2x of these on the heap - perhaps one for prevState and currentState per room, or similar.
So the problem seems to be that we never ever flush non-LL members from the sync accumulator. So if you idle in Matrix HQ in a login for a year, then the persisted room state gradually grows as it sees members appear in the timeline. (Lazily loaded members don't get persisted). So on my account it had gone from 7 members to 11,000 members - and similarly in other busy rooms, hence increasing the 'actual' LL subset of 17K members by ~10x up to 158K members.
While looking into this i saw some other memory leaks too:
For the main bug (slowly leaking members), about the only actual solution that comes to mind that we could occasionally flush the accumulated m.room.member events, and toggle the
include_redundant_members
field on the /sync filter to persuade the server to start resending non-LL members to us. (Assuming the server flushes the LL LRU cache on seeinginclude_redundant_members
)Meanwhile, sliding sync (or a Hydrogen or rust-sdk style architecture where we don't store members in RAM) is clearly the better solution - i.e. killing the sync accumulator entirely.
I propose not fixing this, in favour of landing SS instead, and meanwhile the workaround is for powerusers to logout and login again once a year or so.
Outcome
What did you expect?
App to not OOM overnight
What happened instead?
Many many OOMs.
Operating system
macOS 13.3.1
Browser information
Chrome
URL for webapp
element.io
Application version
Element Nightly version: 2023041901 Olm version: 3.2.12
Homeserver
matrix.org
Will you send logs?
No
The text was updated successfully, but these errors were encountered: