Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

RoomMessageList worker postgres connections leaking memory #8516

Closed
chr-1x opened this issue Oct 10, 2020 · 2 comments
Closed

RoomMessageList worker postgres connections leaking memory #8516

chr-1x opened this issue Oct 10, 2020 · 2 comments

Comments

@chr-1x
Copy link

chr-1x commented Oct 10, 2020

Description

The immediate symptom I've been tracking is my postgres database server getting OOMkilled every 24-28 hours.

After setting up metrics for my database box and synapse, and keeping an eye on them for several days, I noticed a clear association between requests to the RoomMessageList servlet and jumps in memory usage on the database server. A stark example:

rml_dbusg
rml_memory

Furthermore, during a times of high RAM usage, I looked at postgres connections with the highest resident set size (RSS) as reported by ps and compared with the pg_stat_activity table to see which worker the connections were associated with. When nearing the server's memory limit, the connections for the worker handling this endpoint were using nearly double the RAM of any other connection:

RSS (KB) application_name
361792 homeserver
362364 homeserver
362500 homeserver
366628 homeserver
368696 homeserver
372620 homeserver
373136 homeserver
375660 homeserver
376176 homeserver
385900 homeserver
605928 room_message_lister
712208 room_message_lister
732016 room_message_lister
734624 room_message_lister
760680 room_message_lister
768392 room_message_lister
779268 room_message_lister
852376 room_message_lister
876140 room_message_lister
922448 room_message_lister

Specifically, this worker is handling all requests for ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/messages$. I'm using redis replication on the homeserver.

Steps to reproduce

I haven't tried to reproduce this in a clean environment, but here's the nginx block for this worker:

    location ~* ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/messages$ {
         proxy_pass http://synapse_room_message_lister;
         proxy_set_header X-Forwarded-For $remote_addr;
    }

And here's the worker's config file:

worker_app: synapse.app.generic_worker

# The replication listener on the synapse to talk to.
worker_replication_host: 127.0.0.1
worker_replication_http_port: 9093

worker_listeners:
 - type: http
   port: 8084
   resources:
     - names:
       - client
 - port: 5094
   bind_address: '127.0.0.1'
   type: http
   resources:
   - names: [metrics]

worker_daemonize: True
worker_pid_file: /home/matrix/synapse/room_message_lister.pid
worker_log_config: /home/matrix/synapse/config/room_message_lister-log.yaml

and the worker's systemd unit:

Description=Synapse Matrix homeserver

[Service]
Type=forking
PIDFile=/home/matrix/synapse/room_message_lister.pid
User=matrix
Group=matrix
Environment="SYNAPSE_CACHE_FACTOR=2.0"
Environment="PGAPPNAME=synapse@room_message_lister"
WorkingDirectory=/home/matrix/synapse
ExecStart=/home/matrix/synapse/env3.6/bin/synctl -w /home/matrix/synapse/workers/room_message_lister.yaml start /home/matrix/synapse/homeserver.yaml
ExecStop=/home/matrix/synapse/env3.6/bin/synctl -w /home/matrix/synapse/workers/room_message_lister.yaml stop /home/matrix/synapse/homeserver.yaml
Restart=always
RestartSec=60

[Install]
WantedBy=multi-user.target

I can reproduce portions of the homeserver config or metrics that might help debug upon request.

Version information

  • Homeserver: https://matrix.cybre.space

  • Version: {"server_version":"1.20.1 (b=master,86a72d1)","python_version":"3.6.8"}

  • Install method: pip

  • Platform: Ubuntu 18.04 VPS, not containerized.

@clokep
Copy link
Member

clokep commented Oct 13, 2020

It seems a bit surprising that those are using so much memory. Maybe someone is making a request to retrieve an extremely large amount of messages? It would be useful to see INFO logs for the room_message_lister worker.

@clokep
Copy link
Member

clokep commented Dec 21, 2020

Closing due to lack of response.

@clokep clokep closed this as completed Dec 21, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants