-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
synapse hangs when getting room history #2726
Comments
Same problem here, I'm running synapse in an LXC (and the LXC is behind a reverse proxy (nginx)) and I cant make it work :(
from nginx logs : Here is my nginx config: |
This is not the same issue that I have. You should not forward port 8448 in ngnix ! You should forward port 443 for a virtualhost in ngnix, to port 8008 to your LXC with synapse. (You will then also need a DNS record for matrix.yourdomain.com pointing to same as yourdomain.com). For more help I suggest joining #matrix:matrix.org |
Hey, thanks for your reply. The port 8008 is well configured, but I'm struggling with 8448. I removed it from the reverse proxy, I'm now trying to perform port forwarding with iptables. I've a bridge interface, all my LXC are in the subnet 192.168.1.0/24, I did the following : I tried a lot of iptables rules and to perform port forwarding I actually got :
I receive all messages but I cant send any. When I try to join a room, I got "Internal server error".... Please help 😭 |
Sorry it was many years ago I used iptables so I think you will find better help somewhere else about that. |
@spacebug0 No problem, thanks anyway |
I finally got a chance to look at this - @spacebug0, sorry for the delay and thanks for your patience.
Looking at the individual stuck requests, they all seem to get stuck trying to resolve state after checking a given auth event, which is a power level change from July 2015 (https://riot.im/develop/#/room/#matrix:matrix.org/$1436805007295iQZlO:jki.re)
etc The first one (GET-74) also does a huge amount of backfill attempts (which seem to succeed) before getting stuck on this 'bad' event. I've asked @spacebug0 to leave the room, but it sounds like state resets cause them to spontaneously rejoin it - and then on calling /messages in the room the problem recurs. However, after the stack of requests which fail, there are then a few which succeed (after restarting the server):
Looking at the request where /messages does manage to return (after 113s), the only difference I can see is that it tries to backfill from jki.re rather than matrix.org. And then subsequent ones seem to be quick:
So I'm wondering if the initial /messages req (GET-74) gets complete stuck, wedging all subsequent ones until the server is restarted, leaking RAM as they stack up. But it's unclear why it gets wedged, and why it succeeds later on when it backfills from jki.re. A final observation:
...does not look good at all. @erikjohnston, any idea what's going on here? is it possible that a backfill failure like this would cause this explosion? |
I'm in a similar situation, I had given up to try joining big rooms, but even using synapse with 5 users but still "federated" (just to be able to chat with a couple of friends), my VPS (2GB of ram) is suffering too much and the process takes too much memory. It can happen that for 2-3 days it does not crash, but then (I still ignore the cause) it start eating ram and the oom-killer start killing things. Initially I was furious because other processes (like my postgres db) was killed randomly, but now I added the |
Duplicate of #2504 |
Description
Scrolling up in Riot to get history for one or more room(s) can sometimes cause the server to use huge amount of memory, 100% of CPU and finally become unresponsive.
Steps to reproduce
I expect to get room history from the server, but the whole server goes bananas and every client connected get disconnected.
Some stuff from my homeserver.log
Thats the last line before the server totally hanged -^
Version information
My own homeserver lithen.net
If not matrix.org:
Version: What version of Synapse is running?
Synapse/0.25.1
Install method: package manager/git clone/pip
Installed using pip
Platform: Tell us about the environment in which your homeserver is operating
- distro, hardware, if it's running in a vm/container, etc.
Platform is GNU/Linux 'Debian Stretch', where synapse and turnserver are running inside an LXC.
I'm using postgresql with synapse. (I also did have this issue when using sqlite, thats why I changed to postgresql to test).
In another LXC, I have apache running as a webserver and redirects matrix.lithen.net to synapse.
I have a SRV record in my DNS for matrix so I can have lithen.net as homeserver.
The text was updated successfully, but these errors were encountered: