-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Sync is timing out #12971
Comments
Is the timeout variable set from Element? Maybe it needs to be higher? |
I am curious as to what you are seeing that indicates the sync is timing out? |
I now understand it's normal for it to time out and keep on preparing the request in the background waiting for a new sync request to come until it is ready. The problem is it's not doing that for me in the background and I think I just hit the hardware limits I guess even though I have 8GB RAM and 4 cores, probably because of my old hardware I guess I need to get better hardware.... |
I'm asking because in the logs I see /sync requests that look like they are being successfully processed, so I am wondering what you are actually observing re: timeout. Does the client throw an error message saying that the sync times out? Does the spinner just spin endlessly? etc |
The spinner spins endlessly, and I can see 504s in the network tab - I'll sent it on Element |
Please could you provide some logs, at INFO or DEBUG (preferably DEBUG), for multiple syncs over a long enough period of time (let's say >10 minutes for the sake of it)? Initial sync is notoriously slow. That won't realistically be improved dramatically until sliding sync is supported — maybe some cache tuning or something could help a bit, but at 1500 rooms I wouldn't say it's unusual for it to time out a few times. It's expected (but unfortunate) behaviour that the initial sync will time out multiple times, but the progress isn't lost and so eventually the client will retry and succeed. |
Hey, thanks, could i sent you the logs on matrix? I think I might have to reproduce to get the reverse proxy logs but I do have a few hours of synapse debug logs. Also I've been trying a few hours is that normal? Would moving to better hardware perhaps solve this issue? I tried tuning the caches as well...i tried 2.0 and 3.0 for the two room specific caches (don't remember the exact names) |
It would be nice to share some version of the logs (with personal info redacted if needed) in the issue so that others are able to benefit from the information (both other people investigating as well as other people who may have the problem in the future!). If you've been initial syncing for hours then I don't think that's expected and I don't think I'd suspect a hardware problem straight away. Let's see what the logs say. If you don't have the reverse proxy logs then let's just look at Synapse's logs to start with. |
Oops. I didn't know this wasn't configured by default, but there's an option for Let me know if this helps? |
Thx! I just tried reproducing with 2m and I got the logs (proxy logs and console logs as well), I'll remove the personal stuff and upload them shortly, I also tried with 20m and it's still falling I'll try an hour or so and see what happens, thanks for pointing me to that config! |
I can't figure out what I should hide and the file is 10,000 lines so I'm just going to upload it like this and hope for the best: |
I managed to log in! thx!! set the variable to 200m and I think after about half an hour - 40 minutes I was in. I also turned of presence, so I'll be turning that on again and trying again. |
That sounds excessively long, so I'll reopen for now just so that we can have a look to see if anything is awry. |
The being slow might be my very slow hardware... |
I tried again today from web and I didn't manage to log in again, should there be any difference between web and Android? I also turned presence back on, so maybe it's related to that? I think I didn't change anything else other than that and I didn't get debug logs this time but I can try again, or should I maybe try again without presence? |
Looking at the nginx logs it looks like some @squahtx is it possible that the |
Don't think so, we decided to not cancel |
I somehow lost the console logs of the web but it tried more than 80 times and I have the network info from chrome zipped: |
Actually, |
I tried without presence, and it seems to not work, on Android this time (experiment-spaceSwitching) |
same problem duplicating is 0, but still stuck in sync |
I tried again from android I sent logs from android mentioning this ticket and I set debug for synapse and nginx , here are the logs and a screen cap of me doing it: |
Hm, there's an instance in these logs where we're making use of a cached sync response:
However grepping for @chagai95 I assume this Synapse instance does not have any workers other than the main process? |
Yes I'm not using any workers. |
I am having trouble making sense of these logs.
Can you explain your set up some more? Synapse requests are being routed to two different upstreams in nginx-6-18.log: http://172.18.0.9:8008 and http://172.18.0.11:12080 |
I'm using the docker Ansible setup, are you familiar with that? |
Is it this one? https://github.com/spantaleev/matrix-docker-ansible-deploy |
Yes |
http://172.18.0.11:12080 turns out to also be the nginx reverse proxy. nginx proxies requests to itself, which then proxies requests on to Synapse. So the log lines for timed out requests come in pairs:
Actually nginx might just be leaving connections open even though clients have disconnected. |
There are a couple of long-running sync requests in there that don't finish by the end of the Synapse logs. The timestamps in the logs also show frequent unexplained 20-30 second gaps where Synapse appears to be doing nothing. On average, there are only log lines for 10 seconds out of every minute. The pauses are probably the cause of the long sync times. What does the CPU usage of Synapse look like? Are there any CPU quotas or memory limits applied to Synapse? Is the host out of memory? |
I am not sure how to check that, but yeah I think the CPU might be a bit overloaded |
You could run
If |
I'd have to test this again soon and do that, thx! |
Hey, so I switched to an ssd and I only tested after that so I can't compare, it seems to jump straight to zero when I try logging in but then goes up to numbers between 1-10 and sometimes very briefly back to 0 (0.1-0.9) Thx for helping me check that!
|
So the memory is down to 300,000 out of 2,400,000 |
I'm logged in! So I guess it was the slow hdd after all, sorry for all the mess, thx! for helping me with this! Should I close the ticket or would you like to use it for other stuff? P.s. I'm down to barely 100,000 |
Description
I can not use new devices because after the login is successful the sync times out.
Steps to reproduce
Version information
If not matrix.org:
mx.chagai.website
1.59.1
docker ansible playbook
CentOS
The text was updated successfully, but these errors were encountered: