-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Federation no longer upholds retry_interval for exponential backoff when talking to dead servers (SYN-504) #1404
Comments
Jira watchers: @erikjohnston @ara4n |
Links exported from Jira: relates to #1463 |
It seems we (still) have a really serious regression on federation retries sqlite> select * from destinations where destination='tyler.cat'; implies to me that it should be trying tyler.cat every 4 hours? (15 million milliseconds) 2016-01-07 03:16:24,131 - synapse.http.matrixfederationclient - 183 - WARNING - GET-31539 - {GET-O-371} Sending request failed to tyler.cat: GET matrix://tyler.cat/_matrix/media/v1/download/tyler.cat/LPaWtHDwcLyglmqQlsDKBySP: ConnectionRefusedError - ConnectionRefusedError: Connection refused -- @ara4n |
Hmm, it seems to be working correctly on jki.re |
After restarting No idea what's going on, will continue to monitor. |
I think we should assume, that modulo specific cases such as #1737, this has gone away. Worth noting that the limiter is applied more-or-less per api call, rather than in the federation http layer, so it's entirely possible for some retries to be happening even though a server is considered "dead" in general. |
Submitted by @matthew:matrix.org
We're trying to connect every ~10s to dead servers despite destinations.retry_interval being 3600000ms (1h). Also, each failed request logs the connection failure 12 times...
(Imported from https://matrix.org/jira/browse/SYN-504)
The text was updated successfully, but these errors were encountered: