Synapse v1.0.0rc1 significantly increases RAM usage #5395

evilham · 2019-06-07T15:28:46Z

Description

My main Synapse server had been running happily with 4G RAM for some months, after upgrading to v1.0.0rc1 RAM usage increased a lot, leading to system crash.

After disabling presence (usual trick to reduce RAM usage), the RAM consumption went down enough for the server to be usable, but it is still out of the ordinary.

Steps to reproduce

upgrade to v1.0.0rc1
see RAM usage increase
beware of potential server crash

Version information

Version: v1.0.0rc1
Install method: pip

Note: I waited until another user on #synapse-admins:matrix.org confirmed this.

fredcy · 2019-06-07T15:49:15Z

I'm running synapse v1.0.0rc1 on a vps with 8G RAM. Not long after restarting synapse (less than an hour) I see it using about 2.8g of RES memory per the top command. Unfortunately, I don't have a baseline reading prior to upgrading to v1.0.0rc1.

aaronraimist · 2019-06-07T16:01:15Z

My RAM usage also spiked immediately after upgrading to 1.0.0rc1. Usually I hover around 2.1-2.8 GB used but it takes a while to get there. After upgrading 1.0.0rc1 my RAM usage was steadily climbing until it got to 3.7GB (out of 3.85GB total) and is currently sitting at 100% CPU on both cores.

turt2live · 2019-06-07T16:04:49Z

It's possible that the matrix.org outages are causing problems for people - how are extremities (#1760) looking?

aaronraimist · 2019-06-07T16:10:21Z

Extremities are ok. Highest is 5. (this is after Synapse OOMed and I stopped it from restarting but I assume it doesn't need to be running)

aaronraimist · 2019-06-07T16:37:42Z

Haven't setup grafana yet but this gets spammed in the logs as right memory usage is climbing rapidly. When I just restart my server it goes to about normal memory levels but it spikes as soon as I say something in a big room.

Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,138 - synapse.federation.sender.per_destination_queue - 306 - WARNING - federation_transaction_transmission_loop-641 - TX [matrix.3lite.com] Received 401 response to transaction: 401: b'Unauthorized'
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,152 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1153 - {PUT-O-1373} [spinweb.de] Request failed: PUT matrix://spinweb.de/_matrix/federation/v1/send/1559924920618: ConnectionRefusedError('Connection refused',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,388 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1190 - {PUT-O-1370} [matrix.derpaholic.com] Request failed: PUT matrix://matrix.derpaholic.com/_matrix/federation/v1/send/1559924920615: DNSLookupError('no results for hostname lookup: matrix.derpaholic.com',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,389 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-135 - {PUT-O-200} [matrixsoup.pw] Request failed: PUT matrix://matrixsoup.pw/_matrix/federation/v1/send/1559924919575: DNSLookupError('no results for hostname lookup: matrixsoup.pw',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,395 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-561 - {PUT-O-660} [openpsychology.net] Request failed: PUT matrix://openpsychology.net/_matrix/federation/v1/send/1559924919954: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'openpsychology.net'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,396 - synapse.http.matrixfederationclient - 482 - WARNING - POST-663 - {POST-O-1251} [half-shot.uk] Request failed: POST matrix://half-shot.uk/_matrix/federation/v1/get_groups_publicised: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'half-shot.uk'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,397 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-822 - {PUT-O-1024} [bachmeier.xyz] Request failed: PUT matrix://bachmeier.xyz/_matrix/federation/v1/send/1559924920277: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'bachmeier.xyz'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,398 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1054 - {PUT-O-1341} [matrix.hithlum.net] Request failed: PUT matrix://matrix.hithlum.net/_matrix/federation/v1/send/1559924920586: ConnectError('Address not available',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,399 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1217 - {PUT-O-1383} [bisonpeak.com] Request failed: PUT matrix://bisonpeak.com/_matrix/federation/v1/send/1559924920628: ConnectError('Address not available',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,462 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1047 - {PUT-O-1115} [matrix.rollspel.club] Request failed: PUT matrix://matrix.rollspel.club/_matrix/federation/v1/send/1559924920368: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,463 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1159 - {PUT-O-1150} [chat.murkfall.net] Request failed: PUT matrix://chat.murkfall.net/_matrix/federation/v1/send/1559924920403: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,464 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1264 - {PUT-O-1179} [matrix.jamidisi-edu.de] Request failed: PUT matrix://matrix.jamidisi-edu.de/_matrix/federation/v1/send/1559924920432: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,465 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1426 - {PUT-O-1240} [bruder.space] Request failed: PUT matrix://bruder.space/_matrix/federation/v1/send/1559924920493: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,508 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-647 - {PUT-O-1270} [matrix.sayous.me] Request failed: PUT matrix://matrix.sayous.me/_matrix/federation/v1/send/1559924920516: ConnectionRefusedError('Connection refused',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,537 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-36 - {PUT-O-181} [matrix.matthewsmith.id.au] Request failed: PUT matrix://matrix.matthewsmith.id.au/_matrix/federation/v1/send/1559924919556: ConnectError('Host is unreachable',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,550 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1027 - {PUT-O-1338} [matrix.cigliola.com] Request failed: PUT matrix://matrix.cigliola.com/_matrix/federation/v1/send/1559924920583: DNSLookupError('no results for hostname lookup: matrix.cigliola.com',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,560 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1440 - {PUT-O-1243} [cervoi.se] Request failed: PUT matrix://cervoi.se/_matrix/federation/v1/send/1559924920496: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'cervoi.se'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,561 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1254 - {PUT-O-1177} [sciuro.org] Request failed: PUT matrix://sciuro.org/_matrix/federation/v1/send/1559924920430: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'sciuro.org'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,561 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-653 - {PUT-O-963} [haupt.bahnhof.cz] Request failed: PUT matrix://haupt.bahnhof.cz/_matrix/federation/v1/send/1559924920216: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'haupt.bahnhof.cz'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,562 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1169 - {PUT-O-1156} [hax404.de] Request failed: PUT matrix://hax404.de/_matrix/federation/v1/send/1559924920409: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'hax404.de'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,565 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-168 - {PUT-O-439} [ejpcmac.net] Request failed: PUT matrix://ejpcmac.net/_matrix/federation/v1/send/1559924919760: ConnectError('Address not available',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,584 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-564 - {PUT-O-667} [talk.seb.host] Request failed: PUT matrix://talk.seb.host/_matrix/federation/v1/send/1559924919961: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,648 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1163 - {PUT-O-1153} [xx4h.de] Request failed: PUT matrix://xx4h.de/_matrix/federation/v1/send/1559924920406: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'xx4h.de'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,695 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-1378 - {PUT-O-2128} [alternanet.fr] Request failed: PUT matrix://alternanet.fr/_matrix/federation/v1/send/1559924921339: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,724 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-1429 - {PUT-O-2127} [matrix.alicorn.lv] Request failed: PUT matrix://matrix.alicorn.lv/_matrix/federation/v1/send/1559924921338: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,745 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-97 - {PUT-O-198} [linuxgaming.life] Request failed: PUT matrix://linuxgaming.life/_matrix/federation/v1/send/1559924919573: TimeoutError('',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,745 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-878 - {PUT-O-1429} [chat.millcrick.com] Request failed: PUT matrix://chat.millcrick.com/_matrix/federation/v1/send/1559924920670: TimeoutError('',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,761 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1164 - {PUT-O-1367} [tilde.fun] Request failed: PUT matrix://tilde.fun/_matrix/federation/v1/send/1559924920612: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,772 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-1384 - {PUT-O-1226} [matrix.exmachina.community] Request failed: PUT matrix://matrix.exmachina.community/_matrix/federation/v1/send/1559924920479: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,846 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-823 - {PUT-O-1425} [cryptmessage.org] Request failed: PUT matrix://cryptmessage.org/_matrix/federation/v1/send/1559924920668: DNSLookupError('no results for hostname lookup: cryptmessage.org',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,861 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-949 - {PUT-O-1323} [chat.mutnt.io] Request failed: PUT matrix://chat.mutnt.io/_matrix/federation/v1/send/1559924920568: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,862 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-219 - {PUT-O-361} [stevestreza.com] Request failed: PUT matrix://stevestreza.com/_matrix/federation/v1/send/1559924919683: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,863 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-766 - {PUT-O-912} [gauchiste.club] Request failed: PUT matrix://gauchiste.club/_matrix/federation/v1/send/1559924920166: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,864 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-879 - {PUT-O-2132} [tud.at] Request failed: PUT matrix://tud.at/_matrix/federation/v1/send/1559924921342: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,916 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1025 - {PUT-O-1340} [tech9.io] Request failed: PUT matrix://tech9.io/_matrix/federation/v1/send/1559924920585: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'tech9.io'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,918 - synapse.http.matrixfederationclient - 482 - WARNING - POST-838 - {POST-O-2121} [half-shot.uk] Request failed: POST matrix://half-shot.uk/_matrix/federation/v1/get_groups_publicised: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'half-shot.uk'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,919 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1142 - {PUT-O-1364} [matrix.wbatey.co] Request failed: PUT matrix://matrix.wbatey.co/_matrix/federation/v1/send/1559924920609: ResponseNeverReceived:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'matrix.wbatey.co'))])]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,928 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-663 - {PUT-O-966} [pressreleasepoint.com] Request failed: PUT matrix://pressreleasepoint.com/_matrix/federation/v1/send/1559924920219: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,929 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1374 - {PUT-O-938} [chat.askcornerstone.net] Request failed: PUT matrix://chat.askcornerstone.net/_matrix/federation/v1/send/1559924920192: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,931 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1282 - {PUT-O-1400} [matrix.bruder.space] Request failed: PUT matrix://matrix.bruder.space/_matrix/federation/v1/send/1559924920645: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,964 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1179 - {PUT-O-1371} [koopatroopas.net] Request failed: PUT matrix://koopatroopas.net/_matrix/federation/v1/send/1559924920616: ConnectionRefusedError('Connection refused',)
Jun 07 16:35:56 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:56,970 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1445 - {PUT-O-1245} [0xaa55.org] Request failed: PUT matrix://0xaa55.org/_matrix/federation/v1/send/1559924920498: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'0xaa55.org'))])]
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,003 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-1065 - {PUT-O-2143} [maeli.net] Request failed: PUT matrix://maeli.net/_matrix/federation/v1/send/1559924921352: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,021 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1030 - {PUT-O-1110} [londondrugs.com] Request failed: PUT matrix://londondrugs.com/_matrix/federation/v1/send/1559924920363: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'londondrugs.com'))])]
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,032 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-397 - {PUT-O-550} [dh9ph.de] Request failed: PUT matrix://dh9ph.de/_matrix/federation/v1/send/1559924919857: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'dh9ph.de'))])]
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,039 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-1213 - {PUT-O-2142} [comms.rok-on.com] Request failed: PUT matrix://comms.rok-on.com/_matrix/federation/v1/send/1559924921351: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,064 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-749 - {PUT-O-1285} [matrix-test.dev.mccme.ru] Request failed: PUT matrix://matrix-test.dev.mccme.ru/_matrix/federation/v1/send/1559924920530: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'matrix-test.dev.mccme.ru'))])]
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,178 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1108 - {PUT-O-1358} [matrix.balsillie.net] Request failed: PUT matrix://matrix.balsillie.net/_matrix/federation/v1/send/1559924920603: DNSServerError(<Message id=3940 rCode=2 maxSize=0 flags=answer,recDes,recAv queries=[Query('_matrix._tcp.matrix.balsillie.net', 33, 1)]>,)
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,279 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-907 - {PUT-O-1316} [matrix.anatoli.design] Request failed: PUT matrix://matrix.anatoli.design/_matrix/federation/v1/send/1559924920561: DNSLookupError('no results for hostname lookup: matrix.anatoli.design',)
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,323 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-1294 - {PUT-O-2147} [emdete.de] Request failed: PUT matrix://emdete.de/_matrix/federation/v1/send/1559924921355: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,485 - synapse.http.matrixfederationclient - 517 - WARNING - federation_transaction_transmission_loop-1384 - {PUT-O-2148} [matrix.exmachina.community] Request failed: PUT matrix://matrix.exmachina.community/_matrix/federation/v1/send/1559924921356: HttpResponseException("400: b'Bad Request'",)
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,775 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1255 - {PUT-O-1389} [village.mobi] Request failed: PUT matrix://village.mobi/_matrix/federation/v1/send/1559924920634: ConnectionRefusedError('Connection refused',)
Jun 07 16:35:57 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:57,876 - synapse.http.matrixfederationclient - 482 - WARNING - POST-838 - {POST-O-2121} [half-shot.uk] Request failed: POST matrix://half-shot.uk/_matrix/federation/v1/get_groups_publicised: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'half-shot.uk'))])]
Jun 07 16:35:58 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,044 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-828 - {PUT-O-1026} [matrix.liberium.fr] Request failed: PUT matrix://matrix.liberium.fr/_matrix/federation/v1/send/1559924920279: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:58 ubuntu systemd[1]: Stopping Synapse server...
Jun 07 16:35:58 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,617 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1390 - {PUT-O-1412} [matrix.ananstra.org] Request failed: PUT matrix://matrix.ananstra.org/_matrix/federation/v1/send/1559924920657: ConnectionRefusedError('Connection refused',)
Jun 07 16:35:58 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,739 - synapse.http.matrixfederationclient - 482 - WARNING - POST-663 - {POST-O-1251} [half-shot.uk] Request failed: POST matrix://half-shot.uk/_matrix/federation/v1/get_groups_publicised: RequestTransmissionFailed:[VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'half-shot.uk'))])]
Jun 07 16:35:58 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,740 - synapse.http.server - 415 - WARNING - POST-663 - Not sending response to request <XForwardedForRequest at 0x7fc8573864a8 method='POST' uri='/_matrix/client/r0/publicised_groups' clientproto='HTTP/1.0' site=8008>, already disconnected.
Jun 07 16:35:58 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,821 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1210 - {PUT-O-1379} [synapse.labhijau.net] Request failed: PUT matrix://synapse.labhijau.net/_matrix/federation/v1/send/1559924920624: DNSLookupError('no results for hostname lookup: synapse.labhijau.net',)
Jun 07 16:35:58 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,865 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1380 - {PUT-O-1409} [matrix.ctf15.com] Request failed: PUT matrix://matrix.ctf15.com/_matrix/federation/v1/send/1559924920654: DNSLookupError('no results for hostname lookup: matrix.ctf15.com',)
Jun 07 16:35:58 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,888 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1212 - {PUT-O-927} [bitnexus.com] Request failed: PUT matrix://bitnexus.com/_matrix/federation/v1/send/1559924920181: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:35:59 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:58,989 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-1436 - {PUT-O-1417} [matrix.linuxlab.sh] Request failed: PUT matrix://matrix.linuxlab.sh/_matrix/federation/v1/send/1559924920662: DNSLookupError('no results for hostname lookup: matrix.linuxlab.sh',)
Jun 07 16:35:59 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:59,120 - synapse.http.site - 203 - WARNING - PUT-1050 - Error processing request <XForwardedForRequest at 0x7fc852dbaa90 method='PUT' uri='/_matrix/federation/v1/send/1559902917769' clientproto='HTTP/1.0' site=8048>: <class 'twisted.internet.error.ConnectionDone'> Connection was closed cleanly.
Jun 07 16:35:59 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:59,131 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-624 - {PUT-O-1264} [matrix.footynews.info] Request failed: PUT matrix://matrix.footynews.info/_matrix/federation/v1/send/1559924920510: ConnectionRefusedError('Connection refused',)
Jun 07 16:35:59 ubuntu matrix-synapse[22132]: 2019-06-07 16:35:59,132 - synapse.http.matrixfederationclient - 482 - WARNING - federation_transaction_transmission_loop-291 - {PUT-O-393} [synapse.finxploit.com] Request failed: PUT matrix://synapse.finxploit.com/_matrix/federation/v1/send/1559924919715: RequestTransmissionFailed:[Error([('SSL routines', 'CONNECT_CR_CERT', 'certificate verify failed')],)]
Jun 07 16:36:00 ubuntu matrix-synapse[28990]: matrix-synapse
Jun 07 16:36:01 ubuntu matrix-synapse[29453]: Error: No such container: matrix-synapse
Jun 07 16:36:01 ubuntu systemd[1]: matrix-synapse.service: Main process exited, code=exited, status=137/n/a
Jun 07 16:36:01 ubuntu systemd[1]: matrix-synapse.service: Failed with result 'exit-code'.
Jun 07 16:36:01 ubuntu systemd[1]: Stopped Synapse server.

evilham · 2019-06-07T16:40:34Z

I have setup grafana, but the graphs are useless if I enable presence, since the server just crashes.
If presence is disabled, RAM usage is pretty much the same as before v1.0.0rc1 with presence enabled, but it does keep working.
Sadly, my graph is not very useful.

evilham · 2019-06-07T18:17:59Z

Just as an update: RAM usage did keep increasing until the host crashed even with presence disabled, apparently that just bought this server some time.

Started a while true; do sv restart matrix-master; date; sleep 1800; done on a screen meanwhile (it's terrible, does the job apparently).

aaronraimist · 2019-06-07T18:36:24Z

Adding our comments from #synapse:matrix.org:

For me, as long as I don't speak in large rooms the memory usage stays stable. As soon as I try to speak in #synapse:matrix.org or another large room then it crashes.

aaron: wild guess: on this channel there are more servers without valid certs than on other channels, so that makes it worse here?

Seems to be correct.

ingothierack · 2019-06-08T10:44:37Z

Im on 0.99.5.2 current. Had activated "federation_verify_certificates" from first install a few weeks ago.
Have memory issues from the start.
Had now deactivated for around 2 hour, and have an stable memoy usage. After set to true, memory increase again.

10:20 restart with federation_verify_certificates: false. 12:20 restarted with true again. Memory grows shortly to 3.4 GB from 1.5 GB before. Maybe this is only more caching. But as you can see in the graph, it can groß up to 5.5 GB.
Maybe take a look in the verification process for memory issues.

JJJollyjim · 2019-06-08T10:56:41Z

I'm on 0.99.5.2, reproduced @ingothierack's results.

With a procedure of start the server then post in #synapse:matrix.org:

federation_verify_certificates: false -> synapse process ends up around 700MB
federation_verify_certificates: true -> synapse process grows >1.6GB and crashes the machine (which has always functioned okay on 2GB ram)

JJJollyjim · 2019-06-08T12:26:26Z

I added a stub to my synapse install that dumps a tracemalloc snapshot to a file when it receives a SIGUSR1.

This time, I used a procedure of start synapse, start typing in #synapse:matrix.org (thereby sending out typing notifications), watch the memory usage.

Again, with federation_verify_certificates: false, I end up stable (around 1GB), and it doesn't increase if I keep typing.

With federation_verify_certificates: true, the process keeps growing (much faster) until the machine dies.

I sent the SIGUSR1 to the first case once it had settled, and the second case when it was around 1.5GB and about to kill the machine.

Surprisingly, there is very little difference between the two!

The largest difference is +18.3MiB in twisted/internet/defer.py, which has a total of 25.5MiB of allocations... this does not at all explain the massive memory usage.

I see two possibilities:

I don't know how to use tracemalloc (likely)
All the significant memory usage is from non-python-controlled allocations: openssl or something perhaps?

I might see if I can figure out how to debug/prove the second case.

JJJollyjim · 2019-06-08T13:05:18Z

KDE heaptrack shows that 408MB (about the increase from don't-verify to verify) is allocated and never freed by openssl.

evilham · 2019-06-08T13:11:36Z

I can can confirm the issue is with the change of default for federation_verify_certificates, so it is not specific to v1.0.0rc1 but was present before.

JJJollyjim · 2019-06-08T13:16:44Z

I found this likely relevant issue: pyca/cryptography#4833, which is fixed in the recently-released 2.7 (my version, provided by the latest matrix-provided deb, is 2.6)

JJJollyjim · 2019-06-08T13:17:27Z

(I say likely-relevant because the leaks are specifically from ASN.1 parsing, and the issue relates to X509 which is ASN.1-encoded)

evilham · 2019-06-08T13:26:56Z

@JJJollyjim: the server seeing issues on my side has cryptography==2.7, so I don't think it's that specific issue, though it is interesting to keep in mind.

ara4n · 2019-06-08T14:12:50Z

@JJJollyjim do you get backtraces in heaptrack to see where the leaky ASN.1 is being leaked from? Also, do you get the same behaviour if you upgrade to 2.7?

JJJollyjim · 2019-06-09T03:55:52Z

I do have the backtraces (of the C stack frames, not the python ones): all the leaks (actually about 800MB, now I've figured out how to read this) come from _cffi_f_ssl_ctx_load_verify_locations (i.e. the python binding to the openssl function ssl_ctx_load_verify_locations, which apparently is for loading CA certificates). That in turn is called by some unknown python function... _PyMethodDef_RawFastCallKeywords.

I could try and see if there is some allocation tracing thing that can also collect python function names from the stack frames. At a guess though, are we loading the CA store every time we do some sort of ssl thing, and keeping them around, instead of just having one shared copy?

evilham · 2019-06-09T07:57:19Z

Interesting. I've been trying a patch, it seems to work well so far. There is still increased RAM usage but not terribly so, there may still be room for improvement but it was written in 10 mins.

Basically, the way things are, each message sent over federation defines a new class and holds an instance of a class and there appears to be no caching whatsoever, this is the modified file:
https://matrix.unchat.cat/matrix_federation_agent.py
(Not on GH yet, sorry :-))

I'll see if I manage time-wise to add context caching and specifically look at the CA bits, but it'd be useful if someone could confirm that this is an improvement.

PS, the related file:

synapse/synapse/http/federation/matrix_federation_agent.py

Line 164 in c831748

class EndpointFactory(object):

JJJollyjim · 2019-06-09T08:25:25Z

The patch definitely didn't fix the whole problem for me (can't say if it had any affect, as the memory usage is still at the point where the server crashes)

Setting up a synapse instance on my workstation to test further.

evilham · 2019-06-09T08:30:29Z

Thank you. I saw lower mem usage but not too much.
Care with testing of n a workstation, since the issue is triggered when federating with many servers.

richvdh · 2019-06-09T10:52:19Z

Thanks to everyone who has helped investigate this so far, and sorry we didn't catch it before putting out the RC.

all the leaks (actually about 800MB, now I've figured out how to read this) come from _cffi_f_ssl_ctx_load_verify_locations (i.e. the python binding to the openssl function ssl_ctx_load_verify_locations, which apparently is for loading CA certificates).

This is very helpful indeed. Some background here:

Every time we start an outbound federation connection, ClientTLSOptionsFactory.get_options is called. That in turn calls twisted.internet.ssl.CertificateOptions._makeContext, which creates a fresh OpenSSL context and then calls trustRoot._addCACertsToContext(ctx), which in this case loads the CA files from disk and attaches them to the new context.

In theory that should be fine, because the context (along with the loaded certs) gets thrown away when the connection is closed.

However, in practice, we are making several hundred outbound connections at once, and, thanks to http keepalive and connection pooling, the connections stay open for several minutes. What I therefore suspect is happening is that this is not a true leak, but simply massively increased memory usage thanks to attaching all of the CA certificates to each of the outbound connections. If Synapse lasted long enough to send all of the outbound requests and time out the connections, then we would see memory usage drop again. (The connection keepalive timeout is 120 seconds).

Intuitively it seems stupid that we are reading the certs from disk and storing them in a separate memory store for every single outbound connection; however, I'm not aware of a way to avoid this with the openssl interface. I don't think it is correct to share OpenSSL contexts between connections. Any thoughts or advice from those familiar with the OpenSSL API would be welcome.

ara4n · 2019-06-09T11:09:32Z

https://www.postgresql.org/message-id/E1bsROr-0002Z2-25%40gemulon.postgresql.org implies that it /is/ possible to share contexts between connections, albeit fiddly (which is why postgres stopped doing so)

richvdh · 2019-06-09T11:14:27Z

(#4673 was a bug that happened when we briefly tried sharing contexts: I'm not sure offhand if there is a better solution to that than having one-context-per-connection)

ara4n · 2019-06-09T11:16:34Z

http://openssl.6102.n7.nabble.com/Possibility-to-cache-ca-bundle-and-reuse-it-between-SSL-sessions-td51090.html looks to give most of the answers on how the API is meant to be used for this scenario.

hawkowl · 2019-06-09T11:16:46Z

Right now, you need one, because you set the info callback (which is how you do SNI) on the context, not the connection. twisted/twisted#1128 will make it a little faster, though.

JJJollyjim · 2019-06-09T12:02:33Z

This node.js issue from 2011 also discusses the issue: https://paul.querna.org/articles/2011/04/05/openssl-memory-use/

They link this commit nodejs/node-v0.x-archive@5c35dff which does the caching across contexts, and mention that they plan to make node.crypto reuse contexts in future -- not sure if this ended up happening.

The other issue they discuss in the article - allocation of zlib buffers for TLS compression - is likely no longer relevant since TLS compression was turned off by default in openssl 1.1.0 (2016) to mitigate the CRIME attack

richvdh · 2019-06-09T13:06:50Z

Conclusion of discussion elsewhere is that you probably can share context objects, but have to do some hoop-jumping to get the SNI and cert verification done right.

Those interested can try playing with my rav/shared_ssl_context branch (https://github.com/matrix-org/synapse/tree/rav/shared_ssl_context). No promises that it won't blow up in your face.

mvgorcum · 2019-06-09T13:59:39Z

~~For some reason the extreme memory usage seems to have gone away for me, after running RC1 for a day.~~
[edit] I forgot I restarted the server a couple of hours earlier. The memory usage increased loads again after sending a message in a couple of large rooms.

evilham · 2019-06-11T08:40:23Z

Me and others have indeed seen a huge improvement v1.0.0rc{2,3}; thanks to everyone involved.

For the record: this improvement is also showing near 20% less RAM consumption on my biggest matrix server compared to v0.99, good thing this was addressed before v1.0.0 :-).

JJJollyjim mentioned this issue Jun 9, 2019

Reuse SSL contexts for .well-known lookups #5163

Closed

turt2live mentioned this issue Jun 10, 2019

Message send time dramatically increased on 1.0.0-rc.1 in large rooms #5396

Closed

neilisfragile added p1 z-bug (Deprecated Label) A-Performance Performance, both client-facing and admin-facing labels Jun 10, 2019

richvdh added the Synapse v1.0 label Jun 10, 2019

evilham closed this as completed Jun 11, 2019

hidrarga mentioned this issue Jul 6, 2019

Upgrade Synapse to v1 YunoHost-Apps/synapse_ynh#133

Closed

erikjohnston mentioned this issue Jul 30, 2019

Share an SSL context object between SSL connections #5417

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synapse v1.0.0rc1 significantly increases RAM usage #5395

Synapse v1.0.0rc1 significantly increases RAM usage #5395

evilham commented Jun 7, 2019 •

edited

Loading

fredcy commented Jun 7, 2019

aaronraimist commented Jun 7, 2019

turt2live commented Jun 7, 2019

aaronraimist commented Jun 7, 2019

aaronraimist commented Jun 7, 2019 •

edited

Loading

evilham commented Jun 7, 2019

evilham commented Jun 7, 2019

aaronraimist commented Jun 7, 2019

ingothierack commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

evilham commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

evilham commented Jun 8, 2019

ara4n commented Jun 8, 2019

JJJollyjim commented Jun 9, 2019 •

edited

Loading

evilham commented Jun 9, 2019 •

edited

Loading

JJJollyjim commented Jun 9, 2019 •

edited

Loading

evilham commented Jun 9, 2019

richvdh commented Jun 9, 2019 •

edited

Loading

ara4n commented Jun 9, 2019

richvdh commented Jun 9, 2019

ara4n commented Jun 9, 2019

hawkowl commented Jun 9, 2019

JJJollyjim commented Jun 9, 2019 •

edited

Loading

richvdh commented Jun 9, 2019

mvgorcum commented Jun 9, 2019 •

edited

Loading

evilham commented Jun 11, 2019

Synapse v1.0.0rc1 significantly increases RAM usage #5395

Synapse v1.0.0rc1 significantly increases RAM usage #5395

Comments

evilham commented Jun 7, 2019 • edited Loading

Description

Steps to reproduce

Version information

fredcy commented Jun 7, 2019

aaronraimist commented Jun 7, 2019

turt2live commented Jun 7, 2019

aaronraimist commented Jun 7, 2019

aaronraimist commented Jun 7, 2019 • edited Loading

evilham commented Jun 7, 2019

evilham commented Jun 7, 2019

aaronraimist commented Jun 7, 2019

ingothierack commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

evilham commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

JJJollyjim commented Jun 8, 2019

evilham commented Jun 8, 2019

ara4n commented Jun 8, 2019

JJJollyjim commented Jun 9, 2019 • edited Loading

evilham commented Jun 9, 2019 • edited Loading

JJJollyjim commented Jun 9, 2019 • edited Loading

evilham commented Jun 9, 2019

richvdh commented Jun 9, 2019 • edited Loading

ara4n commented Jun 9, 2019

richvdh commented Jun 9, 2019

ara4n commented Jun 9, 2019

hawkowl commented Jun 9, 2019

JJJollyjim commented Jun 9, 2019 • edited Loading

richvdh commented Jun 9, 2019

mvgorcum commented Jun 9, 2019 • edited Loading

evilham commented Jun 11, 2019

evilham commented Jun 7, 2019 •

edited

Loading

aaronraimist commented Jun 7, 2019 •

edited

Loading

JJJollyjim commented Jun 9, 2019 •

edited

Loading

evilham commented Jun 9, 2019 •

edited

Loading

JJJollyjim commented Jun 9, 2019 •

edited

Loading

richvdh commented Jun 9, 2019 •

edited

Loading

JJJollyjim commented Jun 9, 2019 •

edited

Loading

mvgorcum commented Jun 9, 2019 •

edited

Loading