Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Reuse SSL contexts for .well-known lookups #5163

Closed
richvdh opened this issue May 9, 2019 · 11 comments
Closed

Reuse SSL contexts for .well-known lookups #5163

richvdh opened this issue May 9, 2019 · 11 comments
Assignees
Labels
A-Performance Performance, both client-facing and admin-facing z-outbound-federation-meltdown (Deprecated Label) Synapse melting down by trying to talk to too many servers

Comments

@richvdh
Copy link
Member

richvdh commented May 9, 2019

While I was investigating why Synapse 0.99 appeared to be slower than Synapse 0.34 (#4713), I tried disabling the .well-known lookup in the federation agent. Doing so appeared to make my synapse settle at 1.5G of memory usage rather than 2G.

The .well-known cache sits at around 1500 entries, so I can't believe it's all coming from there - it would imply about 333K per entry, which seems immense. I wonder if the connection pool is hanging on to data it shouldn't. It would probably be fruitful to investigate with a memory profiler.

@dkasak
Copy link
Member

dkasak commented May 16, 2019

#4713 was closed in favour of this bug, but this bug only mentions memory. To me it also appears that our synapse 0.99 instance is much slower than it was on 0.34. Should there be a separate issue for the performance regression?

@neilisfragile
Copy link
Contributor

If you have details of a performance regression that is not caused by memory usage, then please do create a new ticket.

@JJJollyjim
Copy link

I would guess that this shares the root cause with #5395 (assuming synapse 0.99 validates SSL certs during .well-known lookups)

@richvdh
Copy link
Member Author

richvdh commented Jun 10, 2019

I would guess that this shares the root cause with #5395 (assuming synapse 0.99 validates SSL certs during .well-known lookups)

yes, absolutely. I think the only reason it was bad rather than catastrophic for .well-known is that we tend to make fewer connections for .well-known lookups, and then cache the results for a while.

We can probably apply the same fix to .well-known lookup that we did to #5395.

@Ralith
Copy link
Contributor

Ralith commented Jun 12, 2019

As of 1.0, my notoriously beset homeserver, which historically settled around 1.4GB on a good day and occasionally exhausted my 4GB of ram entirely and brings down the whole server with swap thrashing, is using under 400MB.

@ara4n
Copy link
Member

ara4n commented Jun 12, 2019

that might just mean that it doesn't have any extremities atm and so isn't having to do much state resolution

@Ralith
Copy link
Contributor

Ralith commented Jun 12, 2019

I've actually been keeping a close eye on extremities lately; both before and after, no room has had more than 4 and only a handful have had more than 1. The improvement is without doubt due to a change between 0.99.5.2 and 1.0. I think #5417 has fixed a serious long-standing problem.

@ara4n
Copy link
Member

ara4n commented Jun 12, 2019

anecdatally, my HS has gone from idling at 1.3GB to being at 650MB fwiw.

@richvdh richvdh added the A-Performance Performance, both client-facing and admin-facing label Jun 28, 2019
@richvdh richvdh changed the title Investigate why .well-known support for outbound federation connections costs 500M of memory Reuse SSL contexts for .well-known lookups Jun 28, 2019
@richvdh richvdh removed the z-bug (Deprecated Label) label Jun 28, 2019
@richvdh
Copy link
Member Author

richvdh commented Jun 28, 2019

I'm turning this issue into a tracker for applying the #5395 fix to .well-known lookups.

@hawkowl hawkowl self-assigned this Jul 11, 2019
@hawkowl hawkowl added the z-outbound-federation-meltdown (Deprecated Label) Synapse melting down by trying to talk to too many servers label Jul 11, 2019
@aaronraimist
Copy link
Contributor

This was fixed in #5794 right?

@richvdh
Copy link
Member Author

richvdh commented Sep 18, 2019

yes!

@richvdh richvdh closed this as completed Sep 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Performance Performance, both client-facing and admin-facing z-outbound-federation-meltdown (Deprecated Label) Synapse melting down by trying to talk to too many servers
Projects
None yet
Development

No branches or pull requests

8 participants