Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Create a machine-readable document for routing requests to workers #12139

Open
richvdh opened this issue Mar 2, 2022 · 9 comments
Open

Create a machine-readable document for routing requests to workers #12139

richvdh opened this issue Mar 2, 2022 · 9 comments
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@richvdh
Copy link
Member

richvdh commented Mar 2, 2022

Constructing a reverse-proxy configuration for routing requests to workers is currently left largely as an exercise for the reader in https://matrix-org.github.io/synapse/develop/workers.html.

I think we should maintain a single source of truth which documents a mapping between endpoints and workers, which could ultimately be used as an input to other scripts which construct configuration files suitable for various reverse-proxies.

@reivilibre reivilibre added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label Mar 2, 2022
@clokep
Copy link
Member

clokep commented Mar 2, 2022

Not exactly the same...but I did wonder if we could annotate certain servlets as being "worker" friendly and then have an entry point that prints them out.

@clokep
Copy link
Member

clokep commented Mar 3, 2022

Not exactly the same...but I did wonder if we could annotate certain servlets as being "worker" friendly and then have an entry point that prints them out.

I have a proof of concept of this at https://github.com/matrix-org/synapse/pull/new/clokep/autogen-workers, it just splats things to the screen for now, but could probably be augmented to spit things out in a few different formats.

The downside is don't get good prose or comments or grouping, but I suspect that the pros outweigh this.

In the current state it generates mostly the same endpoints, but there would be some rough edges.

@richvdh
Copy link
Member Author

richvdh commented Mar 4, 2022

Interesting. Having the single source of truth be the servlets themselves does seem compelling. What does the output from that look like?

I think there's more detail here than just "can it be routed to a worker", by the way. Ideally we want to be able to represent things like:

  • Different ways of sharding endpoints:
    • Endpoints which should be sharded by requesting user (/sync, auth stuff ?)
    • Endpoints which should be sharded by target room
    • Endpoints which should be routed to whichever worker is handling a particular stream.
  • Whether the endpoint absolutely must be routed a given way, or whether it's just "optimal".

@clokep
Copy link
Member

clokep commented Mar 7, 2022

Interesting. Having the single source of truth be the servlets themselves does seem compelling. What does the output from that look like?

It is just a simple list right now:

^/_matrix/client/(api/v1|r0|v3|unstable)/createRoom$
^/_matrix/client/(api/v1|r0|v3|unstable)/join/
^/_matrix/client/(api/v1|r0|v3|unstable)/joined_rooms$
^/_matrix/client/(api/v1|r0|v3|unstable)/login$
^/_matrix/client/(api/v1|r0|v3|unstable)/presence/
^/_matrix/client/(api/v1|r0|v3|unstable)/profile/
^/_matrix/client/(api/v1|r0|v3|unstable)/publicRooms$
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/(join|invite|leave|ban|unban|kick)$
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/context/.*$
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/joined_members$
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/members$
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/redact
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/send
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/state$
^/_matrix/client/(api/v1|r0|v3|unstable)/search$
^/_matrix/client/(api/v1|r0|v3|unstable)/voip/turnServer$
^/_matrix/client/(r0|v3|unstable)/.*/account_data
^/_matrix/client/(r0|v3|unstable)/.*/tags
^/_matrix/client/(r0|v3|unstable)/account/3pid
^/_matrix/client/(r0|v3|unstable)/devices$
^/_matrix/client/(r0|v3|unstable)/joined_groups$
^/_matrix/client/(r0|v3|unstable)/keys/changes$
^/_matrix/client/(r0|v3|unstable)/keys/claim$
^/_matrix/client/(r0|v3|unstable)/keys/query$
^/_matrix/client/(r0|v3|unstable)/publicised_groups$
^/_matrix/client/(r0|v3|unstable)/publicised_groups/
^/_matrix/client/(r0|v3|unstable)/register$
^/_matrix/client/(r0|v3|unstable)/room_keys/
^/_matrix/client/(r0|v3|unstable)/rooms/.*/event/
^/_matrix/client/(r0|v3|unstable)/rooms/.*/read_markers
^/_matrix/client/(r0|v3|unstable)/rooms/.*/receipt
^/_matrix/client/(r0|v3|unstable)/rooms/.*/state/
^/_matrix/client/(r0|v3|unstable)/sendToDevice/
^/_matrix/client/(v1|unstable)/register/m.login.registration_token/validity
^/_matrix/client/versions$
^/_matrix/federation/unstable/org.matrix.msc2946/hierarchy/
^/_matrix/federation/v1/backfill/
^/_matrix/federation/v1/event/
^/_matrix/federation/v1/event_auth/
^/_matrix/federation/v1/exchange_third_party_invite/
^/_matrix/federation/v1/get_groups_publicised$
^/_matrix/federation/v1/get_missing_events/
^/_matrix/federation/v1/hierarchy/
^/_matrix/federation/v1/invite/
^/_matrix/federation/v1/make_join/
^/_matrix/federation/v1/make_leave/
^/_matrix/federation/v1/publicRooms
^/_matrix/federation/v1/query/
^/_matrix/federation/v1/send_join/
^/_matrix/federation/v1/send_leave/
^/_matrix/federation/v1/state/
^/_matrix/federation/v1/state_ids/
^/_matrix/federation/v1/user/devices/
^/_matrix/federation/v2/invite/
^/_matrix/federation/v2/send_join/
^/_matrix/federation/v2/send_leave/
^/_matrix/key/v2/query

I think there's more detail here than just "can it be routed to a worker", by the way. Ideally we want to be able to represent things like:

I agree! An interesting thing about annotating the servlets with this information (and taking the config into account) is that the result could be tailored to your setup (at least, I think, I'm not 100% sure the config has enough info). E.g. if you have a receipts stream configured the output could tell you to send the /receipts endpoint to the appropriate worker.

I haven't figured out a good way of encoding any of the above though, but thank you for putting together a list of some of the info we would need!

@richvdh
Copy link
Member Author

richvdh commented Mar 8, 2022

An interesting thing about annotating the servlets with this information (and taking the config into account) is that the result could be tailored to your setup

mmmhmm, that does sound interesting. Still, even if there is a way to generate a tailored setup, I'd still like a way to generate a comprehensive list.

@clokep
Copy link
Member

clokep commented Mar 9, 2022

It looks like there are also some endpoints which are GET only for workers.

@clokep
Copy link
Member

clokep commented Mar 21, 2022

We seem to maintain the following which all hard-code worker information:

Note that these are all slightly out of sync with each other right now.

@olmari
Copy link
Contributor

olmari commented Apr 11, 2023

+1 to this. even if we can never match all the possible needs for exact amount and type of workers for evereyone, we do need firm one place for one truth or "to get truth" to at least know what endpoints are currently existing. I do also currently like in the docs the "grouping" of endpoints per "what it is for", it's up to sysadmin then to decide is there used more or less workers and what is given even more workers...

Currently I feel while endpoint listing is somewhat up to date in workers.md, it always seems to be quite manual and "if happened to remember" document change.

@clokep
Copy link
Member

clokep commented May 30, 2023

I have a proof of concept of this at https://github.com/matrix-org/synapse/pull/new/clokep/autogen-workers, it just splats things to the screen for now, but could probably be augmented to spit things out in a few different formats.

This was merged as part of #15243.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

No branches or pull requests

5 participants