Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Don't send normal presence updates over federation replication stream #9828

Merged
merged 7 commits into from
Apr 19, 2021

Conversation

erikjohnston
Copy link
Member

Instead we can read from the presence replication stream.

Split out from #9819

@erikjohnston erikjohnston force-pushed the erikj/presence_stream_federation branch from 338470a to 10fdce3 Compare April 16, 2021 11:10
@erikjohnston erikjohnston requested a review from a team April 16, 2021 12:07
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I'm really struggling to grok this. Or rather, I'm really struggling to grok how it worked in the first place, and the changes aren't helping.

Comment on lines 129 to 131
self._federation = None
if hs.should_send_federation():
self._federation = hs.get_federation_sender()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's rather confusing that PresenceHandler ends up with both _federation and federation attributes, which are similar but different. I rather wonder if maybe_send_presence_to_interested_destinations should be unconditional and the should_send_federation should happen in the callers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's rather confusing that PresenceHandler ends up with both _federation and federation attributes, which are similar but different. I rather wonder if maybe_send_presence_to_interested_destinations should be unconditional and the should_send_federation should happen in the callers.

Oh yes, that is horrible. At the end of this PR we end up in a bit of a halfway house, annoyingly. The really tedious bit is that get_federation_sender will throw on workers, but on master will return the send queue instead. I can move the should_send_federation to the call sites and then have self._federation set if its on master or a federation sender, and then in the next PR we can revert and clean it up a bit? (The next PR will remove the weirdness where get_federation_sender returns the queue on master)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up doing something slightly different, and replaced instances of self.federation with self._federation. This involves some assertions to keep mypy happy, but they are temporary until the next PR where we rip out the usage of the federation send queue.

synapse/handlers/presence.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Show resolved Hide resolved
Comment on lines 709 to 717
hosts_and_states = await get_interested_remotes(
self.store,
self.presence_router,
list(to_federation_ping.values()),
self.state,
)

for destinations, states in hosts_and_states:
self.federation.send_presence_to_destinations(states, destinations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this bit unconditional, while the send_presence_to_destinations in _persist_and_notify only happens if we're on the federation sender?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because _persist_and_notify will cause the presence to be replicated to the other instances, which will then send out the presence, while here we're resending the presence to particular remotes. i.e. there are two cases:

  1. Presence state update, which gets streamed out via presence replication stream; and
  2. Times when we want to send existing presence states to remotes, which happens a) when a host joins a room and we send out all local presence and b) we resend "online" presence periodically as a ping/keepalive mechanism.

@erikjohnston
Copy link
Member Author

erikjohnston commented Apr 16, 2021

sorry, I'm really struggling to grok this. Or rather, I'm really struggling to grok how it worked in the first place, and the changes aren't helping.

In case it helps: the current situation is that the federation senders send out presence by reading from the federation replication stream, rather than the presence replication stream (this means that we're often sending the same presence updates twice over replication). This PR changes it so that we don't send presence states down the federation replication stream if we send it down the presence stream, and then the federation senders read from both the federation and presence streams.

A future PR changes it so that we don't send any presence states over the federation stream (and so we can delete it entirely), and instead have a stream just for ad hoc sending of presence states over federation.

(An alternative of adding a presence federation stream is to move the handling for periodic pings and sending presence to newly joined hosts to the federation senders, but that means they're doing more work and its duplicated work across all senders)

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get it now.

Comment on lines 130 to 131
if hs.should_send_federation() or not hs.config.worker_app:
self._federation = hs.get_federation_sender()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this bit going away again soon? If so, it's ok. If not, it could do with more comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it is.

synapse/handlers/presence.py Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
Comment on lines +267 to +268
if not self._send_federation:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still think you should move this out to the callers and rename the function send_presence_to_interested_destinations; then you can use it from PresenceHandler._update_states. I might be missing something though, or you might already have plans here, or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of things that make me a bit hesitant to do that: 1) in the end state the federation ping code won't hit the federation sender, but instead the federation queue, and so everywhere that calls this function will end up having to do that check. 2) We don't always have a federation sender, so we need to do some check, whether its an assert or a no-op (and bear in mind that this will soon turn back into a if self._federation check again).

I think it'll be easier to discuss this at the next PR, as it'll be clearer (for both of us) what the end result feels like

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
@erikjohnston erikjohnston merged commit 2b7dd21 into develop Apr 19, 2021
@erikjohnston erikjohnston deleted the erikj/presence_stream_federation branch April 19, 2021 09:50
anoadragon453 added a commit that referenced this pull request Apr 28, 2021
Synapse 1.33.0rc1 (2021-04-28)
==============================

Features
--------

- Update experimental support for [MSC3083](matrix-org/matrix-spec-proposals#3083): restricting room access via group membership. ([\#9800](#9800), [\#9814](#9814))
- Add experimental support for handling presence on a worker. ([\#9819](#9819), [\#9820](#9820), [\#9828](#9828), [\#9850](#9850))
- Return a new template when an user attempts to renew their account multiple times with the same token, stating that their account is set to expire. This replaces the invalid token template that would previously be shown in this case. This change concerns the optional account validity feature. ([\#9832](#9832))

Bugfixes
--------

- Fixes the OIDC SSO flow when using a `public_baseurl` value including a non-root URL path. ([\#9726](#9726))
- Fix thumbnail generation for some sites with non-standard content types. Contributed by @rkfg. ([\#9788](#9788))
- Add some sanity checks to identity server passed to 3PID bind/unbind endpoints. ([\#9802](#9802))
- Limit the size of HTTP responses read over federation. ([\#9833](#9833))
- Fix a bug which could cause Synapse to get stuck in a loop of resyncing device lists. ([\#9867](#9867))
- Fix a long-standing bug where errors from federation did not propagate to the client. ([\#9868](#9868))

Improved Documentation
----------------------

- Add a note to the docker docs mentioning that we mirror upstream's supported Docker platforms. ([\#9801](#9801))

Internal Changes
----------------

- Add a dockerfile for running Synapse in worker-mode under Complement. ([\#9162](#9162))
- Apply `pyupgrade` across the codebase. ([\#9786](#9786))
- Move some replication processing out of `generic_worker`. ([\#9796](#9796))
- Replace `HomeServer.get_config()` with inline references. ([\#9815](#9815))
- Rename some handlers and config modules to not duplicate the top-level module. ([\#9816](#9816))
- Fix a long-standing bug which caused `max_upload_size` to not be correctly enforced. ([\#9817](#9817))
- Reduce CPU usage of the user directory by reusing existing calculated room membership. ([\#9821](#9821))
- Small speed up for joining large remote rooms. ([\#9825](#9825))
- Introduce flake8-bugbear to the test suite and fix some of its lint violations. ([\#9838](#9838))
- Only store the raw data in the in-memory caches, rather than objects that include references to e.g. the data stores. ([\#9845](#9845))
- Limit length of accepted email addresses. ([\#9855](#9855))
- Remove redundant `synapse.types.Collection` type definition. ([\#9856](#9856))
- Handle recently added rate limits correctly when using `--no-rate-limit` with the demo scripts. ([\#9858](#9858))
- Disable invite rate-limiting by default when running the unit tests. ([\#9871](#9871))
- Pass a reactor into `SynapseSite` to make testing easier. ([\#9874](#9874))
- Make `DomainSpecificString` an `attrs` class. ([\#9875](#9875))
- Add type hints to `synapse.api.auth` and `synapse.api.auth_blocking` modules. ([\#9876](#9876))
- Remove redundant `_PushHTTPChannel` test class. ([\#9878](#9878))
- Remove backwards-compatibility code for Python versions < 3.6. ([\#9879](#9879))
- Small performance improvement around handling new local presence updates. ([\#9887](#9887))
babolivier added a commit to matrix-org/synapse-dinsic that referenced this pull request Sep 1, 2021
Synapse 1.33.0 (2021-05-05)
===========================

Features
--------

- Build Debian packages for Ubuntu 21.04 (Hirsute Hippo). ([\#9909](matrix-org/synapse#9909))

Synapse 1.33.0rc2 (2021-04-29)
==============================

Bugfixes
--------

- Fix tight loop when handling presence replication when using workers. Introduced in v1.33.0rc1. ([\#9900](matrix-org/synapse#9900))

Synapse 1.33.0rc1 (2021-04-28)
==============================

Features
--------

- Update experimental support for [MSC3083](matrix-org/matrix-spec-proposals#3083): restricting room access via group membership. ([\#9800](matrix-org/synapse#9800), [\#9814](matrix-org/synapse#9814))
- Add experimental support for handling presence on a worker. ([\#9819](matrix-org/synapse#9819), [\#9820](matrix-org/synapse#9820), [\#9828](matrix-org/synapse#9828), [\#9850](matrix-org/synapse#9850))
- Return a new template when an user attempts to renew their account multiple times with the same token, stating that their account is set to expire. This replaces the invalid token template that would previously be shown in this case. This change concerns the optional account validity feature. ([\#9832](matrix-org/synapse#9832))

Bugfixes
--------

- Fixes the OIDC SSO flow when using a `public_baseurl` value including a non-root URL path. ([\#9726](matrix-org/synapse#9726))
- Fix thumbnail generation for some sites with non-standard content types. Contributed by @rkfg. ([\#9788](matrix-org/synapse#9788))
- Add some sanity checks to identity server passed to 3PID bind/unbind endpoints. ([\#9802](matrix-org/synapse#9802))
- Limit the size of HTTP responses read over federation. ([\#9833](matrix-org/synapse#9833))
- Fix a bug which could cause Synapse to get stuck in a loop of resyncing device lists. ([\#9867](matrix-org/synapse#9867))
- Fix a long-standing bug where errors from federation did not propagate to the client. ([\#9868](matrix-org/synapse#9868))

Improved Documentation
----------------------

- Add a note to the docker docs mentioning that we mirror upstream's supported Docker platforms. ([\#9801](matrix-org/synapse#9801))

Internal Changes
----------------

- Add a dockerfile for running Synapse in worker-mode under Complement. ([\#9162](matrix-org/synapse#9162))
- Apply `pyupgrade` across the codebase. ([\#9786](matrix-org/synapse#9786))
- Move some replication processing out of `generic_worker`. ([\#9796](matrix-org/synapse#9796))
- Replace `HomeServer.get_config()` with inline references. ([\#9815](matrix-org/synapse#9815))
- Rename some handlers and config modules to not duplicate the top-level module. ([\#9816](matrix-org/synapse#9816))
- Fix a long-standing bug which caused `max_upload_size` to not be correctly enforced. ([\#9817](matrix-org/synapse#9817))
- Reduce CPU usage of the user directory by reusing existing calculated room membership. ([\#9821](matrix-org/synapse#9821))
- Small speed up for joining large remote rooms. ([\#9825](matrix-org/synapse#9825))
- Introduce flake8-bugbear to the test suite and fix some of its lint violations. ([\#9838](matrix-org/synapse#9838))
- Only store the raw data in the in-memory caches, rather than objects that include references to e.g. the data stores. ([\#9845](matrix-org/synapse#9845))
- Limit length of accepted email addresses. ([\#9855](matrix-org/synapse#9855))
- Remove redundant `synapse.types.Collection` type definition. ([\#9856](matrix-org/synapse#9856))
- Handle recently added rate limits correctly when using `--no-rate-limit` with the demo scripts. ([\#9858](matrix-org/synapse#9858))
- Disable invite rate-limiting by default when running the unit tests. ([\#9871](matrix-org/synapse#9871))
- Pass a reactor into `SynapseSite` to make testing easier. ([\#9874](matrix-org/synapse#9874))
- Make `DomainSpecificString` an `attrs` class. ([\#9875](matrix-org/synapse#9875))
- Add type hints to `synapse.api.auth` and `synapse.api.auth_blocking` modules. ([\#9876](matrix-org/synapse#9876))
- Remove redundant `_PushHTTPChannel` test class. ([\#9878](matrix-org/synapse#9878))
- Remove backwards-compatibility code for Python versions < 3.6. ([\#9879](matrix-org/synapse#9879))
- Small performance improvement around handling new local presence updates. ([\#9887](matrix-org/synapse#9887))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants