Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Exclude OOB memberships from the federation sender #12570

Merged
merged 3 commits into from
May 3, 2022

Conversation

richvdh
Copy link
Member

@richvdh richvdh commented Apr 27, 2022

As the comment says, there is no need to process such events, and indeed we need to avoid doing so.

Fixes #12509.

As the comment says, there is no need to process such events, and indeed we
need to avoid doing so.

Fixes #12509.
@richvdh richvdh requested a review from a team as a code owner April 27, 2022 17:30
@richvdh
Copy link
Member Author

richvdh commented Apr 27, 2022

The new error was introduced by #12191. Until that point, these events would be silently ignored, because _get_state_group_for_events, and hence get_hosts_in_room_at_events would return an empty list for an outlier.

@DMRobertson DMRobertson self-assigned this Apr 28, 2022
Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code change looks fine, but I don't have much context, so you're getting loads of questions.

Does this mean that:

  • we were sending events of type (2), (2a) and (3) out over federation prior to Synapse 1.57?
  • If so, presumably they ought to be rejected by other servers?

@@ -355,6 +355,39 @@ async def handle_event(event: EventBase) -> None:
if not is_mine and send_on_behalf_of is None:
return

# We also want to not send out-of-band membership events.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is out-of-band membership a spec-defined or Synapse-internal term? (Seems to be the latter.) Is there a reference for what they are somewhere?

The 3.5 cases below seem to enumerate them and explain why we don't want to send them out. It might be useful to have this list of 3.5 event types here (a more authoritative place?):

def is_out_of_band_membership(self) -> bool:
"""Whether this is an out of band membership, like an invite or an invite
rejection. This is needed as those events are marked as outliers, but
they still need to be processed as if they're new events (e.g. updating
invite state in the database, relaying to clients, etc).
(Added in synapse 0.99.0, so may be unreliable for events received before that)
"""
return self._dict.get("out_of_band_membership", False)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is out-of-band membership a spec-defined or Synapse-internal term? (Seems to be the latter.) Is there a reference for what they are somewhere?

https://matrix-org.github.io/synapse/develop/development/room-dag-concepts.html#out-of-band-membership-events has some info on it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3.5 cases below seem to enumerate them and explain why we don't want to send them out. It might be useful to have this list of 3.5 event types here (a more authoritative place?):

The 3.5 cases here are intended as a proof by enumeration that it's correct to drop these events in the federation sender, rather than acting as an authoritative list of what OOB memberships are.

The authoritative source for compiling this list was searching the codebase for out_of_band_membership = True (there are four such hits, corresponding to cases (1), the federation and local cases for (2), and (3)). I could add such a list to is_out_of_band_membership, or the DAG concepts doc, but it could run the risk of getting out of date.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense. Perhaps just the introductory sentence from https://matrix-org.github.io/synapse/develop/development/room-dag-concepts.html#out-of-band-membership-events

A special case of outlier events are some membership events for federated rooms that we aren't full members of.

would help illuminate is_out_of_band_membership? Not going to block this on that though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the docstring for is_out_of_band_membership in 947f651.

# (2) rejections of invites to federated rooms. These are normally
# created via federation (in which case the remote server is
# responsible for sending out the rejection). If that fails,
# we'll create a leave event locally, but that's only really
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it's this local fallback event that is considered out-of-band?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, both the federated and local cases are out-of-band.

Any thoughts on how I can reword this paragraph to make this clear? How about:

                    # (2) rejections of invites to federated rooms - either remotely
                    #     or locally generated. (Such rejections are normally
                    #     created via federation, in which case the remote server is
                    #     responsible for sending out the rejection. If that fails,
                    #     we'll create a leave event locally, but that's only really
                    #     for the benefit of the invited user - we don't have enough
                    #     information to send it out over federation).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the rework. But the underlying difficulty I'm having here is that I don't grok the meaning of out of band. It makes me think of an event that has been received by this server over some non-Matrix protocol, e.g. carrier pigeon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is maybe not the best term (see #4405 (review), which is where we expanded the idea beyond case (1).)

It's "out of band" inasmuch as it's not sent over the regular federation/v1/send API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only two hard problems, after all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the wording here in ca5b502.

synapse/federation/sender/__init__.py Show resolved Hide resolved
changelog.d/12570.bugfix Show resolved Hide resolved
@richvdh
Copy link
Member Author

richvdh commented Apr 29, 2022

Does this mean that:

* we were sending events of type (2), (2a) and (3) out over federation prior to Synapse 1.57?

No. Such events are always (I think) outliers, which meant that, pre-1.57, get_hosts_in_room_at_events would return an empty list (because resolve_state_groups_for_events would return a _StateCacheEntry with an empty state, because get_state_groups_ids would return an empty list of state groups).

1.57 changed the behaviour of get_hosts_in_room_at_events to raise an exception for an outlier, instead of returning an empty list, hence the problem.

Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm happy with this. Thanks for explaining.

@richvdh richvdh enabled auto-merge (squash) May 3, 2022 12:44
@richvdh richvdh merged commit db2edf5 into develop May 3, 2022
@richvdh richvdh deleted the rav/no_send_oob_memberships branch May 3, 2022 12:47
DMRobertson pushed a commit that referenced this pull request May 10, 2022
Synapse 1.59.0rc1 (2022-05-10)
==============================

This release makes several changes that server administrators should be aware of:

- Device name lookup over federation is now disabled by default. ([\#12616](#12616))
- The `synapse.app.appservice` and `synapse.app.user_dir` worker application types are now deprecated. ([\#12452](#12452), [\#12654](#12654))

See [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1590) for more details.

Additionally, this release removes the non-standard `m.login.jwt` login type from Synapse. It can be replaced with `org.matrix.login.jwt` for identical behaviour. This is only used if `jwt_config.enabled` is set to `true` in the configuration. ([\#12597](#12597))

Features
--------

- Support [MSC3266](matrix-org/matrix-spec-proposals#3266) room summaries over federation. ([\#11507](#11507))
- Implement [changes](matrix-org/matrix-spec-proposals@4a77139) to [MSC2285 (hidden read receipts)](matrix-org/matrix-spec-proposals#2285). Contributed by @SimonBrandner. ([\#12168](#12168), [\#12635](#12635), [\#12636](#12636), [\#12670](#12670))
- Extend the [module API](https://github.com/matrix-org/synapse/blob/release-v1.59/synapse/module_api/__init__.py) to allow modules to change actions for existing push rules of local users. ([\#12406](#12406))
- Add the `notify_appservices_from_worker` configuration option (superseding `notify_appservices`) to allow a generic worker to be designated as the worker to send traffic to Application Services. ([\#12452](#12452))
- Add the `update_user_directory_from_worker` configuration option (superseding `update_user_directory`) to allow a generic worker to be designated as the worker to update the user directory. ([\#12654](#12654))
- Add new `enable_registration_token_3pid_bypass` configuration option to allow registrations via token as an alternative to verifying a 3pid. ([\#12526](#12526))
- Implement [MSC3786](matrix-org/matrix-spec-proposals#3786): Add a default push rule to ignore `m.room.server_acl` events. ([\#12601](#12601))
- Add new `mau_appservice_trial_days` configuration option to specify a different trial period for users registered via an appservice. ([\#12619](#12619))

Bugfixes
--------

- Fix a bug introduced in Synapse 1.48.0 where the latest thread reply provided failed to include the proper bundled aggregations. ([\#12273](#12273))
- Fix a bug introduced in Synapse 1.22.0 where attempting to send a large amount of read receipts to an application service all at once would result in duplicate content and abnormally high memory usage. Contributed by Brad & Nick @ Beeper. ([\#12544](#12544))
- Fix a bug introduced in Synapse 1.57.0 which could cause `Failed to calculate hosts in room` errors to be logged for outbound federation. ([\#12570](#12570))
- Fix a long-standing bug where status codes would almost always get logged as `200!`, irrespective of the actual status code, when clients disconnect before a request has finished processing. ([\#12580](#12580))
- Fix race when persisting an event and deleting a room that could lead to outbound federation breaking. ([\#12594](#12594))
- Fix a bug introduced in Synapse 1.53.0 where bundled aggregations for annotations/edits were incorrectly calculated. ([\#12633](#12633))
- Fix a long-standing bug where rooms containing power levels with string values could not be upgraded. ([\#12657](#12657))
- Prevent memory leak from reoccurring when presence is disabled. ([\#12656](#12656))

Updates to the Docker image
---------------------------

- Explicitly opt-in to using [BuildKit-specific features](https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/syntax.md) in the Dockerfile. This fixes issues with building images in some GitLab CI environments. ([\#12541](#12541))
- Update the "Build docker images" GitHub Actions workflow to use `docker/metadata-action` to generate docker image tags, instead of a custom shell script. Contributed by @henryclw. ([\#12573](#12573))

Improved Documentation
----------------------

- Update SQL statements and replace use of old table `user_stats_historical` in docs for Synapse Admins. ([\#12536](#12536))
- Add missing linebreak to `pipx` install instructions. ([\#12579](#12579))
- Add information about the TCP replication module to docs. ([\#12621](#12621))
- Fixes to the formatting of `README.rst`. ([\#12627](#12627))
- Fix docs on how to run specific Complement tests using the `complement.sh` test runner. ([\#12664](#12664))

Deprecations and Removals
-------------------------

- Remove unstable identifiers from [MSC3069](matrix-org/matrix-spec-proposals#3069). ([\#12596](#12596))
- Remove the unspecified `m.login.jwt` login type and the unstable `uk.half-shot.msc2778.login.application_service` from
  [MSC2778](matrix-org/matrix-spec-proposals#2778). ([\#12597](#12597))
- Synapse now requires at least Python 3.7.1 (up from 3.7.0), for compatibility with the latest Twisted trunk. ([\#12613](#12613))

Internal Changes
----------------

- Use supervisord to supervise Postgres and Caddy in the Complement image to reduce restart time. ([\#12480](#12480))
- Immediately retry any requests that have backed off when a server comes back online. ([\#12500](#12500))
- Use `make_awaitable` instead of `defer.succeed` for return values of mocks in tests. ([\#12505](#12505))
- Consistently check if an object is a `frozendict`. ([\#12564](#12564))
- Protect module callbacks with read semantics against cancellation. ([\#12568](#12568))
- Improve comments and error messages around access tokens. ([\#12577](#12577))
- Improve docstrings for the receipts store. ([\#12581](#12581))
- Use constants for read-receipts in tests. ([\#12582](#12582))
- Log status code of cancelled requests as 499 and avoid logging stack traces for them. ([\#12587](#12587), [\#12663](#12663))
- Remove special-case for `twisted` logger from default log config. ([\#12589](#12589))
- Use `getClientAddress` instead of the deprecated `getClientIP`. ([\#12599](#12599))
- Add link to documentation in Grafana Dashboard. ([\#12602](#12602))
- Reduce log spam when running multiple event persisters. ([\#12610](#12610))
- Add extra debug logging to federation sender. ([\#12614](#12614))
- Prevent remote homeservers from requesting local user device names by default. ([\#12616](#12616))
- Add a consistency check on events which we read from the database. ([\#12620](#12620))
- Remove use of the `constantly` library and switch to enums for `EventRedactBehaviour`. Contributed by @andrewdoh. ([\#12624](#12624))
- Remove unused code related to receipts. ([\#12632](#12632))
- Minor improvements to the scripts for running Synapse in worker mode under Complement. ([\#12637](#12637))
- Move `pympler` back in to the `all` extras. ([\#12652](#12652))
- Fix spelling of `M_UNRECOGNIZED` in comments. ([\#12665](#12665))
- Release script: confirm the commit to be tagged before tagging. ([\#12556](#12556))
- Fix a typo in the announcement text generated by the Synapse release development script. ([\#12612](#12612))

- Fix scripts-dev to pass typechecking. ([\#12356](#12356))
- Add some type hints to datastore. ([\#12485](#12485))
- Remove unused `# type: ignore`s. ([\#12531](#12531))
- Allow unused `# type: ignore` comments in bleeding edge CI jobs. ([\#12576](#12576))
- Remove redundant lines of config from `mypy.ini`. ([\#12608](#12608))
- Update to mypy 0.950. ([\#12650](#12650))
- Use `Concatenate` to better annotate `_do_execute`. ([\#12666](#12666))
- Use `ParamSpec` to refine type hints. ([\#12667](#12667))
- Fix mypy against latest pillow stubs. ([\#12671](#12671))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Failed to calculate hosts in room" error from outbound federation
3 participants