Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Optimise backfill calculation #12522

Merged
merged 16 commits into from
Apr 26, 2022
Merged

Optimise backfill calculation #12522

merged 16 commits into from
Apr 26, 2022

Conversation

richvdh
Copy link
Member

@richvdh richvdh commented Apr 22, 2022

Try to avoid an OOM by checking fewer extremities.

Generally this is a big rewrite of _maybe_backfill, to try and fix some of the TODOs and other problems in it. It's best reviewed commit-by-commit.

Fixes: #12523

@richvdh richvdh force-pushed the rav/optimise_maybe_backfill branch 2 times, most recently from 46a64f4 to 97bfe26 Compare April 25, 2022 11:30
richvdh added 8 commits April 25, 2022 12:39
we may as well just chain together the two inputs
We can potentially skip some expensive db work by moving this synchronous code
earlier. The `sorted` might be expensive, but nowhere near as expensive as the
db lookups.
Both of these queries `GROUP BY b.event_id`, so there's no deduplication being
done, and the dict has no value for us.
Rather than checking all of the backwards extremities (which may be legion), we
check one at a time and bail out as soon as we find a visible one
We don't need the dict here.
@richvdh richvdh force-pushed the rav/optimise_maybe_backfill branch from 97bfe26 to 7e7adb6 Compare April 25, 2022 11:49
@richvdh richvdh marked this pull request as ready for review April 25, 2022 11:53
@richvdh richvdh requested a review from a team as a code owner April 25, 2022 11:53
@richvdh richvdh force-pushed the rav/optimise_maybe_backfill branch from 7e7adb6 to a23e5a1 Compare April 25, 2022 12:00
@DMRobertson DMRobertson self-assigned this Apr 25, 2022
Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unsure about the a couple of comments (see discussion of "filtering" and "successor events", but I think the logical changes are sound. I'm not too fussed about the Set versus List comment.

I'll leave those minor things to your discretion; otherwise LGTM.

synapse/handlers/federation.py Outdated Show resolved Hide resolved
richvdh and others added 2 commits April 25, 2022 18:12
Co-authored-by: David Robertson <davidr@element.io>
richvdh and others added 2 commits April 25, 2022 18:15
It's impossible for us to have duplicates here, so we may as well use a boring
list.
@@ -0,0 +1 @@
Fix a bug introduced in Synapse 0.99.3 which could cause Synapse to consume large amounts of RAM when back-paginating in a large room.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose we're in the worst case, where we have many backfill points to consider and none of them/their successor events are visible to us. Wouldn't we end up doing the same amount of work as before this change? (I.e. isn't this a bugfix for the best/average case rather than the worst?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we end up doing the same amount of work as before this change?

I think we won't try to load all state maps into memory at the same time (which we used to do in the depths of filter_events_for_server, in _event_to_memberships). So yeah, we might end up churning a bunch of CPU which we didn't before, but at least we won't OOM.

@richvdh richvdh merged commit 17d99f7 into develop Apr 26, 2022
@richvdh richvdh deleted the rav/optimise_maybe_backfill branch April 26, 2022 09:27
@MadLittleMods MadLittleMods added the A-Performance Performance, both client-facing and admin-facing label Apr 26, 2022
class _BackfillPointType(Enum):
# a regular backwards extremity (ie, an event which we don't yet have, but which
# is referred to by other events in the DAG)
BACKWARDS_EXTREMITY = enum.auto()
Copy link
Contributor

@MadLittleMods MadLittleMods Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "backward extremity" (not plural) is the correct spelling using the database as the source of truth (and the docs docs/development/room-dag-concepts.md#backward-extremity but I derived those from the code).

babolivier added a commit to matrix-org/synapse-dinsic that referenced this pull request Apr 28, 2022
Synapse 1.58.0rc1 (2022-04-26)
==============================

As of this release, the groups/communities feature in Synapse is now disabled by default. See [\#11584](matrix-org/synapse#11584) for details. As mentioned in [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1580), this feature will be removed in Synapse 1.61.

Features
--------

- Implement [MSC3383](matrix-org/matrix-spec-proposals#3383) for including the destination in server-to-server authentication headers. Contributed by @Bubu and @jcgruenhage for Famedly. ([\#11398](matrix-org/synapse#11398))
- Docker images and Debian packages from matrix.org now contain a locked set of Python dependencies, greatly improving build reproducibility. ([Board](https://github.com/orgs/matrix-org/projects/54), [\#11537](matrix-org/synapse#11537))
- Enable processing of device list updates asynchronously. ([\#12365](matrix-org/synapse#12365), [\#12465](matrix-org/synapse#12465))
- Implement [MSC2815](matrix-org/matrix-spec-proposals#2815) to allow room moderators to view redacted event content. Contributed by @tulir. ([\#12427](matrix-org/synapse#12427))
- Build Debian packages for Ubuntu 22.04 "Jammy Jellyfish". ([\#12543](matrix-org/synapse#12543))

Bugfixes
--------

- Prevent a sync request from removing a user's busy presence status. ([\#12213](matrix-org/synapse#12213))
- Fix bug with incremental sync missing events when rejoining/backfilling. Contributed by Nick @ Beeper. ([\#12319](matrix-org/synapse#12319))
- Fix a long-standing bug which incorrectly caused `GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}` to return edited events rather than the original. ([\#12476](matrix-org/synapse#12476))
- Fix a bug introduced in Synapse 1.27.0 where the admin API for [deleting forward extremities](https://github.com/matrix-org/synapse/blob/erikj/fix_delete_event_response_count/docs/admin_api/rooms.md#deleting-forward-extremities) would always return a count of 1, no matter how many extremities were deleted. ([\#12496](matrix-org/synapse#12496))
- Fix a long-standing bug where the image thumbnails embedded into email notifications were broken. ([\#12510](matrix-org/synapse#12510))
- Fix a bug in the implementation of [MSC3202](matrix-org/matrix-spec-proposals#3202) where Synapse would use the field name `device_unused_fallback_keys`, rather than `device_unused_fallback_key_types`. ([\#12520](matrix-org/synapse#12520))
- Fix a bug introduced in Synapse 0.99.3 which could cause Synapse to consume large amounts of RAM when back-paginating in a large room. ([\#12522](matrix-org/synapse#12522))

Improved Documentation
----------------------

- Fix rendering of the documentation site when using the 'print' feature. ([\#12340](matrix-org/synapse#12340))
- Add a manual documenting config file options. ([\#12368](matrix-org/synapse#12368), [\#12527](matrix-org/synapse#12527))
- Update documentation to reflect that both the `run_background_tasks_on` option and the options for moving stream writers off of the main process are no longer experimental. ([\#12451](matrix-org/synapse#12451))
- Update worker documentation and replace old `federation_reader` with `generic_worker`. ([\#12457](matrix-org/synapse#12457))
- Strongly recommend [Poetry](https://python-poetry.org/) for development. ([\#12475](matrix-org/synapse#12475))
- Add some example configurations for workers and update architectural diagram. ([\#12492](matrix-org/synapse#12492))
- Fix a broken link in `README.rst`. ([\#12495](matrix-org/synapse#12495))
- Add HAProxy delegation example with CORS headers to docs. ([\#12501](matrix-org/synapse#12501))
- Remove extraneous comma in User Admin API's device deletion section so that the example JSON is actually valid and works. Contributed by @olmari. ([\#12533](matrix-org/synapse#12533))

Deprecations and Removals
-------------------------

- The groups/communities feature in Synapse is now disabled by default. ([\#12344](matrix-org/synapse#12344))
- Remove unstable identifiers from [MSC3440](matrix-org/matrix-spec-proposals#3440). ([\#12382](matrix-org/synapse#12382))

Internal Changes
----------------

- Preparation for faster-room-join work: start a background process to resynchronise the room state after a room join. ([\#12394](matrix-org/synapse#12394))
- Preparation for faster-room-join work: Implement a tracking mechanism to allow functions to wait for full room state to arrive. ([\#12399](matrix-org/synapse#12399))
- Remove an unstable identifier from [MSC3083](matrix-org/matrix-spec-proposals#3083). ([\#12395](matrix-org/synapse#12395))
- Run CI in the locked [Poetry](https://python-poetry.org/) environment, and remove corresponding `tox` jobs. ([\#12425](matrix-org/synapse#12425), [\#12434](matrix-org/synapse#12434), [\#12438](matrix-org/synapse#12438), [\#12441](matrix-org/synapse#12441), [\#12449](matrix-org/synapse#12449), [\#12478](matrix-org/synapse#12478), [\#12514](matrix-org/synapse#12514), [\#12472](matrix-org/synapse#12472))
- Change Mutual Rooms' `unstable_features` flag to `uk.half-shot.msc2666.mutual_rooms` which matches the current iteration of [MSC2666](matrix-org/matrix-spec-proposals#2666). ([\#12445](matrix-org/synapse#12445))
- Fix typo in the release script help string. ([\#12450](matrix-org/synapse#12450))
- Fix a minor typo in the Debian changelogs generated by the release script. ([\#12497](matrix-org/synapse#12497))
- Reintroduce the list of targets to the linter script, to avoid linting unwanted local-only directories during development. ([\#12455](matrix-org/synapse#12455))
- Limit length of `device_id` to less than 512 characters. ([\#12454](matrix-org/synapse#12454))
- Dockerfile-workers: reduce the amount we install in the image. ([\#12464](matrix-org/synapse#12464))
- Dockerfile-workers: give the master its own log config. ([\#12466](matrix-org/synapse#12466))
- complement-synapse-workers: factor out separate entry point script. ([\#12467](matrix-org/synapse#12467))
- Back out experimental implementation of [MSC2314](matrix-org/matrix-spec-proposals#2314). ([\#12474](matrix-org/synapse#12474))
- Fix grammatical error in federation error response when the room version of a room is unknown. ([\#12483](matrix-org/synapse#12483))
- Remove unnecessary configuration overrides in tests. ([\#12511](matrix-org/synapse#12511))
- Refactor the relations code for clarity. ([\#12519](matrix-org/synapse#12519))
- Add type hints so `docker` and `stubs` directories pass `mypy --disallow-untyped-defs`. ([\#12528](matrix-org/synapse#12528))
- Update `delay_cancellation` to accept any awaitable, rather than just `Deferred`s. ([\#12468](matrix-org/synapse#12468))
- Handle cancellation in `EventsWorkerStore._get_events_from_cache_or_db`. ([\#12529](matrix-org/synapse#12529))
babolivier added a commit to matrix-org/synapse-dinsic that referenced this pull request May 3, 2022
Synapse 1.58.0 (2022-05-03)
===========================

As of this release, the groups/communities feature in Synapse is now disabled by default. See [\#11584](matrix-org/synapse#11584) for details. As mentioned in [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1580), this feature will be removed in Synapse 1.61.

No significant changes since 1.58.0rc2.

Synapse 1.58.0rc2 (2022-04-26)
==============================

This release candidate fixes bugs related to Synapse 1.58.0rc1's logic for handling device list updates.

Bugfixes
--------

- Fix a bug introduced in Synapse 1.58.0rc1 where the main process could consume excessive amounts of CPU and memory while handling sentry logging failures. ([\#12554](matrix-org/synapse#12554))
- Fix a bug introduced in Synapse 1.58.0rc1 where opentracing contexts were not correctly sent to whitelisted remote servers with device lists updates. ([\#12555](matrix-org/synapse#12555))

Internal Changes
----------------

- Reduce unnecessary work when handling remote device list updates. ([\#12557](matrix-org/synapse#12557))

Synapse 1.58.0rc1 (2022-04-26)
==============================

Features
--------

- Implement [MSC3383](matrix-org/matrix-spec-proposals#3383) for including the destination in server-to-server authentication headers. Contributed by @Bubu and @jcgruenhage for Famedly. ([\#11398](matrix-org/synapse#11398))
- Docker images and Debian packages from matrix.org now contain a locked set of Python dependencies, greatly improving build reproducibility. ([Board](https://github.com/orgs/matrix-org/projects/54), [\#11537](matrix-org/synapse#11537))
- Enable processing of device list updates asynchronously. ([\#12365](matrix-org/synapse#12365), [\#12465](matrix-org/synapse#12465))
- Implement [MSC2815](matrix-org/matrix-spec-proposals#2815) to allow room moderators to view redacted event content. Contributed by @tulir @ Beeper. ([\#12427](matrix-org/synapse#12427))
- Build Debian packages for Ubuntu 22.04 "Jammy Jellyfish". ([\#12543](matrix-org/synapse#12543))

Bugfixes
--------

- Prevent a sync request from removing a user's busy presence status. ([\#12213](matrix-org/synapse#12213))
- Fix bug with incremental sync missing events when rejoining/backfilling. Contributed by Nick @ Beeper. ([\#12319](matrix-org/synapse#12319))
- Fix a long-standing bug which incorrectly caused `GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}` to return edited events rather than the original. ([\#12476](matrix-org/synapse#12476))
- Fix a bug introduced in Synapse 1.27.0 where the admin API for [deleting forward extremities](https://github.com/matrix-org/synapse/blob/erikj/fix_delete_event_response_count/docs/admin_api/rooms.md#deleting-forward-extremities) would always return a count of 1, no matter how many extremities were deleted. ([\#12496](matrix-org/synapse#12496))
- Fix a long-standing bug where the image thumbnails embedded into email notifications were broken. ([\#12510](matrix-org/synapse#12510))
- Fix a bug in the implementation of [MSC3202](matrix-org/matrix-spec-proposals#3202) where Synapse would use the field name `device_unused_fallback_keys`, rather than `device_unused_fallback_key_types`. ([\#12520](matrix-org/synapse#12520))
- Fix a bug introduced in Synapse 0.99.3 which could cause Synapse to consume large amounts of RAM when back-paginating in a large room. ([\#12522](matrix-org/synapse#12522))

Improved Documentation
----------------------

- Fix rendering of the documentation site when using the 'print' feature. ([\#12340](matrix-org/synapse#12340))
- Add a manual documenting config file options. ([\#12368](matrix-org/synapse#12368), [\#12527](matrix-org/synapse#12527))
- Update documentation to reflect that both the `run_background_tasks_on` option and the options for moving stream writers off of the main process are no longer experimental. ([\#12451](matrix-org/synapse#12451))
- Update worker documentation and replace old `federation_reader` with `generic_worker`. ([\#12457](matrix-org/synapse#12457))
- Strongly recommend [Poetry](https://python-poetry.org/) for development. ([\#12475](matrix-org/synapse#12475))
- Add some example configurations for workers and update architectural diagram. ([\#12492](matrix-org/synapse#12492))
- Fix a broken link in `README.rst`. ([\#12495](matrix-org/synapse#12495))
- Add HAProxy delegation example with CORS headers to docs. ([\#12501](matrix-org/synapse#12501))
- Remove extraneous comma in User Admin API's device deletion section so that the example JSON is actually valid and works. Contributed by @olmari. ([\#12533](matrix-org/synapse#12533))

Deprecations and Removals
-------------------------

- The groups/communities feature in Synapse is now disabled by default. ([\#12344](matrix-org/synapse#12344))
- Remove unstable identifiers from [MSC3440](matrix-org/matrix-spec-proposals#3440). ([\#12382](matrix-org/synapse#12382))

Internal Changes
----------------

- Preparation for faster-room-join work: start a background process to resynchronise the room state after a room join. ([\#12394](matrix-org/synapse#12394))
- Preparation for faster-room-join work: Implement a tracking mechanism to allow functions to wait for full room state to arrive. ([\#12399](matrix-org/synapse#12399))
- Remove an unstable identifier from [MSC3083](matrix-org/matrix-spec-proposals#3083). ([\#12395](matrix-org/synapse#12395))
- Run CI in the locked [Poetry](https://python-poetry.org/) environment, and remove corresponding `tox` jobs. ([\#12425](matrix-org/synapse#12425), [\#12434](matrix-org/synapse#12434), [\#12438](matrix-org/synapse#12438), [\#12441](matrix-org/synapse#12441), [\#12449](matrix-org/synapse#12449), [\#12478](matrix-org/synapse#12478), [\#12514](matrix-org/synapse#12514), [\#12472](matrix-org/synapse#12472))
- Change Mutual Rooms' `unstable_features` flag to `uk.half-shot.msc2666.mutual_rooms` which matches the current iteration of [MSC2666](matrix-org/matrix-spec-proposals#2666). ([\#12445](matrix-org/synapse#12445))
- Fix typo in the release script help string. ([\#12450](matrix-org/synapse#12450))
- Fix a minor typo in the Debian changelogs generated by the release script. ([\#12497](matrix-org/synapse#12497))
- Reintroduce the list of targets to the linter script, to avoid linting unwanted local-only directories during development. ([\#12455](matrix-org/synapse#12455))
- Limit length of `device_id` to less than 512 characters. ([\#12454](matrix-org/synapse#12454))
- Dockerfile-workers: reduce the amount we install in the image. ([\#12464](matrix-org/synapse#12464))
- Dockerfile-workers: give the master its own log config. ([\#12466](matrix-org/synapse#12466))
- complement-synapse-workers: factor out separate entry point script. ([\#12467](matrix-org/synapse#12467))
- Back out experimental implementation of [MSC2314](matrix-org/matrix-spec-proposals#2314). ([\#12474](matrix-org/synapse#12474))
- Fix grammatical error in federation error response when the room version of a room is unknown. ([\#12483](matrix-org/synapse#12483))
- Remove unnecessary configuration overrides in tests. ([\#12511](matrix-org/synapse#12511))
- Refactor the relations code for clarity. ([\#12519](matrix-org/synapse#12519))
- Add type hints so `docker` and `stubs` directories pass `mypy --disallow-untyped-defs`. ([\#12528](matrix-org/synapse#12528))
- Update `delay_cancellation` to accept any awaitable, rather than just `Deferred`s. ([\#12468](matrix-org/synapse#12468))
- Handle cancellation in `EventsWorkerStore._get_events_from_cache_or_db`. ([\#12529](matrix-org/synapse#12529))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Performance Performance, both client-facing and admin-facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pagination in a large room with restricted visibility can cause an OOM
3 participants