-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Wait for streams to catch up when processing HTTP replication. #14820
Conversation
3dca06c
to
8202270
Compare
234de3c
to
0e0e1cd
Compare
8202270
to
5589d0c
Compare
0e0e1cd
to
85cd635
Compare
61e2e99
to
5731165
Compare
821ca3b
to
ef358ef
Compare
0122d9a
to
c3d7817
Compare
This should hopefully mitigate a class of races where data gets out of sync due a HTTP replication request racing with the replication streams.
c3d7817
to
4616977
Compare
Otherwise we can deadlock as we wait for the positions we are asking for.
This is so that if a stream advances their position *without* writing a row to the stream, other instances will get told about the updated position quickly anyway.
This is already true when asking for stream positions of other instances, but for our own instance we have fudged it. Changing this should be fine (it was just an optimisation), and I don't think it should have much impact in practice at all. The reason to do this is so that when tell remotes what our current position is we only include *our* writes, rather than writes of other instances. This reduces delays when the remote instance is waiting for stream positions to update. In practice, this is probably only a problem for tests, though we may as well do it for all of them.
47d07a6
to
6edfd62
Compare
Any idea about the failing complement test? Seems to be #14432 |
Not sure what's going on there yet. The failure is happening in monolith mode, so it's probably a flake unrelated to the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thoroughly dug into this, but I trust your guruship. I've asked some probing questions though.
# After persistence we always need to notify replication there may | ||
# be new data. | ||
self._notifier.notify_replication() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this change, did we have to wait for something else to notify replication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We poke the notifier below for all non-backfilled events, and since I don't think anything "waits" on the backfill stream that has broadly been OK.
But yeah, its not ideal. I kinda want to move the poke to replication more close to where we advance the stream tokens, but that proved a bit of a PITA due to circular dependencies.
synapse/replication/http/_base.py
Outdated
@@ -104,6 +111,8 @@ class ReplicationEndpoint(metaclass=abc.ABCMeta): | |||
RETRY_ON_CONNECT_ERROR = True | |||
RETRY_ON_CONNECT_ERROR_ATTEMPTS = 5 # =63s (2^6-1) | |||
|
|||
WAIT_FOR_STREAMS = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be annotated as ClassVar[bool]
? (We always look it up via cls.WAIT_FOR_STREAMS
AFAICS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, yeah can do!
Hmm, currently |
Not sure what you mean by "first instance". There are two call sites:
Are you saying we should log-but-continue in the first of these, so the user gets some kind of vaguely prompt response? |
I'd be interested to what grafana looks like before & after this change. Presumably this is going to make some endpoints take (slightly?) longer to respond; I wonder if that will be user-perceptible? |
Sigh, serves me right for trying to work while on the move. I think I meant was: for this PR and the next release change the behaviour to "log error but continue", potentially moving it back to throwing an exception if we never see it later.
I'm saying we should wait e.g. 10s, but then to continue on for both. I'm cautious that this change might surface some bugs somewhere that we don't hit in CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. We should be able to see if we hit timeouts by looking for the result of logging.error
calls in Sentry.
Synapse 1.76.0 (2023-01-31) =========================== The 1.76 release is the first to enable faster joins ([MSC3706](matrix-org/matrix-spec-proposals#3706) and [MSC3902](matrix-org/matrix-spec-proposals#3902)) by default. Admins can opt-out: see [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.76/docs/upgrade.md#faster-joins-are-enabled-by-default) for more details. The upgrade from 1.75 to 1.76 changes the account data replication streams in a backwards-incompatible manner. Server operators running a multi-worker deployment should consult [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.76/docs/upgrade.md#changes-to-the-account-data-replication-streams). Those who are `poetry install`ing from source using our lockfile should ensure their poetry version is 1.3.2 or higher; [see upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.76/docs/upgrade.md#minimum-version-of-poetry-is-now-132). Notes on faster joins --------------------- The faster joins project sees the most benefit when joining a room with a large number of members (joined or historical). We expect it to be particularly useful for joining large public rooms like the [Matrix HQ](https://matrix.to/#/#matrix:matrix.org) or [Synapse Admins](https://matrix.to/#/#synapse:matrix.org) rooms. After a faster join, Synapse considers that room "partially joined". In this state, you should be able to - read incoming messages; - see incoming state changes, e.g. room topic changes; and - send messages, if the room is unencrypted. Synapse has to spend more effort to complete the join in the background. Once this finishes, you will be able to - send messages, if the room is in encrypted; - retrieve room history from before your join, if permitted by the room settings; and - access the full list of room members. Improved Documentation ---------------------- - Describe the ideas and the internal machinery behind faster joins. ([\#14677](matrix-org/synapse#14677)) Synapse 1.76.0rc2 (2023-01-27) ============================== Bugfixes -------- - Faster joins: Fix a bug introduced in Synapse 1.69 where device list EDUs could fail to be handled after a restart when a faster join sync is in progress. ([\#14914](matrix-org/synapse#14914)) Internal Changes ---------------- - Faster joins: Improve performance of looking up partial-state status of rooms. ([\#14917](matrix-org/synapse#14917)) Synapse 1.76.0rc1 (2023-01-25) ============================== Features -------- - Update the default room version to [v10](https://spec.matrix.org/v1.5/rooms/v10/) ([MSC 3904](matrix-org/matrix-spec-proposals#3904)). Contributed by @FSG-Cat. ([\#14111](matrix-org/synapse#14111)) - Add a `set_displayname()` method to the module API for setting a user's display name. ([\#14629](matrix-org/synapse#14629)) - Add a dedicated listener configuration for `health` endpoint. ([\#14747](matrix-org/synapse#14747)) - Implement support for [MSC3890](matrix-org/matrix-spec-proposals#3890): Remotely silence local notifications. ([\#14775](matrix-org/synapse#14775)) - Implement experimental support for [MSC3930](matrix-org/matrix-spec-proposals#3930): Push rules for ([MSC3381](matrix-org/matrix-spec-proposals#3381)) Polls. ([\#14787](matrix-org/synapse#14787)) - Per [MSC3925](matrix-org/matrix-spec-proposals#3925), bundle the whole of the replacement with any edited events, and optionally inhibit server-side replacement. ([\#14811](matrix-org/synapse#14811)) - Faster joins: always serve a partial join response to servers that request it with the stable query param. ([\#14839](matrix-org/synapse#14839)) - Faster joins: allow non-lazy-loading ("eager") syncs to complete after a partial join by omitting partial state rooms until they become fully stated. ([\#14870](matrix-org/synapse#14870)) - Faster joins: request partial joins by default. Admins can opt-out of this for the time being---see the upgrade notes. ([\#14905](matrix-org/synapse#14905)) Bugfixes -------- - Add index to improve performance of the `/timestamp_to_event` endpoint used for jumping to a specific date in the timeline of a room. ([\#14799](matrix-org/synapse#14799)) - Fix a long-standing bug where Synapse would exhaust the stack when processing many federation requests where the remote homeserver has disconencted early. ([\#14812](matrix-org/synapse#14812), [\#14842](matrix-org/synapse#14842)) - Fix rare races when using workers. ([\#14820](matrix-org/synapse#14820)) - Fix a bug introduced in Synapse 1.64.0 when using room version 10 with frozen events enabled. ([\#14864](matrix-org/synapse#14864)) - Fix a long-standing bug where the `populate_room_stats` background job could fail on broken rooms. ([\#14873](matrix-org/synapse#14873)) - Faster joins: Fix a bug in worker deployments where the room stats and user directory would not get updated when finishing a fast join until another event is sent or received. ([\#14874](matrix-org/synapse#14874)) - Faster joins: Fix incompatibility with joins into restricted rooms where no local users have the ability to invite. ([\#14882](matrix-org/synapse#14882)) - Fix a regression introduced in Synapse 1.69.0 which can result in database corruption when database migrations are interrupted on sqlite. ([\#14910](matrix-org/synapse#14910)) Updates to the Docker image --------------------------- - Bump default Python version in the Dockerfile from 3.9 to 3.11. ([\#14875](matrix-org/synapse#14875)) Improved Documentation ---------------------- - Include `x_forwarded` entry in the HTTP listener example configs and remove the remaining `worker_main_http_uri` entries. ([\#14667](matrix-org/synapse#14667)) - Remove duplicate commands from the Code Style documentation page; point to the Contributing Guide instead. ([\#14773](matrix-org/synapse#14773)) - Add missing documentation for `tag` to `listeners` section. ([\#14803](matrix-org/synapse#14803)) - Updated documentation in configuration manual for `user_directory.search_all_users`. ([\#14818](matrix-org/synapse#14818)) - Add `worker_manhole` to configuration manual. ([\#14824](matrix-org/synapse#14824)) - Fix the example config missing the `id` field in [application service documentation](https://matrix-org.github.io/synapse/latest/application_services.html). ([\#14845](matrix-org/synapse#14845)) - Minor corrections to the logging configuration documentation. ([\#14868](matrix-org/synapse#14868)) - Document the export user data command. Contributed by @thezaidbintariq. ([\#14883](matrix-org/synapse#14883)) Deprecations and Removals ------------------------- - Poetry 1.3.2 or higher is now required when `poetry install`ing from source. ([\#14860](matrix-org/synapse#14860)) Internal Changes ---------------- - Faster remote room joins (worker mode): do not populate external hosts-in-room cache when sending events as this requires blocking for full state. ([\#14749](matrix-org/synapse#14749)) - Enable Complement tests for Faster Remote Room Joins against worker-mode Synapse. ([\#14752](matrix-org/synapse#14752)) - Add some clarifying comments and refactor a portion of the `Keyring` class for readability. ([\#14804](matrix-org/synapse#14804)) - Add local poetry config files (`poetry.toml`) to `.gitignore`. ([\#14807](matrix-org/synapse#14807)) - Add missing type hints. ([\#14816](matrix-org/synapse#14816), [\#14885](matrix-org/synapse#14885), [\#14889](matrix-org/synapse#14889)) - Refactor push tests. ([\#14819](matrix-org/synapse#14819)) - Re-enable some linting that was disabled when we switched to ruff. ([\#14821](matrix-org/synapse#14821)) - Add `cargo fmt` and `cargo clippy` to the lint script. ([\#14822](matrix-org/synapse#14822)) - Drop unused table `presence`. ([\#14825](matrix-org/synapse#14825)) - Merge the two account data and the two device list replication streams. ([\#14826](matrix-org/synapse#14826), [\#14833](matrix-org/synapse#14833)) - Faster joins: use stable identifiers from [MSC3706](matrix-org/matrix-spec-proposals#3706). ([\#14832](matrix-org/synapse#14832), [\#14841](matrix-org/synapse#14841)) - Add a parameter to control whether the federation client performs a partial state join. ([\#14843](matrix-org/synapse#14843)) - Add check to avoid starting duplicate partial state syncs. ([\#14844](matrix-org/synapse#14844)) - Add an early return when handling no-op presence updates. ([\#14855](matrix-org/synapse#14855)) - Fix `wait_for_stream_position` to correctly wait for the right instance to advance its token. ([\#14856](matrix-org/synapse#14856), [\#14872](matrix-org/synapse#14872)) - Always notify replication when a stream advances automatically. ([\#14877](matrix-org/synapse#14877)) - Reduce max time we wait for stream positions. ([\#14881](matrix-org/synapse#14881)) - Faster joins: allow the resync process more time to fetch `/state` ids. ([\#14912](matrix-org/synapse#14912)) - Bump regex from 1.7.0 to 1.7.1. ([\#14848](matrix-org/synapse#14848)) - Bump peaceiris/actions-gh-pages from 3.9.1 to 3.9.2. ([\#14861](matrix-org/synapse#14861)) - Bump ruff from 0.0.215 to 0.0.224. ([\#14862](matrix-org/synapse#14862)) - Bump types-pillow from 9.4.0.0 to 9.4.0.3. ([\#14863](matrix-org/synapse#14863)) - Bump types-opentracing from 2.4.10 to 2.4.10.1. ([\#14896](matrix-org/synapse#14896)) - Bump ruff from 0.0.224 to 0.0.230. ([\#14897](matrix-org/synapse#14897)) - Bump types-requests from 2.28.11.7 to 2.28.11.8. ([\#14899](matrix-org/synapse#14899)) - Bump types-psycopg2 from 2.9.21.2 to 2.9.21.4. ([\#14900](matrix-org/synapse#14900)) - Bump types-commonmark from 0.9.2 to 0.9.2.1. ([\#14901](matrix-org/synapse#14901)) Synapse 1.75.0 (2023-01-17) =========================== No significant changes since 1.75.0rc2. Synapse 1.75.0rc2 (2023-01-12) ============================== Bugfixes -------- - Fix a bug introduced in Synapse 1.75.0rc1 where device lists could be miscalculated with some sync filters. ([\#14810](matrix-org/synapse#14810)) - Fix race where calling `/members` or `/state` with an `at` parameter could fail for newly created rooms, when using multiple workers. ([\#14817](matrix-org/synapse#14817)) Synapse 1.75.0rc1 (2023-01-10) ============================== Features -------- - Add a `cached` function to `synapse.module_api` that returns a decorator to cache return values of functions. ([\#14663](matrix-org/synapse#14663)) - Add experimental support for [MSC3391](matrix-org/matrix-spec-proposals#3391) (removing account data). ([\#14714](matrix-org/synapse#14714)) - Support [RFC7636](https://datatracker.ietf.org/doc/html/rfc7636) Proof Key for Code Exchange for OAuth single sign-on. ([\#14750](matrix-org/synapse#14750)) - Support non-OpenID compliant userinfo claims for subject and picture. ([\#14753](matrix-org/synapse#14753)) - Improve performance of `/sync` when filtering all rooms, message types, or senders. ([\#14786](matrix-org/synapse#14786)) - Improve performance of the `/hierarchy` endpoint. ([\#14263](matrix-org/synapse#14263)) Bugfixes -------- - Fix the *MAU Limits* section of the Grafana dashboard relying on a specific `job` name for the workers of a Synapse deployment. ([\#14644](matrix-org/synapse#14644)) - Fix a bug introduced in Synapse 1.70.0 which could cause spurious `UNIQUE constraint failed` errors in the `rotate_notifs` background job. ([\#14669](matrix-org/synapse#14669)) - Ensure stream IDs are always updated after caches get invalidated with workers. Contributed by Nick @ Beeper (@Fizzadar). ([\#14723](matrix-org/synapse#14723)) - Remove the unspecced `device` field from `/pushrules` responses. ([\#14727](matrix-org/synapse#14727)) - Fix a bug introduced in Synapse 1.73.0 where the `picture_claim` configured under `oidc_providers` was unused (the default value of `"picture"` was used instead). ([\#14751](matrix-org/synapse#14751)) - Unescape HTML entities in URL preview titles making use of oEmbed responses. ([\#14781](matrix-org/synapse#14781)) - Disable sending confirmation email when 3pid is disabled. ([\#14725](matrix-org/synapse#14725)) Improved Documentation ---------------------- - Declare support for Python 3.11. ([\#14673](matrix-org/synapse#14673)) - Fix `target_memory_usage` being used in the description for the actual `cache_autotune` sub-option `target_cache_memory_usage`. ([\#14674](matrix-org/synapse#14674)) - Move `email` to Server section in config file documentation. ([\#14730](matrix-org/synapse#14730)) - Fix broken links in the Synapse documentation. ([\#14744](matrix-org/synapse#14744)) - Add missing worker settings to shared configuration documentation. ([\#14748](matrix-org/synapse#14748)) - Document using Twitter as a OAuth 2.0 authentication provider. ([\#14778](matrix-org/synapse#14778)) - Fix Synapse 1.74 upgrade notes to correctly explain how to install pyICU when installing Synapse from PyPI. ([\#14797](matrix-org/synapse#14797)) - Update link to towncrier in contribution guide. ([\#14801](matrix-org/synapse#14801)) - Use `htmltest` to check links in the Synapse documentation. ([\#14743](matrix-org/synapse#14743)) Internal Changes ---------------- - Faster remote room joins: stream the un-partial-stating of events over replication. ([\#14545](matrix-org/synapse#14545), [\#14546](matrix-org/synapse#14546)) - Use [ruff](https://github.com/charliermarsh/ruff/) instead of flake8. ([\#14633](matrix-org/synapse#14633), [\#14741](matrix-org/synapse#14741)) - Change `handle_new_client_event` signature so that a 429 does not reach clients on `PartialStateConflictError`, and internally retry when needed instead. ([\#14665](matrix-org/synapse#14665)) - Remove dependency on jQuery on reCAPTCHA page. ([\#14672](matrix-org/synapse#14672)) - Faster joins: make `compute_state_after_events` consistent with other state-fetching functions that take a `StateFilter`. ([\#14676](matrix-org/synapse#14676)) - Add missing type hints. ([\#14680](matrix-org/synapse#14680), [\#14681](matrix-org/synapse#14681), [\#14687](matrix-org/synapse#14687)) - Improve type annotations for the helper methods on a `CachedFunction`. ([\#14685](matrix-org/synapse#14685)) - Check that the SQLite database file exists before porting to PostgreSQL. ([\#14692](matrix-org/synapse#14692)) - Add `.direnv/` directory to .gitignore to prevent local state generated by the [direnv](https://direnv.net/) development tool from being committed. ([\#14707](matrix-org/synapse#14707)) - Batch up replication requests to request the resyncing of remote users's devices. ([\#14716](matrix-org/synapse#14716)) - If debug logging is enabled, log the `msgid`s of any to-device messages that are returned over `/sync`. ([\#14724](matrix-org/synapse#14724)) - Change GHA CI job to follow best practices. ([\#14772](matrix-org/synapse#14772)) - Switch to our fork of `dh-virtualenv` to work around an upstream Python 3.11 incompatibility. ([\#14774](matrix-org/synapse#14774)) - Skip testing built wheels for PyPy 3.7 on Linux x86_64 as we lack new required dependencies in the build environment. ([\#14802](matrix-org/synapse#14802)) ### Dependabot updates <details> - Bump JasonEtco/create-an-issue from 2.8.1 to 2.8.2. ([\#14693](matrix-org/synapse#14693)) - Bump anyhow from 1.0.66 to 1.0.68. ([\#14694](matrix-org/synapse#14694)) - Bump blake2 from 0.10.5 to 0.10.6. ([\#14695](matrix-org/synapse#14695)) - Bump serde_json from 1.0.89 to 1.0.91. ([\#14696](matrix-org/synapse#14696)) - Bump serde from 1.0.150 to 1.0.151. ([\#14697](matrix-org/synapse#14697)) - Bump lxml from 4.9.1 to 4.9.2. ([\#14698](matrix-org/synapse#14698)) - Bump types-jsonschema from 4.17.0.1 to 4.17.0.2. ([\#14700](matrix-org/synapse#14700)) - Bump sentry-sdk from 1.11.1 to 1.12.0. ([\#14701](matrix-org/synapse#14701)) - Bump types-setuptools from 65.6.0.1 to 65.6.0.2. ([\#14702](matrix-org/synapse#14702)) - Bump minimum PyYAML to 3.13. ([\#14720](matrix-org/synapse#14720)) - Bump JasonEtco/create-an-issue from 2.8.2 to 2.9.1. ([\#14731](matrix-org/synapse#14731)) - Bump towncrier from 22.8.0 to 22.12.0. ([\#14732](matrix-org/synapse#14732)) - Bump isort from 5.10.1 to 5.11.4. ([\#14733](matrix-org/synapse#14733)) - Bump attrs from 22.1.0 to 22.2.0. ([\#14734](matrix-org/synapse#14734)) - Bump black from 22.10.0 to 22.12.0. ([\#14735](matrix-org/synapse#14735)) - Bump sentry-sdk from 1.12.0 to 1.12.1. ([\#14736](matrix-org/synapse#14736)) - Bump setuptools from 65.3.0 to 65.5.1. ([\#14738](matrix-org/synapse#14738)) - Bump serde from 1.0.151 to 1.0.152. ([\#14758](matrix-org/synapse#14758)) - Bump ruff from 0.0.189 to 0.0.206. ([\#14759](matrix-org/synapse#14759)) - Bump pydantic from 1.10.2 to 1.10.4. ([\#14760](matrix-org/synapse#14760)) - Bump gitpython from 3.1.29 to 3.1.30. ([\#14761](matrix-org/synapse#14761)) - Bump pillow from 9.3.0 to 9.4.0. ([\#14762](matrix-org/synapse#14762)) - Bump types-requests from 2.28.11.5 to 2.28.11.7. ([\#14763](matrix-org/synapse#14763)) - Bump dawidd6/action-download-artifact from 2.24.2 to 2.24.3. ([\#14779](matrix-org/synapse#14779)) - Bump peaceiris/actions-gh-pages from 3.9.0 to 3.9.1. ([\#14791](matrix-org/synapse#14791)) - Bump types-pillow from 9.3.0.4 to 9.4.0.0. ([\#14792](matrix-org/synapse#14792)) - Bump pyopenssl from 22.1.0 to 23.0.0. ([\#14793](matrix-org/synapse#14793)) - Bump types-setuptools from 65.6.0.2 to 65.6.0.3. ([\#14794](matrix-org/synapse#14794)) - Bump importlib-metadata from 4.2.0 to 6.0.0. ([\#14795](matrix-org/synapse#14795)) - Bump ruff from 0.0.206 to 0.0.215. ([\#14796](matrix-org/synapse#14796)) </details> Synapse 1.74.0 (2022-12-20) =========================== Improved Documentation ---------------------- - Add release note and update documentation regarding optional ICU support in user search. ([\#14712](matrix-org/synapse#14712)) Synapse 1.74.0rc1 (2022-12-13) ============================== Features -------- - Improve user search for international display names. ([\#14464](matrix-org/synapse#14464)) - Stop using deprecated `keyIds` parameter when calling `/_matrix/key/v2/server`. ([\#14490](matrix-org/synapse#14490), [\#14525](matrix-org/synapse#14525)) - Add new `push.enabled` config option to allow opting out of push notification calculation. ([\#14551](matrix-org/synapse#14551), [\#14619](matrix-org/synapse#14619)) - Advertise support for Matrix 1.5 on `/_matrix/client/versions`. ([\#14576](matrix-org/synapse#14576)) - Improve opentracing and logging for to-device message handling. ([\#14598](matrix-org/synapse#14598)) - Allow selecting "prejoin" events by state keys in addition to event types. ([\#14642](matrix-org/synapse#14642)) Bugfixes -------- - Fix a long-standing bug where a device list update might not be sent to clients in certain circumstances. ([\#14435](matrix-org/synapse#14435), [\#14592](matrix-org/synapse#14592), [\#14604](matrix-org/synapse#14604)) - Suppress a spurious warning when `POST /rooms/<room_id>/<membership>/`, `POST /join/<room_id_or_alias`, or the unspecced `PUT /join/<room_id_or_alias>/<txn_id>` receive an empty HTTP request body. ([\#14600](matrix-org/synapse#14600)) - Return spec-compliant JSON errors when unknown endpoints are requested. ([\#14620](matrix-org/synapse#14620), [\#14621](matrix-org/synapse#14621)) - Update html templates to load images over HTTPS. Contributed by @ashfame. ([\#14625](matrix-org/synapse#14625)) - Fix a long-standing bug where the user directory would return 1 more row than requested. ([\#14631](matrix-org/synapse#14631)) - Reject invalid read receipt requests with empty room or event IDs. Contributed by Nick @ Beeper (@Fizzadar). ([\#14632](matrix-org/synapse#14632)) - Fix a bug introduced in Synapse 1.67.0 where not specifying a config file or a server URL would lead to the `register_new_matrix_user` script failing. ([\#14637](matrix-org/synapse#14637)) - Fix a long-standing bug where the user directory and room/user stats might be out of sync. ([\#14639](matrix-org/synapse#14639), [\#14643](matrix-org/synapse#14643)) - Fix a bug introduced in Synapse 1.72.0 where the background updates to add non-thread unique indexes on receipts would fail if they were previously interrupted. ([\#14650](matrix-org/synapse#14650)) - Improve validation of field size limits in events. ([\#14664](matrix-org/synapse#14664)) - Fix bugs introduced in Synapse 1.55.0 and 1.69.0 where application services would not be notified of events in the correct rooms, due to stale caches. ([\#14670](matrix-org/synapse#14670)) Improved Documentation ---------------------- - Update worker settings for `pusher` and `federation_sender` functionality. ([\#14493](matrix-org/synapse#14493)) - Add links to third party package repositories, and point to the bug which highlights Ubuntu's out-of-date packages. ([\#14517](matrix-org/synapse#14517)) - Remove old, incorrect minimum postgres version note and replace with a link to the [Dependency Deprecation Policy](https://matrix-org.github.io/synapse/v1.73/deprecation_policy.html). ([\#14590](matrix-org/synapse#14590)) - Add Single-Sign On setup instructions for Mastodon-based instances. ([\#14594](matrix-org/synapse#14594)) - Change `turn_allow_guests` example value to lowercase `true`. ([\#14634](matrix-org/synapse#14634)) Internal Changes ---------------- - Optimise push badge count calculations. Contributed by Nick @ Beeper (@Fizzadar). ([\#14255](matrix-org/synapse#14255)) - Faster remote room joins: stream the un-partial-stating of rooms over replication. ([\#14473](matrix-org/synapse#14473), [\#14474](matrix-org/synapse#14474)) - Share the `ClientRestResource` for both workers and the main process. ([\#14528](matrix-org/synapse#14528)) - Add `--editable` flag to `complement.sh` which uses an editable install of Synapse for faster turn-around times whilst developing iteratively. ([\#14548](matrix-org/synapse#14548)) - Faster joins: use servers list approximation to send read receipts when in partial state instead of waiting for the full state of the room. ([\#14549](matrix-org/synapse#14549)) - Modernize unit tests configuration related to workers. ([\#14568](matrix-org/synapse#14568)) - Bump jsonschema from 4.17.0 to 4.17.3. ([\#14591](matrix-org/synapse#14591)) - Fix Rust lint CI. ([\#14602](matrix-org/synapse#14602)) - Bump JasonEtco/create-an-issue from 2.5.0 to 2.8.1. ([\#14607](matrix-org/synapse#14607)) - Alter some unit test environment parameters to decrease time spent running tests. ([\#14610](matrix-org/synapse#14610)) - Switch to Go recommended installation method for `gotestfmt` template in CI. ([\#14611](matrix-org/synapse#14611)) - Bump phonenumbers from 8.13.0 to 8.13.1. ([\#14612](matrix-org/synapse#14612)) - Bump types-setuptools from 65.5.0.3 to 65.6.0.1. ([\#14613](matrix-org/synapse#14613)) - Bump twine from 4.0.1 to 4.0.2. ([\#14614](matrix-org/synapse#14614)) - Bump types-requests from 2.28.11.2 to 2.28.11.5. ([\#14615](matrix-org/synapse#14615)) - Bump cryptography from 38.0.3 to 38.0.4. ([\#14616](matrix-org/synapse#14616)) - Remove useless cargo install with apt from Dockerfile. ([\#14636](matrix-org/synapse#14636)) - Bump certifi from 2021.10.8 to 2022.12.7. ([\#14645](matrix-org/synapse#14645)) - Bump flake8-bugbear from 22.10.27 to 22.12.6. ([\#14656](matrix-org/synapse#14656)) - Bump packaging from 21.3 to 22.0. ([\#14657](matrix-org/synapse#14657)) - Bump types-pillow from 9.3.0.1 to 9.3.0.4. ([\#14658](matrix-org/synapse#14658)) - Bump serde from 1.0.148 to 1.0.150. ([\#14659](matrix-org/synapse#14659)) - Bump phonenumbers from 8.13.1 to 8.13.2. ([\#14660](matrix-org/synapse#14660)) - Bump authlib from 1.1.0 to 1.2.0. ([\#14661](matrix-org/synapse#14661)) - Move `StateFilter` to `synapse.types`. ([\#14668](matrix-org/synapse#14668)) - Improve type hints. ([\#14597](matrix-org/synapse#14597), [\#14646](matrix-org/synapse#14646), [\#14671](matrix-org/synapse#14671))
Suppose worker A makes an internal http request from worker B. B may make changes that A later learns about over replication. We want A's request to block until it has seen those changes—mainly to ensure A's caches are invalidated promptly. This helps provide read-after-write consistency, eliminating entire categories of races and test flakes. To implement this, B includes a top-level field `_INT_STREAM_POS` in its response JSON. Roughly speaking, the field's value tells A what to wait for. But we weren't removing that internal field before A's request completed! Introduced in #14820. Fixes #15308.
* Have replication clients remove _INT_STREAM_POS Suppose worker A makes an internal http request from worker B. B may make changes that A later learns about over replication. We want A's request to block until it has seen those changes—mainly to ensure A's caches are invalidated promptly. This helps provide read-after-write consistency, eliminating entire categories of races and test flakes. To implement this, B includes a top-level field `_INT_STREAM_POS` in its response JSON. Roughly speaking, the field's value tells A what to wait for. But we weren't removing that internal field before A's request completed! Introduced in #14820. Fixes #15308. * Changelog
* Fix bug where a new writer advances their token too quickly When starting a new writer (for e.g. persisting events), the `MultiWriterIdGenerator` doesn't have a minimum token for it as there are no rows matching that new writer in the DB. This results in the the first stream ID it acquired being announced as persisted *before* it actually finishes persisting, if another writer gets and persists a subsequent stream ID. This is due to the logic of setting the minimum persisted position to the minimum known position of across all writers, and the new writer starts off not being considered. * Fix sending out POSITIONs when our token advances without update Broke in #14820 * For replication HTTP requests, only wait for minimal position
This should hopefully mitigate a class of races where data gets out of
sync due a HTTP replication request racing with the replication streams.
This should hopefully fix flakes with worker mode complement tests for partial joins.