-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add metrics to track how often events are soft_failed
#10156
Add metrics to track how often events are soft_failed
#10156
Conversation
a960687
to
1c44fb4
Compare
Spawned from missing messages we were seeing on `matrix.org` from a federated Gtiter bridged room, https://gitlab.com/gitterHQ/webapp/-/issues/2770. The underlying issue in Synapse is tracked by #10066 where the message and join event race and the message is `soft_failed` before the `join` event reaches the remote federated server. Less soft_failed events = better and usually this should only trigger for events where people are doing bad things and trying to fuzz and fake everything.
1c44fb4
to
1d6b946
Compare
soft_failed
soft_failed
soft_failed
soft_failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation. Code looks good, though I wonder if a simple counter is the most useful metric here versus something that may give us an idea of how many rooms soft failed events are appearing across.
I'm not sure of a metric type that could be used for that though, so leaving this open for someone to have a second look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @anoadragon453 says, I'm not sure if this will help a great deal, because you don't know where the failures are happening. That said, I don't really think there's a better way to do this, and at least if the soft-failures counter is zero, that gives you an easy way to rule out soft-failure as a problem.
Thanks for the review passes @anoadragon453 and @richvdh 🕺 I think it's useful to know whether it's happening or not regardless of how. For the future: If we want to dig deeper into what rooms are soft_failing messages, we could use the Elasticsearch logs and some aggregations (Kibana or raw ES queries). We can add a few fields to the Elasticsearch mapping, It looks like we use Logstash or Filebeat and I assume there is a way we can parse another field out besides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks!
@@ -2498,6 +2504,7 @@ async def _check_for_soft_fail( | |||
event_auth.check(room_version_obj, event, auth_events=current_auth_events) | |||
except AuthError as e: | |||
logger.warning("Soft-failing %r because %s", event, e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the future:
If we want to dig deeper into what rooms are soft_failing messages, we could use the Elasticsearch logs and some aggregations (Kibana or raw ES queries). We can add a few fields to the Elasticsearch mapping,
room_id
,mxid
,event_id
which should handle the high cardinality(lots of different values) just fine.It looks like we use Logstash or Filebeat and I assume there is a way we can parse another field out besides
message
? Maybe https://stackoverflow.com/q/40460830/796832
To document my additional findings here;
We can already use logger.info('foo', extra={ "foo": "bar"})
as shown in tests/logging/test_terse_json.py#L64-L74
to add extra fields to the structured logging.
- Docs on structured logging in Synapse: https://github.com/matrix-org/synapse/blob/develop/docs/structured_logging.md
- Improve structured logging: Improve structured logging #8683
- Record more information into structured logs: Record more information into structured logs #9654
- Use structured logging for EMS: https://github.com/matrix-org/matrix-hosted/issues/1229
- Use structured logging for
matrix.org
: https://github.com/matrix-org/internal-config/issues/889 - Allow simultaneously enabling structured logging and "standard" logging which was implemented in Support outputting structured logs in addition to standard logs #8607
- List of issues with the
z-logging
label: https://github.com/matrix-org/synapse/labels/z-logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #10168 to add these fields
Synapse 1.37.0rc1 (2021-06-24) ============================== This release deprecates the current spam checker interface. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new generic module interface. This release also removes support for fetching and renewing TLS certificates using the ACME v1 protocol, which has been fully decommissioned by Let's Encrypt on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. Features -------- - Implement "room knocking" as per [MSC2403](matrix-org/matrix-spec-proposals#2403). Contributed by @Sorunome and anoa. ([\#6739](#6739), [\#9359](#9359), [\#10167](#10167), [\#10212](#10212), [\#10227](#10227)) - Add experimental support for backfilling history into rooms ([MSC2716](matrix-org/matrix-spec-proposals#2716)). ([\#9247](#9247)) - Implement a generic interface for third-party plugin modules. ([\#10062](#10062), [\#10206](#10206)) - Implement config option `sso.update_profile_information` to sync SSO users' profile information with the identity provider each time they login. Currently only displayname is supported. ([\#10108](#10108)) - Ensure that errors during startup are written to the logs and the console. ([\#10191](#10191)) Bugfixes -------- - Fix a bug introduced in Synapse v1.25.0 that prevented the `ip_range_whitelist` configuration option from working for federation and identity servers. Contributed by @mikure. ([\#10115](#10115)) - Remove a broken import line in Synapse's `admin_cmd` worker. Broke in Synapse v1.33.0. ([\#10154](#10154)) - Fix a bug introduced in Synapse v1.21.0 which could cause `/sync` to return immediately with an empty response. ([\#10157](#10157), [\#10158](#10158)) - Fix a minor bug in the response to `/_matrix/client/r0/user/{user}/openid/request_token` causing `expires_in` to be a float instead of an integer. Contributed by @lukaslihotzki. ([\#10175](#10175)) - Always require users to re-authenticate for dangerous operations: deactivating an account, modifying an account password, and adding 3PIDs. ([\#10184](#10184)) - Fix a bug introduced in Synpase v1.7.2 where remote server count metrics collection would be incorrectly delayed on startup. Found by @heftig. ([\#10195](#10195)) - Fix a bug introduced in Synapse v1.35.1 where an `allow` key of a `m.room.join_rules` event could be applied for incorrect room versions and configurations. ([\#10208](#10208)) - Fix performance regression in responding to user key requests over federation. Introduced in Synapse v1.34.0rc1. ([\#10221](#10221)) Improved Documentation ---------------------- - Add a new guide to decoding request logs. ([\#8436](#8436)) - Mention in the sample homeserver config that you may need to configure max upload size in your reverse proxy. Contributed by @aaronraimist. ([\#10122](#10122)) - Fix broken links in documentation. ([\#10180](#10180)) - Deploy a snapshot of the documentation website upon each new Synapse release. ([\#10198](#10198)) Deprecations and Removals ------------------------- - The current spam checker interface is deprecated in favour of a new generic modules system. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new system. ([\#10062](#10062), [\#10210](#10210), [\#10238](#10238)) - Stop supporting the unstable spaces prefixes from MSC1772. ([\#10161](#10161)) - Remove Synapse's support for automatically fetching and renewing certificates using the ACME v1 protocol. This protocol has been fully turned off by Let's Encrypt for existing installations on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. ([\#10194](#10194)) Internal Changes ---------------- - Update the database schema versioning to support gradual migration away from legacy tables. ([\#9933](#9933)) - Add type hints to the federation servlets. ([\#10080](#10080)) - Improve OpenTracing for event persistence. ([\#10134](#10134), [\#10193](#10193)) - Clean up the interface for injecting OpenTracing over HTTP. ([\#10143](#10143)) - Limit the number of in-flight `/keys/query` requests from a single device. ([\#10144](#10144)) - Refactor EventPersistenceQueue. ([\#10145](#10145)) - Document `SYNAPSE_TEST_LOG_LEVEL` to see the logger output when running tests. ([\#10148](#10148)) - Update the Complement build tags in GitHub Actions to test currently experimental features. ([\#10155](#10155)) - Add a `synapse_federation_soft_failed_events_total` metric to track how often events are soft failed. ([\#10156](#10156)) - Fetch the corresponding complement branch when performing CI. ([\#10160](#10160)) - Add some developer documentation about boolean columns in database schemas. ([\#10164](#10164)) - Add extra logging fields to better debug where events are being soft failed. ([\#10168](#10168)) - Add debug logging for when we enter and exit `Measure` blocks. ([\#10183](#10183)) - Improve comments in structured logging code. ([\#10188](#10188)) - Update [MSC3083](matrix-org/matrix-spec-proposals#3083) support with modifications from the MSC. ([\#10189](#10189)) - Remove redundant DNS lookup limiter. ([\#10190](#10190)) - Upgrade `black` linting tool to 21.6b0. ([\#10197](#10197)) - Expose OpenTracing trace id in response headers. ([\#10199](#10199))
Synapse 1.37.0rc1 (2021-06-24) ============================== This release deprecates the current spam checker interface. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new generic module interface. This release also removes support for fetching and renewing TLS certificates using the ACME v1 protocol, which has been fully decommissioned by Let's Encrypt on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. Features -------- - Implement "room knocking" as per [MSC2403](matrix-org/matrix-spec-proposals#2403). Contributed by @Sorunome and anoa. ([\#6739](#6739), [\#9359](#9359), [\#10167](#10167), [\#10212](#10212), [\#10227](#10227)) - Add experimental support for backfilling history into rooms ([MSC2716](matrix-org/matrix-spec-proposals#2716)). ([\#9247](#9247)) - Implement a generic interface for third-party plugin modules. ([\#10062](#10062), [\#10206](#10206)) - Implement config option `sso.update_profile_information` to sync SSO users' profile information with the identity provider each time they login. Currently only displayname is supported. ([\#10108](#10108)) - Ensure that errors during startup are written to the logs and the console. ([\#10191](#10191)) Bugfixes -------- - Fix a bug introduced in Synapse v1.25.0 that prevented the `ip_range_whitelist` configuration option from working for federation and identity servers. Contributed by @mikure. ([\#10115](#10115)) - Remove a broken import line in Synapse's `admin_cmd` worker. Broke in Synapse v1.33.0. ([\#10154](#10154)) - Fix a bug introduced in Synapse v1.21.0 which could cause `/sync` to return immediately with an empty response. ([\#10157](#10157), [\#10158](#10158)) - Fix a minor bug in the response to `/_matrix/client/r0/user/{user}/openid/request_token` causing `expires_in` to be a float instead of an integer. Contributed by @lukaslihotzki. ([\#10175](#10175)) - Always require users to re-authenticate for dangerous operations: deactivating an account, modifying an account password, and adding 3PIDs. ([\#10184](#10184)) - Fix a bug introduced in Synpase v1.7.2 where remote server count metrics collection would be incorrectly delayed on startup. Found by @heftig. ([\#10195](#10195)) - Fix a bug introduced in Synapse v1.35.1 where an `allow` key of a `m.room.join_rules` event could be applied for incorrect room versions and configurations. ([\#10208](#10208)) - Fix performance regression in responding to user key requests over federation. Introduced in Synapse v1.34.0rc1. ([\#10221](#10221)) Improved Documentation ---------------------- - Add a new guide to decoding request logs. ([\#8436](#8436)) - Mention in the sample homeserver config that you may need to configure max upload size in your reverse proxy. Contributed by @aaronraimist. ([\#10122](#10122)) - Fix broken links in documentation. ([\#10180](#10180)) - Deploy a snapshot of the documentation website upon each new Synapse release. ([\#10198](#10198)) Deprecations and Removals ------------------------- - The current spam checker interface is deprecated in favour of a new generic modules system. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new system. ([\#10062](#10062), [\#10210](#10210), [\#10238](#10238)) - Stop supporting the unstable spaces prefixes from MSC1772. ([\#10161](#10161)) - Remove Synapse's support for automatically fetching and renewing certificates using the ACME v1 protocol. This protocol has been fully turned off by Let's Encrypt for existing installations on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. ([\#10194](#10194)) Internal Changes ---------------- - Update the database schema versioning to support gradual migration away from legacy tables. ([\#9933](#9933)) - Add type hints to the federation servlets. ([\#10080](#10080)) - Improve OpenTracing for event persistence. ([\#10134](#10134), [\#10193](#10193)) - Clean up the interface for injecting OpenTracing over HTTP. ([\#10143](#10143)) - Limit the number of in-flight `/keys/query` requests from a single device. ([\#10144](#10144)) - Refactor EventPersistenceQueue. ([\#10145](#10145)) - Document `SYNAPSE_TEST_LOG_LEVEL` to see the logger output when running tests. ([\#10148](#10148)) - Update the Complement build tags in GitHub Actions to test currently experimental features. ([\#10155](#10155)) - Add a `synapse_federation_soft_failed_events_total` metric to track how often events are soft failed. ([\#10156](#10156)) - Fetch the corresponding complement branch when performing CI. ([\#10160](#10160)) - Add some developer documentation about boolean columns in database schemas. ([\#10164](#10164)) - Add extra logging fields to better debug where events are being soft failed. ([\#10168](#10168)) - Add debug logging for when we enter and exit `Measure` blocks. ([\#10183](#10183)) - Improve comments in structured logging code. ([\#10188](#10188)) - Update [MSC3083](matrix-org/matrix-spec-proposals#3083) support with modifications from the MSC. ([\#10189](#10189)) - Remove redundant DNS lookup limiter. ([\#10190](#10190)) - Upgrade `black` linting tool to 21.6b0. ([\#10197](#10197)) - Expose OpenTracing trace id in response headers. ([\#10199](#10199))
Synapse 1.37.0 (2021-06-29) =========================== This release deprecates the current spam checker interface. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new generic module interface. This release also removes support for fetching and renewing TLS certificates using the ACME v1 protocol, which has been fully decommissioned by Let's Encrypt on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. Synapse 1.37.0rc1 (2021-06-24) ============================== Features -------- - Implement "room knocking" as per [MSC2403](matrix-org/matrix-spec-proposals#2403). Contributed by @Sorunome and anoa. ([\#6739](matrix-org/synapse#6739), [\#9359](matrix-org/synapse#9359), [\#10167](matrix-org/synapse#10167), [\#10212](matrix-org/synapse#10212), [\#10227](matrix-org/synapse#10227)) - Add experimental support for backfilling history into rooms ([MSC2716](matrix-org/matrix-spec-proposals#2716)). ([\#9247](matrix-org/synapse#9247)) - Implement a generic interface for third-party plugin modules. ([\#10062](matrix-org/synapse#10062), [\#10206](matrix-org/synapse#10206)) - Implement config option `sso.update_profile_information` to sync SSO users' profile information with the identity provider each time they login. Currently only displayname is supported. ([\#10108](matrix-org/synapse#10108)) - Ensure that errors during startup are written to the logs and the console. ([\#10191](matrix-org/synapse#10191)) Bugfixes -------- - Fix a bug introduced in Synapse v1.25.0 that prevented the `ip_range_whitelist` configuration option from working for federation and identity servers. Contributed by @mikure. ([\#10115](matrix-org/synapse#10115)) - Remove a broken import line in Synapse's `admin_cmd` worker. Broke in Synapse v1.33.0. ([\#10154](matrix-org/synapse#10154)) - Fix a bug introduced in Synapse v1.21.0 which could cause `/sync` to return immediately with an empty response. ([\#10157](matrix-org/synapse#10157), [\#10158](matrix-org/synapse#10158)) - Fix a minor bug in the response to `/_matrix/client/r0/user/{user}/openid/request_token` causing `expires_in` to be a float instead of an integer. Contributed by @lukaslihotzki. ([\#10175](matrix-org/synapse#10175)) - Always require users to re-authenticate for dangerous operations: deactivating an account, modifying an account password, and adding 3PIDs. ([\#10184](matrix-org/synapse#10184)) - Fix a bug introduced in Synpase v1.7.2 where remote server count metrics collection would be incorrectly delayed on startup. Found by @heftig. ([\#10195](matrix-org/synapse#10195)) - Fix a bug introduced in Synapse v1.35.1 where an `allow` key of a `m.room.join_rules` event could be applied for incorrect room versions and configurations. ([\#10208](matrix-org/synapse#10208)) - Fix performance regression in responding to user key requests over federation. Introduced in Synapse v1.34.0rc1. ([\#10221](matrix-org/synapse#10221)) Improved Documentation ---------------------- - Add a new guide to decoding request logs. ([\#8436](matrix-org/synapse#8436)) - Mention in the sample homeserver config that you may need to configure max upload size in your reverse proxy. Contributed by @aaronraimist. ([\#10122](matrix-org/synapse#10122)) - Fix broken links in documentation. ([\#10180](matrix-org/synapse#10180)) - Deploy a snapshot of the documentation website upon each new Synapse release. ([\#10198](matrix-org/synapse#10198)) Deprecations and Removals ------------------------- - The current spam checker interface is deprecated in favour of a new generic modules system. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new system. ([\#10062](matrix-org/synapse#10062), [\#10210](matrix-org/synapse#10210), [\#10238](matrix-org/synapse#10238)) - Stop supporting the unstable spaces prefixes from MSC1772. ([\#10161](matrix-org/synapse#10161)) - Remove Synapse's support for automatically fetching and renewing certificates using the ACME v1 protocol. This protocol has been fully turned off by Let's Encrypt for existing installations on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. ([\#10194](matrix-org/synapse#10194)) Internal Changes ---------------- - Update the database schema versioning to support gradual migration away from legacy tables. ([\#9933](matrix-org/synapse#9933)) - Add type hints to the federation servlets. ([\#10080](matrix-org/synapse#10080)) - Improve OpenTracing for event persistence. ([\#10134](matrix-org/synapse#10134), [\#10193](matrix-org/synapse#10193)) - Clean up the interface for injecting OpenTracing over HTTP. ([\#10143](matrix-org/synapse#10143)) - Limit the number of in-flight `/keys/query` requests from a single device. ([\#10144](matrix-org/synapse#10144)) - Refactor EventPersistenceQueue. ([\#10145](matrix-org/synapse#10145)) - Document `SYNAPSE_TEST_LOG_LEVEL` to see the logger output when running tests. ([\#10148](matrix-org/synapse#10148)) - Update the Complement build tags in GitHub Actions to test currently experimental features. ([\#10155](matrix-org/synapse#10155)) - Add a `synapse_federation_soft_failed_events_total` metric to track how often events are soft failed. ([\#10156](matrix-org/synapse#10156)) - Fetch the corresponding complement branch when performing CI. ([\#10160](matrix-org/synapse#10160)) - Add some developer documentation about boolean columns in database schemas. ([\#10164](matrix-org/synapse#10164)) - Add extra logging fields to better debug where events are being soft failed. ([\#10168](matrix-org/synapse#10168)) - Add debug logging for when we enter and exit `Measure` blocks. ([\#10183](matrix-org/synapse#10183)) - Improve comments in structured logging code. ([\#10188](matrix-org/synapse#10188)) - Update [MSC3083](matrix-org/matrix-spec-proposals#3083) support with modifications from the MSC. ([\#10189](matrix-org/synapse#10189)) - Remove redundant DNS lookup limiter. ([\#10190](matrix-org/synapse#10190)) - Upgrade `black` linting tool to 21.6b0. ([\#10197](matrix-org/synapse#10197)) - Expose OpenTracing trace id in response headers. ([\#10199](matrix-org/synapse#10199)) Synapse 1.36.0 (2021-06-15) =========================== No significant changes. Synapse 1.36.0rc2 (2021-06-11) ============================== Bugfixes -------- - Fix a bug which caused presence updates to stop working some time after a restart, when using a presence writer worker. Broke in v1.33.0. ([\#10149](matrix-org/synapse#10149)) - Fix a bug when using federation sender worker where it would send out more presence updates than necessary, leading to high resource usage. Broke in v1.33.0. ([\#10163](matrix-org/synapse#10163)) - Fix a bug where Synapse could send the same presence update to a remote twice. ([\#10165](matrix-org/synapse#10165)) Synapse 1.36.0rc1 (2021-06-08) ============================== Features -------- - Add new endpoint `/_matrix/client/r0/rooms/{roomId}/aliases` from Client-Server API r0.6.1 (previously [MSC2432](matrix-org/matrix-spec-proposals#2432)). ([\#9224](matrix-org/synapse#9224)) - Improve performance of incoming federation transactions in large rooms. ([\#9953](matrix-org/synapse#9953), [\#9973](matrix-org/synapse#9973)) - Rewrite logic around verifying JSON object and fetching server keys to be more performant and use less memory. ([\#10035](matrix-org/synapse#10035)) - Add new admin APIs for unprotecting local media from quarantine. Contributed by @dklimpel. ([\#10040](matrix-org/synapse#10040)) - Add new admin APIs to remove media by media ID from quarantine. Contributed by @dklimpel. ([\#10044](matrix-org/synapse#10044)) - Make reason and score parameters optional for reporting content. Implements [MSC2414](matrix-org/matrix-spec-proposals#2414). Contributed by Callum Brown. ([\#10077](matrix-org/synapse#10077)) - Add support for routing more requests to workers. ([\#10084](matrix-org/synapse#10084)) - Report OpenTracing spans for database activity. ([\#10113](matrix-org/synapse#10113), [\#10136](matrix-org/synapse#10136), [\#10141](matrix-org/synapse#10141)) - Significantly reduce memory usage of joining large remote rooms. ([\#10117](matrix-org/synapse#10117)) Bugfixes -------- - Fixed a bug causing replication requests to fail when receiving a lot of events via federation. ([\#10082](matrix-org/synapse#10082)) - Fix a bug in the `force_tracing_for_users` option introduced in Synapse v1.35 which meant that the OpenTracing spans produced were missing most tags. ([\#10092](matrix-org/synapse#10092)) - Fixed a bug that could cause Synapse to stop notifying application services. Contributed by Willem Mulder. ([\#10107](matrix-org/synapse#10107)) - Fix bug where the server would attempt to fetch the same history in the room from a remote server multiple times in parallel. ([\#10116](matrix-org/synapse#10116)) - Fix a bug introduced in Synapse 1.33.0 which caused replication requests to fail when receiving a lot of very large events via federation. ([\#10118](matrix-org/synapse#10118)) - Fix bug when using workers where pagination requests failed if a remote server returned zero events from `/backfill`. Introduced in 1.35.0. ([\#10133](matrix-org/synapse#10133)) Improved Documentation ---------------------- - Clarify security note regarding hosting Synapse on the same domain as other web applications. ([\#9221](matrix-org/synapse#9221)) - Update CAPTCHA documentation to mention turning off the verify origin feature. Contributed by @aaronraimist. ([\#10046](matrix-org/synapse#10046)) - Tweak wording of database recommendation in `INSTALL.md`. Contributed by @aaronraimist. ([\#10057](matrix-org/synapse#10057)) - Add initial infrastructure for rendering Synapse documentation with mdbook. ([\#10086](matrix-org/synapse#10086)) - Convert the remaining Admin API documentation files to markdown. ([\#10089](matrix-org/synapse#10089)) - Make a link in docs use HTTPS. Contributed by @RhnSharma. ([\#10130](matrix-org/synapse#10130)) - Fix broken link in Docker docs. ([\#10132](matrix-org/synapse#10132)) Deprecations and Removals ------------------------- - Remove the experimental `spaces_enabled` flag. The spaces features are always available now. ([\#10063](matrix-org/synapse#10063)) Internal Changes ---------------- - Tell CircleCI to build Docker images from `main` branch. ([\#9906](matrix-org/synapse#9906)) - Simplify naming convention for release branches to only include the major and minor version numbers. ([\#10013](matrix-org/synapse#10013)) - Add `parse_strings_from_args` for parsing an array from query parameters. ([\#10048](matrix-org/synapse#10048), [\#10137](matrix-org/synapse#10137)) - Remove some dead code regarding TLS certificate handling. ([\#10054](matrix-org/synapse#10054)) - Remove redundant, unmaintained `convert_server_keys` script. ([\#10055](matrix-org/synapse#10055)) - Improve the error message printed by synctl when synapse fails to start. ([\#10059](matrix-org/synapse#10059)) - Fix GitHub Actions lint for newsfragments. ([\#10069](matrix-org/synapse#10069)) - Update opentracing to inject the right context into the carrier. ([\#10074](matrix-org/synapse#10074)) - Fix up `BatchingQueue` implementation. ([\#10078](matrix-org/synapse#10078)) - Log method and path when dropping request due to size limit. ([\#10091](matrix-org/synapse#10091)) - In Github Actions workflows, summarize the Sytest results in an easy-to-read format. ([\#10094](matrix-org/synapse#10094)) - Make `/sync` do fewer state resolutions. ([\#10102](matrix-org/synapse#10102)) - Add missing type hints to the admin API servlets. ([\#10105](matrix-org/synapse#10105)) - Improve opentracing annotations for `Notifier`. ([\#10111](matrix-org/synapse#10111)) - Enable Prometheus metrics for the jaeger client library. ([\#10112](matrix-org/synapse#10112)) - Work to improve the responsiveness of `/sync` requests. ([\#10124](matrix-org/synapse#10124)) - OpenTracing: use a consistent name for background processes. ([\#10135](matrix-org/synapse#10135))
Synapse 1.37.0 (2021-06-29) =========================== This release deprecates the current spam checker interface. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new generic module interface. This release also removes support for fetching and renewing TLS certificates using the ACME v1 protocol, which has been fully decommissioned by Let's Encrypt on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. Synapse 1.37.0rc1 (2021-06-24) ============================== Features -------- - Implement "room knocking" as per [MSC2403](matrix-org/matrix-spec-proposals#2403). Contributed by @Sorunome and anoa. ([\#6739](matrix-org/synapse#6739), [\#9359](matrix-org/synapse#9359), [\#10167](matrix-org/synapse#10167), [\#10212](matrix-org/synapse#10212), [\#10227](matrix-org/synapse#10227)) - Add experimental support for backfilling history into rooms ([MSC2716](matrix-org/matrix-spec-proposals#2716)). ([\#9247](matrix-org/synapse#9247)) - Implement a generic interface for third-party plugin modules. ([\#10062](matrix-org/synapse#10062), [\#10206](matrix-org/synapse#10206)) - Implement config option `sso.update_profile_information` to sync SSO users' profile information with the identity provider each time they login. Currently only displayname is supported. ([\#10108](matrix-org/synapse#10108)) - Ensure that errors during startup are written to the logs and the console. ([\#10191](matrix-org/synapse#10191)) Bugfixes -------- - Fix a bug introduced in Synapse v1.25.0 that prevented the `ip_range_whitelist` configuration option from working for federation and identity servers. Contributed by @mikure. ([\#10115](matrix-org/synapse#10115)) - Remove a broken import line in Synapse's `admin_cmd` worker. Broke in Synapse v1.33.0. ([\#10154](matrix-org/synapse#10154)) - Fix a bug introduced in Synapse v1.21.0 which could cause `/sync` to return immediately with an empty response. ([\#10157](matrix-org/synapse#10157), [\#10158](matrix-org/synapse#10158)) - Fix a minor bug in the response to `/_matrix/client/r0/user/{user}/openid/request_token` causing `expires_in` to be a float instead of an integer. Contributed by @lukaslihotzki. ([\#10175](matrix-org/synapse#10175)) - Always require users to re-authenticate for dangerous operations: deactivating an account, modifying an account password, and adding 3PIDs. ([\#10184](matrix-org/synapse#10184)) - Fix a bug introduced in Synpase v1.7.2 where remote server count metrics collection would be incorrectly delayed on startup. Found by @heftig. ([\#10195](matrix-org/synapse#10195)) - Fix a bug introduced in Synapse v1.35.1 where an `allow` key of a `m.room.join_rules` event could be applied for incorrect room versions and configurations. ([\#10208](matrix-org/synapse#10208)) - Fix performance regression in responding to user key requests over federation. Introduced in Synapse v1.34.0rc1. ([\#10221](matrix-org/synapse#10221)) Improved Documentation ---------------------- - Add a new guide to decoding request logs. ([\#8436](matrix-org/synapse#8436)) - Mention in the sample homeserver config that you may need to configure max upload size in your reverse proxy. Contributed by @aaronraimist. ([\#10122](matrix-org/synapse#10122)) - Fix broken links in documentation. ([\#10180](matrix-org/synapse#10180)) - Deploy a snapshot of the documentation website upon each new Synapse release. ([\#10198](matrix-org/synapse#10198)) Deprecations and Removals ------------------------- - The current spam checker interface is deprecated in favour of a new generic modules system. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new system. ([\#10062](matrix-org/synapse#10062), [\#10210](matrix-org/synapse#10210), [\#10238](matrix-org/synapse#10238)) - Stop supporting the unstable spaces prefixes from MSC1772. ([\#10161](matrix-org/synapse#10161)) - Remove Synapse's support for automatically fetching and renewing certificates using the ACME v1 protocol. This protocol has been fully turned off by Let's Encrypt for existing installations on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. ([\#10194](matrix-org/synapse#10194)) Internal Changes ---------------- - Update the database schema versioning to support gradual migration away from legacy tables. ([\#9933](matrix-org/synapse#9933)) - Add type hints to the federation servlets. ([\#10080](matrix-org/synapse#10080)) - Improve OpenTracing for event persistence. ([\#10134](matrix-org/synapse#10134), [\#10193](matrix-org/synapse#10193)) - Clean up the interface for injecting OpenTracing over HTTP. ([\#10143](matrix-org/synapse#10143)) - Limit the number of in-flight `/keys/query` requests from a single device. ([\#10144](matrix-org/synapse#10144)) - Refactor EventPersistenceQueue. ([\#10145](matrix-org/synapse#10145)) - Document `SYNAPSE_TEST_LOG_LEVEL` to see the logger output when running tests. ([\#10148](matrix-org/synapse#10148)) - Update the Complement build tags in GitHub Actions to test currently experimental features. ([\#10155](matrix-org/synapse#10155)) - Add a `synapse_federation_soft_failed_events_total` metric to track how often events are soft failed. ([\#10156](matrix-org/synapse#10156)) - Fetch the corresponding complement branch when performing CI. ([\#10160](matrix-org/synapse#10160)) - Add some developer documentation about boolean columns in database schemas. ([\#10164](matrix-org/synapse#10164)) - Add extra logging fields to better debug where events are being soft failed. ([\#10168](matrix-org/synapse#10168)) - Add debug logging for when we enter and exit `Measure` blocks. ([\#10183](matrix-org/synapse#10183)) - Improve comments in structured logging code. ([\#10188](matrix-org/synapse#10188)) - Update [MSC3083](matrix-org/matrix-spec-proposals#3083) support with modifications from the MSC. ([\#10189](matrix-org/synapse#10189)) - Remove redundant DNS lookup limiter. ([\#10190](matrix-org/synapse#10190)) - Upgrade `black` linting tool to 21.6b0. ([\#10197](matrix-org/synapse#10197)) - Expose OpenTracing trace id in response headers. ([\#10199](matrix-org/synapse#10199))
Add metrics to track how often events are
soft_failed
Spawned from missing messages we were seeing on
matrix.org
from afederated Gitter bridged room, https://gitlab.com/gitterHQ/webapp/-/issues/2770.
The underlying issue in Synapse is tracked by #10066
where the message and join event race and the message is
soft_failed
before thejoin
event reaches the remote federated server.Less soft_failed events = better and usually this should only trigger for events
where people are doing bad things and trying to fuzz and fake everything.
This metric does not track what situation causes the event to be
soft_failed
but could be used to know how many people roughly are potentially running into #10066.Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.Pull request includes a sign off