-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Improve handling of SRV records for federation connections #3016
Conversation
Can one of the admins verify this patch? |
1 similar comment
Can one of the admins verify this patch? |
@matrixbot: test this please |
@silkeh: can you merge latest develop into your branch? I think the test failures may have been fixed. |
So I have no real idea why this was so complicated in the first place, but this looks correct now. I suspect it was so that IPv4 was preferred when both were available, though as you note, the reasons for that have largely gone away and our attempts to work around it are causing bigger problems. I'd like to see the tests passing, but otherwise unless @erikjohnston objects I'm going to land this. |
9a314fd
to
9e4160d
Compare
Done. All green :) |
@silkeh: can you also update the comment at https://github.com/silkeh/synapse/blob/9e4160dc0309c419fdf6d2ae23fb38c0b2983fc9/synapse/http/endpoint.py#L34 ? |
Signed-off-by: Silke Hofstra <silke@slxh.eu>
9e4160d
to
72251d1
Compare
@richvdh sure. |
thank you! |
@silkeh ftr this fixed my personal homeserver (which was trying to talk ipv6 most of the time despite not having an ipv6 stack...) so: thanks! :) |
Changes in synapse v0.28.0-rc1 (2018-04-26) =========================================== Bug Fixes: * Fix quarantine media admin API and search reindex (PR #3130) * Fix media admin APIs (PR #3134) Changes in synapse v0.28.0-rc1 (2018-04-24) =========================================== Minor performance improvement to federation sending and bug fixes. (Note: This release does not include state resolutions discussed in matrix live) Features: * Add metrics for event processing lag (PR #3090) * Add metrics for ResponseCache (PR #3092) Changes: * Synapse on PyPy (PR #2760) Thanks to @Valodim! * move handling of auto_join_rooms to RegisterHandler (PR #2996) Thanks to @krombel! * Improve handling of SRV records for federation connections (PR #3016) Thanks to @silkeh! * Document the behaviour of ResponseCache (PR #3059) * Preparation for py3 (PR #3061, #3073, #3074, #3075, #3103, #3104, #3106, #3107, #3109, #3110) Thanks to @NotAFile! * update prometheus dashboard to use new metric names (PR #3069) Thanks to @krombel! * use python3-compatible prints (PR #3074) Thanks to @NotAFile! * Send federation events concurrently (PR #3078) * Limit concurrent event sends for a room (PR #3079) * Improve R30 stat definition (PR #3086) * Send events to ASes concurrently (PR #3088) * Refactor ResponseCache usage (PR #3093) * Clarify that SRV may not point to a CNAME (PR #3100) Thanks to @silkeh! * Use str(e) instead of e.message (PR #3103) Thanks to @NotAFile! * Use six.itervalues in some places (PR #3106) Thanks to @NotAFile! * Refactor store.have_events (PR #3117) Bug Fixes: * Return 401 for invalid access_token on logout (PR #2938) Thanks to @dklug! * Return a 404 rather than a 500 on rejoining empty rooms (PR #3080) * fix federation_domain_whitelist (PR #3099) * Avoid creating events with huge numbers of prev_events (PR #3113) * Reject events which have lots of prev_events (PR #3118)
This is an updated version of #1934 (which cannot be reopened due to this not being a continuation but a rework).
SRV records are currently not only resolved to hosts, but also to the IP addresses belonging to those hosts. In contrast to hosts without SRV for which the hostname is used. Removing the manual address resolution allows for Twisted to do the heavy lifting.
The problem with the current SRV resolving behaviour is that while the code currently does resolve both A and AAAA, the server's native capabilities are not taken into account.
This breaks connections from IPv6 only servers to dual stack servers, including IPv6 transition methods like DNS64/NAT64.
The concerns of the last PR were:
@14mRh4X0r:
#1886 has been fixed now. The configuration of currently misconfigured servers (~100) will need to be updated. Until then “connection refused“ messages will occur. This is technically correct behaviour.
@erikjohnston:
I think Happy Eyeballs should suffice. Twisted's
HostnameEndpoint
should support this, and it is used in the current implementation.Actually performing a fallback on a TCP “connection refused” seems to me to be outside of the scope of this PR (and related code) as this can be caused by many other things.
Edit: this may also resolve the behaviour seen in #2850