Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

In IPv6-only environment, can't connect to Redis due to DNS only looking for A records #10694

Open
xfk opened this issue Aug 25, 2021 · 14 comments
Labels
S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@xfk
Copy link

xfk commented Aug 25, 2021

Description

My redis instance has only an IPv6 address. The matrix-synapse container is throwing up a DNS lookup error:
2021-08-24 13:30:55,585 - twisted - 258 - ERROR - sentinel- Message: 'Connection to redis server %s failed: %s' 2021-08-24 13:30:55,585 - twisted - 258 - ERROR - sentinel- Arguments: ("b'matrix-test-redis-master':6379", DNSLookupError(b'matrix-test-redis-master'))
It also doesn't work with a FQDN IPv6-only address.

I did a tcpdump of DNS traffic in the container, it's only searching for A records (not AAAA).

  • Platform:

It's running in a IPv6-only Kubernetes cluster.

@reivilibre reivilibre changed the title DNS problems in IPv6-only environment In IPv6-only environment, can't connect to Redis due to DNS only looking for A records Aug 26, 2021
@reivilibre
Copy link
Contributor

reivilibre commented Aug 26, 2021

It also doesn't work with a FQDN IPv6-only address.

I'm assuming 'FQDN' is spurious here and that you meant a full IPv6 address (rather than a domain name).

In any case, I tried this briefly with Wireshark and confirm that it only appears to look up A records (whereas other traffic on my machine does appear to be searching for AAAA records, so I don't think my Wiresharking skills are to blame).
I'm not sure why this is the case — the code looks to be using the right Twisted machinery (which even claims it will accept IPv6 addresses in its documentation) and I've heard from people that IPv6 works for federation.

@reivilibre reivilibre added T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. S-Major Major functionality / product severely impaired, no satisfactory workaround. labels Aug 26, 2021
@xfk
Copy link
Author

xfk commented Aug 26, 2021

By FQDN, I meant a domain name (matrix-test-redis.example.com) which only has AAAA records.

I've been trying to debug the problem further with twisted. I added "IPv6Address" in the list in the following line and it started searching for (and found) AAAA records:
https://github.com/twisted/twisted/blob/trunk/src/twisted/internet/_resolver.py#L305

Now I'm getting a different error and trying to debug it further:
2021-08-26 11:46:57,319 - synapse.replication.tcp.redis - 271 - INFO - sentinel- Connection to redis server b'matrix-test-redis-master':6379 failed: An error occurred while connecting: -9: Unknown error -9.

EDIT: Although with the above modification the correct address is found. Synapse/twisted never tries to connect to that address (looking at tcpdump) and only "Unknown error -9" is returned in the synapse logs.

@richvdh
Copy link
Member

richvdh commented Aug 26, 2021

suspect this is the same cause as #7720.

@richvdh
Copy link
Member

richvdh commented Aug 26, 2021

@xfk: could you open an issue on the twisted bug tracker and link back to it here?

@xfk
Copy link
Author

xfk commented Aug 26, 2021

@clokep
Copy link
Member

clokep commented Aug 26, 2021

This might also be related to https://twistedmatrix.com/trac/ticket/10062 / https://twistedmatrix.com/trac/ticket/9691 / twisted/twisted#1488, where Twisted uses ANY to lookup DNS records, which seems to fail to find AAAA records in some situations.

@telmich
Copy link

telmich commented Aug 26, 2021

ping @evilham -- this might be relevant for you, too

@glyph
Copy link
Contributor

glyph commented Aug 30, 2021

Anywhere you need to be looking up A or AAAA, you want to be using HostnameEndpoint. There's a rules-lawyer-y answer to the IResolverSimple.getHostByName dilemma, but it's going to be long and drawn out and complicated and at the end of the day, this API is fundamentally too narrow for the use-case of actually connecting to a TCP endpoint.

@xfk
Copy link
Author

xfk commented Aug 30, 2021

@glyph Thanks for the suggestion! I tried it out and everything seems to be working. I'm not an experienced twisted user, so I'm not sure if I'm using it the proper way.

xfk@4d7037a

@clokep
Copy link
Member

clokep commented Aug 30, 2021

@xfk Would you mind making a PR? Seems like it is overall the right fix.

@xfk
Copy link
Author

xfk commented Aug 30, 2021

@clokep I created a PR.

#10717

@a-0-dev
Copy link

a-0-dev commented Apr 21, 2022

I just ran into this exact problem, are there any issues holding the PR up? I could work around it by querying the IPv6 at install time, but that is very hacky... :/

@richvdh
Copy link
Member

richvdh commented Apr 21, 2022

@a-0-dev: I don't think anyone is working on this currently.

@clokep
Copy link
Member

clokep commented Aug 29, 2023

Note that #7720 is now fixed, we can likely apply the same fixes for Redis.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants