Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout fetching PAA root certificates from Git #284

Closed
weinshel opened this issue Apr 30, 2023 · 25 comments
Closed

Timeout fetching PAA root certificates from Git #284

weinshel opened this issue Apr 30, 2023 · 25 comments

Comments

@weinshel
Copy link

When using the Matter server add-on, I'm encountering an issue that appears to be a timeout fetching PAA root certificates from Git when the server is starting up:

2023-04-29 21:25:03 core-matter-server matter_server.server.helpers.paa_certificates[126] INFO Fetching the latest PAA root certificates from Git.
2023-04-29 21:30:04 core-matter-server asyncio[126] ERROR Task exception was never retrieved
future: <Task finished name='Task-1' coro=<run.<locals>.new_coro() done, defined at /usr/local/lib/python3.10/site-packages/aiorun.py:227> exception=TimeoutError()>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/aiorun.py", line 237, in new_coro
    await coro
  File "/usr/local/lib/python3.10/site-packages/matter_server/server/server.py", line 94, in start
    await self.device_controller.initialize()
  File "/usr/local/lib/python3.10/site-packages/matter_server/server/device_controller.py", line 73, in initialize
    await fetch_certificates()
  File "/usr/local/lib/python3.10/site-packages/matter_server/server/helpers/paa_certificates.py", line 153, in fetch_certificates
    fetch_count += await fetch_git_certificates()
  File "/usr/local/lib/python3.10/site-packages/matter_server/server/helpers/paa_certificates.py", line 126, in fetch_git_certificates
    async with http_session.get(f"{GIT_URL}/{cert}.pem") as response:
  File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
  File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line 467, in _request
    with timer:
  File "/usr/local/lib/python3.10/site-packages/aiohttp/helpers.py", line 721, in __exit__
    raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError

On the Matter integration side I'm getting an error that the integration is unable to connect to the Matter server:

2023-04-29 21:38:12.705 WARNING (MainThread) [homeassistant.config_entries] Config entry 'Matter' for matter integration not ready yet: Failed to connect to matter server; Retrying in background
@jusicgn
Copy link

jusicgn commented May 6, 2023

I have the exact same issue.

@darkxst
Copy link

darkxst commented May 11, 2023

I've noticed this in a number of places in Home Assistant, so I dont think its specific to the matter server and is actually an issue at a lower level. For example I am also having issues importing blueprints from Github.

It seems, atleast on my system (HA OS in a VM), downloads from Github links timeout in aiohttp when IPv6 is enabled. This issue goes away when disabling IPv6, of course that is not a fix since IPv6 is core to Matter.

@marcelveldt
Copy link
Contributor

Github has had some issues the last couple of weeks. We'll have to add some more guard to catch the timeout

@agners
Copy link
Collaborator

agners commented May 15, 2023

In case it is not what @marcelveldt suggested, a temporary outage:

@darkxst at least the Matter Server add-on runs on host network. What IPv6 settings are you using? If you see issues when having IPv6 set to "auto", likely there is a general IPv6 problem in your network. I'd check IPv6 connectivity on another host where IPv6 is enabled using https://test-ipv6.com. If that looks all green, testing with another Linux host might be worthwhile.

@weinshel
Copy link
Author

Disabling IPv6 also seemed to get me past these timeouts in the Matter server. I wasn't able to pair a Matter access though, either because IPv6 wasn't available, or another bug.

I'm on Home Assistant OS on a Raspberry Pi 4. Oddly my devices all are receiving IPv6 addresses and my ISP supports IPv6, but I'm getting a 0/11 on https://test-ipv6.com/ so looks like I have some investigating to do…

@hidaris
Copy link

hidaris commented May 15, 2023

Disabling IPv6 also seemed to get me past these timeouts in the Matter server. I wasn't able to pair a Matter access though, either because IPv6 wasn't available, or another bug.

I'm on Home Assistant OS on a Raspberry Pi 4. Oddly my devices all are receiving IPv6 addresses and my ISP supports IPv6, but I'm getting a 0/11 on https://test-ipv6.com/ so looks like I have some investigating to do…

Pairing failure is a separate issue. Do you have any logs? We need to determine at which step the failure occurred.

@weinshel
Copy link
Author

weinshel commented May 15, 2023

Filed #292 for the pairing issue

@darkxst
Copy link

darkxst commented May 16, 2023

What IPv6 settings are you using? If you see issues when having IPv6 set to "auto", likely there is a general IPv6 problem in your network.

@agners I am using IPv6 auto. LAN is working and DNS is working, but dont seem to have any internet connectivity (from HA) via IPv6. IPv6 works fine though on host (Ubuntu) and other VMs which are all Debian/Ubuntu based.

I kinda thinking there is some issue with router advertisement though. If I ping the router from HA, ipv6 internet works for a while.

# ping -6 google.com
No response from google.com
# ping -6 fritz.box
fritz.box is alive!
# ping -6 google.com
google.com is alive!

@agners
Copy link
Collaborator

agners commented May 16, 2023

@darkxst that is not really the right place to discuss HAOS IPv6 issues. If you believe it is a HAOS issue, report a bug here, otherwise Discord or the community forum is better suited.

@darkxst
Copy link

darkxst commented May 16, 2023

I will file a bug there.

In terms of this bug though, perhaps it should fallback to ipv4 to connect to github servers for certificate download, if ipv6 fails with timeout? Matter was working here prior to c5a6f8e being added.

@marcelveldt
Copy link
Contributor

That is not how networking works. We just request a DNS name and if the underlying network supports IPv6 that is always preferred (come on its 2023, time to get IPv6 support everywhere) and IPv4 will always be the fallback if IPv4 fails.

So to put it shortly its not the application's responsibility to determine IPv4 vs IPv6.

@agners
Copy link
Collaborator

agners commented May 16, 2023

An IPv6 capable IP stack should prefer IPv6 if global IPv6 addressing is available. The problem is, that sometimes even though IPv6 addresses are available, reaching the IPv6 server doesn't work (reasons could be not properly setup routing, firewall or the resolved IPv6 address by the DNS server was wrong in the first place). Typically, people notice and resolve such issues quickly if it is IPv4 (since it causes nothing to work), but IPv6 issues of that sort are often ignore/not diagnosed correctly, left unresolved.

In any case, because there are many misconfigured IPv6 networks around, there is a IETF standard which does support fallback on application level. The algorithm is coined Happy Eyeballing. It needs to be implemented on application level (user space), in this case probably aiohttp.

It seems that aiohttp doesn't support that currently, see aio-libs/aiohttp#4451.

@marcelveldt
Copy link
Contributor

So in that case we'll have to wait for aiohttp as we really need IPv6 to be enabled

@weinshel
Copy link
Author

Figured out what the issue was on my end, there was some misconfiguration/bug in my router that was causing IPv6 to be blocked due to what appears to be a bug in how Ubiquiti routers handle VPN routing.

After resolving that and restarting the Matter server (and a few reboots of the accessories), I was able to successfully pair my accessories!

@marcelveldt
Copy link
Contributor

Great news and thanks for sharing your experience, that will definitely help others!

@agners
Copy link
Collaborator

agners commented May 16, 2023

Just to be clear: Matter itself can happily work with a router not capable of IPv6. Matter devices announce themselfs on the local network via mDNS (multicast), and communication happens peer to peer (no interaction of the router needed). That said, if the router also acts as switch, and if their filtering also applies to switching, then of course that might be a problem. However, I don't think that Ubiquiti stuff is that broken.

After resolving that and restarting the Matter server (and a few reboots of the accessories), I was able to successfully pair my accessories!

That sounds as if you retried pairing then? Maybe it just worked after a few retries? I've seen issues with our Matter v1.0 example apps where mDNS announcements sometimes are sent delayed, in fact so late that the Matter server's commissioning flow timed out already.

I think that is solved with Matter v1.1, at least I haven't seen that behavior with the lastest example apps.

What accessory did you try to pair?

@ArturoGuerra
Copy link
Contributor

Maybe we should force git cert fetching to happen over ipv4 since I've noticed github seems to have lots of issues with ipv6, I've encountered issues with fetching stuff from github over ipv6 with nvidias container runtime which hosts their apt repo on github and I've also encountered this issue myself.

@marcelveldt
Copy link
Contributor

Yeah, but how to tell aiohttp to only use ipv4 dns ? imo the best solution would be if that happy eyeballing works in aiohttp where an automatic failover to ipv4 takes place.

That said, maybe the issues are already gone now as Github said it has fixed their issues: https://github.blog/2023-05-16-addressing-githubs-recent-availability-issues/

@weinshel
Copy link
Author

I was trying to pair the Nanoleaf essentials bulbs, it seemed that before the ipv6 issues were resolved, pairing would consistently time out at mDNS. One bulb paired pretty quickly after resolving the IPv6 issues, the other took a few more reboots and retries. Unfortunately I didn’t capture logs for the errors though.

@djandrew2005
Copy link

I selected "Allow All" on Promiscuous Mode (VM Settings -> Network -> Advanced) and now ipv6 github certs are reachable

@marcelveldt
Copy link
Contributor

Marking this issue as complete with the comment above from @djandrew2005 as solution.

@agners
Copy link
Collaborator

agners commented May 22, 2023

I selected "Allow All" on Promiscuous Mode (VM Settings -> Network -> Advanced) and now ipv6 github certs are reachable

I am guessing this in the end made IPv6 Neighbor Discovery protocol working correctly, which in turn fixed IPv6 connectivity.

In the end: Currently Home Assistant Core as well as the Python Matter Server require working IPv6 connectivity, if there is an IPv6 address with global scope is assigned.

@marcelveldt
Copy link
Contributor

Shall we note this one in the docs for others to discover ? I guess lots of people run HAOS on VM ?

@agners
Copy link
Collaborator

agners commented May 22, 2023

Well, it is complicated: In theory, a Hypervisor should make sure that whatever Layer 2 network card it emulates, is properly handled on the physical card. Most physical cards filter by their known MAC address by default. I'd expect a hypervisor which creates a virtual network card with a (pseudo) random MAC, makes sure that this MAC address is also listened on the host MAC address.

Some cards support multiple MAC address they can listen on. Others need to use the promiscuous mode. The downside of the promiscuous mode is that ALL traffic is forwarded to the higher stack, leading to higher system load.

So: If possible, I'd prefer to not suggest enabling promiscuous mode. I also think it should not be necessary... But without knowing any exact system configuration (judging from the configuration path, I think it is VMware in djandrew2005 case, but not sure about version and the rest of system configuration) and further debugging/investigating i'd rather prefer to not put such things into the documentation.

@robert-alfaro
Copy link

bug happened to me, ended up being loss of ipv6 address at the comcast modem 😒 ... release/renew fixed it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants