Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connectivity Issues (router dropouts/offline) - ZBDongle-P and Philips Hue #19416

Closed
charlesomer opened this issue Oct 24, 2023 · 13 comments
Closed
Labels
problem Something isn't working stale Stale issues

Comments

@charlesomer
Copy link

charlesomer commented Oct 24, 2023

What happened?

Hi, I've been running Zigbee2mqtt for a while now and have been trying to get it stable. Unfortunately I've been unsuccessful with everything I've tried and now I've pretty much run out of ideas so any help will be much appreciated!

I have 18 devices currently joined (all routers):

  • 11 x Philips Hue LCT001
  • 5 x IKEA control outlets
  • 2 x ZBDongle-P with router firmware (20221102)

The IKEA outlets and the 2 routers are generally okay, are available most of the time and respond as expected. They show up in the map etc. Sometimes one or two can become unavailable after a restart or similar, but this usually resolves itself (there's 1 IKEA outlet unavailable at the moment for example).

The hue bulbs are much more inconsistent, most will remain offline (at the moment 7 are offline). Due to the way they are setup, they are only powered at night (this is something I am not able to change). I understand this isn't ideal but I have a similar situation with the same bulbs in another setup (although not as many) and this works fine. They reconnect once powered on etc.

Here are some of the things I've tried so far:

  • Using ZBDongle-E as the coordinator. This is what I started with originally as I am using the same in a different setup elsewhere and it works fine. I know it's not one of the recommended ones though which is why I switched to ZBDongle-P.
  • Using a different Zigbee channel, with the ZBDongle-E I was using channel 25 but I switched this to the default 11 for the ZBDongle-P.
  • Adjusting the power of the routers and coordinator, I have tried 5dBm to 20dBm. At the moment they are all set to 15.
  • Adding additional router devices close to the bulbs. Extra IKEA outlets and the ZBDongle-P routers - some of these are on the other side of the same wall as a hue bulb that will be marked as offline.
  • Updating the firmware of the coordinator.
  • Updating to latest-dev.
  • Powering off the PC for 15 minutes (or more).
  • Repairing all the devices (this was done when switching coordinators).
  • Using USB extension cables to get the coordinator as far away from the PC as possible.

I have read the map isn't the best tool but I noticed on there that some bulbs will have links drawn despite the LQI being 0. Some of the devices marked as offline will have single numbers associated with their routes but never a "/". I'm not 100% sure what this means but my guess is that something can "see" the device but the device can't talk back maybe?

Also, when the bulbs are powered on in the evening, they do show up as "online". Which leads me to believe that zigbee2mqtt can communicate with them in some way, but then upon the next availability check they go offline again.

The distance between some of these bulbs is only 2 metres or so and the distance to a known working router is sometimes less than a metre (albeit through a wall). In fact one of the ones furthest away from the next closest router appears to work all the time.

The logs report timeouts when trying to ping or execute a command (get/set).

Is there anything else I can try to get this more reliable?

I'll add more if I remember or discover extra info.

Thank you
Charles

What did you expect to happen?

Devices connect when powered on and are stable after a short amount of time.

How to reproduce it (minimal and precise)

Run zigbee2mqtt (docker) with ZBDongle-P. Ubuntu Ubuntu 22.04.3 LTS.
Pair multiple LCT001.
Observe offline devices.

Zigbee2MQTT version

1.33.1-dev commit: 0e55b38

Adapter firmware version

20230507

Adapter

ZBDongle-P

Debug log

No response

@charlesomer charlesomer added the problem Something isn't working label Oct 24, 2023
@charlesomer
Copy link
Author

I've also noticed that the Hue bulbs do not mesh with one another (at least according to the map). There's not a single link between two LCT001 devices. I was expecting the hue bulbs to behave correctly but perhaps this is an incorrect assumption.

@charlesomer charlesomer changed the title Connectivity Issues (router dropouts/offline) - ZBDongle-P Connectivity Issues (router dropouts/offline) - ZBDongle-P and Philips Hue Oct 30, 2023
@charlesomer
Copy link
Author

Some weak routes do appear when selecting "none of the above", but it's not consistent. I am able to force devices to an "online" state again by refreshing one of the devices properties a few times (4) very quickly - after 30 seconds or so the device is marked as online again. However, this is not permanent and devices will go back offline again shortly after. This behaviour includes devices which are powered all the time too.

So far I've tried the following firmware:

  • 20220219
  • 20221226
  • 20230507
  • 20231112

The next things I'm debating trying is switching coordinators to either zzh or slaesh and changing the channel back to 25.

@xelemorf
Copy link

xelemorf commented Dec 13, 2023

If you still have your Dongle-E you could use that one as a router and pair those problematic devices to it. Dongle-E is quite good at router functionality by providing an outstanding range even through concrete walls, also found it very reliable as router (quite the opposite as coordinator).

It would be worth to check if you have any connections to those bulbs which will go offline during the night (which I suppose act as routers aswell), to see if any end device would pair up with them and ending up disconnected, you would want to avoid that by pairing them to the Dongle-E instead.

You mentioned you are using USB extension cable, it's USB 2.0 right? Make sure it's not USB 3 because of the interference.

@charlesomer
Copy link
Author

Thanks for your reply. I don't have any end devices, they are all routers so should dynamically adjust their routes to the best path from what I understand. I have the Dongle-E setup and running as a router (firmware 20220515) along with two other Dongle-P.

Some permanently powered devices do end up in an "offline" state occasionally once the hue bulbs are powered on (one is offline at the moment). This confuses me the most really, if the Zigbee network knows a good route to a router, why is switching on another router elsewhere causing drop outs?

I believe it's a USB2 extension cable and a USB2 port.

@charlesomer
Copy link
Author

I've just checked again and every single router (including the hue bulbs) shows up as expected in Home Assistant history as "available" when they get powered on. I'm struggling to understand why the availability check fails if they are all shown online at this point.

@xelemorf
Copy link

xelemorf commented Dec 13, 2023

The default availability state values did not work for me so ended up changing to advanced mode and specifying custom values and then fine tuning them to the ones shown on the screenshot.

Screenshot_20231213_224918_Home Assistant

On the other hand you may want to experiment with the "Legacy availability payload" and "Legacy API" under Advanced settings. Attaching another screenshot for that.

Screenshot_20231213_225150_Home Assistant

Most of the Zigbee devices does not necessarily comply with Zigbee standards (I believe this is mainly for vendor-lock attempts to keep you in a closed eco-system / or the vendors just straight up not caring enough since they made it work with their own zigbee hub they are offering), and even the standards has multiple revisions (currently I believe we are on Zigbee 3.0). I could observe major issues for example with all Xiaomi and Aqara devices (both are manufactured by LUMI) where they just straight up refuse pairing with any other vendor's router even refusing connecting to the coordinator directly (In some case I was able to force it but they ended up disconnecting and having zero LQI, sometimes they are able to reconnect but in case of a switch the first button press might be lost and only the second gets registered but eventually gone from the network after a few hours). Also they (like Xiaomi/Aqara - just as an example) are not able to dynamically change to another router by the way, so you cannot get the capabilities from the standards granted in any way.

You need to find out which devices from which vendors are playing nice with each other. I have found that the cheapest devices sold on Aliexpress (using "Choice" sales) which are TuYa compatible works the best and the most reliable, not to mention you can buy 4-5 devices for mostly the same capabilities for the price of one Aqara - Just a thought to consider.

You might want to also consider using Zigbee2Mqtt Edge version instead of the stable branch, that helped me a lot in similar cases. (For me the Edge version seems to be more stable actually)

@charlesomer
Copy link
Author

I have tried increasing the availably to 30 minutes in the past which didn't fix things, it just delayed how long it took for devices to be marked as offline when they were powered on. The communication to devices was still problematic.

Many people have recommended Hue, Sonoff ZB-Dongle E/P and Ikea on the bases they work well as Zigbee devices so I'd have expected these ones to comply with the specification enough to work reliably.

On using the legacy availability payload, I believe this only relates to how the availability is reported by Zigbee2mqtt rather than how the availability is retrieved for the devices: https://www.zigbee2mqtt.io/guide/configuration/device-availability.html#state-retrieval. I did try it though, it made no difference :/

@xelemorf
Copy link

xelemorf commented Dec 14, 2023

I'm using this fw below for Dongle-P which works quite well, can you check your version?
https://github.com/Koenkk/Z-Stack-firmware/blob/master/coordinator/Z-Stack_3.x.0/bin/CC1352P2_CC2652P_launchpad_coordinator_20230507.zip

Screenshot_20231214_013834_Home Assistant

By the way, you are still on Dongle-P, right? I had the exact same issues with Dongle-E (after HAOS v10 was released late October), that's why I have switched to Dongle-P recently.

Please check the following below:

  • Check if you have enabled the hardware DIP switch for your Dongle-P to enable hardware flow control? Will need to open it up to see if you are unsure.
  • Share a network map with visible LQI values?
  • Share your /homeassistant/zigbee2mqtt/configuration.yaml file contents while making sure not to share the network_key, ext_pan_id, and pan_id segments?
  • Confirm you are in Edge version of Zigbee2Mqtt?

Another idea, if you could enable the below ones for all of your Zigbee devices then save on the bottom if the page.

  • QoS: 2
  • Retain: True

Screenshot_20231214_012211_Home Assistant

Also check the Serial settings:
Screenshot_20231214_013813_Home Assistant

@charlesomer
Copy link
Author

My coordinator version is currently 20230507 and I'm using the dongle-P as the coordinator.

I haven't enabled hardware flow control.
z2m.zip

Current version of z2m is: 1.34.0-dev commit 25de4fd

Those settings are related to MQTT, I didn't think these made any difference to how z2m communicates with devices? I see the issues in the Z2M UI which I don't think is dependent on MQTT.

@xelemorf
Copy link

xelemorf commented Dec 14, 2023

I'd suggest:

  • Set transmit power to 20 for Dongle-P
  • Disassemble your Coordinator Dongle-P and enable thr DIP switch.
  • Configure baudrate to 115200
  • Check the Dongle-E router firmware, it appears to have a very poor signal while it has by default a transmit power of 20 on fw level (that device would be the stongest on your network with way above 200 LQI in relation of a 61nm flat and concrete walls in my example), this is fw I use: https://github.com/itead/Sonoff_Zigbee_Dongle_Firmware/tree/master/Dongle-E/Router
  • How much distance do you have between these devices, what type of walls, any huge metal casing objects near the affected routers like a fridge, washmachine, dishwasher, etc?
  • What Wifi channels are you using for 2.4GHz Wifi, are you living in a flat with close proximity neighbors, did you analyze 2.4GHz wireless congestion (neighborhood) in relation to Zigbee coexistence? You might need to disable auto channel selection for your 2.4Ghz aswell (I ended up using highest Ch13 for 2.4GHz wifi and lowest Ch11 for Zigbee)
    https://www.metageek.com/training/resources/zigbee-wifi-coexistence/
    https://play.google.com/store/apps/details?id=net.techet.netanalyzer.an
  • What type of host system are you using? Raspberry Pi, x64 pc, VM, what resources do you have on it?
  • Do you use HAOS or Supervised?

This overall seems to me a Zigbee signal issue, specifically if you live in a very big house with maybe multiple floors at the same time time.
Would there be a way to get these devices to close proximity with each other for a day and see if the signal is adequate for them, that would easily rule out some of the things.

@charlesomer
Copy link
Author

So over Christmas I tried to stabilise the setup, I have now got it to a point where only roughly 3 of the hue bulbs are problematic (slow to respond sometimes). The rest so far have been better - I'm going to keep an eye on it and see how it goes.

I ended up moving the coordinator to a pi3 so I could place it somewhere else. I believe previously, Zigbee was struggling to determine which route from the coordinator was best. It appeared to switch between high LQIs to low ones randomly, my best guess so far is the lower LQI were fewer "hops" than the higher LQI route.

This also meant I could move some of my repeaters around too. I still need to see if I can get the remaining bulbs working reliably but at least it's an improvement. Hue bulbs still don't show as being linked to one another (just to other manufacturers) in the Zigbee map unless "none of the above" is checked.

Copy link
Contributor

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 30 days

@github-actions github-actions bot added the stale Stale issues label Jul 12, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 11, 2024
@antst
Copy link

antst commented Nov 17, 2024

actually, I think there is something. As it happened, I have couple of LCT001.
They used to work as a charm, but over last year I have seen clear degradation with updating Z2M to newer and newer version.
At the end, with 1.40.1 they almost stopped to work, but at least they were still formally part of the system (although they were paired long time ago). It was with ZDongle-P (TI).
Over last few days I finally made transition to ZDongle-E and Z2M 1.41.0 in new setup. And those two LCT001 proved to be impossible to pair anymore. And I am not a virgin in pairing, with mu 120 devices of all possible brands in zigbee network.
But I spent hours trying to get LCT001 into new system. And while rest of devices showed kind of more smooth experience with ember that it was with stack.
It always ends up with
"Interview failed because can not get node descriptor".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
problem Something isn't working stale Stale issues
Projects
None yet
Development

No branches or pull requests

3 participants