Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Everything unavailable when z2m restarts/crashes #87

Closed
copystring opened this issue Dec 13, 2022 · 24 comments
Closed

Everything unavailable when z2m restarts/crashes #87

copystring opened this issue Dec 13, 2022 · 24 comments
Labels
problem Something isn't working stale

Comments

@copystring
Copy link

What happened?

It looks like z2m crashed and restarted. When this happens every device is marked as unavailable and only sometimes return to available.

What did you expect to happen?

Maybe it makes more sense to automatically restart the adapter when there's a connection loss instead of endlessly trying to reconnect?
Also as you can see in the third screenshot of the logs it seems like iobroker.zigbee2mqtt behaves different than a restart when the connection ist lost which seems to be the reason the devices do not always turn back to being available.

How to reproduce it (minimal and precise)

Wait for z2m to restart for example be connection loss of the adapter.

Adapter version

2.4.4

Log

Disconnect:
image

Eventually z2m works again and iobroker.zigbee2mqtt reconnects:
image

This is what it looks like when manually restarting iobroker.zigbee2mqtt:
image

@copystring copystring added the problem Something isn't working label Dec 13, 2022
@o0shojo0o
Copy link
Contributor

o0shojo0o commented Dec 13, 2022

What happened?

It looks like z2m crashed and restarted. When this happens every device is marked as unavailable and only sometimes return to available.

I have now tried so several things, but do not get this behavior readjusted, then z2m always sends all available states on a new connection.

Maybe it makes more sense to automatically restart the adapter when there's a connection loss instead of endlessly trying to reconnect?

The websocket connection is completely reinitialized, as with an adapter restart, that would make no difference.
Also a full adapter restart would be a problem for users using the 'dummy MQTT server' from the adapter, for them it would end in an endless loop.

Also as you can see in the third screenshot of the logs it seems like iobroker.zigbee2mqtt behaves different than a restart when the connection ist lost which seems to be the reason the devices do not always turn back to being available.

No this is only because I suppress the log when an adapter is running and reconnects.
https://github.com/o0shojo0o/ioBroker.zigbee2mqtt/blob/ea52abe240a43fe729f7236773281f92b27264d0/main.js#L142-L146

@o0shojo0o o0shojo0o added stale and removed stale labels Feb 2, 2023
@github-actions
Copy link

github-actions bot commented Mar 5, 2023

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale label Mar 5, 2023
@github-actions github-actions bot removed the stale label Mar 13, 2023
@o0shojo0o
Copy link
Contributor

I assume from your other issues (#148) that you are using the MQTT connection type here.
Please switch to the websocket connection.

@copystring
Copy link
Author

I do still have this Problem.
For example this device:
image
As you can see it's clearly available in z2m but not in the ioBroker adapter.
A few seconds later everything is online again.

🤷‍♂️

image
I will try to get a log when this happens the next time.

@copystring
Copy link
Author

I have not been able to get a good log for this, but I'm still confident it has something to do with the websocket connection reconnecting.

For example, this light: image

All the other devices have the same available timestamp as this light, which is roughly 10 seconds after websocket reconnected.
image

Here's a log with the time range https://pastebin.com/XNcRB09a

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added stale and removed stale labels Apr 28, 2023
@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added stale and removed stale labels May 29, 2023
@tonez18
Copy link

tonez18 commented Jun 13, 2023

image

Same issue here. After restarting zigbee2mqtt and adapter everything is working fine for 3-4 days. After this period the problem comes back :/

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale label Jul 14, 2023
@copystring
Copy link
Author

Issue still persists.

@github-actions github-actions bot removed the stale label Jul 15, 2023
@o0shojo0o
Copy link
Contributor

Sorry, but no matter what I try, I do not get the problem readjusted.
I am at a loss as to why a few users have this kind of problem.

@o0shojo0o
Copy link
Contributor

What else I could offer is the connection via mqtt, can you test that?
Possibly also the dev branch if you need authentication.
https://github.com/o0shojo0o/ioBroker.zigbee2mqtt/tree/external-mqtt-credentials-%23148

@copystring
Copy link
Author

Yes, I was just thinking about that yesterday evening.
I don't need authentication so the normal release should work.
Sure, if it works this is fine. However it feels like a workaround rather than finding the actual reason. Having said that, I understand the difficulty finding the issue when it's not possible to reproduce.
Just yesterday I was thinking maybe additional logging for each device could help. Debug logging the availability for each device when the availability changes might give a clue when this happens.

@o0shojo0o
Copy link
Contributor

I get a message for each change of availability on my Zigbee devices.
And I actually only get messages that make sense since one battery is dead.

@copystring
Copy link
Author

What does a message like this look like? I could not find one in my logs.
I will test the adapter via MQTT for the next days and report back. For testing, is the latest release ok or install via GitHub for latest changes?

@copystring
Copy link
Author

I get errors via MQTT.

z2m_log1.txt

@o0shojo0o
Copy link
Contributor

Ok... that was of course not with me.
But tonight it goes away for a week on vacation, I will look at it afterwards.

@o0shojo0o
Copy link
Contributor

o0shojo0o commented Aug 8, 2023

z2m_log1

I looked at the log, but what's totally weird is that it's banging at the JSON.Pars() in the websocket even though you have MQTT enabled?!
Totally wild ...

2023-07-31 21:01:39.862 - error: zigbee2mqtt.0 (240608) Unexpected token 'o', ...""payload":online,"to"... is not valid JSON
2023-07-31 21:01:39.862 - error: zigbee2mqtt.0 (240608) Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch().
2023-07-31 21:01:39.862 - error: zigbee2mqtt.0 (240608) unhandled promise rejection: Unexpected token 'o', ...""payload":online,"to"... is not valid JSON
2023-07-31 21:01:39.863 - error: zigbee2mqtt.0 (240608) SyntaxError: Unexpected token 'o', ...""payload":online,"to"... is not valid JSON
at JSON.parse ()
at Zigbee2mqtt.messageParse (/opt/iobroker/node_modules/iobroker.zigbee2mqtt/main.js:135:33)

https://github.com/o0shojo0o/ioBroker.zigbee2mqtt/blob/2ba7ce53c19213400299221405d61c1bcc7b07ae/main.js#L126-L142

Can you test the current git version?

@o0shojo0o
Copy link
Contributor

@Eistee82 are you by any chance online in Discord today?

@o0shojo0o
Copy link
Contributor

I have already found you ;)

@copystring
Copy link
Author

I want to try this later. Does the PR mean it's fixed?

@o0shojo0o
Copy link
Contributor

No, the PR was for the problem of @Eistee82

o0shojo0o added a commit that referenced this issue Aug 12, 2023
-   (o0shojo0o) optimisation for the MQTT connection
-   (o0shojo0o) fix for MQTT output type: attribute_and_json ([#87](#87))
-   (o0shojo0o) added support for external MQTT-Server credentials ([#148](#148))
  	- *After update, Websocket Auth-Token must be set again, if used.*
@copystring
Copy link
Author

I've been using 2.10.1 for a few days now and the issue has since been resolved using mqtt instead of websocket.
So the issue must be with websocket.

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale label Sep 20, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
problem Something isn't working stale
Projects
None yet
Development

No branches or pull requests

3 participants