Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] tcp://ip:port connection ETIMEDOUT error doesn't match ZWaveErrorCodes.Driver_Failed #3509

Closed
3 tasks
vladbabii opened this issue Oct 9, 2021 · 37 comments · Fixed by #3520
Closed
3 tasks
Assignees

Comments

@vladbabii
Copy link

Checklist:

  • [ X ] I am not using Home Assistant.
  • [ X ] I have checked the troubleshooting section and my problem is not described there.
  • [ X ] I have read the changelog and my problem is not mentioned there.

Build/Run method

  • [ X ] Docker
  • PKG
  • Snap package
  • Manually built (git clone - yarn install - yarn run build )

Zwavejs2Mqtt version: 5.8.0
Z-Wave JS version: 8.4.1

Describe the bug

  • I am using a tcp connection for zwave, in the form of tcp://192.168.1.5:7676
  • on the other end of the connection there's a ser2net server
  • everything works great until the network device reboot

After reboot, the web interface shows
Driver: read ETIMEDOUT
and in MQTT I have (topic and value)
service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error {"data":[{"errno":-110,"code":"ETIMEDOUT","syscall":"read"}]}

and it never works again unless i restart the zwave2mqtt docker image or i save setting again in the web interface

To Reproduce

Steps to reproduce the behavior:

  1. Go to setting -> zwave, set tcp://: for ser2net server attached to a zwave stick on a network device
  2. Save setting
  3. Check zwave works by turning a device/on/off
  4. Reboot network device that runs ser2net or pull network cable
  5. see in web interface "Driver: read ETIMEDOUT"
  6. see in mqtt service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error {"data":[{"errno":-110,"code":"ETIMEDOUT","syscall":"read"}]}

Expected behavior

When zwave driver enter that specific error state i expect it to be restarted after some amount of time.

Additional context

  1. I don't think it matters but my network device is a vera edge with all it's vera software turned off and ser2net being started automatically
  2. I am using docker image with :latest tag pulled one hour ago and i did a zwave config updates then restarted everything

How do i clean up my logs of any private data ? I'd like to post them here but i want to make sure i don't leak any keys or other information.

Thank you for your time! This is great software!

@robertsLando
Copy link
Member

robertsLando commented Oct 9, 2021

@vladbabii thanks for your issue, could you provide me some logs when this happens? I need both zwavejs and Zwavejs2mqtt logs. You can find instructions about how to get them here: https://zwave-js.github.io/zwavejs2mqtt/#/troubleshooting/generating-logs

@vladbabii
Copy link
Author

zwavejs2mqtt-store.zip

Here you go. I made a separate install with nothing private on it.

@vladbabii
Copy link
Author

vladbabii commented Oct 9, 2021

The log file has the same docker stuff as before, connected to a second vera edge with ser2net with only one qubino device.
The same issue is present. I checked the log files and the error appears in both

2021-10-09 13:32:57.930 INFO ZWAVE: Controller status: Driver: read ETIMEDOUT

2021-10-09T10:32:57.930Z DRIVER Serial port errored: read ETIMEDOUT

@vladbabii
Copy link
Author

You can probably use a ser2net docker image (many available on docker hub) to map your zwave /dev/... device to a network port inside docker to reproduce this issue

@robertsLando
Copy link
Member

I'm not on my pc right now I will give a look at your logs on Monday

@robertsLando
Copy link
Member

@AlCalzone I think that ETIMEDOUT shuld have an error code like `ZwaveErrorCodes.Driver_Failed' because i already check errors like that to restart the driver:

https://github.com/zwave-js/zwavejs2mqtt/blob/master/lib/ZwaveClient.ts#L2232

@AlCalzone AlCalzone transferred this issue from zwave-js/zwave-js-ui Oct 11, 2021
@robertsLando robertsLando changed the title When using tcp://ip:port as connection, when driver has ETIMEDOUT error, it never recovers [bug] tcp://ip:port connection ETIMEDOUT error doesn't match ZWaveErrorCodes.Driver_Failed Oct 11, 2021
@AlCalzone
Copy link
Member

@vladbabii do you have a foolproof way to reproduce this? I've tried with ser2net on a Raspberry Pi, but as soon as I reboot it from another shell, the socket gets closed and the driver destroys itself to force a restart.

Note I'm working without Docker, but if necessary, I could try with it.

@vladbabii
Copy link
Author

With docker it happens every time. I can reproduce it like clock work. Let me try again with the latest docker image.

@vladbabii
Copy link
Author

Seems like it's fixed. I had to pull latest image ( from 17 hours ago according to docker hub) it detects the issue then tries to reconnect until it succeeds.

When i kill ser2net

2021-10-13 09:59:18.172 INFO ZWAVE: Controller status: Driver: The socket closed unexpectedly!
2021-10-13 09:59:18.173 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    ZWaveError: The socket closed unexpectedly!
        at Socket.<anonymous> (/usr/src/app/node_modules/@zwave-js/serial/src/ZWaveSocket.ts:24:10)
        at Socket.emit (node:events:394:28)
        at Socket.emit (node:domain:470:12)
        at TCP.<anonymous> (node:net:661:12) {
      [stack]: 'ZWaveError: The socket closed unexpectedly!\n' +
        '    at Socket.<anonymous> (/usr/src/app/node_modules/@zwave-js/serial/src/ZWaveSocket.ts:24:10)\n' +
        '    at Socket.emit (node:events:394:28)\n' +
        '    at Socket.emit (node:domain:470:12)\n' +
        '    at TCP.<anonymous> (node:net:661:12)',
      [message]: 'The socket closed unexpectedly!',
      code: 5,
      context: undefined,
      transactionSource: undefined,
      name: 'ZWaveError'
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 09:59:18.173 INFO ZWAVE-SERVER: Server closed
2021-10-13 09:59:18.177 INFO ZWAVE: Client closed
2021-10-13 09:59:18.179 INFO ZWAVE: Connecting to tcp://10.11.8.88:7676
Logging to file:
        /usr/src/app/store/zwavejs_2021-10-13.log
2021-10-13 09:59:18.183 INFO ZWAVE: Controller status: Driver: connect ECONNREFUSED 10.11.8.88:7676
2021-10-13 09:59:18.183 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    Error: connect ECONNREFUSED 10.11.8.88:7676
        at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1133:16) {
      [stack]: 'Error: connect ECONNREFUSED 10.11.8.88:7676\n' +
        '    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1133:16)',
      [message]: 'connect ECONNREFUSED 10.11.8.88:7676',
      errno: -111,
      code: 'ECONNREFUSED',
      syscall: 'connect',
      address: '10.11.8.88',
      port: 7676
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 09:59:18.183 WARN ZWAVE: Retry connection in 3 seconds...
2021-10-13 09:59:21.188 INFO ZWAVE: Connecting to tcp://10.11.8.88:7676
Logging to file:
        /usr/src/app/store/zwavejs_2021-10-13.log
2021-10-13 09:59:21.192 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    Error: connect ECONNREFUSED 10.11.8.88:7676
        at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1133:16) {
      [stack]: 'Error: connect ECONNREFUSED 10.11.8.88:7676\n' +
        '    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1133:16)',
      [message]: 'connect ECONNREFUSED 10.11.8.88:7676',
      errno: -111,
      code: 'ECONNREFUSED',
      syscall: 'connect',
      address: '10.11.8.88',
      port: 7676
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 09:59:21.192 WARN ZWAVE: Retry connection in 3 seconds...
2021-10-13 09:59:24.198 INFO ZWAVE: Connecting to tcp://10.11.8.88:7676
Logging to file:
        /usr/src/app/store/zwavejs_2021-10-13.log

When i reboot - so connection is dropped from one side with no immediate tcp notification

2021-10-13 10:00:32.328 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/controller/statistics_updated: {
  data: [
    {
      messagesTX: 9,
      messagesRX: 13,
      messagesDroppedRX: 0,
      NAK: 0,
      CAN: 0,
      timeoutACK: 3,
      timeoutResponse: 0,
      timeoutCallback: 0,
      messagesDroppedTX: 0
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 10:00:33.431 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/controller/statistics_updated: {
  data: [
    {
      messagesTX: 9,
      messagesRX: 13,
      messagesDroppedRX: 0,
      NAK: 0,
      CAN: 0,
      timeoutACK: 4,
      timeoutResponse: 0,
      timeoutCallback: 0,
      messagesDroppedTX: 0
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }

going up until

2021-10-13 10:00:38.142 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/controller/statistics_updated: {
  data: [
    {
      messagesTX: 9,
      messagesRX: 13,
      messagesDroppedRX: 0,
      NAK: 0,
      CAN: 0,
      timeoutACK: 6,
      timeoutResponse: 0,
      timeoutCallback: 0,
      messagesDroppedTX: 0
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 10:00:40.246 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds
2021-10-13 10:00:40.246 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: Timeout while waiting for an ACK from the controller

then

2021-10-13 10:00:55.488 INFO ZWAVE: Controller status: Driver: read ECONNRESET
2021-10-13 10:00:55.489 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    Error: read ECONNRESET
        at TCP.onStreamRead (node:internal/stream_base_commons:211:20) {
      [stack]: 'Error: read ECONNRESET\n' +
        '    at TCP.onStreamRead (node:internal/stream_base_commons:211:20)',
      [message]: 'read ECONNRESET',
      errno: -104,
      code: 'ECONNRESET',
      syscall: 'read'
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }

then it stays at

2021-10-13 10:01:10.258 DEBUG ZWAVE: Polling value 37-0-currentValue
2021-10-13 10:01:10.259 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds
2021-10-13 10:01:10.259 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: The driver is not ready or has been destroyed
2021-10-13 10:01:40.268 DEBUG ZWAVE: Polling value 37-0-currentValue
2021-10-13 10:01:40.269 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: The driver is not ready or has been destroyed
2021-10-13 10:01:40.269 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds
2021-10-13 10:02:10.271 DEBUG ZWAVE: Polling value 37-0-currentValue
2021-10-13 10:02:10.271 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: The driver is not ready or has been destroyed
2021-10-13 10:02:10.272 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds

If i then go to settings and click on save it connects and works again. I can't get the ECONNTIMEOUT error i was getting initially...

Oh wait, i managed to get it again by doing a couple of reboots on the network device with ser2net - Driver: read ETIMEDOUT

2021-10-13 10:03:59.531 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/node/statistics_updated: {
  data: [
    2,
    {
      commandsTX: 2,
      commandsRX: 1,
      commandsDroppedRX: 0,
      commandsDroppedTX: 0,
      timeoutResponse: 0
    },
    [length]: 2
  ]
} with options { qos: 1, retain: false }
2021-10-13 10:04:19.647 INFO ZWAVE: Controller status: Driver: read ETIMEDOUT
2021-10-13 10:04:19.648 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    Error: read ETIMEDOUT
        at TCP.onStreamRead (node:internal/stream_base_commons:211:20) {
      [stack]: 'Error: read ETIMEDOUT\n' +
        '    at TCP.onStreamRead (node:internal/stream_base_commons:211:20)',
      [message]: 'read ETIMEDOUT',
      errno: -110,
      code: 'ETIMEDOUT',
      syscall: 'read'
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 10:04:29.287 DEBUG ZWAVE: Polling value 37-0-currentValue
2021-10-13 10:04:29.288 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: The driver is not ready or has been destroyed
2021-10-13 10:04:29.288 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds

I'll try to get some logs now....

@vladbabii
Copy link
Author

Here is log files...
zwavejs2mqtt-store (3).zip

Here is what I see in the interface
https://www.youtube.com/watch?v=QadwqiFkMmE

So when TCP is broken correctly (zwave2mqtt is notified connection is closed) it reconnects fine after a 3 second delay (that it appears in logs).

But when the TCP times out I get ETIMEDOUT or ECONNRESET but the driver is not restarted / not reconnecting

If it would help I can do a screen share session or provide a tunnel for my device ser2net so you can test it yourself...

@vladbabii
Copy link
Author

Wait, seems latest is from 4 days ago, i will try with a newer image.

@robertsLando
Copy link
Member

@vladbabii Use master tag instead

@vladbabii
Copy link
Author

Testing with master from 18 hours ago ( https://hub.docker.com/layers/zwavejs/zwavejs2mqtt/master/images/sha256-141126158ae796f9a8610676079f9488cc1858bec1bb8150245b575593c129fc?context=explore )

Killing ser2net - works fine, 3 seconds reconnect.

Rebooting device ... timeoutACK increases... Getting error while polling (timeout while waiting for ACK) then ECONNRESET.

From docker logs:

} with options { qos: 1, retain: false }
2021-10-13 10:21:41.785 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/controller/statistics_updated: {
  data: [
    {
      messagesTX: 9,
      messagesRX: 17,
      messagesDroppedRX: 0,
      NAK: 0,
      CAN: 0,
      timeoutACK: 6,
      timeoutResponse: 0,
      timeoutCallback: 0,
      messagesDroppedTX: 0
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 10:21:43.892 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: Timeout while waiting for an ACK from the controller
2021-10-13 10:21:43.893 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds
2021-10-13 10:21:59.104 INFO ZWAVE: Controller status: Driver: read ECONNRESET
2021-10-13 10:21:59.105 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    Error: read ECONNRESET
        at TCP.onStreamRead (node:internal/stream_base_commons:211:20) {
      [stack]: 'Error: read ECONNRESET\n' +
        '    at TCP.onStreamRead (node:internal/stream_base_commons:211:20)',
      [message]: 'read ECONNRESET',
      errno: -104,
      code: 'ECONNRESET',
      syscall: 'read'
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 10:22:13.907 DEBUG ZWAVE: Polling value 37-0-currentValue
2021-10-13 10:22:13.908 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: The driver is not ready or has been destroyed
2021-10-13 10:22:13.908 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds

After 2-3 minutes i don't see any mention of restarting/reconnecting in 3 seconds like when tcp breaks cleanly.

@vladbabii
Copy link
Author

I can start a jitsi meet session or google hangouts and give you screen share if it helps in any way

Log files from master test

zwavejs2mqtt-store (4).zip
.

@vladbabii
Copy link
Author

In zwavejs2mqtt log

2021-10-13 10:24:26.656 INFO ZWAVE: Controller status: Driver: read ETIMEDOUT
2021-10-13 10:24:26.659 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    Error: read ETIMEDOUT
        at TCP.onStreamRead (node:internal/stream_base_commons:211:20) {
      [stack]: 'Error: read ETIMEDOUT\n' +
        '    at TCP.onStreamRead (node:internal/stream_base_commons:211:20)',
      [message]: 'read ETIMEDOUT',
      errno: -110,
      code: 'ETIMEDOUT',
      syscall: 'read'
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 10:24:44.415 DEBUG ZWAVE: Polling value 37-0-currentValue
2021-10-13 10:24:44.416 ERROR ZWAVE: Error while polling value 2-37-0-currentValue: The driver is not ready or has been destroyed
2021-10-13 10:24:44.416 DEBUG ZWAVE: 2-37-0-currentValue will be polled in 30 seconds
2021-10-13 10:25:14.432 DEBUG ZWAVE: Polling value 37-0-currentValue

in zwavejs log

2021-10-13T07:24:14.307Z SERIAL « 0x011800130500000200c37f7f7f7f000003000000000301000031              (26 bytes)
2021-10-13T07:24:14.308Z SERIAL » [ACK]                                                                   (0x06)
2021-10-13T07:24:14.308Z DRIVER « [REQ] [SendData]
                                    callback id:     5
                                    transmit status: OK
2021-10-13T07:24:14.404Z SERIAL « 0x010b0004000203250300c30014                                        (13 bytes)
2021-10-13T07:24:14.405Z CNTRLR   [Node 002] [~] [Binary Switch] currentValue: false => false       [Endpoint 0]
2021-10-13T07:24:14.409Z SERIAL » [ACK]                                                                   (0x06)
2021-10-13T07:24:14.410Z DRIVER « [Node 002] [REQ] [ApplicationCommand]
                                  └─[BinarySwitchCCReport]
                                      current value: false
2021-10-13T07:24:26.655Z DRIVER   Serial port errored: read ETIMEDOUT
2021-10-13T07:24:26.659Z DRIVER   destroying driver instance...

@vladbabii
Copy link
Author

Is this ok @robertsLando ?

@robertsLando
Copy link
Member

robertsLando commented Oct 13, 2021

@vladbabii Like I said above I think that this whould be handled on zwave-js side as I already listen for driver errors and restart driver when the error match a driver failed error, I think that @AlCalzone should just add the correct error code to those errors:

https://github.com/zwave-js/zwavejs2mqtt/blob/master/lib/ZwaveClient.ts#L2235

@vladbabii
Copy link
Author

@AlCalzone can i help in some other way than provide logs?

@AlCalzone
Copy link
Member

I think that is enough, thanks. The error causes the driver to be destroyed, but zwavejs2mqtt has no means of detecting that.

@vladbabii
Copy link
Author

@AlCalzone @robertsLando please let me know when i can test with any docker image to validate the fix. Also, if you have any other bugs that need docker testing please let me know, i have a couple zwave devices, zwave sticks and other zwave-related things.

@robertsLando
Copy link
Member

As soon as @AlCalzone makes a new release I will update my repo

@AlCalzone
Copy link
Member

You can do it now:
https://zwave-js.github.io/zwavejs2mqtt/#/development/custom-docker?id=building-a-container-using-dockerfilecontrib
Use master branch for zwavejs2mqtt, wrap-serialport-error for node-zwave-js

@vladbabii
Copy link
Author

2021-10-13 12:36:30.682 INFO ZWAVE: Next update scheduled for: Thu Oct 14 2021 00:00:00 GMT+0300 (Eastern European Summer Time)
2021-10-13 12:36:44.832 INFO ZWAVE: Controller status: Driver: Serial port errored: read ETIMEDOUT
2021-10-13 12:36:44.833 DEBUG MQTT: Publishing to service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error: {
  data: [
    ZWaveError: Serial port errored: read ETIMEDOUT
        at ZWaveSocket.<anonymous> (/usr/src/app/node_modules/zwave-js/src/lib/driver/Driver.ts:702:19)
        at ZWaveSocket.emit (node:events:394:28)
        at ZWaveSocket.emit (node:domain:470:12)
        at Socket.<anonymous> (/usr/src/app/node_modules/@zwave-js/serial/src/ZWaveSerialPortBase.ts:112:9)
        at Socket.emit (node:events:394:28)
        at Socket.emit (node:domain:470:12)
        at emitErrorNT (node:internal/streams/destroy:193:8)
        at emitErrorCloseNT (node:internal/streams/destroy:158:3)
        at processTicksAndRejections (node:internal/process/task_queues:83:21) {
      [stack]: 'ZWaveError: Serial port errored: read ETIMEDOUT\n' +
        '    at ZWaveSocket.<anonymous> (/usr/src/app/node_modules/zwave-js/src/lib/driver/Driver.ts:702:19)\n' +
        '    at ZWaveSocket.emit (node:events:394:28)\n' +
        '    at ZWaveSocket.emit (node:domain:470:12)\n' +
        '    at Socket.<anonymous> (/usr/src/app/node_modules/@zwave-js/serial/src/ZWaveSerialPortBase.ts:112:9)\n' +
        '    at Socket.emit (node:events:394:28)\n' +
        '    at Socket.emit (node:domain:470:12)\n' +
        '    at emitErrorNT (node:internal/streams/destroy:193:8)\n' +
        '    at emitErrorCloseNT (node:internal/streams/destroy:158:3)\n' +
        '    at processTicksAndRejections (node:internal/process/task_queues:83:21)',
      [message]: 'Serial port errored: read ETIMEDOUT',
      code: 5,
      context: undefined,
      transactionSource: undefined,
      name: 'ZWaveError'
    },
    [length]: 1
  ]
} with options { qos: 1, retain: false }
2021-10-13 12:36:44.836 INFO ZWAVE-SERVER: Server closed
2021-10-13 12:36:44.840 INFO ZWAVE: Client closed
2021-10-13 12:36:44.843 INFO ZWAVE: Connecting to tcp://10.11.8.88:7676
Logging to file:
        /usr/src/app/store/zwavejs_2021-10-13.log

Then if it fails i see the retry

2021-10-13 12:37:22.375 WARN ZWAVE: Retry connection in 3 seconds...
2021-10-13 12:37:25.380 INFO ZWAVE: Connecting to tcp://10.11.8.88:7676

And then it seems to be working fine. I'll do some testing over the next 1-2 hours to make sure it's stable

@AlCalzone
Copy link
Member

Thanks - let me know if it stays that way, then I can merge.

@vladbabii
Copy link
Author

@robertsLando I see this topic/message in mqtt

service/zwave_01/_EVENTS/ZWAVE_GATEWAY-zwave_01/driver/driver_error {"data":[{"code":5}]}

But i don't see any message published without any error when the connection works again... Is this intended behaviour ?

@robertsLando
Copy link
Member

But i don't see any message published without any error when the connection works again... Is this intended behaviour ?

The events you see there are all the events coming from zwave-js, I don't emit any event on connection successfull if that's the question. BTW you see that after a reconnect zwave-js will send the all nodes readty event, you can use that as event to know when everything is setted up correctly

@vladbabii
Copy link
Author

vladbabii commented Oct 13, 2021

What i'm trying to do is to write a script to automate breaking zwave tcp connection (by using relays with power source of the device and of a network switch or by running shell commands to restart ser2net) but i don't see an obvious way of detecting state from mqtt topics right now... what should i use? I want to make the script and leave it running for 1-2 hours without my intervention so i can get some statistics at the end...

edit1
I need to be able to get current status by either a mqtt topic or a http api or something easy to read

@vladbabii
Copy link
Author

All the mqtt retained topics are theres

service/zwave_01/_CLIENTS/ZWAVE_GATEWAY-Zwavejs2Mqtt/version {"value":"5.8.0","time":1634119906027}
service/zwave_01/_CLIENTS/ZWAVE_GATEWAY-Zwavejs2Mqtt/status {"value":true,"time":1634119906027}
service/zwave_01/Edge-8/status {"time":1634120088837,"value":true,"status":"Alive","nodeId":1}
service/zwave_01/Edge-8/114/0/manufacturerId {"time":1634120088835,"value":0}
service/zwave_01/Edge-8/114/0/productType {"time":1634120088835,"value":3}
service/zwave_01/Edge-8/114/0/productId {"time":1634120088835,"value":8}
service/zwave_01/Edge-8/134/0/libraryType {"time":1634120088836}
service/zwave_01/Edge-8/134/0/protocolVersion {"time":1634120088836}
service/zwave_01/Edge-8/134/0/firmwareVersions {"time":1634120088836,"value":["5.14"]}
service/zwave_01/PowerStrip/status {"time":1634120088910,"value":true,"status":"Alive","nodeId":2}
service/zwave_01/PowerStrip/37/0/currentValue {"time":1634120088895,"value":false}
service/zwave_01/PowerStrip/37/0/targetValue {"time":1634120088896,"value":false}
service/zwave_01/PowerStrip/50/0/value/65537 {"time":1634120088896,"value":0}
service/zwave_01/PowerStrip/50/0/value/66049 {"time":1634120277444,"value":0}
service/zwave_01/PowerStrip/50/0/reset {"time":1634120088897}
service/zwave_01/PowerStrip/112/0/1 {"time":1634120088897,"value":1}
service/zwave_01/PowerStrip/112/0/2 {"time":1634120088897,"value":0}
service/zwave_01/PowerStrip/112/0/3 {"time":1634120088898,"value":0}
service/zwave_01/PowerStrip/112/0/10 {"time":1634120088898,"value":255}
service/zwave_01/PowerStrip/112/0/11 {"time":1634120088898,"value":0}
service/zwave_01/PowerStrip/112/0/12 {"time":1634120088899,"value":0}
service/zwave_01/PowerStrip/112/0/15 {"time":1634120088899,"value":0}
service/zwave_01/PowerStrip/112/0/30 {"time":1634120088899,"value":1}
service/zwave_01/PowerStrip/112/0/40 {"time":1634120088900,"value":10}
service/zwave_01/PowerStrip/112/0/42 {"time":1634120088900,"value":300}
service/zwave_01/PowerStrip/112/0/63 {"time":1634120088900,"value":0}
service/zwave_01/PowerStrip/112/0/100 {"time":1634120088901,"value":1}
service/zwave_01/PowerStrip/112/0/101 {"time":1634120088903,"value":1}
service/zwave_01/PowerStrip/112/0/110 {"time":1634120088904,"value":32536}
service/zwave_01/PowerStrip/112/0/120 {"time":1634120088904,"value":5}
service/zwave_01/PowerStrip/114/0/manufacturerId {"time":1634120088905,"value":345}
service/zwave_01/PowerStrip/114/0/productType {"time":1634120088905,"value":2}
service/zwave_01/PowerStrip/114/0/productId {"time":1634120088906,"value":82}
service/zwave_01/PowerStrip/134/0/libraryType {"time":1634120088907,"value":3}
service/zwave_01/PowerStrip/134/0/protocolVersion {"time":1634120088907,"value":"4.5"}
service/zwave_01/PowerStrip/134/0/firmwareVersions {"time":1634120088908,"value":["1.1"]}
service/zwave_01/PowerStrip/134/0/hardwareVersion {"time":1634120088909,"value":1}

so currently i see no easy way to get a status from mqtt. If the mqtt client is not connected when the reconnection happens it does not see that at all.

I would expect to see a retained topic with will set up on it so on a bad disconnect mqtt would change value to something to indicate 'this zwave2mqtt is not online/working right now'...

@vladbabii
Copy link
Author

xyz

This is confusing... driver is in a bad state but devices status is "Alive"

@vladbabii
Copy link
Author

Seems /health endpoint gives correct state of service ( 200/500 status).
Could this be mirrored on mqtt with a last will and a publish on disconnect ? @robertsLando can i make a new feature request for this? Or should i keep polling /health to get state of service ?

@robertsLando
Copy link
Member

I was going to answer to you that /health apis are designed for such purposes.

BTW I still don't get the problem here, when the driver fails it emits the driver error, then it connects again and should emit driver_ready event, couldn't you listen to this events on mqtt?

@AlCalzone
Copy link
Member

but devices status is "Alive"

The device status only gets updated when communicating with devices.

@vladbabii
Copy link
Author

I was going to answer to you that /health apis are designed for such purposes.

BTW I still don't get the problem here, when the driver fails it emits the driver error, then it connects again and should emit driver_ready event, couldn't you listen to this events on mqtt?

Using only retain messages, a client can connect to mqtt at any time to read data. the messages of driver error and driver ready are not retained messages. So for stateful information about zwave actual status (not only device status) to be used, a clients would need to either poll /health, or have a second service poll /health and write it to a mqtt topic.

This way an automation script that listens to a motion sensor being tripped and doing something should listen to both /health updates and motion updates. In this way it can device "zwave stack is in a bad state, don't try to do anything with information until health says it's ok"

I can script my way around the information that's available now by polling /health every 250ms and dumping it into mqtt topic but it would be nicer to have /health exposed as mqtt message (with retained messages) and also unset it via last will if service goes down (docker issue, connection issue, zwave issue). this way i can get my automations to be aware of the real state of things regardless of when they connect to mqtt (some things might run for a brief time, some might run for months, but kubernetes/docker could also deploy them on different nodes so it would trigger reconnects).

Or my use case is different than the intended use case of zwave2mqtt. I thought of it as "make available all information from zwave on mqtt so any mqtt client can read it and make valid decisions from (near) real-time data"

@robertsLando
Copy link
Member

Try to check zwave-js/zwave-js-ui#1838

you should see driverReady status now on <prefix>/driver/status payload is 'true'/'false'. Would that be enought?

@vladbabii
Copy link
Author

Regarding the initial issue in the ticket:
@robertsLando @AlCalzone i have zwave still working since yesterday (about 18 hours) with multiple reboots and ser2net restarts (more than 200 probably). It seems pretty stable.

@vladbabii
Copy link
Author

Try to check zwave-js/zwavejs2mqtt#1838

you should see driverReady status now on <prefix>/driver/status payload is 'true'/'false'. Would that be enought?

So build docker image with "driver-status" for z2m branch ?

curl -s https://raw.githubusercontent.com/zwave-js/zwavejs2mqtt/master/docker/Dockerfile.contrib | \
DOCKER_BUILDKIT=1 docker build - --build-arg SRC=git-clone-src --no-cache \
--build-arg ZWJ_BRANCH=wrap-serialport-error --build-arg Z2M_BRANCH=driver-status -t zwavejs2mqtt

There are 2 cases to cover

  1. driver status when service is working normally --> normal updates to mqtt
  2. driver status when service gets killed/times out --> set last will on that topic to set the driver status to false
  3. driver status when service exists normally --> before mqtt disconnect set that status to false

@robertsLando should we move the discussion in a different issue or on that merge request since it's kinda off topic here ?

@robertsLando
Copy link
Member

@vladbabii We can move to zwave-js/zwave-js-ui#1840

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants