pkg/semtech_loramac: deadlock with UNCONFIRMED messages #11530
Labels
Area: LoRa
Area: LoRa radio support
Type: bug
The issue reports a bug / The PR fixes a bug (including spelling errors)
Description
If you are unlucky,
semtech_loramac
package can produce a deadlock when you sendUNCONFIRMED
messages from a thread that doesn't have a message queue.The simplest way to show why is to explain how
semtech_loramac
package is connected to LoRaMAC-node library. This library has 3 interfaces: MLME, MCPS and MIB. When you send an application uplink you use the MCPS interface. This interface is asynchronous and can trigger two events:Confirm
andIndication
.When you send an uplink you first call
semtech_loramac_send
and thensemtech_loramac_recv
to wait for the transmission to finish. Currentlysemtech_loramac
handle the two LoRaMAC-node events as following:when an
UNCONFIRMED
message is sent,semtech_loramac_recv
returns onConfirm
event.when a
CONFIRMED
message is sent,Confirm
event is ignored. As we expect an Ack from the LoRa server,semtech_loramac_recv
will return onIndication
event (when this Ack is received). Waiting forIndication
event allows also to retrieve data from downlinks.Function
semtech_loramac_recv
is waiting using the message module on the caller thread. WhenConfirm
/Indication
event is received, loramac event loop sends aMSG_TYPE_LORAMAC_TX_STATUS
message to the caller thread, so it can returns fromsemtech_loramac_recv
. If this thread doesn't have a message queue, sending the message blocks the event loops until the caller thread receives the message.The problem is that when you transmit an
UNCONFIRMED
message you are also waiting for the 2 RX windows. If the LoRa server is sending a downlink to the device,Indication
event will be fired but loramac event loop has already sentMSG_TYPE_LORAMAC_TX_STATUS
onConfirm
event. It will send anotherMSG_TYPE_LORAMAC_TX_STATUS
that will block the event loop forever because the caller thread is out ofsemtech_loramac_recv
. Even worse, if now the caller thread callssemtech_loramac_send
, a deadlock occurs.With the current architecture of
semtech_loramac
package, it is quite difficult to fix this issue. The simplest way would be to ignoreIndication
event when anUNCONFIRMED
message was sent, but data from downlink messages would be lost.Steps to reproduce the issue
Send an
UNCONFIRMED
message from a thread that doesn't have an message queue to a LoRa server that have a downlink message to transmit to the end device.In fact if you are lucky LoRa server can send a downlink by itself if it has MAC parameters to transmit to end device (for example
NewChannelReq
).The image below describes what happened when a downlink is received after an
UNCONFIRMED
uplink.The code that was used to produce this example is the following:
The text was updated successfully, but these errors were encountered: