Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gnrc_netif: use event loops by default to process ISR #16748

Merged

Conversation

jia200x
Copy link
Member

@jia200x jia200x commented Aug 16, 2021

Contribution description

This PR removes the NETDEV_MSG_TYPE_EVENT and forces GNRC Netif to do the Bottom Half processing (ISR offload) using event loops instead of msg queues.

Msg queues are not safe for processing ISR because they might be lost. Indeed this might happen often if the device is under stress (e.g Border Router). Losing these events tend to have catastrophic consequences such as:

  • Radios might stop receiving frames if the FB is lock and nobody cares to read the framebuffer
  • The IEEE 802.15.4 SubMAC might be go to the wrong state if one of the radio events get lost.

I suspect some of the BR freezes might be related to this.

This PR also removes the gnrc_netif_events since it's not optional anymore. We should adapt it in the future to use thread_flags directly though (I know @maribu already did this in one of the confirm_send PRs).

Testing procedure

Make sure GNRC netif works properly.

Issues/PRs references

Depends on #16747, otherwise that radio won't work anymore.

@jia200x jia200x requested review from maribu and fjmolinas August 16, 2021 13:49
@github-actions github-actions bot added Area: build system Area: Build system Area: drivers Area: Device drivers Area: network Area: Networking Area: sys Area: System labels Aug 16, 2021
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Aug 16, 2021
@jia200x
Copy link
Member Author

jia200x commented Aug 17, 2021

I will need #16751 first

@benpicco
Copy link
Contributor

I'm a bit surprised about the size increase. Is the failure on native related to #16751?

@jia200x
Copy link
Member Author

jia200x commented Aug 17, 2021

I'm a bit surprised about the size increase.

I guess it can be reduced when we use thread_flags instead of event loops.

Is the failure on native related to #16751?

Yes it is

@maribu
Copy link
Member

maribu commented Aug 17, 2021

I'm a bit surprised about the size increase.

I guess it can be reduced when we use thread_flags instead of event loops.

This is what the PR to adapt gnrc_netif to use confirm_send(). And indeed, it is more lightweight. But note that if there are additional users of the event queue, I'm confident that this will pay out. With the ability to handle multiple event queues (of different priorities) in a single thread, it would be even possible to run a full network stack in a single thread while keeping the strict separation of network layers and the flexibility this yields as in GNRC.

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Mar 2, 2022
@jia200x
Copy link
Member Author

jia200x commented Apr 14, 2022

is there something missing here? I would like to have this one to remove the netdev dependency on GNRC (required for the openDSME integration)

@stale stale bot removed the State: stale State: The issue / PR has no activity for >185 days label Apr 14, 2022
@benpicco
Copy link
Contributor

Does this still depend on #16751?

@jia200x
Copy link
Member Author

jia200x commented Apr 14, 2022

Does this still depend on #16751?

Hmmmm, yes, it actually depends on that...

@benpicco benpicco added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels May 6, 2022
@benpicco
Copy link
Contributor

benpicco commented May 7, 2022

Hm CI does not like it anymore

@benpicco benpicco force-pushed the pr/gnrc_netif/remove_netdev_event_isr_msg branch from 374991d to 869c5f3 Compare May 10, 2022 15:50
@benpicco
Copy link
Contributor

I've taken the liberty to give this a rebase.

@github-actions github-actions bot removed the Area: drivers Area: Device drivers label May 10, 2022
@github-actions github-actions bot added the Area: tests Area: tests and testing framework label May 10, 2022
Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically just makes gnrc_netif_events the default and removes the !gnrc_netif_events case.

@benpicco benpicco merged commit 1e315c7 into RIOT-OS:master May 13, 2022
gschorcht added a commit to gschorcht/RIOT-Xtensa-ESP that referenced this pull request Aug 16, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
gschorcht added a commit to gschorcht/RIOT-Xtensa-ESP that referenced this pull request Aug 18, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
gschorcht added a commit to gschorcht/RIOT-Xtensa-ESP that referenced this pull request Aug 21, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
gschorcht added a commit to gschorcht/RIOT-Xtensa-ESP that referenced this pull request Aug 23, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
gschorcht added a commit to gschorcht/RIOT-Xtensa-ESP that referenced this pull request Aug 23, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
gschorcht added a commit to gschorcht/RIOT-Xtensa-ESP that referenced this pull request Aug 24, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
@chrysn chrysn added this to the Release 2022.07 milestone Aug 25, 2022
Teufelchen1 pushed a commit to Teufelchen1/RIOT that referenced this pull request Aug 26, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
Teufelchen1 pushed a commit to Teufelchen1/RIOT that referenced this pull request Aug 26, 2022
When FreeRTOS semaphores, as required by ESP-IDF, are used together with `gnrc_netif`, RIOT may crash if `STATUS_RECEIVE_BLOCKED` is used as a blocking mechanism in the FreeRTOS adaptation layer. The reason for this is that `gnrc_netif` uses thread flags since PR RIOT-OS#16748. If the `gnrc_netif` thread is blocked because of a FreeRTOS semaphore, and is thus in `STATUS_RECEIVE_BLOCKED` state, the `_msg_send` function will cause a crash because it then assumes that `target->wait_data` contains a pointer to a message of type `msg_t`, but by using thread flags it contains the flag mask. This situation can happen if the ESP hardware is used while another thread is sending something timer controlled to the `gnrc_netif` thread.

To solve this problem `STATUS_MUTEX_LOCKED` is used instead of `STATUS_RECEIVE_BLOCKED` and `STATUS_SEND_BLOCKED`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: build system Area: Build system Area: network Area: Networking Area: sys Area: System Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants