-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drivers: eth: stm32: Fix driver crash caused by RX IRQ trigger #25393
Conversation
FYI, the problem shown by CI has nothing to do with the change set..
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to the fix. If system has VLANs enabled, then this function will be called multiple times. Does it cause any side effects if the config function is also called multiple times? I see that mcux driver has same issue.
@bwasim , txs for this fix. However, since we're in V2.3.0 release process, we'll only merge bug fixes that matches a known issue. Can you add a issue for this bug ? |
This is not a regression and has been present in the driver for a long time. The problem appears because we enable interrupts before the "netif" pointer is populated in the device structure. If we get an RX interrupt before the population of the pointer, we see a crash as the "rx_thread" uses this information.
Given that this is not a regression, do you still want me to open a bug report for this ? |
No problem in calling this function multiple times, though I think that it will be called only once per Ethernet instance. If you want, I can add changes for the MCUx driver also, or create a separate issue / PR for it. |
@bwasim, That's a requirement for bugfixes merged during pre-release freeze window (as it is now). So, if you'd like to get this into the 2.3 release, please create such a ticket/update commit message with "Fixes: #ticket_no". |
Created #25408 and updated commit message. Thanks.. |
Yes, but if VLANs are enabled, then there will be one Ethernet device, but multiple network interfaces that are tied to that device. So the network interface init function will be called multiple times in that case. The mcux init seems to do things the same way as stm32 one, gmac driver checks this and only does relevant "one time things" only once. If you are into it, then separate PR could be created that fixes this both here and mcux.
could be used for this purposes, so that the things that are suppose to be done only once could be placed inside the if. |
@bwasim I was trying to understand how happen it was not detected/reported before. |
All initialization of the Ethernet interface is done in the eth_initialize function which is invoked by the boot code. This function sets up DMA, programs the Ethernet module and enables IRQs. However, this function does not setup "netif" interface info which is done when the ethernet device is enumerated by the NET stack via the "iface_api.init" func. However, after the eth_initialize func is called, it is possible that the system receives RX interrupts, and the "rx_thread" accesses the "netif" pointer to get iface info. However, because the "netif" info is not necessarily populated at this time, we get a crash (as OS does NULL access). Fixed by enabling Ethernet IRQ after the interface is properly setup. Tested on Nucleo F767Zi board. Fixes zephyrproject-rtos#25408 Signed-off-by: Bilal Wasim <bilalwasim676@gmail.com>
Looking at the code, it shouldn't be a problem if we do it multiple times but I've updated to do this only once in the STM32 Ethernet device because that makes more logical sense. I see that the gmac / mcux have the same problem in them which can be addressed in separate PR..
@erwango , this problem only shows up if there is continuous data on the network while Ethernet is initializing which results in interrupts being triggered as soon as we enable IRQ / perform hardware init.. This use-case is reasonably common in local networks, but I don't think this was tested before.. |
All initialization of the Ethernet interface is done in the eth_initialize function which is invoked by the boot code. This function sets up DMA, programs the Ethernet module and enables IRQs. However, this function does not setup "netif" interface info which is done when the ethernet device is enumerated by the NET stack via the "iface_api.init" func. However, after the eth_initialize func is called, it is possible that the system receives RX interrupts, and the "rx_thread" accesses the "netif" pointer to get iface info. However, because the "netif" info is not necessarily populated at this time, we get a crash (as OS does NULL access).
Fixed by enabling Ethernet IRQ after the interface is properly setup.
Tested on Nucleo F767Zi board.
Fixes #25408
Signed-off-by: Bilal Wasim bilalwasim676@gmail.com