-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poll on DTLS socket returns -EAGAIN if bind & receive any data. #33330
Comments
Thanks for reporting, indeed this seems to be a valid issue. I just wanted to clarify that we're on the same page first. In the first paragraph you wrote:
and the further in the analysis:
Are you sure about the former statement? Because what I've observed was indeed a busy loop, but all inside a single Actually, we used to to handle this case properly (please have a look at some older revision: zephyr/subsys/net/lib/sockets/sockets_tls.c Line 1729 in 7d77307
So in my opinion, how In case of DTLS client:
In case of DTLS server it's a bit different though:
What do you think, does the above make sense? I can work on a fix if we agree it's a sane approach. |
Thank you for looking into this @rlubos !
You are right, upon further investigation I concur with you, it busy waits within
Yes this sounds like a good plan. Although it would seem easier to just poll
I have nothing to add, it sounds like the cleanest solution available. It would be great if you implement this, it feels a bit over my head at the moment. I have already two merge requests to update in my backlog as well 😄 Thanks! |
Great, I'll work on the patch then. |
Describe the bug
I am running into a problem where poll() is called with timeout but returning 0 without waiting. This causes a busy-wait loop which is blocking lower-priority tasks (and essentially all tasks if time-slicing is not enabled). I am doing the following:
The application is a CoAP server/client where a client call
sendto
starts the DTLS handshake. In other words, my end is a DTLS client.It seems the problem is that bind() call will open up incoming data to be given to the socket FIFO without being given to mbedTLS.
zephyr/subsys/net/lib/sockets/sockets.c
Lines 364 to 367 in 4626a57
It seems like data is only given to mbedTLS when recv(from) or send(to) is called. So if any data comes in after bind() then this causes data to exist in the FIFO but it hasn't reached mbedTLS. Then from what I can see sockets_tls returns -EAGAIN in the
update_pollin
function because..but due to mentioned reasons, handshake hasn't even been started yet.
zephyr/subsys/net/lib/sockets/sockets_tls.c
Lines 1875 to 1877 in 4626a57
It will stay this way until the end of time since no handshake is started and will neither timeout nor ever finish.
I don't know if this is usage fault from my side, probably, but from what I can see even if my DTLS socket were to have the server role then if bind() is called and data is received then handshake would not be started and poll would still act this way because it seems like it's firstly when recvfrom is called that mbedTLS is fed the data and any data that comes in after bind() is called will just be put in the queue and not be given to mbedTLS. Please correct me if I'm wrong. Anyways in my opinion this kind of behavior is wrong and the case should be handled. Please comment and advice, and I can implement a fix if it's deemed necessary.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Impact
Showstopper, kinda, since the external library is using bind()/poll() combination. I could change it to recvfrom with some non-blocking flag and then check if data was received and then
k_msleep(1)
but it seems like a better solution to fix the root problem.Logs and console output
N/A
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: