Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: sockets: Update msg_controllen in recvmsg properly #77345

Conversation

decsny
Copy link
Member

@decsny decsny commented Aug 21, 2024

If adding any control data like timestamping or pkt_info fails, the msg_controllen wasn't properly cleared.
This commit aims to fix this by clearing the msg_controllen only when adding control data is successful.

Fixes #77303

Author: @axelnxp
(submitted on behalf because of some nxp github policies delaying his ability to post a PR)

awojasinski
awojasinski previously approved these changes Aug 21, 2024
jukkar
jukkar previously approved these changes Aug 21, 2024
@axelnxp
Copy link
Contributor

axelnxp commented Aug 21, 2024

The change from this PR is causing a regression is some socket tests.
I tested on native_sim and I confirmed the change is the culprit.

when running run_ancillary_recvmsg_test, after the change, the second call to comm_sendmsg_recvmsg returns a server_msg with no control data, so it looks like we clear msg_controllen but we shouldn't in this case.

@jukkar @awojasinski what do you think?

@axelnxp
Copy link
Contributor

axelnxp commented Aug 21, 2024

comm_sendmsg_recvmsg seems to be re-using the cmsgbuf without clearing it.
@jukkar is this a valid use of cmsgbuf ? I would expect the caller to clear it before calling recvmsg.

I tried to clear the buffer, and the test can pass.

I push it so you can review.

@jukkar
Copy link
Member

jukkar commented Aug 21, 2024

I tried to find answer when the msg_controllen should be cleared.
The https://www.man7.org/linux/man-pages/man3/cmsg.3.html says

Finally, the msg_controllen field of the msghdr
should be set to the sum of the CMSG_SPACE() of the length of all
control messages in the buffer. 

Also recvmsg(2) says

 When  recvmsg() is called, msg_controllen should contain the length of the available buffer in msg_control; upon return from a successful call it will contain the length of the control message sequence.

So if I am reading this correctly, if the call is successfull, we should not clear the value.
The original code cleared the controllen when there was no control data involved. The commit ee06da5 added more socket option checks but now it looks like the change was not correct.
With this in mind, I did following change and the tests started to pass. Please take a look if this looks sane.

diff --git a/subsys/net/lib/sockets/sockets.c b/subsys/net/lib/sockets/sockets.c
index b61683b0eec..c0ce03e1a16 100644
--- a/subsys/net/lib/sockets/sockets.c
+++ b/subsys/net/lib/sockets/sockets.c
@@ -1572,27 +1572,28 @@ static inline ssize_t zsock_recv_dgram(struct net_context *ctx,
 	if (msg != NULL) {
 		if (msg->msg_control != NULL) {
 			if (msg->msg_controllen > 0) {
-				bool clear_controllen = true;
+				bool clear_controllen = false;
 
 				if (IS_ENABLED(CONFIG_NET_CONTEXT_TIMESTAMPING)) {
 					if (add_timestamping(ctx, pkt, msg) < 0) {
 						msg->msg_flags |= ZSOCK_MSG_CTRUNC;
-					} else {
-						clear_controllen = false;
 					}
+				} else {
+					clear_controllen = true;
 				}
 
 				if (IS_ENABLED(CONFIG_NET_CONTEXT_RECV_PKTINFO) &&
 				    net_context_is_recv_pktinfo_set(ctx)) {
+					clear_controllen = false;
 					if (add_pktinfo(ctx, pkt, msg) < 0) {
 						msg->msg_flags |= ZSOCK_MSG_CTRUNC;
-					} else {
-						clear_controllen = false;
 					}
+				} else {
+					clear_controllen = true;
 				}
 
 				if (clear_controllen) {
-					msg->msg_controllen = 0;
+					msg->msg_controllen = 0U;
 				}
 			}
 		} else {

@jukkar
Copy link
Member

jukkar commented Aug 21, 2024

comm_sendmsg_recvmsg seems to be re-using the cmsgbuf without clearing it. @jukkar is this a valid use of cmsgbuf ? I would expect the caller to clear it before calling recvmsg.

I tried to clear the buffer, and the test can pass.

Indeed, that could also be a problem.

@axelnxp
Copy link
Contributor

axelnxp commented Aug 21, 2024

I tried to find answer when the msg_controllen should be cleared. The https://www.man7.org/linux/man-pages/man3/cmsg.3.html says

Finally, the msg_controllen field of the msghdr
should be set to the sum of the CMSG_SPACE() of the length of all
control messages in the buffer. 

Also recvmsg(2) says

 When  recvmsg() is called, msg_controllen should contain the length of the available buffer in msg_control; upon return from a successful call it will contain the length of the control message sequence.

So if I am reading this correctly, if the call is successfull, we should not clear the value. The original code cleared the controllen when there was no control data involved. The commit ee06da5 added more socket option checks but now it looks like the change was not correct. With this in mind, I did following change and the tests started to pass. Please take a look if this looks sane.

diff --git a/subsys/net/lib/sockets/sockets.c b/subsys/net/lib/sockets/sockets.c
index b61683b0eec..c0ce03e1a16 100644
--- a/subsys/net/lib/sockets/sockets.c
+++ b/subsys/net/lib/sockets/sockets.c
@@ -1572,27 +1572,28 @@ static inline ssize_t zsock_recv_dgram(struct net_context *ctx,
 	if (msg != NULL) {
 		if (msg->msg_control != NULL) {
 			if (msg->msg_controllen > 0) {
-				bool clear_controllen = true;
+				bool clear_controllen = false;
 
 				if (IS_ENABLED(CONFIG_NET_CONTEXT_TIMESTAMPING)) {
 					if (add_timestamping(ctx, pkt, msg) < 0) {
 						msg->msg_flags |= ZSOCK_MSG_CTRUNC;
-					} else {
-						clear_controllen = false;
 					}
+				} else {
+					clear_controllen = true;
 				}
 
 				if (IS_ENABLED(CONFIG_NET_CONTEXT_RECV_PKTINFO) &&
 				    net_context_is_recv_pktinfo_set(ctx)) {
+					clear_controllen = false;
 					if (add_pktinfo(ctx, pkt, msg) < 0) {
 						msg->msg_flags |= ZSOCK_MSG_CTRUNC;
-					} else {
-						clear_controllen = false;
 					}
+				} else {
+					clear_controllen = true;
 				}
 
 				if (clear_controllen) {
-					msg->msg_controllen = 0;
+					msg->msg_controllen = 0U;
 				}
 			}
 		} else {

Are you sure this patch satisfies the following statement:

Finally, the msg_controllen field of the msghdr
should be set to the sum of the CMSG_SPACE() of the length of all
control messages in the buffer. 

I might be missing something, but I don't see where the sum of CMSG_SPACE is computed and set to msg_controllen.
I think with your patch, msg_controllen stays at the input value, but it should actually be the length of the control data sequence, which might be different (inferior or equal to the input len).

Maybe before returning from recvmsg, we should go through all the cmsg buffer and compute the sum of the different control data length.

Am I missing something ?

@jukkar
Copy link
Member

jukkar commented Aug 21, 2024

Maybe before returning from recvmsg, we should go through all the cmsg buffer and compute the sum of the different control data length.

Yes, that is missing. The original code had only support for one ancillary message, and the caller should have been allocating space for it, but now there could be two so we should update the controllen correctly.

@jukkar jukkar dismissed their stale review August 21, 2024 18:25

The fix needs more work

@axelnxp
Copy link
Contributor

axelnxp commented Aug 21, 2024

Maybe before returning from recvmsg, we should go through all the cmsg buffer and compute the sum of the different control data length.

Yes, that is missing. The original code had only support for one ancillary message, and the caller should have been allocating space for it, but now there could be two so we should update the controllen correctly.

I'll cook up something in my tomorrow, I got a rough patch which makes things pass and should correctly update the controllen, will keep you updated.

@axelnxp axelnxp force-pushed the fix/77303/recvmsg-not-updating-msg_controllen branch from 01b8f8b to 8be4ab7 Compare August 22, 2024 09:03
@axelnxp
Copy link
Contributor

axelnxp commented Aug 22, 2024

Fix updated, I updated insert_pktinfo to compute the proper msg_controllen when control data are added to the buffer.
I chose to implement this in this function to take advantage of the for loop which searches for available space in the buffer, this should be more optimized than adding the same loop after calling add_pktinfo in recvmsg.

With this version of the fix, the unit test are passing on my side, will get the full confirmation with the PR checks.

awojasinski
awojasinski previously approved these changes Aug 22, 2024
Copy link
Collaborator

@awojasinski awojasinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix compliance check errors

@axelnxp axelnxp force-pushed the fix/77303/recvmsg-not-updating-msg_controllen branch 2 times, most recently from 02a7a32 to 3acae5c Compare August 22, 2024 09:49
@axelnxp
Copy link
Contributor

axelnxp commented Aug 22, 2024

In last push I fixed the compliance checks and also pushed an update I forgot to add in the patch (the test would fail without it)

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

jukkar
jukkar previously approved these changes Aug 22, 2024
awojasinski
awojasinski previously approved these changes Aug 22, 2024
@@ -1572,21 +1580,28 @@ static inline ssize_t zsock_recv_dgram(struct net_context *ctx,
if (msg != NULL) {
if (msg->msg_control != NULL) {
if (msg->msg_controllen > 0) {
bool clear_controllen = true;
bool clear_controllen = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after some more thought, I think this clear_controllen logic is not working well.
If we have both TIMESTAMPING and RECV_PKTINFO enabled, we might end up in some corner cases when we clear msg_controllen even if we shouldn't.
The only safe way to implement this seems to compute the control data length in this function

}
} else {
clear_controllen = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as an example for my previous comment, if we have both timestamping and pkt_info enabled, we can have a case where we add timestamping info, then there are no pkt_info to add so we go to this branch and we would clear the control data, including the timestamping.

@axelnxp axelnxp dismissed stale reviews from awojasinski and jukkar via 3a6827d August 22, 2024 13:55
@axelnxp axelnxp force-pushed the fix/77303/recvmsg-not-updating-msg_controllen branch from 3acae5c to 3a6827d Compare August 22, 2024 13:55
@axelnxp
Copy link
Contributor

axelnxp commented Aug 22, 2024

In the last push, I reworked the whole fix and now I think it covers all cases.
The idea was to handle the following cases:

  • Combination of TIMESTAMPING and RECV_PKTINFO enabled/disabled
  • Combination of TIMESTAMPING and RECV_PKTINFO present or not in the context
  • Pre-exisiting control data in the control data buffer. The recvmsg man page doesn't exclude cases where the user reuses the same control data buffer between calls to recvmsg, so in those case, msg_controllen must be updated while taking into account pre-existing data in the buffer

To do so, I implemented next_context_is_timestamping_set to use the same logic than recv_pktinfo (please let me know if this is valid). This avoids calling add_timestamping which would return -ENOTSUP in some cases, but this wasn't handled correctly (we couldn't know if control_data was added or not).

I also implemented update_msg_controllen as an helper. This function goes through the control data buffer to compute the total length.
To me it was the only way to handle cases where the user re-uses the same control data buffer between calls to recvmsg, meaning if recvmsg doesn't add control data, then you can't set msg_controllen to 0 if control data are already present in the buffer, so you have to iterate through each data anyway.

@axelnxp axelnxp force-pushed the fix/77303/recvmsg-not-updating-msg_controllen branch from 3a6827d to 86716c0 Compare August 22, 2024 14:14
According to recvmsg man page, msg_controllen should be set to the sum
of the length of all control messages in the buffer.
This is missing from the current recvmsg implementation.
This commit aims to fix this by updating msg_controllen each time control
data are added to the buffer.
This commit also fixes cases where the msg_controllen is cleared
incorrectly.

Fixes zephyrproject-rtos#77303

Signed-off-by: Axel Le Bourhis <axel.lebourhis@nxp.com>
@axelnxp axelnxp force-pushed the fix/77303/recvmsg-not-updating-msg_controllen branch from 86716c0 to 7d3d37b Compare August 22, 2024 14:26
@jukkar
Copy link
Member

jukkar commented Aug 22, 2024

To me it was the only way to handle cases where the user re-uses the same control data buffer between calls to recvmsg, meaning if recvmsg doesn't add control data, then you can't set msg_controllen to 0 if control data are already present in the buffer, so you have to iterate through each data anyway.

I don't think this is a valid use case, what I mean by that is that the control data is always tied to the data that we just received, so the user should read all the control data after each recvmsg because it would be useless in the next recvmsg call.

@axelnxp
Copy link
Contributor

axelnxp commented Aug 22, 2024

To me it was the only way to handle cases where the user re-uses the same control data buffer between calls to recvmsg, meaning if recvmsg doesn't add control data, then you can't set msg_controllen to 0 if control data are already present in the buffer, so you have to iterate through each data anyway.

I don't think this is a valid use case, what I mean by that is that the control data is always tied to the data that we just received, so the user should read all the control data after each recvmsg because it would be useless in the next recvmsg call.

Ok that makes sense. But who is responsible to clear the control data buffer ? The user or recvmsg ? Couldn't find the answer.

@jukkar
Copy link
Member

jukkar commented Aug 22, 2024

But who is responsible to clear the control data buffer ? The user or recvmsg ? Couldn't find the answer.

I think it can be only the caller, it only knows what to do with the data. So if we assume that, then your code is working fine by not overriding the control data.

@axelnxp
Copy link
Contributor

axelnxp commented Aug 22, 2024

I think it can be only the caller, it only knows what to do with the data. So if we assume that, then your code is working fine by not overriding the control data.

Then I think we need to update the unit tests as the control data buffer is not always cleared between calls to recvmsg.

Currently, the test doesn't clear the control data buffer before
calling recvmsg. This leads to recvmsg being unable to add the new
control data, which corresponds to the current received data.
This commit aims to clear the control data buffer to match most use
cases, when the control data buffer is empty before calling recvmsg.

Signed-off-by: Axel Le Bourhis <axel.lebourhis@nxp.com>
@jukkar jukkar requested a review from awojasinski August 23, 2024 11:16
@dleach02 dleach02 added the backport v3.7-branch Request backport to the v3.7-branch label Aug 23, 2024
@fabiobaltieri fabiobaltieri merged commit 179c85a into zephyrproject-rtos:main Aug 23, 2024
26 checks passed
@axelnxp axelnxp deleted the fix/77303/recvmsg-not-updating-msg_controllen branch August 23, 2024 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking area: Sockets Networking sockets backport v3.7-branch Request backport to the v3.7-branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

net: socket: recvmsg() doesn't update msg_controllen accordingly
8 participants