-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nanocoap: follow-up fixes #17950
nanocoap: follow-up fixes #17950
Conversation
76be802
to
aadb304
Compare
Sorry that this has grown larger again, @fjmolinas discovered that retransmissions were broken, which prompted me to dig through the code and clean it up some more. Retransmissions should now work properly, but it seems like we might have discovered a bug in |
aadb304
to
ba15a6e
Compare
Indeed I had come across issues when running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has grown, and I would like some more context for some of the commits, and I think it would be best to split the improvements from the follow-up fixes to #17474.
- d96ccaf: I understand the reasoning, but I would like to know what prompted you to add this (what success code other than 205 where you getting?)
- 25c9b5a: I think this should be split.
- 75787a7: this one also is better as its own PR
- ba15a6e: can we split was is cosmetic and functional?
- 90ca43d: commit message does not seem to align with the commit itself, related to the commit I only see an assert, while there is some new logic to re-try reception without a re-request.
pktpos += coap_opt_put_uri_pathquery(pktpos, &lastonum, path); | ||
pktpos += coap_opt_put_uint(pktpos, lastonum, COAP_OPT_BLOCK2, | ||
(num << 4) | blksize); | ||
buf += coap_build_hdr(pkt.hdr, COAP_TYPE_CON, NULL, 0, COAP_METHOD_GET, _get_id()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should actually be setting the token (which is 0 right now) to maybe num
and then check on that instead of else if (coap_get_id(pkt) != id)
, would that be correct @maribu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that a zero length token is still a valid token. The standard asks for 32 bit of entropy in the Token if a device is connected to the Internet, though:
A client that is connected to the general Internet SHOULD use at least 32 bits of randomness, keeping in mind that not being directly connected to the Internet is not necessarily sufficient protection against spoofing. (Note that the Message ID adds little in protection as it is usually sequentially assigned, i.e., guessable, and can be circumvented by spoofing a separate response.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that a zero length token is still a valid token
Yep
The standard asks for 32 bit of entropy in the Token if a device is connected to the Internet, though
In that case, the default token length should at least be 4bytes, thinking of gcoap default
Lines 482 to 490 in e64b1bd
/** | |
* @ingroup net_gcoap_conf | |
* @brief Length in bytes for a token | |
* | |
* Value must be in the range 0 to @ref GCOAP_TOKENLEN_MAX. | |
*/ | |
#ifndef CONFIG_GCOAP_TOKENLEN | |
#define CONFIG_GCOAP_TOKENLEN (2) | |
#endif |
Although for this specific use-case, I guess if a block option is set it could check that the blknum matches what is expected, keeping the token field to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the individual blocks require different tokens or is this considered a single transfer, so the same token for all blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be different tokens for each block request, but I think you could leave it empty and check in the nanocoap_sock_request_cb
call coap_get_block2(pkt, &block2);
on the request and response, and if its there is a block option and the blknum
does not match retry reception. But this optimization is maybe a bit of a hack to save the token bytes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is up to the client and needs to aid the client to identify the right response. With NSTART=1 only duplicates (trivial to detect with blockwise) and spoofed responses are needed to be told apart from the right one. In case of UDP transport, one cannot really defend against spoofed responses, but having same entropy in the Token at least requires the adversary to be able to listen to the requests. The standard specifically says 32 bits of randomness and not reusing Tokens for the same source and destination endpoints is what SHOULD be done.
The common strategy is to just use a properly seeded PRNG for generating the token and hope for the best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't include a token yet in the request, how about we move this to a follow-up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments regarding the last commits, otherwise now need a rebase.
do they? |
1aa9a93
to
a5acea3
Compare
Ahh... well then maybe that one is fine, but the interval ones are definitively wrong. |
So we don't have to wait a full interval again if we got a wrong message without resend.
a5acea3
to
b20f899
Compare
Uh, which do you mean? |
Ah wait no, I mis-read where the |
I think this is still pending
This is still un-addressed |
I merged the first one with b20f899. The reasoning is that the (user) callback will evaluate the response code and can better decide if the function was a success than the wrapper function. The second one I hoped the added comment could explain. Now the way the function is implemented for GNRC and LWIP, there will never be a second data fragment. This is what enables this (slight ab)use here in the first place: If there were more fragments, we would still need a temporary buffer. |
Thanks for the details @benpicco, I had missed the first part
I think then this 'abuse' probably needs a comment as well. I would even prefer an assert on this in the case a new network stack would behave differently. |
Blockwise still works after all that 😄 tests/nanocoap_cli
|
examples/suit_update PASS
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maribu said I can go ahead on Matrix, there are more follow-ups to come anyway - thank you all for the review and help bringing nanoCoAP forward 🎉 |
Contribution description
NanoCoAP should use random message IDs and randomize the retransmit timeout.
This PR takes care of this.
Testing procedure
Block-wise does still work:
Issues/PRs references