Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/net/application_layer/gcoap: fix Observe notifications correlation #20684

Merged

Conversation

fabian18
Copy link
Contributor

@fabian18 fabian18 commented May 22, 2024

Contribution description

When a CONfirmable CoAP request with an Observe option is sent, the sending client has to store the CoAP token to match multiple responses from the server sending Observe notifications.
The gcoap memo is not cleared when the 2.xx response for the client registration contains an Observe option.

if (memo->send_limit >= 0) { /* if confirmable */
*memo->msg.data.pdu_buf = 0; /* clear resend PDU buffer */
}
/* The memo must be kept if the response is an observe notification.
* Non-2.xx notifications indicate that the associated observe entry
* was removed on the server side. Then also free the memo here. */
if (!observe_notification || (code_class != COAP_CLASS_SUCCESS)) {
/* setting the state to unused frees (drops) the memo entry */
memo->state = GCOAP_MEMO_UNUSED;
}

For a confirmable request with a token length >0, the problem happens right before, where the resend buffer is released by actually overwriting the first byte of the request header in pdu_buf with 0. The first byte contains the token length which is required to match in:

static gcoap_request_memo_t* _find_req_memo_by_token(const sock_udp_ep_t *remote,
const uint8_t *token, size_t tkl)
{
/* no need to initialize struct; we only care about buffer contents below */
coap_pkt_t memo_pdu_data;
coap_pkt_t *memo_pdu = &memo_pdu_data;
for (int i = 0; i < CONFIG_GCOAP_REQ_WAITING_MAX; i++) {
if (_coap_state.open_reqs[i].state == GCOAP_MEMO_UNUSED) {
continue;
}
gcoap_request_memo_t *memo = &_coap_state.open_reqs[i];
memo_pdu->hdr = gcoap_request_memo_get_hdr(memo);
if (coap_get_token_len(memo_pdu) == tkl) {
if ((memcmp(token, coap_get_token(memo_pdu), tkl) == 0)
&& (sock_udp_ep_equal(&memo->remote_ep, remote)
/* Multicast addresses are not considered in matching responses */
|| sock_udp_ep_is_multicast(&memo->remote_ep)
)) {
return memo;
}
}
}
return NULL;
}

There, the overwritten request header is accessed in gcoap_request_memo_get_hdr().

RIOT/sys/include/net/gcoap.h

Lines 1182 to 1190 in 3f41494

static inline coap_hdr_t *gcoap_request_memo_get_hdr(const gcoap_request_memo_t *memo)
{
if (memo->send_limit == GCOAP_SEND_LIMIT_NON) {
return (coap_hdr_t *)&memo->msg.hdr_buf[0];
}
else {
return (coap_hdr_t *)memo->msg.data.pdu_buf;
}
}

Solution:

When a response is received, the retransmission buffer is released, but before the header is copied to hdr_buf in:

struct gcoap_request_memo {
unsigned state; /**< State of this memo, a GCOAP_MEMO... */
int send_limit; /**< Remaining resends, 0 if none;
GCOAP_SEND_LIMIT_NON if non-confirmable */
union {
uint8_t hdr_buf[GCOAP_HEADER_MAXLEN];
/**< Copy of PDU header, if no resends */
gcoap_resend_t data; /**< Endpoint and PDU buffer, for resend */
} msg; /**< Request message data; if confirmable,
supports resending message */
sock_udp_ep_t remote_ep; /**< Remote endpoint */
gcoap_resp_handler_t resp_handler; /**< Callback for the response */
void *context; /**< ptr to user defined context data */
event_timeout_t resp_evt_tmout; /**< Limits wait for response */
event_callback_t resp_tmout_cb; /**< Callback for response timeout */
gcoap_socket_t socket; /**< Transport type to remote endpoint */
#if IS_USED(MODULE_NANOCOAP_CACHE) || DOXYGEN
/**
* @brief Cache key for the request
*
* @note Only available with module ['nanocoap_cache'](@ref net_nanocoap_cache)
*/
uint8_t cache_key[CONFIG_NANOCOAP_CACHE_KEY_LENGTH];
#endif
};

Testing procedure

Internal project. CoAP Observe is not so well tested. I hope the explanation makes sense.
In examples/gcoap/client.c:

if (_send(&buf[0], len, rem, NULL, tl) <= 0) {
puts("gcoap_cli: msg send failed");
}
else {
if (observe) {
/* on successful observe request, store that this node is
* observing / not observing anymore */
observing = obs_value == COAP_OBS_REGISTER;
}
/* send Observe notification for /cli/stats */
notify_observers();
}

the notification is sent right after the registration was sent. I have not tried that.

Issues/PRs references

Depends on and includes #20711

@github-actions github-actions bot added Area: network Area: Networking Area: CoAP Area: Constrained Application Protocol implementations Area: sys Area: System labels May 22, 2024
@fabian18 fabian18 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed Area: network Area: Networking Area: CoAP Area: Constrained Application Protocol implementations Area: sys Area: System labels May 22, 2024
@benpicco benpicco requested a review from chrysn May 22, 2024 14:52
@riot-ci
Copy link

riot-ci commented May 22, 2024

Murdock results

✔️ PASSED

21a46dc examples/cord_lc: adapt Makefile.ci

Success Failures Total Runtime
10193 0 10193 14m:07s

Artifacts

@fabian18
Copy link
Contributor Author

Of course the Observe notification must be sent from the same address as well ...
I am at it to add it here.

size_t gcoap_obs_send(const uint8_t *buf, size_t len,
const coap_resource_t *resource)
{
gcoap_observe_memo_t *memo = NULL;
_find_obs_memo_resource(&memo, resource);
if (memo) {
ssize_t bytes = _tl_send(&memo->socket, buf, len, memo->observer, NULL);
return (size_t)((bytes > 0) ? bytes : 0);
}
else {
return 0;
}
}

@github-actions github-actions bot added Area: network Area: Networking Area: CoAP Area: Constrained Application Protocol implementations Area: sys Area: System labels May 30, 2024
@fabian18 fabian18 force-pushed the pr/fix_gcoap_observe_response_correlation branch 2 times, most recently from 977ce56 to b7ce179 Compare May 31, 2024 10:10
@github-actions github-actions bot added the Area: examples Area: Example Applications label May 31, 2024
@fabian18
Copy link
Contributor Author

Depends on #20711 now due to aux tx local supply in _handle_req()

@fabian18 fabian18 force-pushed the pr/fix_gcoap_observe_response_correlation branch 2 times, most recently from 40bbd42 to 81fd3e7 Compare May 31, 2024 10:47
@fabian18
Copy link
Contributor Author

fabian18 commented Jun 5, 2024

for better testing I added an /rtc resource to the gcoap example.
Use our tapsetup.sh and observe the notifications in wireshark for two native instances.

@fabian18 fabian18 force-pushed the pr/fix_gcoap_observe_response_correlation branch from cd72b36 to 886b1b0 Compare June 7, 2024 08:59
@benpicco benpicco added the Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. label Jul 29, 2024
Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A rebase would be good to only have the changes from this PR visible

examples/gcoap/server.c Outdated Show resolved Hide resolved
sys/include/net/gcoap.h Outdated Show resolved Hide resolved
examples/gcoap/server.c Outdated Show resolved Hide resolved
@benpicco benpicco removed the Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. label Jul 29, 2024
@fabian18 fabian18 force-pushed the pr/fix_gcoap_observe_response_correlation branch from 886b1b0 to 9fe6dfe Compare August 1, 2024 11:04
@fabian18
Copy link
Contributor Author

Only one observer per resource is supported:

 * A CoAP client may register for Observe notifications for any resource that
 * an application has registered with gcoap. An application does not need to
 * take any action to support Observe client registration. However, gcoap
 * limits registration for a given resource to a _single_ observer.

Because, given a request, identifying a resource we look for only one gcoap_observe_memo_t that has a pointer to that resource. See

_find_obs_memo_resource(&resource_memo, resource);

@fabian18
Copy link
Contributor Author

Maybe there is a CoAP way to tell the client, that Observe cannot be honored for the reason that another client is observing the resource currently.

examples/gcoap/client.c Outdated Show resolved Hide resolved
@benpicco
Copy link
Contributor

Please squash directly

@fabian18
Copy link
Contributor Author

Maybe there is a CoAP way to tell the client, that Observe cannot be honored for the reason that another client is observing the resource currently.

From RFC7641

A server that is unable or unwilling to add a new entry to the list
of observers of a resource MAY silently ignore the registration
request and process the GET request as usual. The resulting response
MUST NOT include an Observe Option, the absence of which signals to
the client that it will not be notified of changes to the resource
and, e.g., needs to poll the resource for its state instead.

@fabian18 fabian18 force-pushed the pr/fix_gcoap_observe_response_correlation branch from 33a979c to c898d75 Compare August 21, 2024 16:03
@benpicco benpicco added this pull request to the merge queue Aug 21, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 21, 2024
@benpicco
Copy link
Contributor

Looks like Makefile.ci needs an update - make generate-Makefile.ci should do

@fabian18
Copy link
Contributor Author

I executed make -C examples/gcoap generate-Makefile.ci but actually it only purged boards from the list because I guess the builds are failing in the Docker container due to permission issue:
Example:

Launching build container using image "sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3".
docker run --rm --tty --user $(id -u) -v '/usr/share/zoneinfo/Europe/Berlin:/etc/localtime:ro' -v '/home/fabian.huessler@ml-pa.loc/RIOT:/data/riotbuild/riotbase:delegated' -v '/home/fabian.huessler@ml-pa.loc/.cargo/registry:/data/riotbuild/.cargo/registry:delegated' -v '/home/fabian.huessler@ml-pa.loc/.cargo/git:/data/riotbuild/.cargo/git:delegated' -e 'RIOTBASE=/data/riotbuild/riotbase' -e 'CCACHE_BASEDIR=/data/riotbuild/riotbase' -e 'BUILD_DIR=/data/riotbuild/riotbase/build' -e 'BUILD_IN_DOCKER=/data/riotbuild/riotbase/examples/gcoap/1' -e 'RIOTPROJECT=/data/riotbuild/riotbase' -e 'RIOTCPU=/data/riotbuild/riotbase/cpu' -e 'RIOTBOARD=/data/riotbuild/riotbase/boards' -e 'RIOTMAKE=/data/riotbuild/riotbase/makefiles'      -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG=' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG='  -w '/data/riotbuild/riotbase/examples/gcoap/' 'sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3' make 'BOARD=stm32f469i-disco' 'BOARD=stm32f469i-disco'   -j16  all
Building application "gcoap_example" for "stm32f469i-disco" with CPU "stm32".

Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 147, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 139, in main
    generate_vectors(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 132, in generate_vectors
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:109: /data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 155, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 147, in main
    generate_irqs(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 130, in generate_irqs
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:123: /data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h] Error 1
make[1]: *** [/home/fabian.huessler@ml-pa.loc/RIOT/makefiles/docker.inc.mk:391: ..in-docker-container] Error 2
make[1]: Leaving directory '/home/fabian.huessler@ml-pa.loc/RIOT/examples/gcoap'
stm32f4discovery                        build failed

@fabian18
Copy link
Contributor Author

I executed make -C examples/gcoap generate-Makefile.ci but actually it only purged boards from the list because I guess the builds are failing in the Docker container due to permission issue: Example:

Launching build container using image "sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3".
docker run --rm --tty --user $(id -u) -v '/usr/share/zoneinfo/Europe/Berlin:/etc/localtime:ro' -v '/home/fabian.huessler@ml-pa.loc/RIOT:/data/riotbuild/riotbase:delegated' -v '/home/fabian.huessler@ml-pa.loc/.cargo/registry:/data/riotbuild/.cargo/registry:delegated' -v '/home/fabian.huessler@ml-pa.loc/.cargo/git:/data/riotbuild/.cargo/git:delegated' -e 'RIOTBASE=/data/riotbuild/riotbase' -e 'CCACHE_BASEDIR=/data/riotbuild/riotbase' -e 'BUILD_DIR=/data/riotbuild/riotbase/build' -e 'BUILD_IN_DOCKER=/data/riotbuild/riotbase/examples/gcoap/1' -e 'RIOTPROJECT=/data/riotbuild/riotbase' -e 'RIOTCPU=/data/riotbuild/riotbase/cpu' -e 'RIOTBOARD=/data/riotbuild/riotbase/boards' -e 'RIOTMAKE=/data/riotbuild/riotbase/makefiles'      -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG=' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG='  -w '/data/riotbuild/riotbase/examples/gcoap/' 'sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3' make 'BOARD=stm32f469i-disco' 'BOARD=stm32f469i-disco'   -j16  all
Building application "gcoap_example" for "stm32f469i-disco" with CPU "stm32".

Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 147, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 139, in main
    generate_vectors(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 132, in generate_vectors
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:109: /data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 155, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 147, in main
    generate_irqs(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 130, in generate_irqs
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:123: /data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h] Error 1
make[1]: *** [/home/fabian.huessler@ml-pa.loc/RIOT/makefiles/docker.inc.mk:391: ..in-docker-container] Error 2
make[1]: Leaving directory '/home/fabian.huessler@ml-pa.loc/RIOT/examples/gcoap'
stm32f4discovery                        build failed

Had to delete cpu/stm32/vectors/* and cpu/stm32/include/irqs/*

@benpicco benpicco added this pull request to the merge queue Aug 25, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 25, 2024
@benpicco benpicco added this pull request to the merge queue Aug 26, 2024
Merged via the queue into RIOT-OS:master with commit 1626919 Aug 26, 2024
25 checks passed
@benpicco benpicco added this to the Release 2024.10 milestone Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: CoAP Area: Constrained Application Protocol implementations Area: examples Area: Example Applications Area: network Area: Networking Area: sys Area: System CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants