Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/net/nanocoap: improve coap_build_reply() #21155

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

maribu
Copy link
Member

@maribu maribu commented Jan 22, 2025

Contribution description

  • The responsibility for handling matching CoAP No-Response Options has been split:
    • coap_build_reply() only needs to report this and return -ECANCLED
    • coap_handle_req() does generate the empty ACK is needed.
      ==> As a result, writing CoAP request handlers correctly becomes a lost easier. Correct error handling to be present is now sufficient for correct handling of No-Response options.
      ==> This change is backward compatible with existing code.
  • The API doc has been cleaned up and straightened

Testing procedure

  • CoAP reply handlers that behaved correctly before should still behave correctly now
  • CoAP reply handlers that did not behave correctly before in presence of No-Response options should now behave correctly, if the had correct error handling

(CoAP reply handlers that did not get error handling correctly will stay broken.)

Issues/PRs references

None

@maribu maribu added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Area: doc Area: Documentation Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Jan 22, 2025
@maribu maribu requested review from benpicco and mguetschow January 22, 2025 21:01
@github-actions github-actions bot added Area: network Area: Networking Area: sys Area: System and removed Area: doc Area: Documentation labels Jan 22, 2025
@riot-ci
Copy link

riot-ci commented Jan 22, 2025

Murdock results

FAILED

e5d85cd fixup! fixup! sys/net/nanocoap: improve coap_build_reply

Artifacts

Copy link
Contributor

@mguetschow mguetschow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice start! Some suggestions below.

sys/include/net/nanocoap.h Outdated Show resolved Hide resolved
sys/include/net/nanocoap.h Outdated Show resolved Hide resolved
sys/include/net/nanocoap.h Outdated Show resolved Hide resolved
sys/include/net/nanocoap.h Outdated Show resolved Hide resolved
sys/include/net/nanocoap.h Show resolved Hide resolved
sys/include/net/nanocoap.h Outdated Show resolved Hide resolved
sys/include/net/nanocoap.h Show resolved Hide resolved
@maribu maribu force-pushed the sys/net/nanocoap/coap_build_reply branch 3 times, most recently from e2464a0 to 58ca598 Compare January 23, 2025 10:45
Using a constant is easier than explaining where the magic 1 came from
in size estimations.
@maribu maribu force-pushed the sys/net/nanocoap/coap_build_reply branch 2 times, most recently from 7141f4f to 38397bc Compare January 23, 2025 17:43
@maribu maribu force-pushed the sys/net/nanocoap/coap_build_reply branch from 38397bc to 526310d Compare January 24, 2025 15:24
@github-actions github-actions bot added the Area: CoAP Area: Constrained Application Protocol implementations label Jan 24, 2025
@maribu maribu changed the title sys/net/nanocoap: improve doc for coap_build_reply() sys/net/nanocoap: improve coap_build_reply() Jan 24, 2025
@maribu maribu added the Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. label Jan 24, 2025
- The responsibility for handling matching CoAP No-Response Options
  has been split:
    - `coap_build_reply()` only needs to report this and return
      `-ECANCLED`
    - `coap_handle_req()` does generate the empty ACK is needed.
  ==> As a result, writing CoAP request handlers correctly becomes a
      lost easier. Correct error handling to be present is now
      sufficient for correct handling of No-Response options.
  ==> This change is backward compatible with existing code.
- The API doc has been cleaned up and straightened

Co-authored-by: mguetschow <[email protected]>
@maribu maribu force-pushed the sys/net/nanocoap/coap_build_reply branch from 526310d to d1da109 Compare January 24, 2025 15:32
@maribu
Copy link
Member Author

maribu commented Jan 24, 2025

I changed the API for handling with the No-Response case, so that less foot-shooting should occur.

The API change is not breaking:

  • Everything that did work correctly before will keep working correctly
  • Some handlers that were not working correctly (e.g. shot in their foot in presence of No-Response Options in the request) will now work
  • Response handlers that did not check the return value of coap_build_reply() and pass it on were broken before and stay broken

@maribu maribu requested a review from chrysn January 28, 2025 07:35
@maribu
Copy link
Member Author

maribu commented Jan 28, 2025

Let me try to explain the motivation why I believe changing the API is required. The current API does return 0 or 4 in case a No-Response Option in the request matched the class of the reply status code, depending on whether the request was a NON (no reply at all needed) or a CON (an empty ACK is still needed).

The API doc (in master) says:

The caller may just continue populating the payload (the space was checked to suffice), but may also skip that needless step if the returned length is less than the requested payload length.

The condition that payload_len (or more correctly called max_data_len in this PR) being larger than the returned value on matching No-Response Option does not always hold: E.g. if we plan to add no CoAP-Options, a Payload Marker, and a single byte of payload data we will request 2, but the return value will be 4 on matching No-Response Option.

For figuring out the position where to continue writing data to, we usually do:

    size_t max_data_len = COAP_OPT_FOO_MAX_LEN + COAP_OPT_BAR_MAX_LEN
                        + COAP_PAYLOAD_MARKER_SIZE + payload_len;
    ssize_t hdr_len = coap_build_reply(pkt, COAP_CODE_CONTENT, buf, len, max_data_len);
    if (hdr_len > max_data_len) {
        return hdr_len;
    }
    /* per API doc: return value is actually written header length + max_data_en */
    hdr_len -= max_data_len;
    uint8_t *pos = response_buf + hdr_len;

The intuitive alternative to just subtracting max_data_len from the return value would be:

    uint8_t *pos = response_buf + sizeof(coap_hdr_t) + coap_get_token_length(request_pkt);

However, this would work only when support for the Extended Token Length RFC is not enabled. The correct way would actually be parsing back the header written:

    uint8_t *pos = response_buf + coap_hdr_len((const coap_hdr_t *)response_buf);

This however would yet again add another builtin assumption that the format is CoAP over UDP. And there still is a footgun: Because the return value was larger than the requested payload_len (or more correctly max_data_len), the caller may still assume there is enough space to generate the response. It may then also write less data than originally requested, because e.g. it requested space for 3 bytes (1 payload marker and 2 bytes of payload as worst case estimation), but actually wrote only 1 bytes of payload. In this case, it cannot just pass through the return value of coap_build_reply() as was assumed, but has to computed a new length: Either by subtracting the diff between the estimation of payload_len and what was actually written, or by computing the length as difference between uint8_t *pos and uint8_t *response_buf. The latter would just ignore the No-Response option, which would be suboptimal but OK spec wise. The former would send the first 3 bytes of the empty ACK.

IMO this API has so many different ways to shoot your foot that just returning an error on matching No-Response Option and letting the CoAP Server handle that error by ignoring it (request was a NON) or sending an empty ACK (request was a CON) is a lot more sane.

Specifically: The API doc already states, that coap_build_reply() returns -NOSPC when the size check did not match and <0 on other errors. So callers will handle this case already and the API change would just fix all the footguns, without breaking CoAP handlers that at least implemented the error check correctly.

@maribu maribu added the Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) label Jan 28, 2025
@maribu
Copy link
Member Author

maribu commented Jan 28, 2025

With the API in master being such a minefield that I would bet 90% of the implemented CoAP resource handlers implemented do not navigate that minefield safely, I'd like to call this PR a bug fix.

Because of that, I'd also like to interest @MrKevinWeiss in this PR with regard to whether we should backport this.

if (type != COAP_TYPE_ACK) {
return 0;
}
return -ECANCELED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation of coap_build_reply_header() (and coap_reply_simple() which makes use of this) will now also need an update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: CoAP Area: Constrained Application Protocol implementations Area: network Area: Networking Area: sys Area: System CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer Process: API change Integration Process: PR contains or issue proposes an API change. Should be handled with care. Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants