You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are several MQTT_ERROR_SEND_BUFFER_IS_FULL issues already but I think I've tracked down a possible root cause for how and why QoS 0 messages can cause it.
I'm writing a embedded app using NuttX (which uses MQTT-C 1.1.5), and if an interrupt arrives then the send call in mqtt_pal_sendall can return EINTR. With just mqtt_sync running on it's own thread, the client ends up in MQTT_ERROR_SEND_BUFFER_IS_FULL, but if I call mqtt_sync after mqtt_publish then the client ends up in MQTT_ERROR_SOCKET_ERROR. Note that NuttX does not support SA_RESTART, so there's no way to configure system calls to automatically restart when a signal is received.
If __mqtt_send fails to send a QoS 0 message, then it doesn't remove it from the queue. The error state is set to MQTT_ERROR_SOCKET_ERROR, but if another thread immediately calls mqtt_publish, then the MQTT_CLIENT_TRY_PACK macro pack_call fails, because the buffer is full (even though it's only got a QoS 0 message in it). It tries to call mqtt_mq_clean, but the failed message isn't dropped because when the send failed, the post send logic didn't run and change the QoS 0 message into MQTT_QUEUED_COMPLETE state. MQTT_CLIENT_TRY_PACK then tries the pack_call again which again fails, and so the clients ends up in MQTT_ERROR_SEND_BUFFER_IS_FULL.
Adding a call to mqtt_sync immediately after mqtt_publish ensures that the client->error is set to MQTT_ERROR_SOCKET_ERROR after which __mqtt_send will error out early (and then nothing will change it again).
There are probably a couple of fixes that could be made:
if send fails with EINTR, then treat that as a temporary error (like EAGAIN) (so that it will be retried, but see the caveat below),
if send of a QoS 0 message fails, then mark it as MQTT_QUEUED_COMPLETE (so that mqtt_mq_clean can drop it).
Unfortunately due to apache/nuttx#669 it's not safe to retry EINTR, because (some or all of) the data may have already been sent. Also it looks like calling close on the socket can deadlock the system.
I eventually worked around all of these issues by configuring NuttX to have more active sockets, and enabled TCP write buffering. I guess it moves the data to buffers sooner, and then doesn't interrupt the send when the signal arrives.
The text was updated successfully, but these errors were encountered:
There are several
MQTT_ERROR_SEND_BUFFER_IS_FULL
issues already but I think I've tracked down a possible root cause for how and why QoS 0 messages can cause it.I'm writing a embedded app using NuttX (which uses MQTT-C 1.1.5), and if an interrupt arrives then the
send
call inmqtt_pal_sendall
can returnEINTR
. With justmqtt_sync
running on it's own thread, the client ends up inMQTT_ERROR_SEND_BUFFER_IS_FULL
, but if I callmqtt_sync
aftermqtt_publish
then the client ends up inMQTT_ERROR_SOCKET_ERROR
. Note that NuttX does not supportSA_RESTART
, so there's no way to configure system calls to automatically restart when a signal is received.If
__mqtt_send
fails to send a QoS 0 message, then it doesn't remove it from the queue. The error state is set toMQTT_ERROR_SOCKET_ERROR
, but if another thread immediately callsmqtt_publish
, then theMQTT_CLIENT_TRY_PACK
macropack_call
fails, because the buffer is full (even though it's only got a QoS 0 message in it). It tries to callmqtt_mq_clean
, but the failed message isn't dropped because when the send failed, the post send logic didn't run and change the QoS 0 message intoMQTT_QUEUED_COMPLETE
state.MQTT_CLIENT_TRY_PACK
then tries thepack_call
again which again fails, and so the clients ends up inMQTT_ERROR_SEND_BUFFER_IS_FULL
.Adding a call to
mqtt_sync
immediately aftermqtt_publish
ensures that theclient->error
is set toMQTT_ERROR_SOCKET_ERROR
after which__mqtt_send
will error out early (and then nothing will change it again).There are probably a couple of fixes that could be made:
EINTR
, then treat that as a temporary error (likeEAGAIN
) (so that it will be retried, but see the caveat below),MQTT_QUEUED_COMPLETE
(so thatmqtt_mq_clean
can drop it).Unfortunately due to apache/nuttx#669 it's not safe to retry
EINTR
, because (some or all of) the data may have already been sent. Also it looks like callingclose
on the socket can deadlock the system.I eventually worked around all of these issues by configuring NuttX to have more active sockets, and enabled TCP write buffering. I guess it moves the data to buffers sooner, and then doesn't interrupt the
send
when the signal arrives.The text was updated successfully, but these errors were encountered: