Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response Issue in CMUX Mode (IDFGH-12431) #536

Closed
3 tasks done
brhmrsln opened this issue Mar 22, 2024 · 15 comments · Fixed by #540
Closed
3 tasks done

Response Issue in CMUX Mode (IDFGH-12431) #536

brhmrsln opened this issue Mar 22, 2024 · 15 comments · Fixed by #540
Assignees

Comments

@brhmrsln
Copy link

Answers checklist.

  • I have read the documentation for esp-protocols components and the issue is not addressed there.
  • I have updated my esp-protocols branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

Environment:

  • Hardware: ESP32-S3
  • ESP-IDF
  • Module: Quectel MC60 (configured in CMUX mode)
  • Connection: UART

Problem Description:
We are utilizing the Quectel MC60 module in conjunction with the ESP32-S3, employing the ESP-MODEM component for communication. Our application involves connecting to the internet via the MC60's all-in-one solution and periodically requesting NMEA sentences.

Issue:
In CMUX mode, after successfully establishing an internet connection and receiving an IP address, we attempt to fetch GLL NMEA sentences using the command "AT+QGNSSRD" periodically. The MC60 appears to respond correctly with the requested sentence followed by an "OK" acknowledgment. However, In the initial response, we receive only "nmeasentence" without the preceding "OK". Subsequent requests then yield an "OK" response that seems to belong to the previous request, leading to a cycle where each NMEA sentence request is followed by an "OK" from the prior request, effectively causing empty responses and misaligned confirmations.

Observations:
The entire response (including "OK") is present in the UART buffer, suggesting that the issue might be with how the response is processed or retrieved.
We employ esp_modem::dce_commands::generic_get_string to capture the response.
The issue consistently occurs when the ESP system clock is set to either 240 MHz or 160 MHz.
Interestingly, adding a delay at the beginning of the bool DTE::command_cb::process_line function mitigates the issue.
At an 80 MHz ESP system clock frequency, the problem does not occur, and the system behaves as expected, receiving both the NMEA sentence and the "OK" acknowledgment without issues.
Upon inspecting the communication with a logic analyzer on the MC60 side, it is observed that the module transmits the complete response (sentence + "OK") as expected, indicating that the UART transmission from the MC60 is functioning correctly.

Request:
Could you provide insight or a solution to this problem? It appears that system clock frequencies above 80 MHz introduce timing issues with UART communication or processing within the ESP-MODEM component, specifically in the context of receiving complete responses from the MC60 module. The observation with a logic analyzer suggests that the issue lies not in the MC60's transmission but possibly in the reception or processing of the data by the ESP32-S3 or the ESP-MODEM component.

Thank you for your assistance.

@github-actions github-actions bot changed the title Response Issue in CMUX Mode Response Issue in CMUX Mode (IDFGH-12431) Mar 22, 2024
@david-cermak
Copy link
Collaborator

we receive only "nmeasentence" without the preceding "OK".

just for me to understand, you're expecting a response pre-pended by "OK" and trying to parse it using dce_commands::generic_get_string()?
This command looks for the OK as a delimiter at the end of the message, for the first occurrence, so it probably just returns the buffer as is and stops looking for any other OK...

Maybe an alternative approach could be to use dce_commands::at_raw() which checks for an arbitrary pattern and returns the entire buffer (doesn't strip the OK)

@brhmrsln
Copy link
Author

Hi David
Thanks for the quick response.
I am expecting the ‘OK’ at the end of the response. And when I check the logic analyzer it is coming from MC60 as expected. (Sentence+OK). And I also log the uart buffer and the answer is as we expected. (Sentence + OK). But when I check the data inside the dce_commands::generic_get_string()(I log the data) I can only see the sentence not the OK at the end.(it fills the parameter and returns timeout, because the OK is not there)When I reguest second nmea sentence, now the ‘previous OK + Sentence + OK ‘ is found and the response is empty.(As you say when it finds the OK, it returns).
Same setup, somehow when we put delay at the beggining of the dte::command_cb::process_line it works as expected. And also if i configure the esp system clock frequency to 80Mhz, it also works.
I tried 240 Mhz and 160Mhz and in both configuration I get the issue.
I get the issue in CMUX mode, in command mode it works.
I will try the dce_commands::at_raw().
Thank you,
Best regards.

@brhmrsln
Copy link
Author

I try the dce_commands::at_raw(). Issue remains. Here are some outputs:

  • I put log inside the dce_commands::at_raw(), in lambda function. Below is the log of "data" ptr with size "len".
    in_at_raw_buffer_log
  • Inside the dce_commands::at_raw(), remaining data in the uart buffer
    in_at_raw_remaining_uart_buffer_log
  • And the response string we pass it to dce_commands::at_raw()
    out_at_raw_response_str_log
  • Logic analyzer outputs:
    logicAnalyzerSnip
    logicAnalyzerSnipDetailed

*80Mhz working log:
80Mhz_working_example

@brhmrsln
Copy link
Author

And also
1 sec delay at the beggining of the DTE::command_cb::process_line, working log:
1secDelay

@david-cermak
Copy link
Collaborator

Thanks for sharing the log. Is it possible that some command before that failed or timeouted? (and it passed while running on 80MHz?)

I think we can workaround it simply by flushing the Rx buffer before each command. (perhaps this would sound like a solution, because we do not need the data from previous commands, but I'd still like to understand where this OK comes from...)

@brhmrsln
Copy link
Author

The shared log is showing the first nmea request we sent and its response. When I checked the logs, there is no fail or timeout situation:)
Actually OK is in the response. I mean when we request NMEA sentence, module is sending the response as expected (nmea sentence + 'OK'). And we see this in the uart buffer. but when we read DTE buffer, somehow we can only read the sentence but not the OK. It is still in the uart buffer however. Next time when we request, the response is like starting with 'OK'.

With the solution 'flushing the rx buffer', my concern is that we can also flush the internet related data because we are also using the gsm module at the same time and its data is also coming from same uart. (One uart connection to the module and CMUX configuration)
A workaround solution I am implementing for now is sending sync command('AT') after the nmea request just to read the buffer and get rid of the 'OK'. It seems it is working now but not sure what would be the side effects.
Thank you for the help, hope we can find a better solution :)

@david-cermak
Copy link
Collaborator

david-cermak commented Mar 25, 2024

A workaround solution I am implementing for now is sending sync command('AT')

This should work, I don't see any downside other than the effective response time.
Would it wok if you send the sync command just once, upon entering the CMUX mode?

One theory for the extra OK would be that it's a leftover from the initial AT+CMUX=0\r command. Could you please check if applying this solves the problem?

diff --git a/components/esp_modem/src/esp_modem_command_library.cpp b/components/esp_modem/src/esp_modem_command_library.cpp
index 6f06bb7..e491b3f 100644
--- a/components/esp_modem/src/esp_modem_command_library.cpp
+++ b/components/esp_modem/src/esp_modem_command_library.cpp
@@ -373,7 +373,7 @@ command_result send_sms(CommandableIf *t, const std::string &number, const std::
 command_result set_cmux(CommandableIf *t)
 {
     ESP_LOGV(TAG, "%s", __func__ );
-    return generic_command_common(t, "AT+CMUX=0\r");
+    return generic_command_common(t, "AT+CMUX=0\r", 5000);
 }
 
 command_result read_pin(CommandableIf *t, bool &pin_ok)
diff --git a/components/esp_modem/src/esp_modem_dce.cpp b/components/esp_modem/src/esp_modem_dce.cpp
index 55c51f0..cce4684 100644
--- a/components/esp_modem/src/esp_modem_dce.cpp
+++ b/components/esp_modem/src/esp_modem_dce.cpp
@@ -135,7 +135,10 @@ bool DCE_Mode::set_unsafe(DTE *dte, ModuleIf *device, Netif &netif, modem_mode m
         if (mode == modem_mode::DATA_MODE || mode == modem_mode::CMUX_MODE || mode >= modem_mode::CMUX_MANUAL_MODE) {
             return false;
         }
-        device->set_mode(modem_mode::CMUX_MODE);    // switch the device into CMUX mode
+        if (!device->set_mode(modem_mode::CMUX_MODE)) {
+            return false;
+        }
+                                                    // switch the device into CMUX mode
         usleep(100'000);                            // some devices need a few ms to switch
 
         if (!dte->set_mode(modem_mode::CMUX_MODE)) {

@brhmrsln
Copy link
Author

"Would it wok if you send the sync command just once, upon entering the CMUX mode?" -> Does not work. I have to send after every NMEA response we received.

I apply the patch but still have the issue.

I think the problem lies in filling the dte buffer. I mean after we send NMEA request, we are waiting for the response (sentence + OK). This answer is coming from the MC60 as expected. And when I check the uart buffer (uart_read_bytes) I see the response is as we expected. But when I get the response with dce_commands::generic_get_string() or dce_commands::at_raw() I can only read the 'sentence'. The remaining response('OK') is in the uart buffer and will be read after a request.

We are expecting dte buffer to be this (with OK at the end, nothing left in the uart buffer and response string is sentence + OK)
1secDelay

But, dte buffer has only sentence and the expected OK is still in the uart buffer waiting to be read(it should be read and should be in the dte buffer. Below logs are sequential):

in_at_raw_buffer_log
in_at_raw_remaining_uart_buffer_log

The below log i get with uart_read_bytes function of esp:
in_at_raw_remaining_uart_buffer_log

I think the problem is not having an extra 'OK' but not be able to get the whole response from uart buffer to dte buffer.
If we get the whole response, both dce_commands::generic_get_string() and dce_commands::at_raw() is going to work as we expected.

Thank you

@david-cermak
Copy link
Collaborator

I see, thanks for testing. the issue is apparently somewhere else.

Could you please share your sdkconfig (I'm only interested in esp_modem options that are set differently than the defaults). From what I can see, you had ESP_MODEM_CMUX_DEFRAGMENT_PAYLOAD set to n, correct? Otherwise you wouldn't get the line processing lambda called with only the partial response (as I saw that the response itself and the OK message come within one CMUX payload).

If it's the case, could you please try to enable ESP_MODEM_USE_INFLATABLE_BUFFER_IF_NEEDED ? (still doens't explain why it works on 80MHz, when the response apparently comes in two pieces...)

@brhmrsln
Copy link
Author

"From what I can see, you had ESP_MODEM_CMUX_DEFRAGMENT_PAYLOAD set to n, correct?", it is set to 'y'

Here is the related part.

#
# esp-modem
#
CONFIG_ESP_MODEM_CMUX_DEFRAGMENT_PAYLOAD=y
CONFIG_ESP_MODEM_USE_INFLATABLE_BUFFER_IF_NEEDED=y
CONFIG_ESP_MODEM_CMUX_DELAY_AFTER_DLCI_SETUP=1000
CONFIG_ESP_MODEM_CMUX_USE_SHORT_PAYLOADS_ONLY=y
# CONFIG_ESP_MODEM_ADD_CUSTOM_MODULE is not set
# end of esp-modem

And i used various settings :)

When I saw the issue , I thought that the response is coming in two pieces too. But I think it is not the case because in logic analyzer there is no delay between 'Sentence' and 'OK' and uart buffer has the "OK" when we read the 'sentence' from dte buffer.

Here is how I got the log:

command_result at_raw(CommandableIf *t, const std::string &cmd, std::string &out, const std::string &pass, const std::string &fail, int timeout= 500)
  {
    ESP_LOGV(TAG, "%s", __func__);
    return t->command(
        cmd,
        [&](uint8_t *data, size_t len)
        {
          out.assign(reinterpret_cast<char *>(data), len);

          // LOG *************************************************************
          //  Only Sentece is read, I also read more than length still no 'OK'
          ESP_LOG_BUFFER_HEXDUMP(TAG, data, len, ESP_LOG_INFO);

          // When we read uart buffer here, OK is there
          uint8_t uart_remaining_data[128];
          size_t uart_data_len= uart_read_bytes(1, uart_remaining_data, 128, 20);
          if (uart_data_len > 0)
          {
            uart_remaining_data[uart_data_len]= 0;
            ESP_LOGI(TAG, "Remaining data: %s", data);
          }
          // LOG *************************************************************

          if (out.find(pass) != std::string::npos)
          {
            return command_result::OK;
          }
          else if (out.find(fail) != std::string::npos)
          {
            return command_result::FAIL;
          }

          return command_result::TIMEOUT;
        },
        timeout);
  }

Thank you

@david-cermak
Copy link
Collaborator

Okay, seems I was wrong again. The response may or may not come in two pieces, but it comes in two distinct cmux messages, now I see it from your logic analyzer picture (was confused by the logs seeing only ascii characters).
What I think is happening on slower CPU rate is that it actually comes in one piece, but the CMUX layers see it as two payloads, so the lambda is called twice, this also explains your initial log
on faster clock rate, you might get a callback immediately after the first part (without the OK), which would also explain why you're seeing some UART data buffered. But wouldn't explain why you won't get the second part of the response. I'll try to simulate this condition on my device.

I think you've already tried, but still:

  • what happens if you remove reading of the UART buffer from the code above? if we read the buffer there then the data wouldn't be available in the second callback
  • increase the timeout of the command to maximum, try 10 seconds or so, just to be sure. Does the command return command_result::TIMEOUT ?

@brhmrsln
Copy link
Author

1)"what happens if you remove reading of the UART buffer from the code above? if we read the buffer there then the data wouldn't be available in the second callback"
-> If I read the uart buffer there, then I can get the 'sentence' because I take the 'OK' from the uart buffer manually by reading it and then when I make a new NMEA request, I again get the 'sentence'(response str is filled with 'sentence') and manually get the 'OK'(with uart_read_bytes) and so on. We are manually removing 'OK' by reading the uart buffer.
-> If I remove it, then I get the first response (just sentence, 'OK' is still in uart buffer). After that we request new NMEA and we only get the 'OK'(from the response of the previous request).
2) increase the timeout of the command to maximum, try 10 seconds or so, just to be sure. Does the command return command_result::TIMEOUT ?
-> I try it with 5 seconds and it returns command_result::TIMEOUT.

"I'll try to simulate this condition on my device."
Remainder: I am using ESP32-S3 (with 240Mhz system clock config) and Quectel MC60 (All in one solution with CMUX config). I connected them via one UART (In command mode no issue). After I make the CMUX configuration and get the ip, I am periodically requesting NMEA sentence(periods can be various.)
Also this can be also very helpful. 1 sec delay at the beggining of the DTE::command_cb::process_line is exactly what we expected(dte buffer contains 'sentence+OK'):
image

I appriciate your time and the help, Thank you David.

@david-cermak
Copy link
Collaborator

It seems like an issue in the modem code (which doesn't exhibit very often, as I've never seen two such a small fragments posted in two cmux frames, which is a legitimate usecase). The question is why we won't get the second callback.
would you mind trying this again and logging the data event directly in the uart task?

--- a/components/esp_modem/src/esp_modem_uart.cpp
+++ b/components/esp_modem/src/esp_modem_uart.cpp
@@ -120,6 +120,7 @@ void UartTerminal::task()
             switch (event.type) {
             case UART_DATA:
                 uart_get_buffered_data_len(uart.port, &len);
+                ESP_LOGW(TAG, "UART_DATA %d", len);
                 if (len && on_read) {
                     on_read(nullptr, len);
                 }

1 sec delay at the beggining of the DTE::command_cb::process_line is exactly what we expected(dte buffer contains 'sentence+OK'):

This actually explains that we get the correct data in two cmux callbacks, but only one uart callback (but not via two uart callbacks).

If this is the case, then this should fix the problem:

--- a/components/esp_modem/src/esp_modem_uart.cpp
+++ b/components/esp_modem/src/esp_modem_uart.cpp
@@ -160,6 +160,11 @@ void UartTerminal::task()
                 ESP_LOGW(TAG, "unknown uart event type: %d", event.type);
                 break;
             }
+        } else {
+            uart_get_buffered_data_len(uart.port, &len);
+            if (len && on_read) {
+                on_read(nullptr, len);
+            }
         }
     }
 }

(which basically just periodically checks for any potential data buffered in case we missed an event)

@brhmrsln
Copy link
Author

Sorry for the late response, I was running some functional tests on the solution. it is working, thank you. The functional tests I carried out also include sending data over the Internet to server while periodically requesting the NMEA sentence and no problem is observed.

The log you requested (regarding the size of the uart buffer)
UartBufferLength

About the solution, could you please clarify if there's a planned timeline for these changes to be merged into the master branch?
Because I want to use this component from Github (as managed_components) to be always up to date with the repository.
I only manage the component manually for test and logging purposes.

Again I appriciate the help, Thank you.

@david-cermak
Copy link
Collaborator

Thanks for testing it and for the good news. I've just put up a PR with the fix. It usually takes few days for people to review it before it gets merged. I'll also add a bump commit, so we'll publish a new version to component manager.

david-cermak added a commit to david-cermak/esp-protocols that referenced this issue Nov 7, 2024
1.2.0
Features
- Add support for guessing mode (52598e5)
- Delete CMUX internal implementation even if terminal exit fails (0e0cbd6)
- Add support for handling URC (1b6a3b3, espressif#180)
- add ability to change ESP_MODEM_C_API_STR_MAX from Kconfig (1790989)
- Added target test config with CHAP authentication (f8ae7de)
- example add esp32p4 usb support (adafeae)
- Publish mbedtls component (0140455)
- host test support of the latest ESP-IDF release (3f74b4e)
Bug Fixes
- Fix console example to use urc/detect features (1a9eaf3)
- Update target test builds to use external Catch2 (554f022)
- Fix arguments names when spawn esp_modem_xxx declarations (b6792c5)
- Remove catch dependency (c348076)
- Examples: use local configs for MQTT topic/data (f5c13b9)
- Fixed clang-tidy warnings (70fa3af)
- Fix CI build per IDFv5.3 (d0c17ef)
- Fixed UART task to check for buffered data periodically (4bdd90c, espressif#536)
- Cleanup unused configs from PPPoS example (08a62cc)
- Update CMUX example with SIM7070_gnss cleaned-up (56fe532)
- Update console example with SIM7070_gnss format comments (5baaf54)
- Fix remaining print format warnings (3b80181)
Updated
- docs(modem): Fix esp_modem_at_raw() description (C-API) (492a6a0)
- ci(common): updated github actions(checkout, upload, download) v3 to 4, Ubuntu 20.04 to v22.04 (a23a002)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants