Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mbedtls Dynamic Port - Memory leak (IDFGH-13555) #14444

Open
3 tasks done
HikingDev opened this issue Aug 26, 2024 · 17 comments · May be fixed by #14614
Open
3 tasks done

Mbedtls Dynamic Port - Memory leak (IDFGH-13555) #14444

HikingDev opened this issue Aug 26, 2024 · 17 comments · May be fixed by #14614
Labels
Status: Opened Issue is new Type: Bug bugs in IDF

Comments

@HikingDev
Copy link

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.2-dev-3903-g66992aca7a

Espressif SoC revision.

ESP32 (revision: v3.0)

Operating System used.

Windows

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

PowerShell

Development Kit.

ESP32-Wrover-Kit

Power Supply used.

External 5V

What is the expected behavior?

Running an a https Webserver with dynamic rx/tx buffers and using a Client to send Data > 16KB to the ESP32 HTTPS Server without crashing with "CORRUPT HEAP" message

What is the actual behavior?

The firmware crashes withe a "CORRUPT HEAP" message

Steps to reproduce.

  1. flash firmware with esp32 webserver hosting a website
  2. Open Website hosted via ESP32 in Chrome browser

Debug Logs.

I (55063) Socket parser: [ #55 ] Setting up SSL/TLS data
I (55063) Socket parser: [ #55 ] Performing the SSL/TLS handshake
I (55143) Socket parser: [ #55 ] Handshake successful
W (55143) Socket parser: [initializeServerSSLSession][HEAP] Nr sockets: 6, Heap usage: 112804
I (55143) Socket parser: [initializeServerSSLSession] Releasing lock for clientFd '55'
I (55193) Socket parser: Recv Socket: socketFd: 51  size: 3800
I (55193) Socket parser: [recvSocket][LOCK] Attempting to acquire lock for socket '51'
I (55193) Socket parser: [recvSocket][LOCK] Lock acquired for socket '51'
E (55203) SSL TLS: failed read
CORRUPT HEAP: Bad tail at 0x3ffde6fd. Expected 0xbaad5678 got 0x00000000
Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Setting breakpoint at 0x4008d895 and returning...
0x4008d895: multi_heap_free at C:/SysGCC/esp32/13.2.0_12.1_idf5.2/esp-idf/v5.2/components/heap/multi_heap_poisoning.c:279

V (1541) esp_apptrace: esp_apptrace_flush_nolock(): enter

More Information.

Currently the esp mmbedtls dynamic implementation uses a function tx_buffer_len(...) in both

esp_mbedtls_add_tx_buffer and esp_mbedtls_add_rx_buffer to get the buffer length used in mbedtls_calloc.

static int tx_buffer_len(mbedtls_ssl_context *ssl, int len)
{
    (void)ssl;

    if (!len) {
        return MBEDTLS_SSL_OUT_BUFFER_LEN;
    } else {
        return len + MBEDTLS_SSL_HEADER_LEN
                   + MBEDTLS_MAX_IV_LENGTH
                   + MBEDTLS_SSL_MAC_ADD
                   + MBEDTLS_SSL_PADDING_ADD
                   + MBEDTLS_SSL_MAX_CID_EXPANSION;
    }
}

If Asymmetric in/out fragment length is chosen to reduce the size of buffers the RX buffer gets the MBEDTLS_SSL_OUT_BUFFER_LEN regardless of the config value in sdkonfig. If MBEDTLS_SSL_OUT_BUFFER_LEN is set to e.g. 4096 or if there is already data in the buffer, the RX buffer is always smaller than the by mbedtls required 16KB for the RX buffer. The documentation specifically states that the incoming buffer should be kept at 16KB, due to the fact that there is no supported way of informing the client about size restriction for incoming messagens. (See mbedtls Source
mbedtls_config.h )

There were discussions and a pull request about the issues in mbedtls:
dynamic ssl buffers
Mandate of 16kb

Currently the espressif dynamic implementation disregards the mandatory 16kb rx buffer size. I suggest adding a rx_buffer_len function:

static int rx_buffer_len(mbedtls_ssl_context *ssl)
{
    (void)ssl;
    return MBEDTLS_SSL_IN_BUFFER_LEN;
}

and using it in esp_mbedtls_add_rx_buffer

@HikingDev HikingDev added the Type: Bug bugs in IDF label Aug 26, 2024
@github-actions github-actions bot changed the title Mbedtls Dynamic Port - Memory leak Mbedtls Dynamic Port - Memory leak (IDFGH-13555) Aug 26, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Aug 26, 2024
@AdityaHPatwardhan
Copy link
Collaborator

Thank you for the issue @HikingDev,
I agree that this behaviour needs to be fixed. Since you have already provide the solution do you want to raise a PR for this?

HikingDev added a commit to HikingDev/esp-idf that referenced this issue Sep 23, 2024
…if#14444)

- Added rx_buffer_len() to handle correct RX buffer size
- Ensured compliance with mbedtls 16KB RX buffer requirement
- Prevents CORRUPT HEAP error when receiving large data (>16KB) over HTTPS
@AdityaHPatwardhan
Copy link
Collaborator

AdityaHPatwardhan commented Sep 24, 2024

@HikingDev Is it possible to share the entire error log in this case?

Actually, I think the issue (and its solution) may not be as straightforward as it seems to be.
As you see in the dynamic implementation the in_msglen is used to allocate the buffer dynamically. Typically the msglen is parsed before setting the rx_buffer len, The only case where this would cause a heap corruption is when the in_msglen = 0.

Can you please provide detailed log with log level verbose? I would particularly like to see the debug log before the heap corruption happens.

@HikingDev
Copy link
Author

HikingDev commented Sep 24, 2024

@AdityaHPatwardhan the issue is, that as documented in mbedtls, the client cannot be informed about the Maximum incoming fragment size.

/** \def MBEDTLS_SSL_IN_CONTENT_LEN
 *
 * Maximum length (in bytes) of incoming plaintext fragments.
 *
 * This determines the size of the incoming TLS I/O buffer in such a way
 * that it is capable of holding the specified amount of plaintext data,
 * regardless of the protection mechanism used.
 *
 * \note When using a value less than the default of 16KB on the client, it is
 *       recommended to use the Maximum Fragment Length (MFL) extension to
 *       inform the server about this limitation. On the server, there
 *       is no supported, standardized way of informing the client about
 *       restriction on the maximum size of incoming messages, and unless
 *       the limitation has been communicated by other means, it is recommended
 *       to only change the outgoing buffer size #MBEDTLS_SSL_OUT_CONTENT_LEN
 *       while keeping the default value of 16KB for the incoming buffer.
 *
 * Uncomment to set the maximum plaintext size of the incoming I/O buffer.
 */
//#define MBEDTLS_SSL_IN_CONTENT_LEN              16384

So indeed the in_msglen is used for dynamic buffer allocation but in case of a read (in a server application) the client cannot be informed that MBEDTLS_SSL_IN_CONTENT_LEN is lower than the required 16386 Bytes.

I cannot set mbedtls to verbose. Since its a heap corruption, the crash doesnt happen immediatley but in a later mbedtls_ssl_read and the verbose loglevel causes too much output, floods the terminal and causes a crash.

EDIT: I can try to make a simple server application, allowing to reproduce the faulty behavior.

For now we use a patched version, implementing the suggested pull request. This indeed fixes our crashes.
I do wonder how many ppl out there use the dynamic port in a server application.

@AdityaHPatwardhan
Copy link
Collaborator

Ahh, I see.
I only wanted more information in the in_msglen value in the server application. So I think this issue is more tied with the server.
I will try to reproduce this at my end to get more information.
So the main issue here is that directly creating a buffer of MBEDTLS_SSL_IN_CONTENT_LEN requires a lot more heap than just creating a buffer for thein_msglen this defeats the purpose of this feature that is to save the RAM.
So Ideally we should only fix the issue for the heap corruption without affecting the heap usage numbers.

@HikingDev
Copy link
Author

Indeed! The client doesn`t care about our buffer size.
There was a discussion in the mbedtls repo: issue 1203

the buffer length of 16384 is the mandatory default for both sides per RFC6066. In an optional extension, per-connection, the client (only) may propose to negotiate it down to 512/1024/2048/4096 (only), and the server (only) may accept the proposal or reject it. The server may not make a counter-proposal. The existence of MBEDTLS_SSL_MAX_CONTENT_LEN itself already violates this, then.

For client connections, an api setting the buffer size can also imply the same 512/1024/2048/4096 restriction to the extension. Assuming the server allows it, that would be safe with both sides aware of the logical limit. RFC6066's extension provides no way for the server to do that for the client, but that's no worse than setting MBEDTLS_SSL_MAX_CONTENT_LEN globally today.

Also interesting read the pull request to add dynamic buffers pull request dynamic buffers mbedtls

Only for the in_buf we cannot save RAM. However, reducing it for the out_buf we can reduce memory usage.

@AdityaHPatwardhan
Copy link
Collaborator

AdityaHPatwardhan commented Oct 7, 2024

Hi @HikingDev Sorry for the delayed reply.
I created a HTTPS file server where I can upload files.
I tried to send the 198 KB file and the server received the file in the chunks of 16KB.
Even without applying any patch the transmission happened successfully as the msglen given by the client was always 16KB.
The only case that I can think of when your server can corrupt the data is when it receives msglen as 0 and it sets up an rx_buffer of length 4096 ( MBEDTLS_SSL_OUT_BUFFER_LEN) and ends up corrupting memory when the actual msg is of length > 4096. I could not reproduce this issue at my end. but from my understanding I have created following patch.
Can you please check if the patch fixes the issue for you?

mbedtls_dynamic_buffers_issue.patch

@HikingDev
Copy link
Author

Hi @AdityaHPatwardhan ,
no worries!
Unfortunately the patch does not fix our issue.
The crash still happens in mbedtls_ssl_read.
In our scenario we have a webserver.
The crash only occurs with Chrome/Edge.
Both open multiple sockets (at the time of the crash there are 6 open sockets.
Heap still has 116kb of available memory.

@AdityaHPatwardhan
Copy link
Collaborator

Can you please provide a demo example where I can reproduce the issue at my end ?

@AdityaHPatwardhan
Copy link
Collaborator

@HikingDev The thing is that, we can't accept #14614 as it is as it drastically reduces the heap saved by the dynamic_buffers feature for mbedtls.
If I am able to reproduce the issue at my end then I can probably try to fix the issue in a different way where the performance of the feature is not reduced.
let me know!

@HikingDev
Copy link
Author

@AdityaHPatwardhan i understand the issue.
I will try to make sample app that allows to reproduce the bug.
The testcase with the upload is probably not sufficient, cause it only opens a single socket (at least that's my assumption).
Not sure if I manage in the next few weeks, but will try!

@github-staff github-staff deleted a comment from Lxx-c Oct 23, 2024
@AdityaHPatwardhan
Copy link
Collaborator

Hi @HikingDev any update on the reproducible code?

@HikingDev
Copy link
Author

Hi @AdityaHPatwardhan,

still havent found the time to make a testcase, allowing to reproduce the issue.
If the ticket needs to be closed for now, I am fine with that.
We can use our current workaround.

@HikingDev
Copy link
Author

Hi @AdityaHPatwardhan, I am unable to apply your patch.
Which branch/tag should I best checkout to apply the patch?

@AdityaHPatwardhan
Copy link
Collaborator

The patch was created for idf master branch.
Are you using any specific branch? I can provide a patch for that particular branch as well

@HikingDev
Copy link
Author

Unfortunately, I was unable to apply the patch to the current master branch, and I suspect this might be due to changes in the branch since the patch was created.

Could you kindly provide the commit hash of the master branch that you based the patch on?

I appreciate your help!

@AdityaHPatwardhan
Copy link
Collaborator

Hi @HikingDev, I am still able to apply the patch on esp-idf.
Just now I checked with the latest version i.e. v5.4-beta1, I see that the patch is applicable.
I am re-sharing the patch for clarity

dynamic_buffers.patch

Also,
Is it possible for you to share you sdkconfig file for reference and if possible some more detailed log?

@HikingDev
Copy link
Author

Hi @AdityaHPatwardhan ,

its a little akward but i downloaded again your patch and it differs from the one i previously downloaded.
I was able to apply it but as I said, it doesnt fix the issue.
Feel free to close the issue for now since it seems no one else experiences the bug.
I will try to find some time to reproduce the issue with an example.

mbedtls_dynamic_buffers_issue.patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new Type: Bug bugs in IDF
Projects
None yet
10 participants
@HikingDev @AdityaHPatwardhan @espressif-bot and others