Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for CH390H, but needs further tests. (IDFGH-12291) #23

Merged
merged 2 commits into from
Aug 29, 2024

Conversation

SergeyKharenko
Copy link
Contributor

@SergeyKharenko SergeyKharenko commented Mar 5, 2024

Original PR: IDFGH-12259 and IDFGH-12264
Besides, two examples(iperf and tcp_server) have been added!

ch390/CMakeLists.txt Outdated Show resolved Hide resolved
ch390/README.md Outdated Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Outdated Show resolved Hide resolved
ch390/examples/iperf/main/basic.h Outdated Show resolved Hide resolved
ch390/examples/iperf/main/basic.h Outdated Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Outdated Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Outdated Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Outdated Show resolved Hide resolved
ch390/src/esp_eth_phy_ch390.c Outdated Show resolved Hide resolved
ch390/src/esp_eth_phy_ch390.c Show resolved Hide resolved
@leeebo
Copy link

leeebo commented Mar 7, 2024

Please refer https://github.com/espressif/esp-eth-drivers/blob/master/lan867x/idf_component.yml for the component manager.

And please wait #24 be merged, then add it to ethernet_init

@leeebo
Copy link

leeebo commented Mar 7, 2024

@SergeyKharenko Overall, LGTM!

@github-actions github-actions bot changed the title Added support for CH390H, but needs further tests. Added support for CH390H, but needs further tests. (IDFGH-12291) Mar 7, 2024
@SergeyKharenko
Copy link
Contributor Author

Now ethernet init can use CH390 module with polling mode support. By the way, use override_path to find the ch390 instead.

ch390/idf_component.yml Outdated Show resolved Hide resolved
@igrr
Copy link
Member

igrr commented Mar 22, 2024

Please also add the new component here and the new examples here

@kostaond
Copy link
Collaborator

@SergeyKharenko I would like to share some information of the progress at our side. Once we accept the PR to our repository, the "official" final responsibility for the component maintenance lays at our side (even though it would be great if you helped us with the maintenance even in the future). Therefore, we are currently in the process of developing and manufacturing a test board. Once ready, we run tests and continue with the PR review. Please stay tuned and thank you once again for your contribution.

@SergeyKharenko
Copy link
Contributor Author

@SergeyKharenko I would like to share some information of the progress at our side. Once we accept the PR to our repository, the "official" final responsibility for the component maintenance lays at our side (even though it would be great if you helped us with the maintenance even in the future). Therefore, we are currently in the process of developing and manufacturing a test board. Once ready, we run tests and continue with the PR review. Please stay tuned and thank you once again for your contribution.

Thank you for your attention, it is a great pleasure for me. I will always keep an eye out for the follow-up work!

@SergeyKharenko
Copy link
Contributor Author

SergeyKharenko commented Mar 23, 2024

@igrr @kostaond Both examples and the component have been modified and added to the workflow. All builds are successful in my repo, please check again~

@kostaond
Copy link
Collaborator

Hi @SergeyKharenko, we have the test boards with CH390 in the office! I added the CH390 to esp_eth\test_apps I ran the few of the tests. I haven't run the complete pytest yet since I discovered the following issues:

  • test (6) "ethernet start/stop stress test with IP stack" fails - after the 4th start/stop sequence, the CH390 starts continuously transmitting corrupted DHCP message. There are two extra zero bytes at start of the frame.
0000  00 00 ff ff ff ff ff ff  96 e6 86 da c1 c4 08 00   ········ ········
0010  45 00 01 50 00 0c 00 00  40 11 79 92 00 00 00 00   E··P···· @·y·····
0020  ff ff ff ff 00 44 00 43  01 3c 67 14 01 01 06 00   ·····D·C ·<g·····
0030  47 0a b8 10 00 00 00 00  00 00 00 00 00 00 00 00   G······· ········
0040  00 00 00 00 00 00 00 00  96 e6 86 da c1 c4 00 00   ········ ········
0050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
0060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
0070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
0080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
0090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
00A0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
00B0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
00C0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
00D0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
00E0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
00F0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
0100  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
0110  00 00 00 00 00 00 00 00  63 82 53 63 35 01 01 39   ········ c·Sc5··9
0120  02 05 dc 0c 09 65 73 70  72 65 73 73 69 66 37 04   ·····esp ressif7·
0130  01 03 1c 06 3d 07 01 96  e6 86 da c1 c4 ff 00 00   ····=··· ········
0140  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ········ ········
0150  00 00 00 00 00 00 00 00  00 00 00 00 00 00         ········ ······

Could you please try to reproduce at your setup? Just add CH390 to https://github.com/espressif/esp-idf/blob/master/components/esp_eth/test_apps/main/esp_eth_test_common.c build and flash. The process is quite straight forward but if you needed any assistance, just let me know.

@ShunzDai
Copy link

ShunzDai commented May 20, 2024

I tried this modification, but it didn't work

static esp_err_t emac_ch390_transmit(esp_eth_mac_t *mac, uint8_t *buf, uint32_t length)
{
    esp_err_t ret = ESP_OK;
    emac_ch390_t *emac = __containerof(mac, emac_ch390_t, parent);
    /* Check write address */
    uint16_t tx_offset = 0;
    /* Check if last transmit complete */
    uint8_t tcr = 0;

    ESP_GOTO_ON_FALSE(length <= ETH_MAX_PACKET_SIZE, ESP_ERR_INVALID_ARG, err,
                      TAG, "frame size is too big (actual %lu, maximum %u)", length, ETH_MAX_PACKET_SIZE);

    int64_t wait_time = esp_timer_get_time();
    do {
        ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_TCR, &tcr), err, TAG, "read TCR failed");
    } while ((tcr & TCR_TXREQ) && ((esp_timer_get_time() - wait_time) < CH390_MAC_TX_WAIT_TIMEOUT_US));

    if (tcr & TCR_TXREQ) {
        ESP_LOGE(TAG, "last transmit still in progress, cannot send.");
        return ESP_ERR_INVALID_STATE;
    }

    /* set tx length */
    ESP_GOTO_ON_ERROR(ch390_register_write(emac, CH390_TXPLL, length & 0xFF), err, TAG, "write TXPLL failed");
    ESP_GOTO_ON_ERROR(ch390_register_write(emac, CH390_TXPLH, (length >> 8) & 0xFF), err, TAG, "write TXPLH failed");

    uint8_t mrrh = 0, mrrl = 0;
    ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MWRH, &mrrh), err, TAG, "read MDRAH failed");
    ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MWRL, &mrrl), err, TAG, "read MDRAL failed");
    uint16_t addr_prev = mrrh << 8 | mrrl;
    /* copy data to tx memory */
    ESP_GOTO_ON_ERROR(ch390_memory_write(emac, buf, length), err, TAG, "write memory failed");

    ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MWRH, &mrrh), err, TAG, "read MDRAH failed");
    ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MWRL, &mrrl), err, TAG, "read MDRAL failed");
    uint16_t addr_next = mrrh << 8 | mrrl;

    tx_offset = (addr_prev <= addr_next) ? (addr_next - addr_prev) : (0x0C00 - addr_prev + addr_next);

    ESP_GOTO_ON_FALSE(tx_offset == length, ESP_ERR_INVALID_STATE, err, TAG, "check tx data failed");

    /* issue tx polling command */
    ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_TCR, &tcr), err, TAG, "read TCR failed");
    ESP_GOTO_ON_ERROR(ch390_register_write(emac, CH390_TCR, tcr | TCR_TXREQ), err, TAG, "write TCR failed");
    return ESP_OK;
err:
    ESP_ERROR_CHECK_WITHOUT_ABORT(ch390_register_write(emac, CH390_MPTRCR, MPTRCR_RST_TX));
    return ret;
}

static esp_err_t emac_ch390_receive(esp_eth_mac_t *mac, uint8_t *buf, uint32_t *length)
{
    esp_err_t ret = ESP_OK;
    emac_ch390_t *emac = __containerof(mac, emac_ch390_t, parent);
    /* Check read address */
    uint16_t rx_offset = 0;

    uint8_t ready = 0;
    /* dummy read, get the most updated data */
    ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MRCMDX, &ready), err, TAG, "read MRCMDX failed");
    ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MRCMDX, &ready), err, TAG, "read MRCMDX failed");

    // if ready != 1 or 0 reset device
    if (ready & CH390_PKT_ERR) {
        emac_ch390_stop(mac);
        esp_rom_delay_us(1000);
        emac_ch390_start(mac);

        ESP_LOGE(TAG, "PACK ERR");
    } else {
        __attribute__((aligned(4))) ch390_rx_header_t rx_header; // SPI driver needs the rx buffer 4 byte align

        if (ready & CH390_PKT_RDY) {
            ESP_GOTO_ON_ERROR(ch390_memory_read(emac, (uint8_t *) & (rx_header), sizeof(rx_header)), err, TAG, "peek rx header failed");
            *length = (rx_header.length_high << 8) + rx_header.length_low;

            ESP_GOTO_ON_FALSE(!((rx_header.status & 0xBF) || (*length > CH390_PKT_MAX)), ESP_ERR_INVALID_STATE, err, TAG, "check rx status failed");

            uint8_t mrrh = 0, mrrl = 0;
            ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MRRH, &mrrh), err, TAG, "read MDRAH failed");
            ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MRRL, &mrrl), err, TAG, "read MDRAL failed");
            uint16_t addr_prev = mrrh << 8 | mrrl;
            /* copy rx memory to data */
            ESP_GOTO_ON_ERROR(ch390_memory_read(emac, buf, *length), err, TAG, "read rx data failed");

            ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MRRH, &mrrh), err, TAG, "read MDRAH failed");
            ESP_GOTO_ON_ERROR(ch390_register_read(emac, CH390_MRRL, &mrrl), err, TAG, "read MDRAL failed");
            uint16_t addr_next = mrrh << 8 | mrrl;

            rx_offset = (addr_prev <= addr_next) ? (addr_next - addr_prev) : (0x3400 - addr_prev + addr_next);

            ESP_GOTO_ON_FALSE(rx_offset == *length, ESP_ERR_INVALID_STATE, err, TAG, "check rx data failed");
            // ESP_LOG_BUFFER_HEX(TAG, buf, rx_offset);
            *length -= ETH_CRC_LEN;
        } else {
            *length = 0;
        }
        return ESP_OK;
    }

err:
    ESP_ERROR_CHECK_WITHOUT_ABORT(ch390_register_write(emac, CH390_MPTRCR, MPTRCR_RST_RX));
    *length = 0;
    return ret;
}

does this indicate that the problem is not caused by a shift in the ch390 read/write register pointer?

@kostaond
Copy link
Collaborator

@ShunzDai, you tried to address the "start/stop" issue with your modification? Do I understand it correctly?

@ShunzDai
Copy link

@ShunzDai, you tried to address the "start/stop" issue with your modification? Do I understand it correctly?

yes

@kostaond
Copy link
Collaborator

@ShunzDai good to know it is reproducible issue.

@SergeyKharenko we currently don't have any resources to investigate deeper. If you also don't have time to investigate, please state it as known issue in README. I would like to move forward with this PR. Please also address all the previous comments.

@kostaond
Copy link
Collaborator

kostaond commented Jul 9, 2024

@SergeyKharenko just a gentle reminder. Could you please have a look at the comments and address them so we could move forward?

@SergeyKharenko
Copy link
Contributor Author

SergeyKharenko commented Jul 10, 2024

@kostaond @ShunzDai My sincere apologies for the late reply. I was busy with the damn DDL the other day. I'll try my best to fix it this weekend!!!

@kostaond
Copy link
Collaborator

@SergeyKharenko great to having you back! Just two last comments:

  • please squash your commits into one commit
  • you've probably noticed, we established common Ethernet examples folder. Please use it for CH390 too. It's about maintainability, most of the examples in this repo was basically the same so the goal is to reduce number of examples we would have to maintain. CH390 does not have any special characteristics so it should fit to be used in common examples.

@SergeyKharenko
Copy link
Contributor Author

@kostaond Certainly, I synced with the official repository and resolved the conflict. However the problem with the driver seems to be a bit more complicated than I thought... The only good news is that I'm heading into a week-long vacation, and it's a great time to deal with the problem!

@kostaond
Copy link
Collaborator

kostaond commented Aug 9, 2024

@kostaond Certainly, I synced with the official repository and resolved the conflict. However the problem with the driver seems to be a bit more complicated than I thought... The only good news is that I'm heading into a week-long vacation, and it's a great time to deal with the problem!

Do you mean start/stop issue? Give it just few days. If you couldn't find solution, then address it by listing it in Known issues in README and creating GH issue. Once the PR is ready for re-review, please let me know.

@SergeyKharenko
Copy link
Contributor Author

SergeyKharenko commented Aug 20, 2024

@kostaond Yes, this issue bothered me a period. It's heartening that I've finally figured out the cause today!!!

Analysis

Keyword: Dual TX Index; TX Pointer; Mysterious Compensation
Firstly I've observed a discrepancy between the TX pointer offset and the actual number of bytes written (which @leeebo has previously discovered), as shown in the fig below. Interestingly, there is a certain pattern to this difference. The odd number of packets sent from the module boot will cause the TX pointer to point 2 bytes ahead of the theoretical position(for example, 352 actually while 350 in theory), but at the even number of packets, the TX pointer will be the same as the theoretical position. Therefore, I speculate that there is a difference in the actual behavior of the chip between the two consecutive transmissions.
image

That is further corroborated in Chapter 8.4 of the Datasheet, which provides information about the TX Index(fig below).
There are two Indexes(A and B) in this chip. The two Indexes run alternately, ABAB...
image

Through packet capture, it is found that whether it is an odd or even number of packets, the actual length transmitted is consistent with our set value, and there seems to be a mechanism of Mysterious Compensation inside the chip to ensure its correctness. Given the TX pointer position as x, data length as len, transmit starts at x and ends at x+len when using the Index A, while starts at x-2 and ends at x+len-2 when using the Index B. In addition, x increases by len+2 and len-2 respectively.

The above pattern fails when the chip continuously transmitting corrupted DHCP messages(as @kostaond has mentioned). In this case, the actual position of the TX pointer is 2 bytes behind the theoretical position(348 actually while 350 in theory). For odd packets, the behavior is exactly the same as for the even packets mentioned earlier, so I guess that Index B is used when sending odd packets, and Index A is used for even packets, which is the opposite of the normal case. This also explains why the first two characters 0x00 are included in the corrupted packet as @kostaond noticed.
image

After reset, the chip uses the Index A initially. Here is the usage of the Index when the chip is normal (top) and when the issue occurs (bottom). At this point, the conjecture of the Mysterious Compensation has been conclusively verified!
Screenshot_20240820_214651

Looking at the entire start/stop loop, the only thing that comes to the operation directly on the TX pointer is

    /* reset tx and rx memory pointer */
    ESP_GOTO_ON_ERROR(ch390_io_register_write(emac, CH390_MPTRCR, MPTRCR_RST_RX | MPTRCR_RST_TX), err, TAG, "write MPTRCR failed");

in emac_ch390_start. This action ignores the Mysterious Compensation and mistakenly assumes that the reading of the TX data always starts from 0x00, which ultimately led to this terrible error!!!

Solution

Quite simple, two ways:

  1. Force the chip using the Index A when exec emac_ch390_stop. There is no direct way to control usage of the Index. Continuously record the Index. An empty packet could be sent if a switch is needed(a little complex, emm...).
  2. (Recommend) DO NOT OPERATE TX pointer DIRECTLY. Actually after the chip is reset, the TX pointer automatically points to 0, and the TX cache is a circular array (never cause out-of-bounds), so there is no need to reset TX pointer to 0 under normal circumstances manually. So I've removed the action as follows:
    /* reset rx memory pointer */
    ESP_GOTO_ON_ERROR(ch390_io_register_write(emac, CH390_MPTRCR, MPTRCR_RST_RX ), err, TAG, "write MPTRCR failed");

Test Result

A 100 loops start/stop test has been passed. The issue is completely solved. I don't think it will appear again~
image

Conclusion

Sometimes, a bug turns out to be the key to keeping the system running, and if it doesn't affect normal use, it may not be bad to leave it alone😁

PS: The code will be submitted over the next two days...

Copy link
Collaborator

@kostaond kostaond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergeyKharenko impressive job identifying root cause of the start/stop issue!

I performed the review and asked our intern to run the test_apps with latest CH390 driver version since I'm a little bit busy with other things. Could you please also go over older comments and provide response how you addressed them? That would be great help for me. Thank you.

.github/workflows/build.yml Outdated Show resolved Hide resolved
ch390/README.md Outdated Show resolved Hide resolved
ch390/examples/iperf/CMakeLists.txt Outdated Show resolved Hide resolved
ch390/idf_component.yml Outdated Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Outdated Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Show resolved Hide resolved
ch390/src/esp_eth_mac_ch390.c Outdated Show resolved Hide resolved
ethernet_init/README.md Show resolved Hide resolved
@SergeyKharenko
Copy link
Contributor Author

@kostaond Almost all comments are replied to(try my best)~

@kostaond
Copy link
Collaborator

@SergeyKharenko I finished the review. However, I cannot merge it due to conflicts. Please resolve. Please also squash the commits into one commit.

@SergeyKharenko SergeyKharenko force-pushed the patch_ch390 branch 6 times, most recently from bd4e845 to acda86c Compare August 28, 2024 17:28
fix(ch390): length filter is not enabled
build(ch390): lowered the IDF version requirement to 5.1
docs(ch390): change 10BASE-TX to 10BASE-T
@SergeyKharenko
Copy link
Contributor Author

@kostaond Lack of experience with git manipulation led to many invalid pushes... After a period of autism, the problem was finally solved😀!!! I still divide all of the commits into two parts, initial publish and critical bug fix, which allows this issue to be documented on the timeline.

@kostaond kostaond merged commit 6315a47 into espressif:master Aug 29, 2024
10 checks passed
@kostaond
Copy link
Collaborator

@SergeyKharenko merged! Thank you for your patience and dedicated work! Your code has very high standard. Everything is very well structured and cleanly written. To be honest, I decided to update the buffer allocation in DM9051 based on what I've seen in your PR since the previous implementation was over complicated. I'm planning to do the same for the rest SPI Ethernet devices we support.

@SergeyKharenko
Copy link
Contributor Author

SergeyKharenko commented Aug 29, 2024

My pleasure!
Thanks for your valuable advice and continued attention.!!! @leeebo @igrr @kostaond @ShunzDai

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: Opened labels Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants