-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Occasionally hit TG1WDT_SYS_RESET when OTA (IDFGH-5615) #7335
Comments
I hit panic while OTA again: E (33651) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time: abort() was called at PC 0x4012bea0 on core 0 Backtrace:0x40088ea6:0x3ffb0750 0x400895ed:0x3ffb0770 0x40090046:0x3ffb0790 0x4012bea0:0x3ffb0800 0x40082fc1:0x3ffb0820 0x40084a80:0x3ffaf7e0 0x40083842:0x3ffaf800 0x40083794:0x3ffaf820 0x400895ed: esp_system_abort at /home/axel/esp/esp-idf/components/esp_system/system_api.c:112 0x40090046: abort at /home/axel/esp/esp-idf/components/newlib/abort.c:46 0x4012bea0: task_wdt_isr at /home/axel/esp/esp-idf/components/esp_common/src/task_wdt.c:182 (discriminator 1) 0x40082fc1: _xt_lowint1 at /home/axel/esp/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1105 0x40084a80: xt_int_enable_mask at /home/axel/esp/esp-idf/components/xtensa/include/xtensa/xtensa_api.h:170 0x40083842: spi_flash_op_block_func at /home/axel/esp/esp-idf/components/spi_flash/cache_utils.c:121 0x40083794: ipc_task at /home/axel/esp/esp-idf/components/esp_ipc/ipc.c:62 Also note, increase CONFIG_ESP_INT_WDT_TIMEOUT_MS to 10000 does not help, it still hit watchdog timeout. |
Hit the issue again when test with v4.4-dev-2359-g58022f859940. E (34329) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time: abort() was called at PC 0x400ffed4 on core 0 Backtrace:0x400d521e:0x3ffb0af00x400890b5:0x3ffb0b10 0x4008fb5a:0x3ffb0b30 0x400ffed4:0x3ffb0ba0 0x40082b0d:0x3ffb0bc0 0x4000bfed:0x3ffca030 0x4008c8fa:0x3ffca040 0x4008201f:0x3ffca060 0x40083eab:0x3ffca080 0x40084e21:0x3ffca0a0 0x40084e5b:0x3ffca0c0 0x40087785:0x3ffca0e0 0x4008447a:0x3ffca100 0x4008495e:0x3ffca120 0x400fca7b:0x3ffca170 0x400fb8ea:0x3ffca190 0x400e6d56:0x3ffca1b0 0x400e7015:0x3ffca1d0 0x400e0de7:0x3ffca210 0x400da642:0x3ffca2b0 0x4019f6ce:0x3ffca2d0 0x4019f74c:0x3ffca320 0x400890b5: esp_system_abort at /home/axel/esp/esp-idf/components/esp_system/esp_system.c:129 0x4008fb5a: abort at /home/axel/esp/esp-idf/components/newlib/abort.c:46 0x400ffed4: task_wdt_isr at /home/axel/esp/esp-idf/components/esp_system/task_wdt.c:184 (discriminator 3) 0x40082b0d: _xt_lowint1 at /home/axel/esp/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1105 0x4008c8fa: vPortExitCritical at /home/axel/esp/esp-idf/components/freertos/port/xtensa/port.c:476 0x4008201f: vPortExitCriticalSafe at /home/axel/esp/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:231 0x40083eab: spi_flash_enable_interrupts_caches_and_other_cpu at /home/axel/esp/esp-idf/components/spi_flash/cache_utils.c:206 0x40084e21: cache_enable at /home/axel/esp/esp-idf/components/spi_flash/spi_flash_os_func_app.c:71 0x40084e5b: spi1_end at /home/axel/esp/esp-idf/components/spi_flash/spi_flash_os_func_app.c:119 0x40087785: spiflash_end_default at /home/axel/esp/esp-idf/components/spi_flash/esp_flash_api.c:129 0x4008447a: flash_end_flush_cache at /home/axel/esp/esp-idf/components/spi_flash/esp_flash_api.c:167 0x4008495e: esp_flash_erase_region at /home/axel/esp/esp-idf/components/spi_flash/esp_flash_api.c:546 0x400fca7b: esp_partition_erase_range at /home/axel/esp/esp-idf/components/spi_flash/partition.c:531 0x400fb8ea: esp_ota_write at /home/axel/esp/esp-idf/components/app_update/esp_ota_ops.c:196 |
I have encountered a similar issue. When I run OTA, I get a Task WDT in ipc0 task. I have only 3 of these custom prototype boards using ESP32-WROVER and this CPU crash occurs on 2 of the 3. The 3rd one runs the OTA without a hitch every single time. Find below the idf debug print outs for OTA FAILURE first and there after OTA SUCCESS (different device) I've included the boot up prints to prove that processor silicon version etc is exactly the same. I know that esp_ota_begin has an internal call for partition erase, but thought I'll call it separately to see if that helps, but still same result. Any help provided would be greatly appreciated as this project is on a very strict deadline for end of Feb 2022 OTA FAILURE
OTA SUCCESS
|
I've noticed that all my CPU crash reports had a common theme: task WDT failure in IPC task. Started browsing IPC related issues on ESP-IDF repo. Noticed this issue. Tried to run my application in single core mode. Problem solved. PLEASE can the ESP-IDF gurus have an in-depth look at SPI handling where IPC is involved? It seems like there is a serious issue there. Although this single core functionality helps temporarily, I suspect that as our project grows, we would soon run out of processing power and would definitely require that second CPU to come online. So please let me know if there is any way how I can help speed up the root cause discovery. |
I don't use bluetooth at all in my original report, so this is probably different from #3923.
I don't find any engineer involved in this thread and #3923 for so many months, so I don't expect this will be fixed soon. |
How can we escalate this to bring it to their attention? |
Spoke to another developer with extensive ESP32 experience regarding this. He said he never experienced issues like this BUT he has the habit of pinning tasks to cores instead of just letting them float around to the first available core. (thus instead of using xTaskCreate he uses xTaskCreatePinnedToCore). Furthermore, his strategy is to pin all BLE, WiFi and Flash intensive tasks to core 0 and the rest to core 1 and then letting trivial processing tasks be unpinned (thus His exact message to me in a Skype Chat was:
I tried it like that and worked!!! So my project is now back to dual core operation :) |
I would suggest to do the reverse and check if it's easier to reproduce the issue, then fix it properly. (If it's easy to reproduce, it has higher chance to fix the issue).
You may have less chance to hit the issue with above workaround. |
Environment
Module or chip used: ESP32-WROOM-32E
IDF version: v4.3-242-g1f7172dbf968
Build System: idf.py
Compiler version: xtensa-esp32-elf-gcc (crosstool-NG esp-2021r1) 8.4.0
Operating System: Linux
Power Supply: USB
CONFIG_SPI_FLASH_YIELD_DURING_ERASE=y
CONFIG_SPI_FLASH_ERASE_YIELD_DURATION_MS=20
CONFIG_SPI_FLASH_ERASE_YIELD_TICKS=1
CONFIG_SPI_FLASH_WRITE_CHUNK_SIZE=8192
Problem Description
Hit TG1WDT_SYS_RESET when OTA.
It's not easy to reproduce, but this happens occasionally.
ets Jun 8 2016 00:22:57
rst:0x8 (TG1WDT_SYS_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:4
load:0x3fff0034,len:6012
load:0x40078000,len:13964
ho 0 tail 12 room 4
load:0x40080400,len:3632
0x40080400: _init at ??:?
entry 0x4008064c
W (183) boot.esp32: PRO CPU has been reset by WDT.
W (184) boot.esp32: WDT reset info: PRO CPU PC=0x4008d0fa
0x4008d0fa: spi_flash_hal_erase_block at /home/axel/esp/esp-idf/components/hal/spi_flash_hal_iram.c:65
W (185) boot.esp32: WDT reset info: APP CPU PC=0x4008176e
0x4008176e: start_cpu_other_cores_default at /home/axel/esp/esp-idf/components/esp_system/startup.c:231
I (654) cpu_start: Pro cpu up.
I (654) cpu_start: Starting app cpu, entry point is 0x40081394
0x40081394: call_start_cpu0 at /home/axel/esp/esp-idf/components/esp_system/port/cpu_start.c:405
I (0) cpu_start: App cpu up.
I (747) cpu_start: Pro cpu start user code
I (747) cpu_start: cpu freq: 240000000
...
The text was updated successfully, but these errors were encountered: