Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal RC oscillator issues with 3.6V (IDFGH-11064) #12240

Closed
3 tasks done
peter3099 opened this issue Sep 13, 2023 · 73 comments
Closed
3 tasks done

Internal RC oscillator issues with 3.6V (IDFGH-11064) #12240

peter3099 opened this issue Sep 13, 2023 · 73 comments
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@peter3099
Copy link

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.1.1

Operating System used.

Windows

How did you build your project?

VS Code IDE

If you are using Windows, please specify command line type.

PowerShell

What is the expected behavior?

Hello

We are currently having a random issue with ESP32-WROVER-E where we find that the internal 150kHz RC osccilator changes the time for waking up the ESP from deep sleep, it either goes shorter or longer.

the only 2 things we change in our design was moving from IDF 4.4 to 5.1.1 and we also stop using a voltage regulator for the supply on the ESP32. We now use the 3.6V directly from the disposable Lithium Battery with a super capacitor in parallel to help with current demand.

In the previous design with ESP IDF v4.4 and 3.3V regulated as power for the ESP32 we flash 100 boards and we got 0 issues with the internal RC oscillator

Now, with this new design after flashing 30 boards, 4 of them had this issue were the ESP will never wake up from deep sleep or it will wake up really often. This issue doesn't happen from the start, it ussually gets worse over time.

Will the VCC used for powering the ESP32 alter the internal RC oscillation? or could it be an issue with the IDF 5.1.1?

Thanks

What is the actual behavior?

ESP32 doesn't wake up from deep sleep at the correct time.

Steps to reproduce.

Supply ESP32 with 3.6V
Random issue were deep sleep doesn't wake up at the correct time and the RTC loses the value.

Build or installation Logs.

No response

More Information.

No response

@peter3099 peter3099 added the Type: Bug bugs in IDF label Sep 13, 2023
@espressif-bot espressif-bot added the Status: Opened Issue is new label Sep 13, 2023
@github-actions github-actions bot changed the title Internal RC oscillator issues with 3.6V Internal RC oscillator issues with 3.6V (IDFGH-11064) Sep 13, 2023
@peter3099
Copy link
Author

image

image

@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 16, 2023

Hi, @peter3099 ,

I tested a few chips that kept sleeping and waking up at 300s intervals for one day and it didn't reproduce the problem you're experiencing. (I used a constant 3.6v power supply with no LDO in my test flow, and test with IDF deepsleep_wake stub_example)

I have the following questions to confirm with you:

  1. Can you confirm that the problem is with the accompanying 150K RC clock? Meaning that if you configure the slow clock source to external 32K crystal or 8.5MHz oscillator you will not trigger the problem?

  2. Similarly, can you rule out the effects of power supply differences.

  3. What is the approximate temperature of the faulty esp32 working scenario?

thanks.

@peter3099
Copy link
Author

Hi @esp-wzh

Thanks for doing the tests

Answers to your questions

  1. Yes, only with the 150k RC clock, when I switch to 8.5MHz the issue dissapears but it consumes more current in deep sleep
  2. Power supply goes from 3.6V to 3.4V during measurement and tranmission as there is some fluctuation when the Lora module send the data and consumes higher current.
  3. Same as the other ones that are not faulty, between 19 degC to 25 deg C

It might be that they are faulty but what was strange to me was that I didn't get any of these issues in the previous 100 board iteration and now out of 40 I get 4 that have a similar issue.

Please let me knwo if there is anything you need from me
Thanks a have a nice weekend.

@peter3099
Copy link
Author

I must add that I get the issue with the batteries connected and also with the 3.6V power supply(which shoudn't get any voltage drop during trasnmission)

@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 17, 2023

Hi~ @peter3099
I compared the changes between v4.4.1 and v5.1.1 regarding the esp32 hardware sleep parameters, and also compared the register configurations that are dumped from RTC_CNTL_REG at the moment of entering sleep. It was confirmed that there are no changes related to this issue. So it's unlikely that the problem was introduced by the new release.

  1. Are you using the same chip batch in both iterations of your board?
  2. If roll back the firmware of the esp32 that has the problem to v4.4 does it still appear?
  3. Can you streamline the reproduction case to exclude more software configuration elements?

If this issue is blocking your development, could you contact our commercial team and send back some of the faulty chips for us to test?

@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels Sep 17, 2023
@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 18, 2023

@peter3099 I discussed this issue with the analog team and confirmed that the supply voltage does affect the RC oscillator circuit. The supply voltage of esp32 should be between 2.3V and 3.6V. It is not recommended to use such a boundary voltage as a normal supply voltage, because the function of the chip cannot be guaranteed once the voltage exceeds this threshold.
And as far as I know, the voltage of most lithium batteries when fully charged will be higher than the nominal voltage. Can you use an oscilloscope to confirm the power supply quality?

@peter3099
Copy link
Author

Hi @esp-wzh

thanks for all this. we are using this battery
https://www.tme.eu/Document/a73a7427149d2fcfd65fa1d91281179d/EVE-ER14505_S.pdf

Voltage is flat at 3.6V until the battery dies. It is not rechargable. What I see with the oscilloscope was the voltage close to 3.67V in the ones that have more charge, but it is never above 3.67V. Could it be that been so close to the max voltage some board could be more susceptible than others and react differently?

Yes, I am using the same chip batch as the previous one. We bought a 1000 of them.

I cannot go back to 4.4.1 because of the breaking changes in v5 and because more lora module is new on the board so the commands are different.

What are the contact details for the commercial team? Regards

@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 18, 2023

I must add that I get the issue with the batteries connected and also with the 3.6V power supply(which shoudn't get any voltage drop during trasnmission)

For the chip that has the problem on a 3.6V power supply, can you step the voltage down to 3.3v and confirm if the problem still exists?

If you find that the chip still has this issue at voltages lower than 3.6V, you can report the issue via the link below
https://www.espressif.com/en/contact-us/technical-inquiries/hardware-issues

@peter3099
Copy link
Author

Ok. I will try that. BTW this is my RTC menuconfig details
image

@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 18, 2023

The config LGTM, I believe this problem is not caused by software : )

@peter3099
Copy link
Author

Cool, I keep you posted on the test with 3.3V on the failing ESP32

@peter3099
Copy link
Author

Hi @esp-wzh I can confirm the issue still happens with regulated on the faulty ESP32. Question is: did they get broken with the 3.6V or they were faulty already?

@peter3099
Copy link
Author

I already started a commercial claim

@peter3099
Copy link
Author

More than a claim, a request :)

@peter3099
Copy link
Author

Hi @esp-wzh

In order to avoid this problem, do you recomend using an external crystal? Will consumption be close to RTC consumption?

Because these guys are reporting something similar
#11939

and they are using external crystal

Thanks

@peter3099
Copy link
Author

Hi @esp-wzh just to let you know that I found 2 more devices out of the 40 ESP32 batch with the same issue....

They should sample every 5 minutes, have a look at the time between 16:00 and 16:35

image

That number is the count of motion while in deep sleep count by the ULP, it gets reset every time it samples, they usual max number is 15(1 motion every 20Seconds in 300 seconds sample), I know that the device never woke up as I see the number 37, so it means that it keep counting becuase it never woke up to reset it.

The consumption I get using 3.6V is much lower than using 3.3V as I avoid the loses in the LDO due to the voltage drop. I would like to keep using this setup but I'm concern about this kind of issue.

@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 21, 2023

@peter3099 Thank you for your continued reporting.

Yeah, you can try use EXT 32kHz crystal, compared to 8M RC it does not consume much power, typically < 5uA.

The other thing I want to confirm: if not use ULP during sleep, does the problem recur? And how often does it recur on a faulty esp32, and does it recur with shorter sleep intervals?

@peter3099
Copy link
Author

Hi @esp-wzh

Let me check that, is the consumption with the ext 32KHz crystal better than the internal RC?

I will check the ULP thing and come back to you with some details

@peter3099
Copy link
Author

Confirmed, it keeps doing the issue with the ULP, if the frequency of sampling is higher then it does it less often. Sometimes it does weird things like this

image

@peter3099
Copy link
Author

peter3099 commented Sep 22, 2023

Hi @esp-wzh

Could it be that this issue with the RTC is causing some issues with UART1 comunication. I have a module connected to UART1 and sometimes I get random issues where the module doesn't understand the commands I'm sending. In the ESP32 that have this RTC issue this random error happens more often, almost all the time.

void setup_rak_uart(void)
{

  /* Configure parameters of an UART driver,
   * communication pins and install the driver */
  uart_config_t uart_config = {
      .baud_rate = 115200,
      .data_bits = UART_DATA_8_BITS,
      .parity = UART_PARITY_DISABLE,
      .stop_bits = UART_STOP_BITS_1,
      .flow_ctrl = UART_HW_FLOWCTRL_DISABLE,
      .source_clk = UART_SCLK_REF_TICK,
      // .source_clk = UART_SCLK_APB,
      // .source_clk = UART_SCLK_DEFAULT,
  };
  // Install UART driver, and get the queue.
  // uart_driver_install(UART_NUM_1, BUF_SIZE * 2, BUF_SIZE * 2, 200, &uart1_queue, 0);
  ESP_ERROR_CHECK(uart_driver_install(UART_NUM_1, BUF_SIZE * 2, BUF_SIZE, 20, &uart1_queue, 0));
  ESP_ERROR_CHECK(uart_param_config(UART_NUM_1, &uart_config));

  // Set UART pins (using UART0 default pins ie no changes.)
  ESP_ERROR_CHECK(uart_set_pin(UART_NUM_1, RAK_TX_P2, RAK_RX_P1, UART_PIN_NO_CHANGE, UART_PIN_NO_CHANGE));
  ESP_ERROR_CHECK(uart_flush(UART_NUM_1));
  // ESP_LOGI(TAG, "Lora UART Activated at %" PRId64 "ms", esp_timer_get_time() / 1000);
}

This is the configuration for my UART, notice that I only got it working with source_clk = UART_SCLK_REF_TICK

just wondering if the issues are related.

All my devices are 3.6V capable

@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 22, 2023

Hi, @peter3099 , I don't think they are related, the two modules belong to different digital domains.

It is recommended that you dump the raw data in UART1 communication and compare it with the data collected by the logic analyzer on the UART TX/RX pin.

And what needs to be noted is that if you have enabled esp_pm or autolightsleep, you need to aquire the no_lightsleep power lock during UART transmission process to avoid going to lightsleep automatically.

@peter3099
Copy link
Author

Hi @esp-wzh

I've disabled this
image

Now all my problem with the comunication are gone :)
sdkconfig.zip

The RTC issue still happening but it looks like it happens less often when the frequency is higher

Attached are my sdkconfig just in case it helps

Thanks

@peter3099
Copy link
Author

Some other ESPs do this

image

It says sleeping for 300 seconds but it only sleeps for a minute or less....

@esp-wzh
Copy link
Collaborator

esp-wzh commented Sep 25, 2023

Does it happens on all chips? Which api are you using to calculate the sleep duration.

@peter3099
Copy link
Author

So far so good
image

🤞

@esp-wzh
Copy link
Collaborator

esp-wzh commented Oct 12, 2023

@peter3099 It's good news! Change dbg_atten_slp to 0 will increase the sleep current by about 5uA, if you accept this solution, you can also change the value of it to some other value, the bigger it is between 0 and 3 the lower the power consumption, but it looks like the risk of experiencing wakeup loss is also higher on your side.

In our QA testing, we have tested a huge number of chips to confirm that the value of RTC_CNTL_DBG_ATTEN_DEFAULT=0 is safe for RC8M powerdown deepsleep.

And if it's due to insufficient voltage, the wakeup should be stuck but not lost, so I think this parameter is not the direct cause of this problem.

I would like to know after each wakeup, how do you configure the next wakeup time in your application? By deepsleep wake stub or ULP? Can you make sure that the configuration is correct?

One more request, can you give the mac info for a few chips with this problem? You can dump it by command esptool.py -p COMx chip_id.

@peter3099
Copy link
Author

Hi @esp-wzh

thanks for this, it looks like it is solving the issue

in this

In our QA testing, we have tested a huge number of chips to confirm that the value of RTC_CNTL_DBG_ATTEN_DEFAULT=0 is safe for RC8M powerdown deepsleep.

Do you mean for 150kHz RC?

After each wake up and after finalizing my routine before calling esp_deep_sleep_start()
I do ESP_ERROR_CHECK(esp_sleep_enable_timer_wakeup(deepSleepTime * mS_TO_uS_FACTOR));

To set my next deep sleep time

When I do esptool.py -p COMx chip_id I get noting
image

Can I do this change to dbg_atten_slp without modyfing my ESP-IDF files? At the moment I jsut made a change on rtc_sleep.c. Or are you going to implement this fix in the next release?

Thanks

@esp-wzh
Copy link
Collaborator

esp-wzh commented Oct 12, 2023

@peter3099

Do you mean for 150kHz RC?

Yes, if the rtc_clk chooses 150kHz RC and there's no other peripherals use RC8M during sleep, the RC8M will auto powerdown.

After each wake up and after finalizing my routine before calling esp_deep_sleep_start()
I do ESP_ERROR_CHECK(esp_sleep_enable_timer_wakeup(deepSleepTime * mS_TO_uS_FACTOR));

Can you print the wakeup cause and the value of deepSleepTime each time?

When I do esptool.py -p COMx chip_id I get noting

Sorry, my fault, it should be esptool -p COMx chip_id on Windows Powershell.

Can I do this change to dbg_atten_slp without modyfing my ESP-IDF files? At the moment I jsut made a change on rtc_sleep.c. Or are you going to implement this fix in the next release?

Until the exact cause and explanation of this problem is found, this change will not be allowed to be merged into master, as the current version of the parameter has been working stably for many years now

@peter3099
Copy link
Author

I get this
image

Ok, so the only way for me to adapt it is to change the rtc_sleep.c? Or is there any command where I can modify dbg_atten_slp?

Regarding wake up reason and value of deelSleepTime I will implement and show you results, but I did in the past
image

if I pressed the button when it doesn't wake up then I get those weird values

I will add deepSleepTime to check

@esp-wzh
Copy link
Collaborator

esp-wzh commented Oct 12, 2023

For esptool issue, please run it in IDF python env or pip install esptool at first. refer: espressif/esptool#777 (comment)

Ok, so the only way for me to adapt it is to change the rtc_sleep.c? Or is there any command where I can modify dbg_atten_slp?

Yes, you must edit the IDF source code manually, because this part of the code was not designed to be configurable by the user.

if I pressed the button when it doesn't wake up then I get those weird values

Can you help to provide a minimum project to reproduce this problem?

@peter3099
Copy link
Author

Hi @esp-wzh

I can confirm the issue is resolved changing the dbg_atten_slp, I will try to get you a piece of the code so you can try it.

So basically RTC_CNTL_DBG_ATTEN_DEFAULT=0 selects the 150kHz osc and put the other ones in off? Or does it something else?

Thanks

@esp-wzh
Copy link
Collaborator

esp-wzh commented Oct 16, 2023

Hi! @peter3099

You can understand that this register controls the power drive capability of one of the analogue circuits inside the chip, you can choose a value between 0 and 3, the larger the value the weaker the drive capability, usually the circuit doesn't need a strong drive capability if the 8MRC is turned off during the sleep, so configure it to 3, if the 8MRC is kept on during the sleep it needs to increase the power drive capability, so configure it to 0, but it seems that on your chip, even sleep with only a 150K clock requires a strong drive capability to work properly.

BTW, do you have any other questions on the reading chip mac info thing? This will help us trace the process information of the chip in your hand to help debug if the issue is a software issue.

@peter3099
Copy link
Author

I forgot about the chip mac, let me try again. Thanks for the info

@peter3099
Copy link
Author

3 of the mac Ids of ESP32 with issues with RTC
macId1 = 70b8f6156b70
macId2 = 70b8f6156014
macId3 = 70b8f6156b74

@peter3099
Copy link
Author

hi @esp-wzh did anything came back from the mac Ids?

@peter3099
Copy link
Author

Here my tests for dbg_atten_slp deep sleep consumption
Value 0
image

Value 1
image

image

Value 2
image

Value 3
image

@esp-wzh
Copy link
Collaborator

esp-wzh commented Oct 26, 2023

Hi, @peter3099
Thank you for the information, but from the production test data, there is nothing unusual about these 3 chips. Power consumption, process corner and other related parameters are perfect.

Another point that needs to be clarified about the rtc_timer wake-up mechanism is that: after each wakeup the CPU needs to read the current rtc_timer tick value and reconfigure it by adding it to the wakeup period. So theoretically if one wakeup is lost, all subsequent wakeups should also be lost.

It is still expected that you can provide a minimal reproducible project or send back an environment where it can be reproduced stably.

thanks.

@peter3099
Copy link
Author

Hi @esp-wzh thanks for that! I understand, I'm trying to minimize our sleep code so I can send it to you.

Once thing I noticed is that with dbg_atten_slp values of 1 and 2 I get random wake up in between, like normally the device will wake up every 5 min and in some cases for this 2 devices I see random wake up 1 min apart, then the rest are ok and they are done every 5 min. Are values 1 and 2 correct values?

Also with the value 3 it doesn't look like I'm lossing samples anymore. I will keep testing it. Thanks

@peter3099
Copy link
Author

In terms of hardware design do you think we need to go for an external RTC or using the internal should be realiable enough? We don't need super precision I just need that it wakes up when it needs to wake up :)

@esp-wzh
Copy link
Collaborator

esp-wzh commented Oct 26, 2023

As previously described, the 0 to 3 control of the internal power drive capability is linear, and if 0/3 is valid and 1/2 is not, then I suspect that the current test data doesn't prove a significant correlation between power drive capability and wake-up loss issue.

If your application accepts the increased cost of additional crystals, it is recommended to use it, which are indeed more power efficient and stable than internal clocks.

@peter3099
Copy link
Author

perfect thanks man!

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: In Progress Work is in progress labels Dec 1, 2023
@esp-wzh
Copy link
Collaborator

esp-wzh commented Dec 1, 2023

@peter3099 If no more problems please help close this issue, thanks!

@peter3099
Copy link
Author

Thanks @esp-wzh

@peter3099 peter3099 reopened this Dec 4, 2023
@peter3099
Copy link
Author

Hi @esp-wzh sorry for reopening this, but I wanted to know what is the status of this in IDF 5.1.2, do I still need to change out_config->dbg_atten_slp = RTC_CNTL_DBG_ATTEN_NODROP?

Thanks

@esp-wzh
Copy link
Collaborator

esp-wzh commented Dec 4, 2023

Hi, @peter3099. According to the data of our batch testing, there is no evidence showing that the current IDF default parameters will cause the failure you encountered, so I am sorry that I cannot promote the change of this parameter in the master. If your test proves that modifying this value can indeed to fix the problem you encountered, you can maintain the patch yourself. What we can confirm is that changing the RTC_CNTL_DBG_ATTEN_DEFAULT value to 0 will not have any other negative effects except increasing power consumption.

@peter3099
Copy link
Author

perfect thanks, so far I haven't got any power increase when using 0 or 3, the issue with deep sleep dissapeared but the power looks the same as before...

@Alvin1Zhang
Copy link
Collaborator

Thanks for sharing the updates, feel free to reopen.

@peter3099
Copy link
Author

Hi

Just to confirm this still happens with 5.1.2. I will try to make the same changes to the IDF to see if that solves the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

4 participants