Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"phy_init: saving new calibration data because of checksum failure, mode(0)" on almost every start (IDFGH-14161) #14963

Closed
3 tasks done
kriegste opened this issue Dec 1, 2024 · 12 comments
Assignees
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally

Comments

@kriegste
Copy link

kriegste commented Dec 1, 2024

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

Under what circumstances does this message appear at startup?

phy_init: saving new calibration data because of checksum failure, mode(0)

I am asking because I get this on almost every start. Also software restart or hardware restart. I have several devices, with very different power supplies. All are affected. This may be since IDF 5.3.1. But I could be mistaken and it only caught my eye now. Experiments show that "mode(n)" refer to whether "full" or "partial" calibration is selected. However, this does not make a difference in the frequency of this message appearing.
I cannot believe this is really a "checksum failure". That would mean reading the NVS fails. But my WiFi credentials (SSID and password) which are also in the NVS, are never misread. The WiFi connection succeeds every time regardless if the message appears or not.
Any insights or ways to debug?

@espressif-bot espressif-bot added the Status: Opened Issue is new label Dec 1, 2024
@github-actions github-actions bot changed the title "phy_init: saving new calibration data because of checksum failure, mode(0)" on almost every start "phy_init: saving new calibration data because of checksum failure, mode(0)" on almost every start (IDFGH-14161) Dec 1, 2024
@AxelLin
Copy link
Contributor

AxelLin commented Dec 2, 2024

I also reported the same symptom with v5.2 branch: #13251 (comment), but got no response.

@mhdong

@kriegste
Copy link
Author

kriegste commented Dec 2, 2024

I am afraid of wrecking the flash chip prematurely by writing the calibration data over and over again.

@mhdong
Copy link
Collaborator

mhdong commented Dec 4, 2024

hi @kriegste
Current output "phy_init: saving new calibration data because of checksum failure, mode(0)" is not entirely accurate. It should differentiate between checksum errors and calibration updates. In the future, we plan to revise the description here or use more precise return values to distinguish between checksum errors and updates.

@AxelLin
Copy link
Contributor

AxelLin commented Dec 4, 2024

@mhdong
Is it ok writing the calibration data so frequently? #14963 (comment)

@mhdong
Copy link
Collaborator

mhdong commented Dec 4, 2024

hi @kriegste @kriegste
Considering the frequency of software restarts, hardware restarts, the impact on the chip's flash wear is acceptable.

@espressif-bot espressif-bot added Status: Reviewing Issue is being reviewed Status: Done Issue is done internally Resolution: NA Issue resolution is unavailable and removed Status: Opened Issue is new Status: Reviewing Issue is being reviewed labels Dec 10, 2024
@AxelLin
Copy link
Contributor

AxelLin commented Dec 20, 2024

@mhdong

I don't get it, this commit does not fix anything.
See the original report: "I am asking because I get this on almost every start."
Which means it's unlikely to be "the calibration data is outdated" case.
So the question is: Is it normal that hitting "data checksum failure" so frequently?

Also, you should have different return value for "data checksum failure" and "the calibration data is outdated" so the users can figure out the root cause. (we don't need to guess the root cause.)

@AxelLin
Copy link
Contributor

AxelLin commented Dec 20, 2024

hi @kriegste Current output "phy_init: saving new calibration data because of checksum failure, mode(0)" is not entirely accurate. It should differentiate between checksum errors and calibration updates. In the future, we plan to revise the description here or use more precise return values to distinguish between checksum errors and updates.

But you even didn't figure out if it's "checksum errors" or "calibration updates" for the reported case.
What is the time period for calibration updates?

@kriegste
Copy link
Author

If there is indeed a checksum failure it should still be logged as a warning. Don't you agree, Espressif?

@mhdong
Copy link
Collaborator

mhdong commented Dec 23, 2024

For checksum errors or calibration data being outdated, these simply indicate that we need to update calibration data. We don't need to handle them differently based on the specific return value, so there's no need for further differentiation.

@AxelLin
Copy link
Contributor

AxelLin commented Dec 23, 2024

For checksum errors or calibration data being outdated, these simply indicate that we need to update calibration data. We don't need to handle them differently based on the specific return value, so there's no need for further differentiation.

If the user cannot differentate it when reprot the issue, how could you
know if it could be a bug (checksum error) or expected behavior (expired) from a user reported log?
What is the time interval for calibration data expiration?

@mhdong
Copy link
Collaborator

mhdong commented Dec 23, 2024

hi @AxelLin
"Calibration data being outdated" has nothing to do with time. It simply means that the stored calibration information no longer matches the current environment.

@AxelLin
Copy link
Contributor

AxelLin commented Dec 24, 2024

hi @AxelLin "Calibration data being outdated" has nothing to do with time. It simply means that the stored calibration information no longer matches the current environment.

Which means ther is no way to differentiate "Calibration data being outdated"
and "Checksum errors" from a user's report.
The point is not about if you need to handle them differently.
The point is there is no way to check if "Checksum errors" happens (which should not happen).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

5 participants