Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chassis] thermal overload seen on LC #110

Open
liamkearney-msft opened this issue Nov 15, 2024 · 4 comments
Open

[chassis] thermal overload seen on LC #110

liamkearney-msft opened this issue Nov 15, 2024 · 4 comments

Comments

@liamkearney-msft
Copy link

liamkearney-msft commented Nov 15, 2024

not seen on other devices

str2-7804-lc7 :

admin@str2-7804-lc7-1:~$ show reboot-cause hist
Name                 Cause                                                                                                      Time                             User    Comment
-------------------  ---------------------------------------------------------------------------------------------------------  -------------------------------  ------  ---------
2024_11_14_23_44_13  Thermal Overload: CPU (cpu-overtemp, description: detailed fault powerup=3016, time: 2024-11-14 23:43:18)  N/A                              N/A     Unknown
2024_11_14_23_02_48  reboot                                                                                                     Thu Nov 14 10:58:45 PM UTC 2024          N/A
2024_11_14_20_59_10  Unknown                                                                                                    N/A                              N/A     N/A
2024_11_14_03_48_12  Thermal Overload: CPU (cpu-overtemp, description: detailed fault powerup=3014, time: 2024-11-14 03:47:12)  N/A                              N/A     Unknown
2024_11_14_03_22_56  Unknown                                                                                                    N/A                              N/A     N/A
2024_11_14_02_03_46  reboot                                                                                                     Thu Nov 14 01:59:41 AM UTC 2024          N/A
2024_11_14_01_54_41  Thermal Overload: CPU (cpu-overtemp, description: detailed fault powerup=3011, time: 2024-11-14 01:53:45)  N/A                              N/A     Unknown
2024_11_14_01_34_58  Thermal Overload: CPU (cpu-overtemp, description: detailed fault powerup=3010, time: 2024-11-14 01:34:05)  N/A                              N/A     Unknown
2024_11_14_00_28_47  Thermal Overload: CPU (cpu-overtemp, description: detailed fault powerup=3009, time: 2024-11-14 00:27:47)  N/A                              N/A     Unknown
2024_11_14_00_15_29  Thermal Overload: CPU (cpu-overtemp, description: detailed fault powerup=3007, time: 2024-11-13 05:21:23)  N/A                              N/A     Unknown
@patrickmacarthur
Copy link
Contributor

Do you see similar reboot cause entries on other linecards on the same chassis?

@patrickmacarthur
Copy link
Contributor

Also, are you intentionally rebooting the linecard or chassis at the times that you see these thermal overload entries?

@liamkearney-msft
Copy link
Author

hey @patrickmacarthur , we dont seem to see this on other linecards in the chassis.
I noticed this while running reboot tests (so yes some intentional reboots were happening), but I can't confirm whether these all line up / are directly correlated

@patrickmacarthur
Copy link
Contributor

Can you try running the reboot test and then see what reboot-cause entries are added to the history?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants