node-restart takes over 2 minutes to shutdown due to longhorn #9752
Replies: 2 comments 4 replies
-
This isn't strictly a question about K3s; I believe you'll see the same behavior on any node that you shut down while leaving volumes mounted. K3s doesn't stop pods when the service stops, to allow for nondisruptive upgrades. If you wanted to address this delay, you could probably drain the node first, and/or run the killall script before shutdown. |
Beta Was this translation helpful? Give feedback.
-
i cannot disagree more on this one, sorry. In understand, longhorn does add overhead, but it adds so moch
i used local-path in the past, while using host-path by democratic-csi now days, but rather for the less-ephemeral replacement for 'emptyDir' or for cluster databases where backups are not part of the storage solution. But in the end, that is the cool about k8s, we do not need to agree of 'a toolbelt', we can pick and chose what ever suits best. If local-path is the one you go for, that's perfect. (i understand that picking can have implications like with longhorn.. that it tries keeping alive the last replica 'harder' which might be an issue for single-node clusters here) |
Beta Was this translation helpful? Give feedback.
-
Environmental Info:
K3s Version: 1.28.7
Node(s) CPU architecture, OS, and Version:
ubuntu jammy, amd64
Cluster Configuration:
Describe the bug:
Usual reboot of a node takes very long due to probably longhorn, not sure
Running
https://docs.k3s.io/upgrades/killall
makes it visiable that unmounting longhorn volumes seems to be the cause.it stops at
The entire process takes 2 minutes and 5 seconds. Always 2 minutes and 5-10s, this is the classic 120s timeout that is waiting to happen somewhere.
Is there anything to debug with to see what it might be? IMHO it seems to be blocked on unounting the first longhorn volume, then waits and then instantly unmounts about 10 longhorn volumes in a row after 2 minutes
Beta Was this translation helpful? Give feedback.
All reactions