kube-apiserver occasionally terminates on cluster boot and doesn't get restarted #7241

hoo29 · 2024-11-07T17:15:33Z

Environmental Info:
RKE2 Version:
rke2 version v1.31.1+rke2r1 (909d20d)
go version go1.22.6 X:boringcrypto

Node(s) CPU architecture, OS, and Version:
Rocky 9
Linux 5.14.0-427.24.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Jul 8 17:47:19 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
Running in AWS with 3 servers and 7 agents. Using cilium. Servers are 2 core 4 GB ram.

Describe the bug:
Our cluster is shutdown overnight for cost saving and started back up each morning. Twice in the last month a server node has failed to start properly and this has prevented some agents nodes from joining, despite the other 2 servers being marked as healthy. We were previously on v1.29.3+rke2r1 and never had this issue after running it for several months with the same shutdown behaviour.

Looking through the broken server (server0), kube-apiserver gets terminated during rke2-server startup which causes numerous other issues. ps aux | grep kube-apiserver returns nothing on server0.

Steps To Reproduce:

Installed RKE2 on AWS using https://github.com/lablabs/ansible-role-rke2.
Shutdown all cluster nodes
Turn on all cluster nodes

Expected behavior:
The cluster starts up successfully.

The cluster is not severely degraded from 1 out 3 server nodes having issues.

Actual behavior:
The cluster does not always start up successfully.

The cluster is severely degraded from 1 out 3 server nodes having issues.

Additional context / logs:
etcd appears healthy. On server0 with 2456 being the pid of etcd

$ nsenter --target 2456 --mount  --net --pid --uts -- etcdctl \
  --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt \
  --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key \
  --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt endpoint status  --cluster --write-out=table;

+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT          |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.128.23.141:2379 | 12659cee018f1be2 |  3.5.13 |   52 MB |      true |      false |       248 |   71361783 |           71361783 |        |
|  https://10.128.23.66:2379 | 857d2c9e2293e976 |  3.5.13 |   52 MB |     false |      false |       248 |   71361783 |           71361783 |        |
| https://10.128.21.171:2379 | f699167f1021d729 |  3.5.13 |   52 MB |     false |      false |       248 |   71361783 |           71361783 |        |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

kube-apiserver on server1 & server2 are filled with authentication.go:73] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated].

Cilium is failing to start on server0 but with kube-apiserver down, I assume nothing will.

I have found this comment around load balancer health checks #5557 (comment) which we are not doing (but will implement) but I don't think it is the issue as etcd appears healthily.

Looking at kube-apiserver on server0

/var/log/containers/kube-apiserver-k8s-server0.DOMAIN_kube-system_kube-apiserver-a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171.log -> /var/log/pods/kube-system_kube-apiserver-k8s-server0.DOMAIN_80bcfddfbe1c3d0b4d90fe8d28b7a51c/kube-apiserver/22.log (attached as k8s-server0-not-ready-kube-apiserver-pod.log)

The kube-apiserver on server0 is container a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171. In /var/log/messages I can see

Nov  7 06:02:02 k8s-server0 systemd[1]: Started libcontainer container a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171.
Nov  7 06:02:26 k8s-server0 systemd[1]: run-containerd-runc-k8s.io-a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171-runc.choiNb.mount: Deactivated successfully.
Nov  7 06:02:31 k8s-server0 systemd[1]: run-containerd-runc-k8s.io-a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171-runc.BKojHd.mount: Deactivated successfully.
Nov  7 06:02:42 k8s-server0 systemd[1]: cri-containerd-a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171.scope: Deactivated successfully.
Nov  7 06:02:42 k8s-server0 systemd[1]: cri-containerd-a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171.scope: Consumed 9.180s CPU time.

Weirdly, critctl reports it as running on server0

$ crictl ps --name  kube-apiserver
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
a4d23bea00f67       1b44d478ec444       9 hours ago         Running             kube-apiserver      22                  093e8959f7b4d       kube-apiserver-k8s-server0.DOMAIN

$ crictl inspect a4d23bea00f67 | jq .info.pid
2372

$ ps aux | grep 2372
root      200950  0.0  0.0   6408  2176 pts/0    S+   15:22   0:00 grep --color=auto 2372

Containerd logs on server0 say it has been killed but then something is trying to exec it

time="2024-11-07T06:02:41.609408926Z" level=info msg="Kill container \"a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171\""
time="2024-11-07T06:02:41.987115383Z" level=error msg="ExecSync for \"a4d23bea00f6796f0ae87e8779331cd3abb4557a6208ad6ba52666e2ae544171\" failed" error="failed to exec in container: failed to start exec \"36daae2d0ca0aa2b27f83e18036018231cccee0bdfb824c5d11e9d9ed4020f3c\": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown"

I cannot stop the zombie container

 crictl stop a4d23bea00f67
E1107 16:15:02.506585  219370 remote_runtime.go:366] "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" containerID="a4d23bea00f67"
FATA[0002] stopping the container "a4d23bea00f67": rpc error: code = DeadlineExceeded desc = context deadline exceeded

Running systemctl restart rke2-server on server0 fixes it and all nodes immediately join but we would like to understand why this is happening.

Log attachments
k8s-agent0-not-ready-rke2-agent.log
k8s-server0-not-ready-containerd.log
k8s-server0-not-ready-rke2-server.log
k8s-server1-rke2-server.log
k8s-server2-rke2-server.log
k8s-agent6-rke2-agent.log
k8s-server0-not-ready-kube-apiserver-pod.log

The text was updated successfully, but these errors were encountered:

brandond · 2024-11-07T17:52:12Z

OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown

This sounds like a containerd bug:

runc-shim: fix races/prevent init exits from being dropped containerd/containerd#10651

We've updated to containerd v1.7.22 in RKE2 v1.31.2+rke2r1, please try that release.

hoo29 · 2024-11-07T18:29:59Z

Thanks for the quick response. We will give that a go.

Do you know why the agents are failing to join when one control node is down, is that the wrong health checks on the load balancer issue?

brandond · 2024-11-07T18:38:01Z

One problem per issue please!

hoo29 · 2024-11-07T18:40:29Z

Thanks - I will close this and reopen if we still experience the same with v1.31.2+rke2r1

hoo29 closed this as completed Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kube-apiserver occasionally terminates on cluster boot and doesn't get restarted #7241

kube-apiserver occasionally terminates on cluster boot and doesn't get restarted #7241

hoo29 commented Nov 7, 2024

brandond commented Nov 7, 2024 •

edited

Loading

hoo29 commented Nov 7, 2024

brandond commented Nov 7, 2024

hoo29 commented Nov 7, 2024

kube-apiserver occasionally terminates on cluster boot and doesn't get restarted #7241

kube-apiserver occasionally terminates on cluster boot and doesn't get restarted #7241

Comments

hoo29 commented Nov 7, 2024

brandond commented Nov 7, 2024 • edited Loading

hoo29 commented Nov 7, 2024

brandond commented Nov 7, 2024

hoo29 commented Nov 7, 2024

brandond commented Nov 7, 2024 •

edited

Loading