You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Environmental Info:
RKE2 Version:
rke2 version v1.29.0+rke2r1 (4fd30c2)
go version go1.21.5 X:boringcrypto
Node(s) CPU architecture, OS, and Version:
Linux yellowtail 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
1 RKE2 Server node
SELinux Disabled
Describe the bug:
The etcd container for RKE2 starts up, but eventually segfaults before restarting. This prevents the cluster from starting as the etcd container keeps crashing.
Steps To Reproduce:
Installed RKE2 traditionally, with the following config:
Actual behavior:
Etcd segfaults. The etcd container launches normally. Containerd starts fine. Kubelet starts fine. The etcd container is relaunched and the same segfault occurs.
This is the relevant portion from journalctl -u rke2-server
Dec 31 22:18:04 yellowtail rke2[37955]: time="2021-12-31T22:18:04-05:00" level=info msg="containerd is now running"
Dec 31 22:18:04 yellowtail rke2[37955]: time="2021-12-31T22:18:04-05:00" level=debug msg="Deleting existing lease: {rke2 2022-01-01 03:17:17.103412157 +0000 UTC map[]}"
Dec 31 22:18:04 yellowtail rke2[37955]: time="2021-12-31T22:18:04-05:00" level=info msg="Importing images from /var/lib/rancher/rke2/agent/images/rke2-images-canal.linux-amd64.tar.gz"
Dec 31 22:18:21 yellowtail rke2[37955]: time="2021-12-31T22:18:21-05:00" level=info msg="Pod for etcd is synced"
Dec 31 22:18:21 yellowtail rke2[37955]: time="2021-12-31T22:18:21-05:00" level=info msg="Pod for kube-apiserver is synced"
Dec 31 22:18:21 yellowtail rke2[37955]: time="2021-12-31T22:18:21-05:00" level=info msg="ETCD server is now running"
Dec 31 22:18:21 yellowtail rke2[37955]: time="2021-12-31T22:18:21-05:00" level=info msg="rke2 is up and running"
Dec 31 22:18:21 yellowtail systemd[1]: Started Rancher Kubernetes Engine v2 (server).
Dec 31 22:18:25 yellowtail rke2[37955]: {"level":"warn","ts":"2021-12-31T22:18:25.995336-0500","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00086f500/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Dec 31 22:18:25 yellowtail rke2[37955]: {"level":"info","ts":"2021-12-31T22:18:25.995483-0500","logger":"etcd-client","caller":"[email protected]/client.go:210","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
Dec 31 22:18:34 yellowtail rke2[37955]: time="2021-12-31T22:18:34-05:00" level=info msg="Failed to get existing traefik HelmChart" error="helmcharts.helm.cattle.io \"traefik\" not found"
Dec 31 22:18:34 yellowtail rke2[37955]: time="2021-12-31T22:18:34-05:00" level=info msg="Reconciling ETCDSnapshotFile resources"
Dec 31 22:18:34 yellowtail rke2[37955]: time="2021-12-31T22:18:34-05:00" level=info msg="Reconciliation of ETCDSnapshotFile resources complete"
Dec 31 22:18:34 yellowtail rke2[37955]: panic: runtime error: invalid memory address or nil pointer dereference
Dec 31 22:18:34 yellowtail rke2[37955]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x28d75cd]
Dec 31 22:18:34 yellowtail rke2[37955]: goroutine 518 [running]:
Dec 31 22:18:34 yellowtail rke2[37955]: github.com/k3s-io/k3s/pkg/etcd.(*ETCD).listLocalSnapshots.func1({0xc0008a0654, 0x29}, {0x0, 0x0}, {0x3bbc700, 0xc001700150})
Dec 31 22:18:34 yellowtail rke2[37955]: /go/pkg/mod/github.com/k3s-io/[email protected]/pkg/etcd/snapshot.go:439 +0x4d
Dec 31 22:18:34 yellowtail rke2[37955]: path/filepath.Walk({0xc0008a0654, 0x29}, 0xc0007f1038)
Dec 31 22:18:34 yellowtail rke2[37955]: /usr/local/go/src/path/filepath/path.go:570 +0x4a
Dec 31 22:18:34 yellowtail rke2[37955]: github.com/k3s-io/k3s/pkg/etcd.(*ETCD).listLocalSnapshots(0xc000b0d0e0)
Dec 31 22:18:34 yellowtail rke2[37955]: /go/pkg/mod/github.com/k3s-io/[email protected]/pkg/etcd/snapshot.go:438 +0xa7
Dec 31 22:18:34 yellowtail rke2[37955]: github.com/k3s-io/k3s/pkg/etcd.(*ETCD).ReconcileSnapshotData(0xc000b0d0e0, {0x3bf3f70, 0xc0009449b0})
Dec 31 22:18:34 yellowtail rke2[37955]: /go/pkg/mod/github.com/k3s-io/[email protected]/pkg/etcd/snapshot.go:735 +0xd6
Dec 31 22:18:34 yellowtail rke2[37955]: github.com/k3s-io/k3s/pkg/cluster.(*Cluster).Start.func1()
Dec 31 22:18:34 yellowtail rke2[37955]: /go/pkg/mod/github.com/k3s-io/[email protected]/pkg/cluster/cluster.go:110 +0x9e
Dec 31 22:18:34 yellowtail rke2[37955]: created by github.com/k3s-io/k3s/pkg/cluster.(*Cluster).Start in goroutine 1
Dec 31 22:18:34 yellowtail rke2[37955]: /go/pkg/mod/github.com/k3s-io/[email protected]/pkg/cluster/cluster.go:101 +0x6ad
Dec 31 22:18:34 yellowtail systemd[1]: rke2-server.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 31 22:18:34 yellowtail systemd[1]: rke2-server.service: Failed with result 'exit-code'.
Dec 31 22:18:39 yellowtail systemd[1]: rke2-server.service: Service RestartSec=5s expired, scheduling restart.
Dec 31 22:18:39 yellowtail systemd[1]: rke2-server.service: Scheduled restart job, restart counter is at 20.
Dec 31 22:18:39 yellowtail systemd[1]: Stopped Rancher Kubernetes Engine v2 (server).
Environmental Info:
RKE2 Version:
rke2 version v1.29.0+rke2r1 (4fd30c2)
go version go1.21.5 X:boringcrypto
Node(s) CPU architecture, OS, and Version:
Linux yellowtail 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
1 RKE2 Server node
SELinux Disabled
Describe the bug:
The etcd container for RKE2 starts up, but eventually segfaults before restarting. This prevents the cluster from starting as the etcd container keeps crashing.
Steps To Reproduce:
Installed RKE2 traditionally, with the following config:
Expected behavior:
Cluster starts up normally
Actual behavior:
Etcd segfaults. The etcd container launches normally. Containerd starts fine. Kubelet starts fine. The etcd container is relaunched and the same segfault occurs.
This is the relevant portion from
journalctl -u rke2-server
Additional context / logs:
etcd config:
The text was updated successfully, but these errors were encountered: