Pod for kube-apiserver not synced (no current running pod found), retrying #4907

bataliero · 2023-10-17T09:07:09Z

Environmental Info:
RKE2 Version:

rke2 version v1.26.9+rke2r1 (368ba42)
go version go1.20.8 X:boringcrypto

Node(s) CPU architecture, OS, and Version:

Linux ip-172-31-18-26 5.19.0-1025-aws #26~22.04.1-Ubuntu SMP Mon Apr 24 01:58:15 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

Single server

Describe the bug:

It seams that API server is not started.

Steps To Reproduce:

Installed RKE2:
I just run on a fresh Ubuntu 22.04 (AWS EC2)

curl -sfL https://get.rke2.io | sudo sh -
sudo systemctl start rke2-server.service  #  <- this never ends

Expected behavior:

I would expect to be able to use kubectl that is able to connect to an API server (executed locally on a server machine)

>> sudo /var/lib/rancher/rke2/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes
E1017 08:59:07.669658    1911 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": dial tcp 127.0.0.1:6443: connect: connection refused
E1017 08:59:07.669941    1911 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": dial tcp 127.0.0.1:6443: connect: connection refused
E1017 08:59:07.671214    1911 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": dial tcp 127.0.0.1:6443: connect: connection refused
E1017 08:59:07.674289    1911 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": dial tcp 127.0.0.1:6443: connect: connection refused
E1017 08:59:07.674538    1911 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": dial tcp 127.0.0.1:6443: connect: connection refused
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?

Actual behavior:

Pod for kube-apiserver not synced (no current running pod found), retrying"

Additional context / logs:

Oct 17 08:54:47 ip-172-31-18-26 rke2[1497]: time="2023-10-17T08:54:47Z" level=info msg="Waiting for API server to become available"
Oct 17 08:54:47 ip-172-31-18-26 rke2[1497]: time="2023-10-17T08:54:47Z" level=info msg="Pod for etcd is synced"
Oct 17 08:54:47 ip-172-31-18-26 rke2[1497]: time="2023-10-17T08:54:47Z" level=info msg="Pod for kube-apiserver not synced (no current running pod found), retrying"
Oct 17 08:54:52 ip-172-31-18-26 rke2[1497]: time="2023-10-17T08:54:52Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:9345/v1-rke2/readyz: 500 Internal Server Error"
Oct 17 08:54:57 ip-172-31-18-26 rke2[1497]: time="2023-10-17T08:54:57Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:9345/v1-rke2/readyz: 500 Internal Server Error"
Oct 17 08:55:02 ip-172-31-18-26 rke2[1497]: time="2023-10-17T08:55:02Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:9345/v1-rke2/readyz: 500 Internal Server Error"

logs:
journalctl_rke2_server.log
kubelet.log

The text was updated successfully, but these errors were encountered:

brandond · 2023-10-17T16:46:40Z

It is normal to see that momentarily during startup while RKE2 is waiting for the pod to start. I don't see the message repeated, it looks like the apiserver is now running. You'd need to look at the apiserver pod logs (in /var/log/pods) to see why it's not ready yet.

IsaSih · 2024-01-27T05:55:35Z

Hello, I'm facing the same situation in a Ubuntu machine (AWS EC2), single server node. After enabling the service and running the systemctl start rke2-server.service command, I see the same error with rke2 version 1.26.11.

Looking into the API server pod logs, I see this error:

2024-01-24T13:35:14.689255139Z stderr F {"level":"info","ts":"2024-01-24T13:35:14.689002Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["etcd","--config-file=/var/lib/rancher/rke2/server/db/etcd/config"]}
2024-01-24T13:35:14.68963836Z stderr F {"level":"warn","ts":"2024-01-24T13:35:14.689444Z","caller":"etcdmain/etcd.go:446","msg":"**found invalid file under data directory","filename":"config","data-dir":"/var/lib/rancher/rke2/server/db/etcd"}**
2024-01-24T13:35:14.689647893Z stderr F {"level":"warn","ts":"2024-01-24T13:35:14.689471Z","caller":"etcdmain/etcd.go:446","msg":"**found invalid file under data directory","filename":"name","data-dir":"/var/lib/rancher/rke2/server/db/etcd"}**
2024-01-24T13:35:14.690816649Z stderr F {"level":"info","ts":"2024-01-24T13:35:14.689519Z","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["https://127.0.0.1:2380","https://172.31.5.23:2380"]}

Here are the full log files:

pod.log
rke2-server.log
journalctl-rke2-server.log

github-actions · 2024-04-06T20:11:45Z

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

andriy-bulynko · 2024-11-18T23:38:52Z

I have the same issue.

RKE2 Version:

rke2 version v1.30.6+rke2r1 (2959cd2)
go version go1.22.8 X:boringcrypto

Node(s) CPU architecture, OS, and Version:

Arch: x86_64, OS: Rocky Linux, Version: 8.10 (Green Obsidian)

Cluster Configuration:

Single server

journalctl -u rke2-server -f keeps printing logs like:

Nov 18 18:32:28 rocky3 rke2[5945]: time="2024-11-18T18:32:28-05:00" level=info msg="Waiting for API server to become available"
Nov 18 18:32:28 rocky3 rke2[5945]: time="2024-11-18T18:32:28-05:00" level=info msg="Pod for etcd is synced"
Nov 18 18:32:28 rocky3 rke2[5945]: time="2024-11-18T18:32:28-05:00" level=info msg="Pod for kube-apiserver not synced (pod sandbox not found), retrying"
Nov 18 18:32:28 rocky3 rke2[5945]: time="2024-11-18T18:32:28-05:00" level=info msg="Waiting for API server to become available"
Nov 18 18:32:30 rocky3 rke2[5945]: time="2024-11-18T18:32:30-05:00" level=warning msg="Failed to list nodes with etcd role: runtime core not ready"
Nov 18 18:32:45 rocky3 rke2[5945]: time="2024-11-18T18:32:45-05:00" level=warning msg="Failed to list nodes with etcd role: runtime core not ready"

brandond · 2024-11-19T06:47:25Z

Check the kubelet and containerd logs. You might also ensure that your node has sufficient CPU and memory resources available for the kubelet to schedule all the static pods.

github-actions bot added the status/stale label Apr 6, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pod for kube-apiserver not synced (no current running pod found), retrying #4907

Pod for kube-apiserver not synced (no current running pod found), retrying #4907

bataliero commented Oct 17, 2023

brandond commented Oct 17, 2023

IsaSih commented Jan 27, 2024

github-actions bot commented Apr 6, 2024

andriy-bulynko commented Nov 18, 2024

brandond commented Nov 19, 2024

Pod for kube-apiserver not synced (no current running pod found), retrying #4907

Pod for kube-apiserver not synced (no current running pod found), retrying #4907

Comments

bataliero commented Oct 17, 2023

brandond commented Oct 17, 2023

IsaSih commented Jan 27, 2024

github-actions bot commented Apr 6, 2024

andriy-bulynko commented Nov 18, 2024

brandond commented Nov 19, 2024