-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RKE2 Canal Pod Issue: Timeout in Creating Service Account Token #5328
Comments
When you see this happen, is there a kube-proxy pod running on the affected node? Are you able to |
Thanks @brandond Ran the following to check the status of pods in kube-system:
Output:
Canal is failing with the same error:
Although kube-proxy-dev-worker-4 shows as
Results in:
Steps to Reproduce
Any thoughts ? |
I was not able to
|
This sounds like the same underlying issue as #4864 |
Is I have checked in the host, and the virtual interfaces are missing. I was thinking that it could be related to CNI plugin. I have deleted However, I have tried to intentionally make the kube-proxy pod crashing by doing following: I have updated the certificate and put the random string so it will be invalid.
and then
subsequently, the kube-proxy began to fail, which was the desired outcome. Following this, I terminated the I have also attempted to delete the network interface using Any thoughts ? |
The logs indicate that the cni-installer failure was due to an inability to reach the in-cluster kubernetes service endpoint to create a token. Access to cluster service endpoints is handled by kube-proxy. So yes. |
I'm closing this one since it's a duplicate of #4864 |
Environmental Info:
RKE2 Version:
v1.26.12+rke2r1
Node(s) CPU architecture, OS, and Version:
Linux worker-1 6.5.0-15-generic #15~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 12 18:54:30 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
1 master and 4 workers
Describe the bug:
Canal pod is failing with the following error:
Steps To Reproduce:
We have installed RKE2 using the Ansible role available at https://github.com/lablabs/ansible-role-rke2. During this process, we did not apply any customizations to systemd, nor did we override any environment variables.
Unfortunately, reproducing the issue is not straightforward. However, I managed to replicate it by repeatedly restarting the worker node, as well as by restarting both the master and worker nodes simultaneously.
Expected behavior:
The Canal pod should successfully create virtual interfaces using the CNI plugin, and then it will be able to generate the service account Canal token.
Actual behavior:
It appears that, in certain cases, Canal is unable to create virtual interfaces, which are essential for generating the service account Canal token.
Additional context / logs:
The entire log of the failed canal pod:
Important note:
The issue is happening in Ubuntu 22.04 which has following network adapter:
BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller
We have several other production servers using the same RKE2 version, equipped with Intel NICs (Ethernet Controller 10-Gigabit X540-AT2), and they are functioning properly.
The text was updated successfully, but these errors were encountered: