Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLANNEL POD went into error state after internet of node gets disconnected from node #2008

Open
pret-nitish opened this issue Jun 26, 2024 · 3 comments
Assignees

Comments

@pret-nitish
Copy link

pret-nitish commented Jun 26, 2024

I am running a single node kubernetes cluster. Issue arises when internet from the node gets disconnected and node rebooted when i checked pods behaviour. they all went into UNKNOWN STATE and flannel pod went into ERROR STATE(crashloopbackoff).

Flannel pod logs :-
kubectl logs kube-flannel-ds-vtgxv -n kube-flannel
error from server: GET "https://192.168.1.249:10250/containerLogs/kube-flannel-ds-vtgxv/kube-flannel": dial tcp 192.168.1.249:10250: connect network is unreachable

Expected Behavior

If there is no internet on the node and node rebooted then also my Kubernetes cluster should be working fine. All microservices should be running. What changes should i do in order to achieve this behavior.

Current Behavior

After network interface goes down , we get to see above shared logs messages in pod logs. and after node reboot they all went into unknown state and flannel into error state.

Steps to Reproduce (for bugs)

Install flannel with below commands.
Created kubernetes and install MS and coredns.
install flannel :- kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Now remove the network interface eth0. Now no internet on the node. check the logs after sometime.
Reboot the node and check pod status and logs.

Your Environment

  • Flannel version: v0.25.4
  • Backend used (e.g. vxlan or udp): vxlan
  • Etcd version:
  • Kubernetes version (if used): 1.29.5
  • Operating System and version: Ubuntu
  • Link to your project (optional):
@rbrtbnfgl
Copy link
Contributor

If eth0 is the network interface used by flannel you can't remove it. The interface should be remain up after flannel started even if there aren't any internet connection the interface should maintain the same IP. If you don't need that the node get internet access I suggest you to create a dummy interface with a static IP and use it as flannel interface.

@pret-nitish
Copy link
Author

Thank you for your reply.

I have already one dummy interface with static IP which remains in the cluster even though when there is no internet connection. but since that is not public interface when internet goes off from device and reboot also happen after that then when system came up flannel pod unable to start and went into crashloopbackoff state and other went into UNKNOWN state except static pods.

Dummy interface configuration.
daemonset conf

  • --iface=dummyinterface0

Error example(No internet on node and reboot happens ) :-

kube-flannel kube-flannel-ds-xx92g 0/1 CrashLoopBackOff 262 (20s ago) 2d20h
kube-system coredns-76f75df574-d7r2t 0/1 Unknown 0 2d20h
kube-system coredns-76f75df574-hdz7c 0/1 Unknown 0 2d20h
kube-system etcd 1/1 Running 1 (24h ago) 2d20h

Since after reboot flannel interface unable to come up until we get the internet connection on the node. pod status remains same.

Can you please let us know how can we run our MS even when internet goes off and node rebooted.

@rbrtbnfgl
Copy link
Contributor

rbrtbnfgl commented Jul 9, 2024

Ok. I'll try to check if it could be fixed with a particular configuration or it requires a fix on the code.

@rbrtbnfgl rbrtbnfgl self-assigned this Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants