Issue with Cluster when one of the cluster node is down #5327

nagraj321 · 2024-01-27T14:19:14Z

Environmental Info:
RKE2 Version:
rke2 -v
rke2 version v1.24.10+rke2r1 (1ccdce2)
go version go1.19.5 X:boringcrypto

Node(s) CPU architecture, OS, and Version:

Linux testserver1 6.1.67 #1 SMP PREEMPT_DYNAMIC Tue Dec 19 11:25:42 PST 2023 x86_64 GNU/Linux

Cluster Configuration:

2 Servers and No Agents

Describe the bug:

We have followed the steps mentioned in the https://docs.rke2.io/install/ha. We see that cluster is working fine.
/var/lib/rancher/rke2/bin/kubectl get nodes --kubeconfig /etc/rancher/rke2/rke2.yaml
NAME STATUS ROLES AGE VERSION
testserver1 Ready control-plane,etcd,master 44m v1.24.10+rke2r1
testserver2 Ready control-plane,etcd,master 43m v1.24.10+rke2r1

When the testserver2 goes down we see that cluster is not working

/var/lib/rancher/rke2/bin/kubectl get nodes --kubeconfig /etc/rancher/rke2/rke2.yaml
Error from server (Timeout): the server was unable to return a response in the time allotted, but may still be processing the request (get nodes)

Steps To Reproduce:

Installed RKE2:

Expected behavior:

Our assumption is that if one of the cluster member goes down still it needs to work

Actual behavior:

If one of the cluster node goes down we cann't access the cluster.

Additional context / logs:

The text was updated successfully, but these errors were encountered:

brandond · 2024-01-27T22:56:10Z

A two node etcd cluster has zero fault tolerance. This is why the rke2 HA docs note that you need at least 3 nodes.

See also https://etcd.io/docs/v3.5/faq/#what-is-failure-tolerance

brandond closed this as completed Jan 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Cluster when one of the cluster node is down #5327

Issue with Cluster when one of the cluster node is down #5327

nagraj321 commented Jan 27, 2024

brandond commented Jan 27, 2024 •

edited

Loading

Issue with Cluster when one of the cluster node is down #5327

Issue with Cluster when one of the cluster node is down #5327

Comments

nagraj321 commented Jan 27, 2024

brandond commented Jan 27, 2024 • edited Loading

brandond commented Jan 27, 2024 •

edited

Loading