Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release-1.28] - Stopping rke2-server service on ControlPlane node causes other nodes to go NotReady #5723

Closed
brandond opened this issue Apr 12, 2024 · 1 comment
Assignees

Comments

@brandond
Copy link
Member

Backport fix for Stopping rke2-server service on ControlPlane node causes other nodes to go NotReady

@fmoral2
Copy link
Contributor

fmoral2 commented Apr 15, 2024

Validated on Version:

-$ rke2 version v1.28.8+dev.e75a5cb2 (e75a5cb237500d183b0b6d06a114ca60dae80eba)



Environment Details

Infrastructure
Cloud EC2 instance

Node(s) CPU architecture, OS, and Version:
SUSE Linux Enterprise Server 15 SP4

Cluster Configuration:
Split roles:

  • 2 cp only
  • 2 etcd
  • 1 worker

Steps to validate the fix

  1. create split roles cluster
  2. Stop server on one of the cp only
  3. check other nodes
  4. validate no other node is not ready or inactive
  5. validate pods

Reproduction Issue:

 
 rke2 -v
rke2 version v1.27.11+rke2r1 (6665618680112568f79b1f5992aecf4655e3cf8b)
go version go1.21.7 X:boringcrypto

 
on a CP-ONLY:
$  sudo systemctl  stop rke2-server


on another node:



$ k get nodes -o wide
NAME                                          STATUS     ROLES                  AGE   VERSION           INTERNAL-IP     EXTERNAL-IP     OS-IMAGE                              KERNEL-VERSION              CONTAINER-RUNTIME
ip-172-31-1-96.us-east-2.compute.internal     Ready      <none>                 50m   v1.27.11+rke2r1   172.31.1.96     3.128.30.47     SUSE Linux Enterprise Server 15 SP4   5.14.21-150400.22-default   containerd://1.7.11-k3s2
ip-172-31-10-186.us-east-2.compute.internal   NotReady   control-plane,master   59m   v1.27.11+rke2r1   172.31.10.186   3.144.224.153   SUSE Linux Enterprise Server 15 SP4   5.14.21-150400.22-default   containerd://1.7.11-k3s2
ip-172-31-10-201.us-east-2.compute.internal   Ready      control-plane,master   59m   v1.27.11+rke2r1   172.31.10.201   18.224.62.92    SUSE Linux Enterprise Server 15 SP4   5.14.21-150400.22-default   containerd://1.7.11-k3s2
ip-172-31-10-59.us-east-2.compute.internal    NotReady   etcd   
 


Validation Results:

       
       
- `Tried from 2 diff control plane nodes. It only stops the requested one.`

CP -2 
 k get nodes
NAME                                          STATUS     ROLES                       AGE     VERSION
ip-172-31-0-203.us-east-2.compute.internal    Ready      <none>                      4m43s   v1.28.8+rke2r1
ip-172-31-12-163.us-east-2.compute.internal   Ready      <none>                      5m57s   v1.28.8+rke2r1
ip-172-31-15-161.us-east-2.compute.internal   Ready      etcd                        9m48s   v1.28.8+rke2r1
ip-172-31-2-204.us-east-2.compute.internal    Ready      control-plane,master        9m8s    v1.28.8+rke2r1
ip-172-31-3-191.us-east-2.compute.internal    Ready      control-plane,etcd,master   13m     v1.28.8+rke2r1
ip-172-31-3-246.us-east-2.compute.internal    Ready      etcd                        10m     v1.28.8+rke2r1
ip-172-31-9-194.us-east-2.compute.internal    Ready      <none>                      5m22s   v1.28.8+rke2r1
ip-172-31-9-205.us-east-2.compute.internal    NotReady   control-plane,master        9m18s   v1.28.8+rke2r1


CP -1 
$ k get nodes
NAME                                          STATUS     ROLES                       AGE   VERSION
ip-172-31-0-203.us-east-2.compute.internal    Ready      <none>                      19m   v1.28.8+rke2r1
ip-172-31-12-163.us-east-2.compute.internal   Ready      <none>                      20m   v1.28.8+rke2r1
ip-172-31-15-161.us-east-2.compute.internal   Ready      etcd                        24m   v1.28.8+rke2r1
ip-172-31-2-204.us-east-2.compute.internal    NotReady   control-plane,master        23m   v1.28.8+rke2r1
ip-172-31-3-191.us-east-2.compute.internal    Ready      control-plane,etcd,master   27m   v1.28.8+rke2r1
ip-172-31-3-246.us-east-2.compute.internal    Ready      etcd                        25m   v1.28.8+rke2r1
ip-172-31-9-194.us-east-2.compute.internal    Ready      <none>                      20m   v1.28.8+rke2r1
ip-172-31-9-205.us-east-2.compute.internal    Ready      control-plane,master        24m   v1.28.8+rke2r1



@fmoral2 fmoral2 closed this as completed Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants