Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to get MemberList from server #6872

Closed
cloudcafetech opened this issue Sep 26, 2024 · 4 comments
Closed

Failed to get MemberList from server #6872

cloudcafetech opened this issue Sep 26, 2024 · 4 comments

Comments

@cloudcafetech
Copy link

Environmental Info:

  • RKE2 Version:
rke2 -v
rke2 version v1.30.4+rke2r1 (9517eea519b780e154dd791c555c698e84a0e5cd)
go version go1.22.5 X:boringcrypto

Node(s) CPU architecture, OS, and Version:
Linux 5.14.0-427.33.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 16 10:56:24 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
MASTER1:

token: 225mgm-secret
write-kubeconfig-mode: "0644"
cluster-cidr: 10.244.0.0/14
service-cidr: 192.168.0.0/16
node-label:
- "region=master"
tls-san:
  - "172.27.2.209"
  - "172.27.2.219"
  - "172.27.2.223"
  - "172.27.2.225"
# SELINUX
selinux: true

MASTER2:

server: https://172.27.2.209:9345
token: 225mgm-secret
write-kubeconfig-mode: "0644"
cluster-cidr: 10.244.0.0/14
service-cidr: 192.168.0.0/16
node-label:
- "region=master"
tls-san:
  - "172.27.2.209"
  - "172.27.2.219"
  - "172.27.2.223"
  - "172.27.2.225"
# SELINUX
selinux: true

Able to reach Master on 9345

# nc -v 172.27.2.209 9345
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 172.27.2.209:9345.
^C

Describe the bug:

Not able to join master, error coming as below

Sep 26 04:59:10 test rke2[2493800]: {"level":"warn","ts":"2024-09-26T04:59:10.746954+0200","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0011981e0/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused\""}
Sep 26 04:59:10 test rke2[2493800]: time="2024-09-26T04:59:10+02:00" level=error msg="Failed to check local etcd status for learner management: context deadline exceeded"
Sep 26 04:59:11 test rke2[2493800]: time="2024-09-26T04:59:11+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
Sep 26 04:59:13 test rke2[2493800]: time="2024-09-26T04:59:13+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
Sep 26 04:59:15 test rke2[2493800]: time="2024-09-26T04:59:15+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
Sep 26 04:59:16 test rke2[2493800]: {"level":"warn","ts":"2024-09-26T04:59:16.73584+0200","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0011981e0/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused\""}
Sep 26 04:59:16 test rke2[2493800]: time="2024-09-26T04:59:16+02:00" level=info msg="Failed to test data store connection: context deadline exceeded"
Sep 26 04:59:17 test rke2[2493800]: time="2024-09-26T04:59:17+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
# journalctl -f -u rke2-server | grep "failed to get MemberList"
Sep 26 05:39:10 test rke2[2589133]: time="2024-09-26T05:39:10+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
Sep 26 05:39:12 test rke2[2589133]: time="2024-09-26T05:39:12+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
Sep 26 05:39:14 test rke2[2589133]: time="2024-09-26T05:39:14+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
Sep 26 05:39:16 test rke2[2589133]: time="2024-09-26T05:39:16+02:00" level=info msg="Waiting to retrieve etcd cluster member list: failed to get MemberList from server: Internal error occurred: failed to get etcd MemberList: context deadline exceeded"
@brandond
Copy link
Member

brandond commented Sep 26, 2024

Figure out why the etcd pod isn't running on the existing server node. Have you checked the kubelet.log, and etcd pod logs under /var/log/pods?

Note, RKE2 does not have "master" nodes. Just server and agents.

@brandond brandond reopened this Sep 26, 2024
@cloudcafetech
Copy link
Author

Looks like bug/issue was in v1.30.4, after changing new version (v1.30.5) issue fixed.

By the way, do you have any idea how to find right stable (bug/issue free) version ?

@brandond
Copy link
Member

I'm not aware of any issues with etcd in v1.30.4+rke2r1, so it is unlikely that whatever was going on was resolved by the change in version. Without logs I really can't say though.

In general I'd recommend the latest version available. If we're aware of a bug, we either fix it, or call it out in the release notes.

@cloudcafetech
Copy link
Author

Thank you for reply.

Based on your issue (#5804), I decided to check another version & it works.

Biggest challenge is is we setting up in Airgap Env :)

Once again thanks for prompt reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants