Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RKE 1.26 fails to start on RHEL 8.7; etcd server doesn't start #5093

Closed
NullOranje opened this issue Dec 1, 2023 · 7 comments
Closed

RKE 1.26 fails to start on RHEL 8.7; etcd server doesn't start #5093

NullOranje opened this issue Dec 1, 2023 · 7 comments
Assignees

Comments

@NullOranje
Copy link

NullOranje commented Dec 1, 2023

Environmental Info:
RKE2 Version: v1.26.10+rke2r2 (21e3a8c)

Node(s) CPU architecture, OS, and Version: Linux rke-test-03 4.18.0-425.3.1.el8.x86_64 #1 SMP Fri Sep 30 11:45:06 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux, RHEL 8.7

Cluster Configuration: 1 server

Describe the bug:

RKE2 server fails to start. The issue seems to be with the etcd container not starting.

"Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"etcd-rke-test-03_kube-system(cc550ce6fb82de09bc9f6236654e9ea2)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"etcd-rke-test-03_kube-system(cc550ce6fb82de09bc9f6236654e9ea2)\\\": rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: write /proc/self/attr/keycreate: invalid argument: unknown\"" pod="kube-system/etcd-rke-test-03"

Steps To Reproduce:

  • Fresh install of RHEL 8.7 with selinux enabled
  • Installed RKE2 via RPM method (

Expected behavior:

RKE2 is ready

Actual behavior:

RKE fails to start. Etcd container never starts.

Additional context / logs:

Seems to be another issue with the rke2-selinux package

# rpm -qa | grep rke2
rke2-common-1.26.10~rke2r2-0.el8.x86_64
rke2-selinux-0.16-1.el8.noarch
rke2-server-1.26.10~rke2r2-0.el8.x86_64

Running transaction
  Preparing        :                                                                                                                                                         
  Running scriptlet: rke2-selinux-0.16-1.el8.noarch
  Installing       : rke2-selinux-0.16-1.el8.noarch
  Running scriptlet: rke2-selinux-0.16-1.el8.noarch
Failed to resolve typeattributeset statement at /var/lib/selinux/targeted/tmp/modules/400/rke2/cil:63
semodule:  Failed!

# cat /usr/share/selinux/packages/rke2.pp   | /usr/libexec/selinux/hll/pp > rke2.cli
# sed -n 63p rke2.cli
(typeattributeset cil_gen_require container_file_type)
@brandond
Copy link
Member

brandond commented Dec 1, 2023

What version of the container-selinux package do you have on your node?

@NullOranje
Copy link
Author

container-selinux-2.189.0-1.module+el8.7.0+16772+33343656.noarch

@brandond
Copy link
Member

brandond commented Dec 5, 2023

@galal-hussein any ideas? This looks like a new container-selinux incompatibility, but not one I've seen before.

@galal-hussein
Copy link
Contributor

The issue was caused because of container_file() macro which we call to support new tls types in rke2-selinux, apparently contianer-selinux 2.189 doesn't define this macro, so I reverted back to files_type instead, you should be able to test it with the testing channel for rke2

@galal-hussein galal-hussein self-assigned this Dec 5, 2023
@NullOranje
Copy link
Author

NullOranje commented Dec 5, 2023

That seems to have resolved my issue. Thanks @galal-hussein !

@ShylajaDevadiga ShylajaDevadiga self-assigned this Dec 7, 2023
@ShylajaDevadiga
Copy link
Contributor

Issue was not reproducible on the RHEL 8.7 as the ami used had the newer version of container-selinux.

Validated install and upgrade on RHEL 8.7 for regression using testing channel. Both scenarios have passed.

[ec2-user@ip-172-31-6-172 ~]$ kubectl get nodes
NAME                                          STATUS     ROLES                       AGE     VERSION
ip-172-31-12-120.us-east-2.compute.internal   NotReady   <none>                      86s     v1.26.10+rke2r2
ip-172-31-6-172.us-east-2.compute.internal    Ready      control-plane,etcd,master   8m44s   v1.26.10+rke2r2
[ec2-user@ip-172-31-6-172 ~]$ rpm -qa|grep container
container-selinux-2.221.0-1.module+el8.9.0+20326+387084d0.noarch
[ec2-user@ip-172-31-6-172 ~]$ 

@brandond
Copy link
Member

brandond commented Dec 7, 2023

I suspect this will only affect users that have locked their hosts to el8.7 or el8.8. el8.9 and newer have the new type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants