Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated node local coredns config has extra newline and is invalid #5156

Closed
bmhughes opened this issue Dec 21, 2023 · 9 comments
Closed

Generated node local coredns config has extra newline and is invalid #5156

bmhughes opened this issue Dec 21, 2023 · 9 comments
Assignees

Comments

@bmhughes
Copy link

bmhughes commented Dec 21, 2023

Environmental Info:
RKE2 Version:

rke2 version v1.27.8+rke2r1 (77c9470934d7073341fb297aefa2dda0b97909c9)
go version go1.20.11 X:boringcrypto

Node(s) CPU architecture, OS, and Version:

Linux 5.14.0-362.8.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 8 17:36:32 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

Describe the bug:

When enabling node local DNS and invalid configuration is generated with an additional newline after the forward option which is rejected as an invalid config by coredns.

Issue appears to be here as the addition of a - to strip in the newline fixes the problem, might be something do with how the split function returns as it does not occur with the static definition of 10.43.0.10 nor the previous behaviour in 4811c82

Steps To Reproduce:

  • Install RKE2
  • Enabled node local DNS

Expected behavior:
It works

Actual behavior:
An invalid config is generated and coredns is stuck in CrashLoopBackoff until the ConfigMap is manuallly fixed

Additional context / logs:

@brandond
Copy link
Member

brandond commented Dec 21, 2023

Is this on a dual-stack cluster? Are you overriding the ClusterDNS address? Just from glancing at the template you linked, this would only appear to be a problem when .Values.global.clusterDNS contains multiple entries, which I don't believe it would by default even on a dual-stack cluster. However, you didn't mention either dual-stack OR a custom ClusterDNS address list in your steps to reproduce.

@bmhughes
Copy link
Author

bmhughes commented Dec 21, 2023

Apologies, my mistake leaving that out. Yes, this is a dual-stack cluster and .Values.global.clusterDNS has both the v4 and v6 address present.

@brandond
Copy link
Member

brandond commented Dec 21, 2023

Ok. And you manually set it to both addresses? It works fine if you leave it as the default single-stack value?

@bmhughes
Copy link
Author

No I haven't manually set the value, I can see it in the set section of the HelmChart resource but as far as I am aware it's been set automatically. I presume it's been pulled from the service?

@brandond
Copy link
Member

brandond commented Dec 22, 2023

It is passed through from the --cluster-dns option, and the default comes from the service CIDR config. Did you customize the --cluster-dns option, or did you just customize --cluster-cidr and --service-cidr to enable dual-stack operation?

@bmhughes
Copy link
Author

Just the latter, --cluster-dns is unset.

@brandond
Copy link
Member

cc @manuelbuil would you mind taking a look at this?

@brandond brandond added this to the v1.29.1+rke2r1 milestone Dec 22, 2023
@bmhughes bmhughes changed the title Generated node local coredns config is has extra newline and is invalid Generated node local coredns config has extra newline and is invalid Dec 23, 2023
@manuelbuil
Copy link
Contributor

Thanks for reporting it! I was able to reproduce the issue and found out that there was a bug in the helm templating helper code. Here is the fix: rancher/rke2-charts#387

@VestigeJ
Copy link
Contributor

##Environment Details
Reproduced using VERSION=v1.29.0+rke2r1
Validated using VERSION=v1.29.1-rc2+rke2r1

Infrastructure

  • Cloud

Node(s) CPU architecture, OS, and version:

Linux 5.11.0-1022-aws x86_64 GNU/Linux
PRETTY_NAME="Ubuntu 20.04.3 LTS"

Cluster Configuration:

NAME               STATUS   ROLES                       AGE     VERSION
ip-19-18-18-186    Ready    control-plane,etcd,master   2m25s   v1.29.0+rke2r1

Config.yaml:

token: YOUR_TOKEN_HERE
write-kubeconfig-mode: 644
debug: true
cni: multus,cilium
profile: cis
selinux: true
cluster-cidr: 10.42.0.0/16,2001:cafe:42:0::/56
service-cidr: 10.43.0.0/16,2001:cafe:42:1::/112

Reproduction

$ curl https://get.rke2.io --output install-"rke2".sh
$ sudo chmod +x install-"rke2".sh
$ sudo groupadd --system etcd && sudo useradd -s /sbin/nologin --system -g etcd etcd
$ sudo modprobe ip_vs_rr
$ sudo modprobe ip_vs_wrr
$ sudo modprobe ip_vs_sh
$ sudo printf "on_oovm.panic_on_oom=0 \nvm.overcommit_memory=1 \nkernel.panic=10 \nkernel.panic_ps=1 \nkernel.panic_on_oops=1 \n" > ~/60-rke2-cis.conf
$ sudo cp 60-rke2-cis.conf /etc/sysctl.d/
$ sudo systemctl restart systemd-sysctl
$ vim nlocal.yaml
$ sudo mkdir -p /var/lib/rancher/rke2/server/manifests/
$ sudo cp nlocal.yaml /var/lib/rancher/rke2/server/manifests/
$ VERSION=v1.29.0+rke2r1
$ setup_rke2
$ sudo INSTALL_RKE2_VERSION=$VERSION INSTALL_RKE2_EXEC=server ./install-rke2.sh
$ go_rke2
$ set_kubefig
$ kgn
$ kgp -A

Results:

$ kgp -n kube-system

kube-system   node-local-dns-l8jl2                                   0/1     CrashLoopBackOff   1 (15s ago)   109s

install using VERSION=v1.29.1-rc2+rke2r1
$ kgp -n kube-system

kube-system   node-local-dns-2k2jl                                    1/1     Running     0               2m55s
kube-system   rke2-coredns-rke2-coredns-9849d5ddb-bblgf               1/1     Running     0               2m55s
kube-system   rke2-coredns-rke2-coredns-autoscaler-64b867c686-tw77g   1/1     Running     0               2m55s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants