-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RKE2/Containerd Not Applying Rewrite Rules in /etc/rancher/rke2/registries.yaml #6889
Comments
I am not aware of any issues with rewrites. Can you provide a specific example of registries.yaml content that does not apply rewrites? Note that rewrite rules ONLY apply when pulling images from a mirror endpoint; rewrites are NOT intended to apply when pulling an image directly from the registry itself (ie, when using the registry's default endpoint). If you add a wildcard entry with rewrites, but no endpoints, this is not expected to do anything. |
I have similar issue. I am prepping for upgrades from rancher In my test environment, I have bootstrapped rke2 with 1.30.5 k get nodes
NAME STATUS ROLES AGE VERSION
node001-29ed1474 Ready control-plane,etcd,master,worker 12d v1.30.5+rke2r1
node002-29ed1474 Ready control-plane,etcd,master,worker 7d v1.30.5+rke2r1
node003-29ed1474 Ready control-plane,etcd,master,worker 7d v1.30.5+rke2r1
node-001-fe6af43f Ready worker 12d v1.30.5+rke2r1
node-002-fe6af43f Ready worker 7d v1.30.5+rke2r1
node-003-fe6af43f Ready worker 7d v1.30.5+rke2r1 with mirrors:
dockerhub.internal.com:
endpoint:
- "https://dockerhub.internal.com"
rewrite:
"^rancher/(.*)": "docker-internal/rancher/$1"
dockerhub-master.internal.com:
endpoint:
- "https://dockerhub-master.internal.com"
rewrite:
"^rancher/(.*)": "docker-internal/rancher/$1"
oci.internal.com:
endpoint:
- "https://oci.internal.com"
rewrite:
"^rancher/(.*)": "docker-internal/rancher/$1"
configs:
dockerhub.internal.com:
auth:
password: redacted
username: username
dockerhub-master.internal.com:
auth:
password: redacted
username: username
qa-oci.internal.com:
auth:
password: redacted
username: username with this configuration, failed with image not found. since I cannot manually edit and preserve changes in config.toml which is manged by RKE2. I used After fixing the rewrites I am able to deploy rancher helm chart version 2.9.3. Then to create a downstream cluster(downstream cluster provisioning, vm and other resources are created with terraform) have the same problem.
In current live setup I have nearly 250 Clusters registered rancher 2.7.10 in differences sizes. In all the nodes the private registry credentials are rotated often stored in hashicorp vault > ExternalSecretsOperator > Rancher fleet-default/ExternalSecret > Rancher fleet-default/Secret > each cluster registry config uses this secret to update the credentials in each node automatically. Updating config.toml with config.toml.tmpl in all 250 clusters with multi nodes is going to be very complex. Is there anything with my registries.yaml? not sure why the rewrite is added to config.toml from /etc/rancher/rke2/registries.yaml. |
I'm really confused by what you're doing here. Why are you trying to apply rewrites when pulling images directly from these registries? Why are you trying to override the desired behavior by providing your own containerd config template? As I said above:
It looks like you're trying to use these private registries as a mirror for docker.io, and apply rewrites when pulling the Rancher images from these registries. In that case, you should actually set these up as mirrors for docker.io, as shown in the RKE2 docs: mirrors:
docker.io:
endpoint:
- "https://dockerhub.internal.com"
- "https://dockerhub-master.internal.com"
- "https://qa-oci.internal.com"
rewrite:
"^rancher/(.*)": "docker-internal/rancher/$1"
configs:
dockerhub.internal.com:
auth:
password: redacted
username: username
dockerhub-master.internal.com:
auth:
password: redacted
username: username
qa-oci.internal.com:
auth:
password: redacted
username: username
Specifying mirrors and rewrites in containerd's config.toml has LONG been deprecated. Recent releases of RKE2 now put these configuration where they belong, in files under |
Thanks @brandond for reply. I am using rewrites in registries.yaml for pulling images from Yes, I read the containerd documentation that the using mirrors and rewrites in containerd config.toml is deprecated. I don't have plans to update the containerd's In my as you said
I don't need to use mirrors. just directly pull image from my private registry but upload the images in my private registry at path
configs:
dockerhub.internal.com:
auth:
password: redacted
username: username
dockerhub-master.internal.com:
auth:
password: redacted
username: username
qa-oci.internal.com:
auth:
password: redacted
username: username
|
Yep - if they're in your private registry under the same name, then you can just set system-default-registry in the config.yaml, and provide creds in registries.yaml. |
Ok. I will need to check with internal team who maintains the private Artifactory to see if I can get a project with name rancher to use as path /rancher/*. In case If I only need to use custom path then may I know the correct configuration to use custom path for example: /docker-internal/rancher/* ? |
Leave |
Ok. I will try and test it Thanks @brandond |
With below configuration RKE2 cluster bootstrapped successfully.
mirrors:
docker.io:
endpoint:
- https://dockerhub-master.internal.com
- https://dockerhub.internal.com
- https://oci.internal.com
rewrite:
^rancher/(.*): docker-internal/3rdparty/rancher/$1
configs:
dockerhub-master.internal.com:
auth:
username: username
password: password
dockerhub.internal.com:
auth:
username: username
password: password
oci.internal.com:
auth:
username: username
password: password checked the images used in pods k get pods --all-namespaces -o jsonpath="{..image}" | tr -s '[[:space:]]' '\n' | sort | uniq
docker.io/rancher/fleet-agent:v0.8.1
docker.io/rancher/hardened-calico:v3.27.2-build20240308
docker.io/rancher/hardened-cluster-autoscaler:v1.8.10-build20240124
docker.io/rancher/hardened-coredns:v1.11.1-build20240305
docker.io/rancher/hardened-etcd:v3.5.9-k3s1-build20230802
docker.io/rancher/hardened-flannel:v0.24.3-build20240307
docker.io/rancher/hardened-k8s-metrics-server:v0.6.3-build20231009
docker.io/rancher/hardened-kubernetes:v1.26.15-rke2r1-build20240314
docker.io/rancher/klipper-helm:v0.8.3-build20240228
docker.io/rancher/kube-api-auth:v0.2.0
docker.io/rancher/mirrored-sig-storage-snapshot-controller:v6.2.1
docker.io/rancher/mirrored-sig-storage-snapshot-validation-webhook:v6.2.2
docker.io/rancher/nginx-ingress-controller:nginx-1.9.3-hardened1
docker.io/rancher/rancher-agent:v2.7.10
docker.io/rancher/rancher-webhook:v0.3.6
docker.io/rancher/rke2-cloud-provider:v1.26.3-build20230406
docker.io/rancher/shell:v0.1.21
docker.io/rancher/system-agent:v0.3.3-suc
docker.io/rancher/system-upgrade-controller:v0.11.0
index.docker.io/rancher/hardened-etcd:v3.5.9-k3s1-build20230802
index.docker.io/rancher/hardened-kubernetes:v1.26.15-rke2r1-build20240314
index.docker.io/rancher/rke2-cloud-provider:v1.26.3-build20230406
rancher/fleet-agent:v0.8.1
rancher/hardened-calico:v3.27.2-build20240308
rancher/hardened-cluster-autoscaler:v1.8.10-build20240124
rancher/hardened-coredns:v1.11.1-build20240305
rancher/hardened-flannel:v0.24.3-build20240307
rancher/hardened-k8s-metrics-server:v0.6.3-build20231009
rancher/klipper-helm:v0.8.3-build20240228
rancher/kube-api-auth:v0.2.0
rancher/mirrored-sig-storage-snapshot-controller:v6.2.1
rancher/mirrored-sig-storage-snapshot-validation-webhook:v6.2.2
rancher/nginx-ingress-controller:nginx-1.9.3-hardened1
rancher/rancher-agent:v2.7.10
rancher/rancher-webhook:v0.3.6
rancher/shell:v0.1.21
rancher/system-agent:v0.3.3-suc
rancher/system-upgrade-controller:v0.11.0 also verified the containerd/cert.d
and docker.io/host.toml has proper rewrites directive cat /var/lib/rancher/rke2/agent/etc/containerd/certs.d/docker.io/hosts.toml
# File generated by rke2. DO NOT EDIT.
server = "https://registry-1.docker.io/v2"
capabilities = ["pull", "resolve", "push"]
[host."https://dockerhub-master.internal.com/v2"]
capabilities = ["pull", "resolve"]
[host."https://dockerhub-master.internal.com/v2".rewrite]
"^rancher/(.*)" = "docker-internal/3rdparty/rancher/$1"
[host."https://dockerhub.internal.com/v2"]
capabilities = ["pull", "resolve"]
[host."https://dockerhub.internal.com/v2".rewrite]
"^rancher/(.*)" = "docker-internal/3rdparty/rancher/$1"
[host."https://qa-oci.iotcc.internal.com/v2"]
capabilities = ["pull", "resolve"]
[host."https://qa-oci.iotcc.internal.com/v2".rewrite]
"^rancher/(.*)" = "docker-internal/3rdparty/rancher/$1" As per the pod images, what I understood is that the images are downloaded and used from internet/public not from private registry registry( all the rancher images are uploaded to internal private registry stored at for testing I completed removed rewrite directive to make it fail then I have this warning msg 92:Nov 08 18:01:21 kbc-001-226c65db.novalocal rancher-system-agent[1079]: time="2024-11-08T18:01:21Z" level=warning msg="Failed to get image from endpoint: GET https://dockerhub-master.internal.com/v2/rancher/system-agent-installer-rke2/manifests/v1.26.15-rke2r1: : Repository 'rancher' not found"
93:Nov 08 18:01:21 kbc-001-226c65db.novalocal rancher-system-agent[1079]: time="2024-11-08T18:01:21Z" level=warning msg="Failed to get image from endpoint: GET https://dockerhub.internal.com/v2/rancher/system-agent-installer-rke2/manifests/v1.26.15-rke2r1: : Repository 'rancher' not found"
``` however it continued to download from docker.io and cluster bootstrap.
My use case is that I only need to download images internally only from custom location `dockerhub-master.internal.com/docker-internal/3rdparty/rancher/*` |
The K3s docs are a bit more comprehensive, all the content hasn't yet been migrated over to RKE2:
The image is still from docker.io. The fact that it was actually pulled from an internal mirror, instead of directly from upstream, does not change that. |
Environmental Info:
RKE2 v1.28.8
Rancher v2.8.3
Describe the bug:
When creating a new cluster via Rancher, RKE2 / containterd isn't applying the rewrite rules from /etc/rancher/rke2/registries.yaml specifically if the files have mirrors.[*].registry..... when the docs and example don't have that wildcard so it's potentially a formatting change that got missed.
Steps To Reproduce:
For a customer this was reproducible for any cluster they were attempting to create via pipeline as they use the same registries.yaml.
The fix was to edit the /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl to add in the mirror registry/rewrite rules and rerun the pipeline to create the cluster.
Expected behavior:
For RKE2 / containerd to apply the rewrite rules specified in the registries.yaml file even when they include a wildcard.
Additional context / logs:
Potentially related to #3227
The text was updated successfully, but these errors were encountered: