Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for bare hostname as endpoint, fix unnecessary namespace param inclusion #5516

Closed
caroline-suse-rancher opened this issue Feb 23, 2024 · 1 comment
Assignees

Comments

@caroline-suse-rancher
Copy link
Contributor

caroline-suse-rancher commented Feb 23, 2024

An issue to track the work being done in rancher/wharfie#24

Internal Jira Ref: SURE-7640

  • Wharfie does not support a bare hostname as the mirror endpoint in registries.yaml. If the endpoint is a bare hostname, or lacks a URI scheme (http://, https://), the node will not use the endpoint. If the endpoint is necessary to pull the rke2-runtime image, rke2 will fail to start.

  • Wharfie unnecessarily adds the namespace to the end of the request URI, if it is added as an endpoint. For example, with system-default-registry: registry.example.com, and the following content in registries.yaml, the image will fail to pull if the registry is namespace-aware (for example, sonatype nexus).

    mirrors:
      registry.example.com:
        endpoint:
          - https://registry.example.com

    When rke2 is running with debug: true, it can be seen that the registry hostname is appended to the request as ?ns=registry.example.com. The namespace query parameter should only be set if the endpoint hostname is not the same as the registry hostname.

@endawkins
Copy link

endawkins commented Feb 28, 2024

Validated on master with 08699df / 1.29

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

Linux ip-172-31-19-58 4.15.0-1051-aws #53-Ubuntu SMP Wed Sep 18 13:35:53 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Cluster Configuration:

4 Servers:
2 Bastion Hosts
2 Airgapped Instances

Config.yaml:

config.yaml 1:
token: test
debug: true
write-kubeconfig-mode: 644
system-default-registry: [REDACTED]

config.yaml 2:
token: test
debug: true
write-kubeconfig-mode: 644
system-default-registry: [REDACTED]

Additional files

registries.yaml
- airgap 1 [registry prefix] w/ uri schema
---
mirrors:
  docker.io:
    endpoint:
            - "https://[REDACTED]"
  [REDACTED]:
    endpoint:
            - "https://[REDACTED]/testing/v2/"
configs:
  "[REDACTED]":
    tls:
      cert_file: /home/ubuntu/domain.crt
      key_file: /home/ubuntu/domain.key

airgap 1 [registry prefix] w/o uri schema

---
mirrors:
  docker.io:
    endpoint:
            - "[REDACTED]"
  [REDACTED]:
    endpoint:
            - "[REDACTED]/testing/v2/"
configs:
  "[REDACTED]":
    tls:
      cert_file: /home/ubuntu/domain.crt
      key_file: /home/ubuntu/domain.key

registries.yaml w/o schema
- airgap 2 [no registry prefix]

---
mirrors:
  docker.io:
    endpoint:
      - "[REDACTED]"
  [REDACTED]:
    endpoint:
      - "[REDACTED]"
configs:
  "[REDACTED]":
    tls:
      cert_file: /home/ubuntu/domain.crt
      key_file: /home/ubuntu/domain.key

Testing Steps

Note:

Bastion Instance 1 -> Airgap Instance 1
Bastion Instance 2 -> Airgap Instance 2

Air-Gap Setup

  1. Launch two bastion instances from AWS
  2. Launch two airgapped instances from AWS -- disable auto-assign public IP
  3. ssh into the bastion nodes
  4. Copy .pem file to bastion instances
    scp -i "<path_to_pem_file>" <path_to_pem_file> username@<PUBLIC_IP>:~
  5. ssh into airgapped instances

Pull-Through Cache Configuration

  1. Add certificates:
    mkdir -p certs && openssl req -newkey rsa:4096 -nodes -sha256 -keyout certs/domain.key -x509 -days 365 -out certs/domain.crt -subj "/C=US/ST=AZ/O=Rancher QA/CN=[REDACTED]" -addext "subjectAltName = DNS:[REDACTED]"
    mkdir -p certs && openssl req -newkey rsa:4096 -nodes -sha256 -keyout certs/domain.key -x509 -days 365 -out certs/domain.crt -subj "/C=US/ST=AZ/O=Rancher QA/CN=[REDACTED]" -addext "subjectAltName = DNS:[REDACTED]"
  2. Bastion 1:
    sudo docker run -d --restart=always --name registry -v "$(pwd)"/certs:/certs -e REGISTRY_HTTP_ADDR=0.0.0.0:443 -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io -e REGISTRY_HTTP_PREFIX=/testing/ -p 443:443 registry:2.7.1
    Bastion 2:
    sudo docker run -d --restart=always --name registry -v "$(pwd)"/certs:/certs -e REGISTRY_HTTP_ADDR=0.0.0.0:443 -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io -p 443:443 registry:2.7.1
  3. To view the containers running: docker ps -a

Copying Files to Airgapped Instance [Do this for each pair of airgapped instances]

  1. Obtain the rke2 binary, rename it if desired (make sure its amd64): wget -O <NAME_THE_BINARY_FILE> https://github.com/rancher/rke2/releases/download/<VERSION>/<FILENAME>
  2. ssh into airgapped instance
    ssh -i <file_name>.pem username@<AIRGAP_IP>
  3. close connection to airgapped instanceexit
  4. copy rke2 binary and certificates to airgapped instance:
scp -i <file_name>.pem <rke2_binary_file> username@<AIRGAP_IP>:~
scp -i <file_name>.pem certs/* username@<AIRGAP_IP>:
  1. ssh to airgapped instance

Certificates and RKE2 Setup

  1. Update Certificates:
    sudo cp domain.crt /usr/local/share/ca-certificates/ && sudo update-ca-certificates
  2. sudo vi config.yaml (there will be a config.yaml in both airgapped instances - a total of 2)
  3. sudo vi registries (there will be a registries.yaml in both airgapped instances - a total of 2)
  4. sudo mkdir -p /etc/rancher/rke2/ && sudo cp config.yaml /etc/rancher/rke2/ && cat /etc/rancher/rke2/config.yaml && sudo cp registries.yaml /etc/rancher/rke2/ && sudo cat /etc/rancher/rke2/registries.yaml
    ** Making RKE2 Binaries Executable **
  5. chmod +x <RKE2_BINARY>
  6. sudo mv <RKE2_BINARY> /usr/local/bin/rke2
  7. Check version: rke2 --version
  8. Open two new terminals and ssh into airgap instance 1 and 2
  • in those two terminals run the following command: sudo rke2 server
  1. source .bashrc
  2. kga
  3. search for "?ns"

Replication Results:

  • rke2 version used for replication:
rke2 --version
rke2 version v1.26.13+rke2r1 (637e8a38334f603b60650b30547252a5c461fa0d)
go version go1.20.13 X:boringcrypto

Registries

registries.yaml
- airgap 1 [registry prefix] w/ uri schema
---
mirrors:
  docker.io:
    endpoint:
            - "https://[REDACTED]"
  [REDACTED]:
    endpoint:
            - "https://[REDACTED]/testing/v2/"
configs:
  "[REDACTED]":
    tls:
      cert_file: /home/ubuntu/domain.crt
      key_file: /home/ubuntu/domain.key

airgap 1 [registry prefix] w/o uri schema

---
mirrors:
  docker.io:
    endpoint:
            - "[REDACTED]"
  [REDACTED]:
    endpoint:
            - "[REDACTED]/testing/v2/"
configs:
  "[REDACTED]":
    tls:
      cert_file: /home/ubuntu/domain.crt
      key_file: /home/ubuntu/domain.key

registries.yaml w/o schema
- airgap 2 [no registry prefix]

---
mirrors:
  docker.io:
    endpoint:
      - "[REDACTED]"
  [REDACTED]:
    endpoint:
      - "[REDACTED]"
configs:
  "[REDACTED]":
    tls:
      cert_file: /home/ubuntu/domain.crt
      key_file: /home/ubuntu/domain.key

Observations:

prefix + no uri schema:

Registry endpoint URL modified: https://[REDACTED]/v2/ => https://[REDACTED]/v2/?ns=[REDACTED]
W0227 22:57:05.649740    2954 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
WARN[0002] Failed to get image from endpoint: GET https://[REDACTED]/v2/?ns=REDACTED: unexpected status code 404 Not Found: 404 page not found
FATA[0002] failed to get runtime image [REDACTED]/rancher/rke2-runtime:v1.26.13-rke2r1: all endpoints failed: GET https://[REDACTED]/v2/?ns=REDACTED: unexpected status code 404 Not Found: 404 page not found

prefix + uri schema:

INFO[0002] Using private registry config file at /etc/rancher/rke2/registries.yaml
DEBU[0002] Kubelet image credential provider bin directory check failed: stat /var/lib/rancher/credentialprovider/bin: no such file or directory
INFO[0002] Pulling runtime image [REDACTED]/rancher/rke2-runtime:v1.26.13-rke2r1
DEBU[0002] Registry endpoint URL modified: https://[REDACTED]/v2/ => https://[REDACTED]/testing/v2/?ns=[REDACTED]
W0227 22:58:16.196629    2966 logging.go:59] [core] [Channel #3 SubChannel #4] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
DEBU[0002] Registry endpoint URL modified: https://[REDACTED]/v2/rancher/rke2-runtime/manifests/v1.26.13-rke2r1 => https://[REDACTED]/testing/v2/rancher/rke2-runtime/manifests/v1.26.13-rke2r1?ns=[REDACTED]
DEBU[0003] Registry endpoint URL modified: https://[REDACTED]/v2/rancher/rke2-runtime/manifests/sha256:[REDACTED] => https://[REDACTED]/testing/v2/rancher/rke2-runtime/manifests/sha256:[REDACTED]?ns=[REDACTED]
DEBU[0003] Registry endpoint URL modified: https://[REDACTED]/v2/rancher/rke2-runtime/blobs/sha256:[REDACTED] => https://[REDACTED]/testing/v2/rancher/rke2-runtime/blobs/sha256:[REDACTED]?ns=[REDACTED]

Validation Results:

  • rke2 version used for validation:
rke2 --version
rke2 version v1.29.2-rc3+rke2r1 (08699dfffdf75a61a5e6064f9f8efe8ddae857fe)
go version go1.21.7 X:boringcrypto
rke2 starts successfully with no ?ns=<registry_hostname> in the logs

Airgap 1:
INFO[0002] Using private registry config file at /etc/rancher/rke2/registries.yaml
DEBU[0002] Kubelet image credential provider bin directory check failed: stat /var/lib/rancher/credentialprovider/bin: no such file or directory
INFO[0002] Pulling runtime image [REDACTED]/rancher/rke2-runtime:v1.29.2-rc3-rke2r1
DEBU[0002] Registry endpoint URL modified: https://[REDACTED]/v2/ => https://[REDACTED]/testing/v2/
DEBU[0002] Registry endpoint URL modified: https://[REDACTED]/v2/rancher/rke2-runtime/manifests/v1.29.2-rc3-rke2r1 => https://[REDACTED]/testing/v2/rancher/rke2-runtime/manifests/v1.29.2-rc3-rke2r1
DEBU[0003] Registry endpoint URL modified: https://[REDACTED]/v2/rancher/rke2-runtime/manifests/sha256:[REDACTED] => https://[REDACTED]/testing/v2/rancher/rke2-runtime/manifests/sha256:[REDACTED]
DEBU[0003] Registry endpoint URL modified: https://[REDACTED]/v2/rancher/rke2-runtime/blobs/sha256:[REDACTED] => https://[REDACTED]/testing/v2/rancher/rke2-runtime/blobs/sha256:[REDACTED]

NAME                    STATUS   ROLES                       AGE     VERSION          INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION    CONTAINER-RUNTIME
node/ip-172-31-20-242   Ready    control-plane,etcd,master   4m10s   v1.29.2+rke2r1   172.31.20.242   <none>        Ubuntu 18.04.3 LTS   4.15.0-1051-aws   containerd://1.7.11-k3s2

NAMESPACE     NAME                                                       READY   STATUS      RESTARTS   AGE     IP              NODE               NOMINATED NODE   READINESS GATES
kube-system   pod/cloud-controller-manager-ip-172-31-20-242              1/1     Running     0          3m13s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/etcd-ip-172-31-20-242                                  1/1     Running     0          3m15s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/helm-install-rke2-canal-rshrk                          0/1     Completed   0          3m55s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/helm-install-rke2-coredns-6xsh9                        0/1     Completed   0          3m55s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/helm-install-rke2-ingress-nginx-6bftd                  0/1     Completed   0          3m55s   10.42.0.3       ip-172-31-20-242   <none>           <none>
kube-system   pod/helm-install-rke2-metrics-server-mnb9p                 0/1     Completed   0          3m55s   10.42.0.2       ip-172-31-20-242   <none>           <none>
kube-system   pod/helm-install-rke2-snapshot-controller-7sjzk            0/1     Completed   1          3m55s   10.42.0.5       ip-172-31-20-242   <none>           <none>
kube-system   pod/helm-install-rke2-snapshot-controller-crd-hhdhn        0/1     Completed   0          3m55s   10.42.0.4       ip-172-31-20-242   <none>           <none>
kube-system   pod/helm-install-rke2-snapshot-validation-webhook-gt9gm    0/1     Completed   0          3m55s   10.42.0.9       ip-172-31-20-242   <none>           <none>
kube-system   pod/kube-apiserver-ip-172-31-20-242                        1/1     Running     0          3m15s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/kube-controller-manager-ip-172-31-20-242               1/1     Running     0          3m16s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/kube-proxy-ip-172-31-20-242                            1/1     Running     0          3m43s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/kube-scheduler-ip-172-31-20-242                        1/1     Running     0          3m17s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/rke2-canal-g45dc                                       2/2     Running     0          3m35s   172.31.20.242   ip-172-31-20-242   <none>           <none>
kube-system   pod/rke2-coredns-rke2-coredns-559d9cd4-gqsdz               1/1     Running     0          3m36s   10.42.0.7       ip-172-31-20-242   <none>           <none>
kube-system   pod/rke2-coredns-rke2-coredns-autoscaler-6b4d47b94-b4rmm   1/1     Running     0          3m36s   10.42.0.6       ip-172-31-20-242   <none>           <none>
kube-system   pod/rke2-ingress-nginx-controller-ngqg7                    1/1     Running     0          2m43s   10.42.0.13      ip-172-31-20-242   <none>           <none>
kube-system   pod/rke2-metrics-server-6b48d4997b-rbndq                   1/1     Running     0          3m2s    10.42.0.8       ip-172-31-20-242   <none>           <none>
kube-system   pod/rke2-snapshot-controller-5769d9ff85-vqrjx              1/1     Running     0          2m49s   10.42.0.12      ip-172-31-20-242   <none>           <none>
kube-system   pod/rke2-snapshot-validation-webhook-7c7764cf48-zcw5r      1/1     Running     0          2m51s   10.42.0.11      ip-172-31-20-242   <none>           <none>


Airgap 2:
INFO[0002] Using private registry config file at /etc/rancher/rke2/registries.yaml
DEBU[0002] Kubelet image credential provider bin directory check failed: stat /var/lib/rancher/credentialprovider/bin: no such file or directory
INFO[0002] Pulling runtime image [REDACTED]/rancher/rke2-runtime:v1.29.2-rc3-rke2r1
INFO[0003] Creating directory /var/lib/rancher/rke2/data/v1.29.2-rc3-rke2r1-58a7ba207f8b/bin
INFO[0003] Extracting file bin/containerd to /var/lib/rancher/rke2/data/v1.29.2-rc3-rke2r1-58a7ba207f8b/bin/containerd
INFO[0004] Extracting file bin/containerd-shim to /var/lib/rancher/rke2/data/v1.29.2-rc3-rke2r1-58a7ba207f8b/bin/containerd-shim
INFO[0004] Extracting file bin/containerd-shim-runc-v1 to /var/lib/rancher/rke2/data/v1.29.2-rc3-rke2r1-58a7ba207f8b/bin/containerd-shim-runc-v1
INFO[0004] Extracting file bin/containerd-shim-runc-v2 to /var/lib/rancher/rke2/data/v1.29.2-rc3-rke2r1-58a7ba207f8b/bin/containerd-shim-runc-v2
INFO[0005] Extracting file bin/crictl to /var/lib/rancher/rke2/data/v1.29.2-rc3-rke2r1-58a7ba207f8b/bin/crictl
W0228 03:00:59.057045    3149 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"
W0228 03:00:59.162924    3149 logging.go:59] [core] [Channel #3 SubChannel #4] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"


NAME                    STATUS   ROLES                       AGE     VERSION          INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION    CONTAINER-RUNTIME
node/ip-172-31-19-225   Ready    control-plane,etcd,master   3m28s   v1.29.2+rke2r1   172.31.19.225   <none>        Ubuntu 18.04.3 LTS   4.15.0-1051-aws   containerd://1.7.11-k3s2

NAMESPACE     NAME                                                       READY   STATUS      RESTARTS   AGE     IP              NODE               NOMINATED NODE   READINESS GATES
kube-system   pod/cloud-controller-manager-ip-172-31-19-225              1/1     Running     0          3m28s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/etcd-ip-172-31-19-225                                  1/1     Running     0          3m12s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/helm-install-rke2-canal-s2p66                          0/1     Completed   0          3m14s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/helm-install-rke2-coredns-t5xg9                        0/1     Completed   0          3m14s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/helm-install-rke2-ingress-nginx-c5v6p                  0/1     Completed   0          3m14s   10.42.0.3       ip-172-31-19-225   <none>           <none>
kube-system   pod/helm-install-rke2-metrics-server-hpcl2                 0/1     Completed   0          3m14s   10.42.0.7       ip-172-31-19-225   <none>           <none>
kube-system   pod/helm-install-rke2-snapshot-controller-crd-pwxn9        0/1     Completed   0          3m14s   10.42.0.5       ip-172-31-19-225   <none>           <none>
kube-system   pod/helm-install-rke2-snapshot-controller-tgb5c            0/1     Completed   0          3m14s   10.42.0.8       ip-172-31-19-225   <none>           <none>
kube-system   pod/helm-install-rke2-snapshot-validation-webhook-gdhn8    0/1     Completed   0          3m14s   10.42.0.6       ip-172-31-19-225   <none>           <none>
kube-system   pod/kube-apiserver-ip-172-31-19-225                        1/1     Running     0          3m21s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/kube-controller-manager-ip-172-31-19-225               1/1     Running     0          3m28s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/kube-proxy-ip-172-31-19-225                            1/1     Running     0          3m23s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/kube-scheduler-ip-172-31-19-225                        1/1     Running     0          3m28s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/rke2-canal-6b2px                                       2/2     Running     0          2m52s   172.31.19.225   ip-172-31-19-225   <none>           <none>
kube-system   pod/rke2-coredns-rke2-coredns-6f5b5d6dfc-2hndx             1/1     Running     0          2m53s   10.42.0.2       ip-172-31-19-225   <none>           <none>
kube-system   pod/rke2-coredns-rke2-coredns-autoscaler-9d6556995-k8nx5   1/1     Running     0          2m53s   10.42.0.4       ip-172-31-19-225   <none>           <none>
kube-system   pod/rke2-ingress-nginx-controller-vbbtb                    1/1     Running     0          117s    10.42.0.13      ip-172-31-19-225   <none>           <none>
kube-system   pod/rke2-metrics-server-78845947d9-jh92x                   1/1     Running     0          2m9s    10.42.0.11      ip-172-31-19-225   <none>           <none>
kube-system   pod/rke2-snapshot-controller-b7cb6fd4b-7l7vv               1/1     Running     0          2m9s    10.42.0.10      ip-172-31-19-225   <none>           <none>
kube-system   pod/rke2-snapshot-validation-webhook-776f84575c-fpvgf      1/1     Running     0          2m8s    10.42.0.12      ip-172-31-19-225   <none>           <none>


Additional context / logs:

N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants