Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s 1.17 kubelet doesn't start on coreos #90331

Closed
cann0nf0dder opened this issue Apr 21, 2020 · 5 comments
Closed

k8s 1.17 kubelet doesn't start on coreos #90331

cann0nf0dder opened this issue Apr 21, 2020 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@cann0nf0dder
Copy link

cann0nf0dder commented Apr 21, 2020

What happened:
After upgrade to kubernetes 1.17.5 kubelet does not start
I couldn't find documentation about change to kubelet startup in the release notes.
Found few people running into the same issue on slack. No solution found.

What you expected to happen:
Kubelet starts fine same as on 1.16 branch.

How to reproduce it (as minimally and precisely as possible):
Upgrade to 1.17.5 on CoreOS

Anything else we need to know?:
Workaround provided below to share my finding with the community

Standard CoreOS kubelet.service

[Unit]
Description=kubelet
Wants=rpc-statd.service

[Service]
User=root
EnvironmentFile=/etc/kubernetes/kubelet.env
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/cache/kubelet-pod.uuid \
  --mount volume=resolv,target=/etc/resolv.conf \
  --mount volume=etc-cni-net,target=/etc/cni/net.d \
  --mount volume=var-lib-cni,target=/var/lib/cni \
  --mount volume=opt-cni-bin,target=/opt/cni/bin \
  --mount volume=var-log,target=/var/log \
  --mount volume=root-docker,target=/root/.docker \
  --mount volume=etc-k8s-cfg,target=/etc/kubernetes/config \
  --mount volume=var-lib-calico,target=/var/lib/calico \
  --volume var-lib-calico,kind=host,source=/var/lib/calico \
  --volume resolv,kind=host,source=/etc/resolv.conf \
  --volume etc-cni-net,kind=host,source=/etc/cni/net.d \
  --volume var-lib-cni,kind=host,source=/var/lib/cni \
  --volume opt-cni-bin,kind=host,source=/opt/cni/bin \
  --volume var-log,kind=host,source=/var/log \
  --volume root-docker,kind=host,source=/root/.docker \
  --volume etc-k8s-cfg,kind=host,source=/etc/kubernetes/config \
  --insecure-options=image"

ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /etc/kubernetes/pki
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/pki/ca.crt"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
--config=/etc/kubernetes/config/kubelet.yaml \
--cni-conf-dir=/etc/cni/net.d \
--exit-on-lock-contention \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni

ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid

After upgrading to 1.17.5 kubelet doesn't start with the following logs:

Apr 21 08:21:38 nodename kubelet-wrapper[1841]: + exec /usr/bin/rkt run --uuid-file-save=/var/cache/kubelet-pod.uuid --mount volume=resolv,target=/etc/resolv.conf --mount volume=etc-cni-net,target=/etc/cni/net.d --mount>
Apr 21 08:21:40 nodename kubelet-wrapper[1841]: --config=/etc/kubernetes/config/kubelet.yaml: command not supported
Apr 21 09:03:13 nodename kubelet-wrapper[971]: Usage:
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   kubelet [command]
Apr 21 09:03:13 nodename kubelet-wrapper[971]: Available Commands:
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   help                     Help about any command
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   kube-apiserver
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   kube-controller-manager
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   kube-proxy
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   kube-scheduler
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   kubectl                  kubectl controls the Kubernetes cluster manager
Apr 21 09:03:13 nodename kubelet-wrapper[971]:   kubelet

I've noticed that the coreos specific kubelet-wrapper expects one of the following commands before the parameters:
kubelet, kube-apiserver, kube-controller-manager, kube-proxy, kubelet
I've gone ahead and added kubelet to first line under the kubelet-wrapper and I was able to start the kubelet and make first api-server upgrade successful.

Workaround kubelet.service config:

ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/cache/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
kubelet --config=/etc/kubernetes/config/kubelet.yaml \
--cni-conf-dir=/etc/cni/net.d \
--exit-on-lock-contention \
--kubeconfig=/etc/kubernetes/kubeconfig \
--lock-file=/var/run/lock/kubelet.lock \
--network-plugin=cni

Don't know if this is a lack of documentation on the recent kubelet change in k8s or coreos kubelet-wrapper specific issue, I thought I'll share it here for comments/thoughts.

Environment:

@cann0nf0dder cann0nf0dder added the kind/bug Categorizes issue or PR as related to a bug. label Apr 21, 2020
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Apr 21, 2020
@cann0nf0dder cann0nf0dder changed the title kubelet --config command not supported kubelet --config command not supported (kubelet-wrapper) Apr 21, 2020
@cann0nf0dder cann0nf0dder changed the title kubelet --config command not supported (kubelet-wrapper) k8s 1.17 kubelet doesn't start on coreos Apr 21, 2020
@athenabot
Copy link

/sig node
/sig network

These SIGs are my best guesses for this issue. Please comment /remove-sig <name> if I am incorrect about one.

🤖 I am a bot run by vllry. 👩‍🔬

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 21, 2020
@cann0nf0dder
Copy link
Author

cann0nf0dder commented Apr 21, 2020

/remove-sig network

@k8s-ci-robot
Copy link
Contributor

@cann0nf0dder: Those labels are not set on the issue: sig/sig/network

In response to this:

/remove-sig sig/network

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the sig/network Categorizes an issue or PR as relevant to SIG Network. label Apr 21, 2020
@liggitt
Copy link
Member

liggitt commented Apr 21, 2020

It appears the kubelet-wrapper script is using hyperkube, which was deprecated in 1.17 (xref #84662).

Additionally, it appears the kubelet-wrapper script is explicitly unsupported:

This repo is not in alignment with current versions of Kubernetes, and will not be active in the future.

@liggitt
Copy link
Member

liggitt commented Apr 21, 2020

I think the coreos/coreos-kubernetes#930 issue is probably the right place to track this

@liggitt liggitt closed this as completed Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

4 participants