Skip to content
This repository has been archived by the owner on Sep 4, 2021. It is now read-only.

API server isn't connectable on 10.3.0.1 after manual setup #870

Open
aknuds1 opened this issue Apr 20, 2017 · 1 comment
Open

API server isn't connectable on 10.3.0.1 after manual setup #870

aknuds1 opened this issue Apr 20, 2017 · 1 comment

Comments

@aknuds1
Copy link
Contributor

aknuds1 commented Apr 20, 2017

I've set up Kubernetes on CoreOS beta (version 1353.4.0), according to the official guide, but the API server isn't working properly after. At least it's not connectable on https://10.3.0.1, which causes for example the DNS addon to fail. I can't connect to https://10.3.0.1 from within pods either, for example via wget.

I'm seeing errors like this as a result in the kube-dns pod:

E0419 22:06:50.952662       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.3.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
I0419 22:07:20.444655       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
F0419 22:07:20.944622       1 dns.go:168] Timeout waiting for initialization
@trinitronx
Copy link

Had this issue using single-node setup script.

We found that there were a couple problems:

  1. kube-dns Deployment & Pods were not running. Only kube-dns-autoscaler.
  2. kube-dns-autoscaler had errors in the log trying to talk to kube-apiserver
  3. kube-apiserver was not contactable on the 10.3.0.1 Service IP.

To fix it on our cluster (YMMV), here is what we did:

  • Add file /etc/kubernetes/manifests/kube-dns.yaml

      apiVersion: extensions/v1beta1
      kind: Deployment
      metadata:
        generation: 1
        labels:
          k8s-app: kube-dns
          kubernetes.io/cluster-service: "true"
        name: kube-dns
        namespace: kube-system
      spec:
        replicas: 1
        strategy:
          rollingUpdate:
            maxSurge: 10%
            maxUnavailable: 0
          type: RollingUpdate
        template:
          metadata:
            annotations:
              scheduler.alpha.kubernetes.io/critical-pod: ""
            creationTimestamp: null
            labels:
              k8s-app: kube-dns
          spec:
            containers:
            - args:
              - --domain=cluster.local.
              - --dns-port=10053
              - --config-dir=/kube-dns-config
              - --v=2
              env:
              - name: PROMETHEUS_PORT
                value: "10055"
              image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1
              imagePullPolicy: IfNotPresent
              livenessProbe:
                failureThreshold: 5
                httpGet:
                  path: /healthcheck/kubedns
                  port: 10054
                  scheme: HTTP
                initialDelaySeconds: 60
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 5
              name: kubedns
              ports:
              - containerPort: 10053
                name: dns-local
                protocol: UDP
              - containerPort: 10053
                name: dns-tcp-local
                protocol: TCP
              - containerPort: 10055
                name: metrics
                protocol: TCP
              readinessProbe:
                failureThreshold: 3
                httpGet:
                  path: /readiness
                  port: 8081
                  scheme: HTTP
                initialDelaySeconds: 3
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 5
              resources:
                limits:
                  memory: 170Mi
                requests:
                  cpu: 100m
                  memory: 70Mi
            - args:
              - -v=2
              - -logtostderr
              - -configDir=/etc/k8s/dns/dnsmasq-nanny
              - -restartDnsmasq=true
              - --
              - -k
              - --cache-size=1000
              - --log-facility=-
              - --server=/cluster.local/127.0.0.1#10053
              - --server=/in-addr.arpa/127.0.0.1#10053
              - --server=/ip6.arpa/127.0.0.1#10053
              image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.1
              imagePullPolicy: IfNotPresent
              livenessProbe:
                failureThreshold: 5
                httpGet:
                  path: /healthcheck/dnsmasq
                  port: 10054
                  scheme: HTTP
                initialDelaySeconds: 60
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 5
              name: dnsmasq
              ports:
              - containerPort: 53
                name: dns
                protocol: UDP
              - containerPort: 53
                name: dns-tcp
                protocol: TCP
              resources:
                requests:
                  cpu: 150m
                  memory: 20Mi
            - args:
              - --v=2
              - --logtostderr
              - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A
              - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A
              image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.1
              imagePullPolicy: IfNotPresent
              livenessProbe:
                failureThreshold: 5
                httpGet:
                  path: /metrics
                  port: 10054
                  scheme: HTTP
                initialDelaySeconds: 60
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 5
              name: sidecar
              ports:
              - containerPort: 10054
                name: metrics
                protocol: TCP
              resources:
                requests:
                  cpu: 10m
                  memory: 20Mi
            dnsPolicy: Default
            nodeSelector:
              node-role.kubernetes.io/master: ""
            restartPolicy: Always
            terminationGracePeriodSeconds: 30
    
  • kubectl apply -f /etc/kubernetes/manifests/kube-dns.yaml

  • Edit file /etc/systemd/system/kubelet.service.
    Add kubelet options: --cloud-provider=aws --node-labels node-role.kubernetes.io/master=,

      [Service]
      ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
      ExecStartPre=/usr/bin/mkdir -p /opt/cni/bin
      Environment=KUBELET_IMAGE_TAG=v1.5.4_coreos.0
      Environment=KUBELET_IMAGE_URL=quay.io/coreos/hyperkube
      Environment="RKT_RUN_ARGS=--uuid-file-save=/var/run/kubelet-pod.uuid   --volume dns,kind=host,source=/etc/resolv.conf   --mount volume=dns,target=/etc/resolv.conf   --volume rkt,kind=host,source=/opt/bin/host-rkt   --mount volume=rkt,target=/usr/bin/rkt   --volume var-lib-rkt,kind=host,source=/var/lib/rkt   --mount volume=var-lib-rkt,target=/var/lib/rkt   --volume stage,kind=host,source=/tmp   --mount volume=stage,target=/tmp   --volume var-log,kind=host,source=/var/log   --mount volume=var-log,target=/var/log   "
      ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
      ExecStartPre=/usr/bin/mkdir -p /var/log/containers
      ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/run/kubelet-pod.uuid
      ExecStart=/usr/lib/coreos/kubelet-wrapper   --api-servers=http://127.0.0.1:8080   --cni-conf-dir=/etc/kubernetes/cni/net.d   --network-plugin=cni   --container-runtime=docker   --rkt-path=/usr/bin/rkt   --rkt-stage1-image=coreos.com/rkt/stage1-coreos   --register-node=true   --allow-privileged=true   --pod-manifest-path=/etc/kubernetes/manifests   --hostname-override=172.17.0.53   --cluster_dns=10.3.0.10   --cluster_domain=cluster.local --cloud-provider=aws --node-labels node-role.kubernetes.io/master=,
      ExecStop=-/usr/bin/rkt stop --uuid-file=/var/run/kubelet-pod.uuid
      Restart=always
      RestartSec=10
      KillMode=process
      [Install]
      WantedBy=multi-user.target
    
  • Edit file /etc/kubernetes/manifests/kube-apiserver.yaml.
    Add kube-apiserver option: --advertise-address=0.0.0.0:

     apiVersion: v1
     kind: Pod
     metadata:
       name: kube-apiserver
       namespace: kube-system
     spec:
       hostNetwork: true
       containers:
       - name: kube-apiserver
         image: quay.io/coreos/hyperkube:v1.5.4_coreos.0
         command:
         - /hyperkube
         - apiserver
         - --bind-address=0.0.0.0
         - --etcd-servers=http://127.0.0.1:2379
         - --allow-privileged=true
         - --service-cluster-ip-range=10.3.0.0/24
         - --secure-port=443
         - --advertise-address=0.0.0.0
         - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota
         - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem
         - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
         - --client-ca-file=/etc/kubernetes/ssl/ca.pem
         - --service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem
         - --runtime-config=extensions/v1beta1/networkpolicies=true
         - --anonymous-auth=false
         livenessProbe:
           httpGet:
             host: 127.0.0.1
             port: 8080
             path: /healthz
           initialDelaySeconds: 15
           timeoutSeconds: 15
         ports:
         - containerPort: 443
           hostPort: 443
           name: https
         - containerPort: 8080
           hostPort: 8080
           name: local
         volumeMounts:
         - mountPath: /etc/kubernetes/ssl
           name: ssl-certs-kubernetes
           readOnly: true
         - mountPath: /etc/ssl/certs
           name: ssl-certs-host
           readOnly: true
       volumes:
       - hostPath:
           path: /etc/kubernetes/ssl
         name: ssl-certs-kubernetes
       - hostPath:
           path: /usr/share/ca-certificates
         name: ssl-certs-host
    
  • systemctl daemon-reload to tell SystemD to re-read kubelet.service file.

  • systemctl restart kubelet to restart it. It may not add node label correctly in some versions, but at least it's supposed to. If it works for you, the last step may not be necessary.

  • Fix node label on master node.
    This is so kube-dns pods will run on the master nodes with nodeSelector: node-role.kubernetes.io/master: "".
    SSH to the node, or replace $(hostname) with yours:

     kubectl patch node $(hostname)  -p '{"metadata":{"labels":{"node-role.kubernetes.io/master":""}}}'
    

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants