Author: Paulo Pires [email protected]
This document describes, step-by-step, how to provision a Kubernetes cluster comprised of:
- One Linux machine acting as Kubernetes master node and OVN central database.
- One Linux machine acting as Kubernetes worker node.
- One Windows machine acting as Kubernetes worker node.
Many thanks to the great people that helped achieve this, namely:
- Alin Serdean (Cloudbase Solutions)
- Alin Balutoiu (Cloudbase Solutions)
- Feng Min (Google)
- Peter Hornyack (Google)
- The authors of https://github.com/openvswitch/ovn-kubernetes
The following items are to be implemented and are not sorted by importance!
- Copy CA files from master.
- Add other cloud providers documentation, e.g. AWS.
- Add gateway node instructions for enabling pod containers Internet access.
- Setup OVS TLS.
- Add Windows node TLS support.
At the time of this writing, the instructions are meant to be run on Google Compute Engine, but apart from gcloud
calls and a few networking details, everything detailed below should work regardless of the adopted cloud-provider.
Having that said, here are the requirements:
- Use Google Cloud Platform (GCP), namely Google Compute Engine (GCE) VMs.
gcloud
CLI tool
- Linux machines(s) run Ubuntu 16.04 with latest updates.
- Windows machine(s) run Windows Server 2016 with latest updates.
- Administrator access to all VMs, i.e.
root
in Linux machines.
Let's provision the master VM:
gcloud compute instances create "sig-windows-master" \
--zone "us-east1-d" \
--machine-type "custom-2-2048" \
--can-ip-forward \
--tags "https-server" \
--image "ubuntu-1604-xenial-v20170125" \
--image-project "ubuntu-os-cloud" \
--boot-disk-size "50" \
--boot-disk-type "pd-ssd" \
When it's ready, SSH into it:
gcloud compute ssh --zone "us-east1-d" "sig-windows-master"
ATTENTION: From now on, it's assumed you're logged-in as root
.
Let's install OVS/OVN:
curl -fsSL https://yum.dockerproject.org/gpg | apt-key add -
echo "deb https://apt.dockerproject.org/repo ubuntu-xenial main" > sudo tee /etc/apt/sources.list.d/docker.list
apt update
apt install -y docker.io dkms
cd ~
git clone https://github.com/apprenda/kubernetes-ovn-heterogeneous-cluster
cd kubernetes-ovn-heterogeneous-cluster/deb
dpkg -i openvswitch-common_2.6.2-1_amd64.deb \
openvswitch-datapath-dkms_2.6.2-1_all.deb \
openvswitch-switch_2.6.2-1_amd64.deb \
ovn-common_2.6.2-1_amd64.deb \
ovn-central_2.6.2-1_amd64.deb \
ovn-docker_2.6.2-1_amd64.deb \
ovn-host_2.6.2-1_amd64.deb \
python-openvswitch_2.6.2-1_all.deb
We'll need to make sure vport_geneve
kernel module is loaded at boot:
echo vport_geneve >> /etc/modules-load.d/modules.conf
Finally, reboot:
reboot
SSH again into the machine and let's proceed.
ATTENTION:
- From now on, it's assumed you're logged-in as
root
. - Pay attention to the environment variables below, particularly:
LOCAL_IP
must be the public IP of this node
Create the OVS bridge interface:
export TUNNEL_MODE=geneve
export LOCAL_IP=10.142.0.2
export MASTER_IP=$LOCAL_IP
ovs-vsctl set Open_vSwitch . external_ids:ovn-remote="tcp:$MASTER_IP:6642" \
external_ids:ovn-nb="tcp:$MASTER_IP:6641" \
external_ids:ovn-encap-ip="$LOCAL_IP" \
external_ids:ovn-encap-type="$TUNNEL_MODE"
ovs-vsctl get Open_vSwitch . external_ids
You should see something like:
{hostname=sig-windows-master.c.apprenda-project-one.internal, ovn-encap-ip="10.142.0.2", ovn-encap-type=geneve, ovn-nb="tcp:10.142.0.2:6641", ovn-remote="tcp:10.142.0.2:6642", system-id="e7af27f6-a218-40bb-8d4f-af67600abd17"}
We are now ready to set-up Kubernetes master node.
ATTENTION:
- From now on, it's assumed you're logged-in as
root
. - Pay attention to the environment variables below, particularly:
MASTER_INTERNAL_IP
must be an IP withinK8S_NODE_POD_SUBNET
subnetK8S_DNS_SERVICE_IP
must be an IP withinK8S_SERVICE_SUBNET
subnet
cd ~/kubernetes-ovn-heterogeneous-cluster/master
rm -rf tmp
mkdir tmp
cp -R make-certs openssl.cnf kubedns-* manifests systemd tmp/
export HOSTNAME=`hostname`
export K8S_VERSION=1.5.3
export K8S_POD_SUBNET=10.244.0.0/16
export K8S_NODE_POD_SUBNET=10.244.1.0/24
export K8S_SERVICE_SUBNET=10.100.0.0/16
export K8S_API_SERVICE_IP=10.100.0.1
export K8S_DNS_VERSION=1.13.0
export K8S_DNS_SERVICE_IP=10.100.0.10
export K8S_DNS_DOMAIN=cluster.local
export ETCD_VERSION=3.1.1
# The following is needed for now, because OVS can't route from pod network to node.
export MASTER_INTERNAL_IP=10.244.1.2
sed -i"*" "s|__K8S_VERSION__|$K8S_VERSION|g" tmp/manifests/*.yaml
sed -i"*" "s|__K8S_VERSION__|$K8S_VERSION|g" tmp/systemd/kubelet.service
sed -i"*" "s|__ETCD_VERSION__|$ETCD_VERSION|g" tmp/systemd/etcd3.service
sed -i"*" "s|__MASTER_IP__|$MASTER_IP|g" tmp/manifests/*.yaml
sed -i"*" "s|__MASTER_IP__|$MASTER_IP|g" tmp/systemd/kubelet.service
sed -i"*" "s|__MASTER_IP__|$MASTER_IP|g" tmp/openssl.cnf
sed -i"*" "s|__MASTER_INTERNAL_IP__|$MASTER_INTERNAL_IP|g" tmp/manifests/*.yaml
sed -i"*" "s|__HOSTNAME__|$HOSTNAME|g" tmp/manifests/proxy.yaml
sed -i"*" "s|__HOSTNAME__|$HOSTNAME|g" tmp/systemd/kubelet.service
sed -i"*" "s|__HOSTNAME__|$HOSTNAME|g" tmp/make-certs
sed -i"*" "s|__HOSTNAME__|$HOSTNAME|g" tmp/openssl.cnf
sed -i"*" "s|__K8S_API_SERVICE_IP__|$K8S_API_SERVICE_IP|g" tmp/openssl.cnf
sed -i"*" "s|__K8S_POD_SUBNET__|$K8S_POD_SUBNET|g" tmp/manifests/*.yaml
sed -i"*" "s|__K8S_SERVICE_SUBNET__|$K8S_SERVICE_SUBNET|g" tmp/manifests/*.yaml
sed -i"*" "s|__K8S_DNS_SERVICE_IP__|$K8S_DNS_SERVICE_IP|g" tmp/systemd/kubelet.service
sed -i"*" "s|__K8S_DNS_DOMAIN__|$K8S_DNS_DOMAIN|g" tmp/systemd/kubelet.service
sed -i"*" "s|__K8S_DNS_SERVICE_IP__|$K8S_DNS_SERVICE_IP|g" tmp/kubedns-service.yaml
sed -i"*" "s|__K8S_DNS_VERSION__|$K8S_DNS_VERSION|g" tmp/kubedns-deployment.yaml
sed -i"*" "s|__K8S_DNS_DOMAIN__|$K8S_DNS_DOMAIN|g" tmp/*.*
cp -R tmp/systemd/*.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable etcd3
systemctl start etcd3
cd tmp
chmod +x make-certs
./make-certs
cd ..
mkdir -p /etc/kubernetes
cp -R tmp/manifests /etc/kubernetes/
systemctl enable kubelet
systemctl start kubelet
curl -Lskj -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/v$K8S_VERSION/bin/linux/amd64/kubectl
chmod +x /usr/bin/kubectl
kubectl config set-cluster default-cluster --server=https://$MASTER_IP --certificate-authority=/etc/kubernetes/tls/ca.pem
kubectl config set-credentials default-admin --certificate-authority=/etc/kubernetes/tls/ca.pem --client-key=/etc/kubernetes/tls/admin-key.pem --client-certificate=/etc/kubernetes/tls/admin.pem
kubectl config set-context local --cluster=default-cluster --user=default-admin
kubectl config use-context local
Attention: the API server will take a minute or more to come up, depending on the CPUs available and networking speed. So, before proceeding make sure the API server came up. Run kubectl get nodes
until you see something like:
NAME STATUS AGE
pires-master Ready,SchedulingDisabled 7s
After making sure the API server is up & running, you need to configure pod networking for this node:
export TOKEN=$(kubectl describe secret $(kubectl get secrets | grep default | cut -f1 -d ' ') | grep -E '^token' | cut -f2 -d':' | tr -d '\t')
ovs-vsctl set Open_vSwitch . \
external_ids:k8s-api-server="https://$MASTER_IP" \
external_ids:k8s-api-token="$TOKEN"
ln -fs /etc/kubernetes/tls/ca.pem /etc/openvswitch/k8s-ca.crt
apt install -y python-pip
pip install --upgrade pip
cd ~
git clone https://github.com/openvswitch/ovn-kubernetes
cd ovn-kubernetes
pip install --upgrade --prefix=/usr/local --ignore-installed .
ovn-k8s-overlay master-init \
--cluster-ip-subnet="$K8S_POD_SUBNET" \
--master-switch-subnet="$K8S_NODE_POD_SUBNET" \
--node-name="$HOSTNAME"
systemctl enable ovn-k8s-watcher
systemctl start ovn-k8s-watcher
And deploy Kubernetes DNS:
cd ~/kubernetes-ovn-heterogeneous-cluster/master
kubectl create -f tmp/kubedns-deployment.yaml
kubectl create -f tmp/kubedns-service.yaml
Note though that Kubernetes DNS will only become available when a schedulable Kubernetes node joins the cluster.
By this time, the master node is ready:
kubectl get nodes
kubectl -n kube-system get pods
You should see something like:
NAME READY STATUS RESTARTS AGE
kube-apiserver-10.142.0.2 1/1 Running 0 9m
kube-controller-manager-10.142.0.2 1/1 Running 0 9m
kube-dns-555682531-5pp48 0/3 Pending 0 1m
kube-proxy-10.142.0.2 1/1 Running 0 9m
kube-scheduler-10.142.0.2 1/1 Running 0 9m
Let's proceed to set-up the worker nodes.
Let's provision the Linux worker VM:
gcloud compute instances create "sig-windows-worker-linux" \
--zone "us-east1-d" \
--machine-type "custom-2-2048" \
--can-ip-forward \
--image "ubuntu-1604-xenial-v20170125" \
--image-project "ubuntu-os-cloud" \
--boot-disk-size "50" \
--boot-disk-type "pd-ssd" \
When it's ready, SSH into it:
gcloud compute ssh --zone "us-east1-d" "sig-windows-worker-linux"
ATTENTION: From now on, it's assumed you're logged-in as root
.
Let's install OVS/OVN:
curl -fsSL https://yum.dockerproject.org/gpg | apt-key add -
echo "deb https://apt.dockerproject.org/repo ubuntu-xenial main" > sudo tee /etc/apt/sources.list.d/docker.list
apt update
apt install -y docker.io dkms
cd ~
git clone https://github.com/apprenda/kubernetes-ovn-heterogeneous-cluster
cd kubernetes-ovn-heterogeneous-cluster/deb
dpkg -i openvswitch-common_2.6.2-1_amd64.deb \
openvswitch-datapath-dkms_2.6.2-1_all.deb \
openvswitch-switch_2.6.2-1_amd64.deb \
ovn-common_2.6.2-1_amd64.deb \
ovn-central_2.6.2-1_amd64.deb \
ovn-docker_2.6.2-1_amd64.deb \
ovn-host_2.6.2-1_amd64.deb \
python-openvswitch_2.6.2-1_all.deb
We'll need to make sure vport_geneve
kernel module is loaded at boot:
echo vport_geneve >> /etc/modules-load.d/modules.conf
Finally, reboot:
reboot
SSH again into the machine, become root and let's proceed.
ATTENTION:
- From now on, it's assumed you're logged-in as
root
. - Pay attention to the environment variables below, particularly:
LOCAL_IP
must be the public IP of this nodeMASTER_IP
must be the remote public IP address of the master node
Create the OVS bridge interface:
export TUNNEL_MODE=geneve
export LOCAL_IP=10.142.0.3
export MASTER_IP=10.142.0.2
ovs-vsctl set Open_vSwitch . external_ids:ovn-remote="tcp:$MASTER_IP:6642" \
external_ids:ovn-nb="tcp:$MASTER_IP:6641" \
external_ids:ovn-encap-ip="$LOCAL_IP" \
external_ids:ovn-encap-type="$TUNNEL_MODE"
ovs-vsctl get Open_vSwitch . external_ids
You should see something like:
{hostname=sig-windows-worker-linux-1.c.apprenda-project-one.internal, ovn-encap-ip="10.142.0.3", ovn-encap-type=geneve, ovn-nb="tcp:10.142.0.2:6641", ovn-remote="tcp:10.142.0.2:6642", system-id="c6364c4c-8069-4bfd-ace8-ec701572feb7"}
We are now ready to set-up Kubernetes Linux worker node.
Attention:
- From now on, it's assumed you're logged-in as
root
. - You must copy the CA keypair that's available in the master node over the following paths:
- /etc/kubernetes/tls/ca.pem
- /etc/kubernetes/tls/ca-key.pem
- Pay attention to the environment variables below
cd ~/kubernetes-ovn-heterogeneous-cluster/worker/linux
rm -rf tmp
mkdir tmp
cp -R ../make-certs ../openssl.cnf ../kubeconfig.yaml manifests systemd tmp/
export HOSTNAME=`hostname`
export K8S_VERSION=1.5.3
export K8S_POD_SUBNET=10.244.0.0/16
export K8S_NODE_POD_SUBNET=10.244.2.0/24
export K8S_DNS_SERVICE_IP=10.100.0.10
export K8S_DNS_DOMAIN=cluster.local
sed -i"*" "s|__K8S_VERSION__|$K8S_VERSION|g" tmp/manifests/proxy.yaml
sed -i"*" "s|__K8S_VERSION__|$K8S_VERSION|g" tmp/systemd/kubelet.service
sed -i"*" "s|__MASTER_IP__|$MASTER_IP|g" tmp/manifests/proxy.yaml
sed -i"*" "s|__MASTER_IP__|$MASTER_IP|g" tmp/systemd/kubelet.service
sed -i"*" "s|__MASTER_IP__|$MASTER_IP|g" tmp/openssl.cnf
sed -i"*" "s|__MASTER_IP__|$MASTER_IP|g" tmp/kubeconfig.yaml
sed -i"*" "s|__LOCAL_IP__|$LOCAL_IP|g" tmp/manifests/proxy.yaml
sed -i"*" "s|__LOCAL_IP__|$LOCAL_IP|g" tmp/systemd/kubelet.service
sed -i"*" "s|__LOCAL_IP__|$LOCAL_IP|g" tmp/openssl.cnf
sed -i"*" "s|__HOSTNAME__|$HOSTNAME|g" tmp/manifests/proxy.yaml
sed -i"*" "s|__HOSTNAME__|$HOSTNAME|g" tmp/systemd/kubelet.service
sed -i"*" "s|__HOSTNAME__|$HOSTNAME|g" tmp/make-certs
sed -i"*" "s|__K8S_POD_SUBNET__|$K8S_POD_SUBNET|g" tmp/manifests/proxy.yaml
sed -i"*" "s|__K8S_DNS_SERVICE_IP__|$K8S_DNS_SERVICE_IP|g" tmp/systemd/kubelet.service
sed -i"*" "s|__K8S_DNS_DOMAIN__|$K8S_DNS_DOMAIN|g" tmp/systemd/kubelet.service
cd tmp
chmod +x make-certs
./make-certs
cd ..
cp -R tmp/manifests /etc/kubernetes/
cp tmp/kubeconfig.yaml /etc/kubernetes/
cp -R tmp/systemd/*.service /etc/systemd/system/
systemctl daemon-reload
curl -Lskj -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/v$K8S_VERSION/bin/linux/amd64/kubectl
chmod +x /usr/bin/kubectl
kubectl config set-cluster default-cluster --server=https://$MASTER_IP --certificate-authority=/etc/kubernetes/tls/ca.pem
kubectl config set-credentials default-admin --certificate-authority=/etc/kubernetes/tls/ca.pem --client-key=/etc/kubernetes/tls/node-key.pem --client-certificate=/etc/kubernetes/tls/node.pem
kubectl config set-context local --cluster=default-cluster --user=default-admin
kubectl config use-context local
Now, let's configure pod networking for this node:
export TOKEN=$(kubectl describe secret $(kubectl get secrets | grep default | cut -f1 -d ' ') | grep -E '^token' | cut -f2 -d':' | tr -d '\t')
ovs-vsctl set Open_vSwitch . \
external_ids:k8s-api-server="https://$MASTER_IP" \
external_ids:k8s-api-token="$TOKEN"
ln -fs /etc/kubernetes/tls/ca.pem /etc/openvswitch/k8s-ca.crt
mkdir -p /opt/cni/bin && cd /opt/cni/bin
curl -Lskj -o cni.tar.gz https://github.com/containernetworking/cni/releases/download/v0.4.0/cni-v0.4.0.tgz
tar zxf cni.tar.gz
rm -f cni.tar.gz
apt install -y python-pip
pip install --upgrade pip
cd ~
git clone https://github.com/openvswitch/ovn-kubernetes
cd ovn-kubernetes
pip install --upgrade --prefix=/usr/local --ignore-installed .
ovn-k8s-overlay minion-init \
--cluster-ip-subnet="$K8S_POD_SUBNET" \
--minion-switch-subnet="$K8S_NODE_POD_SUBNET" \
--node-name="$HOSTNAME"
By this time, your Linux worker node is ready to run Kubernetes workloads:
systemctl enable kubelet
systemctl start kubelet
kubectl get nodes
kubectl -n kube-system get pods
Let's proceed to setup the Windows worker node.
Attention:
- Connectivity may flicker and you may need to reconnect your RDP session. This is expected at least a couple times when setting up networking interfaces.
- Make sure Windows Server 2016 Routing feature is installed. You can find information on how to do it here and if you're following said instructions make sure you just install the feature as there's no need, nor have we tried, enabling LAN Routing, etc.
For the sake of simplicity when setting up Windows node, we will not be relying on TLS for now.
Let's provision the Windows worker VM:
gcloud compute instances create "sig-windows-worker-windows-1" \
--zone "us-east1-d" \
--machine-type "custom-4-4096" \
--can-ip-forward \
--image "windows-server-2016-dc-v20170117" \
--image-project "windows-cloud" \
--boot-disk-size "50" \
--boot-disk-type "pd-ssd" \
After VM is provisioned, establish a new connection to it. How one does this is out of the scope of this document.
Now, start a new Powershell session with administrator privileges and execute:
cd \
mkdir ovs
cd ovs
Start-BitsTransfer https://cloudbase.it/downloads/OpenvSwitch_prerelease.msi
Start-BitsTransfer https://cloudbase.it/downloads/k8s_ovn_service_prerelease.zip
cmd /c 'msiexec /i OpenvSwitch_prerelease.msi /qn'
netsh netkvm setparam 0 *RscIPv4 0
netsh netkvm restart 0
Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
Install-Package -Name docker -ProviderName DockerMsftProvider
A reboot is mandatory:
Restart-Computer -Force
Re-establish connection to the VM.
Now, one needs to set-up the overlay (OVN) network. On a per node basis, copy worker/windows/install_ovn.ps1
over to C:\ovs
on the Windows node and edit its contents accordingly before running the Powershell script.
Then, start a new Powershell session with administrator privileges and execute:
cd C:\ovs
.\install_ovn.ps1
We are now ready to set-up Kubernetes Windows worker node.
On a per node basis, copy worker/windows/install_k8s.ps1
over to the Windows node and edit its contents accordingly before running the Powershell script.
Now, let's install Kubernetes:
.\install_k8s.ps1
Attention: While we don't automate Kubernetes components registration as Windows services, one will need to edit the commands below according to the specifics of the cluster in the making.
Run kube-proxy
:
New-VMSwitch -Name KubeProxySwitch -SwitchType Internal
cd C:\kubernetes
$env:INTERFACE_TO_ADD_SERVICE_IP = "KubeProxySwitch"
.\kube-proxy.exe -v=3 --proxy-mode=userspace --hostname-override=sig-windows-worker-windows-1 --bind-address=10.142.0.5 --master=http://10.142.0.2:8080 --cluster-cidr=10.244.0.0/16
And the kubelet
:
cd C:\kubernetes
$env:CONTAINER_NETWORK = "external"
.\kubelet.exe -v=3 --address=10.142.0.9 --hostname-override=10.142.0.9 --cluster_dns=10.100.0.10 --cluster_domain=cluster.local --pod-infra-container-image="apprenda/pause" --resolv-conf="" --api_servers=http://10.142.0.2:8080
If everything is working, you should see all three nodes and several pods in the output of these kubectl commands:
cd C:\kubernetes
.\kubectl.exe -s 10.142.0.2:8080 get nodes
.\kubectl.exe -s 10.142.0.2:8080 -n kube-system get pods
Look in ovn-kubernetes issues first. Some problems we found are:
- ovn-kubernetes/ovn-kubernetes#79
- ovn-kubernetes/ovn-kubernetes#80
- ovn-kubernetes/ovn-kubernetes#82
- ovn-kubernetes/ovn-kubernetes#85
As root
, run:
apt update
apt install -y build-essential fakeroot dkms \
autoconf automake debhelper dh-autoreconf libssl-dev libtool \
python-all python-twisted-conch python-zopeinterface \
graphviz
cd ~
git clone https://github.com/openvswitch/ovs.git
cd ovs
git checkout branch-2.6
dpkg-checkbuilddeps
DEB_BUILD_OPTIONS='nocheck' fakeroot debian/rules binary