Skip to content

Commit

Permalink
Add CAPI+CAPM3 wf to multi-conductor experiment
Browse files Browse the repository at this point in the history
  • Loading branch information
mquhuy committed Sep 27, 2023
1 parent b007217 commit 644cb67
Show file tree
Hide file tree
Showing 23 changed files with 1,559 additions and 9 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash
set -e
trap 'trap - SIGTERM && kill -- -'$$'' SIGINT SIGTERM EXIT
__dir__=$(realpath "$(dirname "$0")")
# shellcheck disable=SC1091
. ./config.sh
# This is temporarily required since https://review.opendev.org/c/openstack/sushy-tools/+/875366 has not been merged.
./build-sushy-tools-image.sh
sudo ./vm-setup.sh
./configure-minikube.sh
sudo ./handle-images.sh
./generate_unique_nodes.sh
./start_containers.sh
./start-minikube.sh
./install-ironic.sh
./install-bmo.sh
python create_nodes_v3.py
./start_fake_etcd.sh
clusterctl init --infrastructure=metal3
sleep 60
13 changes: 13 additions & 0 deletions Support/Multitenancy/Multiple-Ironic-conductors/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,19 @@ Now, if you open another terminal and run `kubectl -n metal3 get BMH --watch`, y

Just like before, all of the steps can be ran at once by running the `./Init-environment-v2.sh` script. This script also respects configuration in `config.sh`.

# Multiple ironics - full setup

With BMO already working, we can now proceed to making the multiple ironic conductor and fake ipa work with CAPI and CAPM3, i.e. we will aim to "create" clusters with these fake nodes. Since we do not have any nodes to install the k8s apiserver onto, we will attempt to install the apiserver directly on top of the management cluster, using the great research and experiment that was done by our colleague Lennart Jern, which can be read in full [here](https://github.com/metal3-io/metal3-io.github.io/blob/0592e636bb10b1659437790b38f85cc49c552239/_posts/2023-05-17-Scaling_part_2.md)

In short, for this story to work, you will need to install `kubeadm` and `clustctl` on your system. To simulate the `etcd` server, we added the script `start_fake_etcd.sh` into the equation.

All the setup steps can be run at once with the script `Init-environment-v3.sh`. After that, each time we run the script `create-cluster.sh`, a new BMH man ifest will be applied, and a new 1-node cluster will be created (the 1 node is, of course, coming with 1 kcp object, 1 `Machine` object, and 1 `Metal3Machine` object as usual).

Compared to Lennart's setup, ours has a couple of differences and notes:
- Our BMO doesn't run in test mode. Instead, we use `fake-ipa` to "trick" `ironic` to think that it is talking with real nodes.
- We don't expose the apiservers using the domain `test-kube-apiserver.NAMESPACE.svc.cluster.local` (in fact, we still do, but it doesn't seem to expose anything). Instead, we use the ClusterIP ip of the apiserver service.
- We also bump into the issue of lacking resources due to apiservers taking up too much, so the number of nodes/clusters we can simulate will not be too high. (So far, we have not been able to try running these apiservers on external VMs yet.) Another way to solve this issue might be to come up with some sort of apiserver simulation, the kind of things we already did with `fake-ipa`.

# Requirements

This study was conducted on a VM with the following specs:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ SUSHYTOOLS_DIR="$HOME/sushy-tools"
rm -rf "$SUSHYTOOLS_DIR"
git clone https://opendev.org/openstack/sushy-tools.git "$SUSHYTOOLS_DIR"
cd "$SUSHYTOOLS_DIR" || exit
git fetch https://review.opendev.org/openstack/sushy-tools refs/changes/66/875366/35 && git cherry-pick FETCH_HEAD
git fetch https://review.opendev.org/openstack/sushy-tools refs/changes/66/875366/36 && git cherry-pick FETCH_HEAD

pip3 install build
python3 -m build
Expand Down
101 changes: 101 additions & 0 deletions Support/Multitenancy/Multiple-Ironic-conductors/capkcp-deploy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
cluster.x-k8s.io/provider: control-plane-kubeadm-NAMESPACE
clusterctl.cluster.x-k8s.io: ""
control-plane: controller-manager
name: capi-kubeadm-control-plane-controller-manager-NAMESPACE
namespace: capi-kubeadm-control-plane-system
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
cluster.x-k8s.io/provider: control-plane-kubeadm-NAMESPACE
control-plane: controller-manager
strategy:
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
cluster.x-k8s.io/provider: control-plane-kubeadm-NAMESPACE
control-plane: controller-manager
spec:
containers:
- args:
- --namespace=NAMESPACE
- --metrics-bind-addr=localhost:8080
- --feature-gates=ClusterTopology=false,KubeadmBootstrapFormatIgnition=false
command:
- /manager
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_UID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.uid
image: registry.k8s.io/cluster-api/kubeadm-control-plane-controller:v1.3.3
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: healthz
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: manager
ports:
- containerPort: 9443
name: webhook-server
protocol: TCP
- containerPort: 9440
name: healthz
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: healthz
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp/k8s-webhook-server/serving-certs
name: cert
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: capi-kubeadm-control-plane-manager
serviceAccountName: capi-kubeadm-control-plane-manager
terminationGracePeriodSeconds: 10
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
volumes:
- name: cert
secret:
defaultMode: 420
secretName: capi-kubeadm-control-plane-webhook-service-cert
180 changes: 180 additions & 0 deletions Support/Multitenancy/Multiple-Ironic-conductors/cluster.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: test
namespace: metal3
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/18
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
name: test
namespace: metal3
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3Cluster
name: test
namespace: metal3
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3Cluster
metadata:
name: test
namespace: metal3
spec:
controlPlaneEndpoint:
host: test-kube-apiserver.metal3.svc.cluster.local
port: 6443
noCloudProvider: true
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: test
namespace: metal3
spec:
kubeadmConfigSpec:
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
node-labels: metal3.io/uuid={{ ds.meta_data.uuid }}
name: "{{ ds.meta_data.name }}"
joinConfiguration:
controlPlane: {}
nodeRegistration:
kubeletExtraArgs:
node-labels: metal3.io/uuid={{ ds.meta_data.uuid }}
name: "{{ ds.meta_data.name }}"
clusterConfiguration:
controlPlaneEndpoint: test-kube-apiserver.metal3.svc.cluster.local:6443
apiServer:
certSANs:
- localhost
- 127.0.0.1
- 0.0.0.0
- test-kube-apiserver.metal3.svc.cluster.local
etcd:
local:
serverCertSANs:
- etcd-server.metal3.cluster.svc.local
peerCertSANs:
- etcd-0.etcd.metal3.cluster.svc.local
machineTemplate:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
name: test-controlplane
namespace: metal3
nodeDrainTimeout: 0s
replicas: 1
rolloutStrategy:
rollingUpdate:
maxSurge: 1
type: RollingUpdate
version: v1.26.0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
metadata:
name: test-controlplane
namespace: metal3
spec:
nodeReuse: false
template:
spec:
automatedCleaningMode: metadata
dataTemplate:
name: test-controlplane-template
image:
checksum: 97830b21ed272a3d854615beb54cf004
checksumType: md5
format: raw
url: http://192.168.111.1:9999/images/rhcos-ootpa-latest.qcow2"
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
labels:
cluster.x-k8s.io/cluster-name: test
nodepool: nodepool-0
name: test
namespace: metal3
spec:
clusterName: test
replicas: 0
selector:
matchLabels:
cluster.x-k8s.io/cluster-name: test
nodepool: nodepool-0
template:
metadata:
labels:
cluster.x-k8s.io/cluster-name: test
nodepool: nodepool-0
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: test-workers
clusterName: test
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
name: test-workers
nodeDrainTimeout: 0s
version: v1.26.0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
metadata:
name: test-workers
namespace: metal3
spec:
nodeReuse: false
template:
spec:
automatedCleaningMode: metadata
dataTemplate:
name: test-workers-template
image:
checksum: 97830b21ed272a3d854615beb54cf004
checksumType: md5
format: raw
url: http://192.168.111.1:9999/images/rhcos-ootpa-latest.qcow2"
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: test-workers
namespace: metal3
spec:
template:
spec:
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
node-labels: metal3.io/uuid={{ ds.meta_data.uuid }}
name: "{{ ds.meta_data.name }}"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3DataTemplate
metadata:
name: test-controlplane-template
namespace: metal3
spec:
clusterName: test
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3DataTemplate
metadata:
name: test-workers-template
namespace: metal3
spec:
clusterName: test
6 changes: 3 additions & 3 deletions Support/Multitenancy/Multiple-Ironic-conductors/config.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash
#
export N_NODES=1000
export N_SUSHY=30
export N_NODES=10
export N_SUSHY=1
# Put the endpoints of different ironics, separated by spaces
export IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3 172.22.0.4 172.22.0.5"
export IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3"
Loading

0 comments on commit 644cb67

Please sign in to comment.