Skip to content
This repository has been archived by the owner on Jul 3, 2024. It is now read-only.
CraightonH edited this page Dec 16, 2021 · 6 revisions

Clarification

Secrets

  1. ./cluster/base/cluster-secrets.sops.yaml will replace any ${SECRET} referenced anywhere in the ./cluster/ folder before the cluster builds it. This makes it easy to add secrets in places where referencing a secret might otherwise be difficult to accomplish.
  2. To add secrets to this file, run the following command in the ./cluster/ directory sops ./cluster/base/cluster-secrets.sops.yaml which will decrypt the file into your editor. Upon closing the file, sops will encrypt it again.
  3. Where it is easy to reference secrets in the k8s manifest, a dedicated secret can be created and encrypted into git which will then be decrypted before being added to the cluster. Create a secret and then from the ./cluster/ directory, run sops --encrypt --in-place path/to/secret.sops.yaml which will encrypt the file.

Cluster Settings

  1. Similar to the cluster-secrets.sops.yaml file, the cluster-settings.yaml file can be used to replace references anywhere in the ./cluster/ folder. These are useful for common variables used across multiple manifests, like IP addresses.

External-DNS

  1. External-DNS appears to skip records that it doesn't know about. If adding external-dns after records are already in DNS, they can be deleted at which point external-dns will recreate and manage them (after the configured interval).
  2. The way external-dns determines if it manages a DNS record is by leaving a TXT record with breadcrumbs about the ingress, so don't delete those TXTs.

Troubleshooting Guide

Ansible

  1. Task times out verifying k8s api is available on multi-master cluster

    Re-run the playbook - it has always worked the second run through

  2. As of writing, to use the rancher interface, the k3s version must not be greater than v1.21.5+k3s1, so add the following block to ./provision/ansible/playbooks/k3s-install.yml

  vars:
    k3s_release_version: v1.21.5+k3s1

Additionally, update the system-upgrade-controller server and agent plans so they don't automatically upgrade the cluster:

spec:
  ...
  version: v1.21.5+k3s1
  ...

Flux

  1. Resources are not being provisioned into the cluster

    Double-check every kustomization.yaml in that folder references the resources that need to be provisioned - it's easy to overlook when doing a quick and dirty copy/paste

Clone this wiki locally