Install the required tools:
- Terraform
- kubectl
For macOS:
brew install terraform kubectl awscli
Edit the ~/.aws/credentials
Save and source your profile:
source ~/.zshrc
export AWS_PROFILE="awsadrian"
If no tfstate is initiated or available, create a backend in AWS to store the tfstate:
This will add an S3 bucket for storing the tfstate.
cd eks-setup/backend/
tf init
tf plan
cd eks-setup/
tf init
tf plan -var-file=dev.tfvars
tf apply -var-file=dev.tfvars
- VPC with its dependencies
- Security groups
- Public and private subnets with a NAT gateway
- EKS Cluster
- EKS CNI and auth
- EKS Node Group: Added only to one subnet for simplicity and cost-efficiency
- IAM Policies:
- Roles
- Groups
- Policies
- Service Accounts (SA)
- Policies folder for better visibility
- Cluster Autoscaler
- Storage setup
- Minimum Applications:
- Cluster Autoscaler
- ArgoCD
- ArgoCD is applied and configured with:
- Values for individual setups
- Config folder containing:
- Applications: YAML files for deployed apps
- Projects: Definitions for app separation and deployment locations
All deployments are managed via ArgoCD. Applications include:
: AWS Application Load Balancersystem_metric-server
kubectl port-forward deployment/argo-cd-argocd-server 8080:8080 -n argo-system
kubectl port-forward service/grafana 3000:80 -n monitoring-system
Access Grafana at http://localhost:3000. The password is stored in the configuration and should be secured.
- Nodes: Node Exporter Dashboard
- Cluster Metrics: Cluster View
- Pod Metrics: Pod View
This is not an ideal format as normally I would separate the VPC, the EKS, and the next layer as an ArgoCD deployment, but to keep it simpler:
- We are using the default workspace. With the above separation, we should have multiple workspaces and multiple tfstates. This would provide better scalability and options to change/remove things easier and safer.
- There is no ideal way to apply this, and sometimes we need to wait for or use other tricks to make this setup work.
When ArgoCD is applied, we encounter a chicken-and-egg situation where we cannot apply everything in one go. There are three options to make this work:
- Target for the first time: Preferable, but this requires an extra step and a separate tfstate (decided against).
- Apply as a shell instead of a Terraform resource: After this is applied, it needs to be applied manually or targeted and forced (chosen option).
- Helm install manually outside of Terraform: Similar to option 2 but more manual.
- Use Terragrunt with hooks: Perhaps the ideal way but involves using Terragrunt, which introduces other challenges.
Access is open to the public, and permissions are assigned to only two users.
I’ve used Cluster Autoscaler instead of Karpenter. For non-critical environments like monitoring or non-production, Karpenter with spot instances would be cheaper.