https://kubernetes.io/blog/2023/01/12/protect-mission-critical-pods-priorityclass/
check the problem
# k get po -n monitoring
NAME READY STATUS RESTARTS AGE
monitoring-system-ggwrg 0/1 Pending 0 3m35s
monitoring-system-v7nb2 1/1 Running 0 3m36s
monitoring-system-xcpq9 1/1 Running 0 3m41s
# k get po -n monitoring -o wide
ubuntu@worker:~> k get po -n monitoring -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
monitoring-system-d74xv 1/1 Running 0 4m39s 10.0.74.1 ip-10-2-3-254 <none> <none>
monitoring-system-sxkbj 0/1 Pending 0 4m40s <none> <none> <none> <none>
monitoring-system-td6wp 1/1 Running 0 4m49s 10.0.194.67 ip-10-2-0-200 <none> <none>
#k describe po monitoring-system-sxkbj -n monitoring
Name: monitoring-system-sxkbj
Namespace: monitoring
Priority: 0
Service Account: default
Node: <none>
Labels: app=monitoring-system
controller-revision-hash=7b7f5d5689
pod-template-generation=1
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: DaemonSet/monitoring-system
Containers:
app:
Image: viktoruj/ping_pong
Port: <none>
Host Port: <none>
Limits:
memory: 2500Mi
Requests:
memory: 2500Mi
Environment:
ENABLE_LOAD_MEMORY: true
MEMORY_USAGE_PROFILE: 500=60 1024=30 2048=30
ENABLE_LOG_LOAD_MEMORY: true
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8g4r8 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-8g4r8:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m2s (x3 over 7m58s) default-scheduler 0/3 nodes are available: 1 Insufficient memory, 2 node is filtered out by the prefilter result. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling.
One of pods can't start because not enough memory on the node.
check priorityclasses
# k get priorityclasses.scheduling.k8s.io
NAME VALUE GLOBAL-DEFAULT AGE
system-cluster-critical 2000000000 false 9m30s
system-node-critical 2000001000 false 9m30s
create priorityclass monitoring
k create priorityclass monitoring --value 1000000000
edit monitoring system DaemonSet
# k edit DaemonSet monitoring-system -n monitoring
apiVersion: apps/v1
kind: DaemonSet
metadata:
creationTimestamp: null
labels:
app: monitoring-system
name: monitoring-system
namespace: monitoring
spec:
selector:
matchLabels:
app: monitoring-system
template:
metadata:
labels:
app: monitoring-system
spec:
tolerations:
- key: node-role.kubernetes.io/control-plane
effect: "NoSchedule"
priorityClassName: monitoring # add it
containers:
- image: viktoruj/ping_pong
name: app
resources:
limits:
memory: 2500Mi
requests:
memory: 2500Mi
env:
- name: ENABLE_LOAD_MEMORY
value: "true"
- name: ENABLE_LOG_LOAD_MEMORY
value: "true"
- name: MEMORY_USAGE_PROFILE
value: "500=60 1024=60 2048=60"
check the problem
# k get po -n monitoring
NAME READY STATUS RESTARTS AGE
monitoring-system-2ss5z 1/1 Running 0 3s
monitoring-system-kdf26 1/1 Running 0 7s
monitoring-system-lc845 1/1 Running 0 6s
now you can see that all pods of monitoring-system have status running .
check results
# check_result
ubuntu@worker:~> check_result
✓ 0 Init
✓ 1.1 PriorityClass
✓ 1.2 DaemonSet PriorityClass
✓ 1.3 monitoring-system pods ready
4 tests, 0 failures
result = 100.00 % ok_points=3 all_points=3
time_left=306 minutes
you spend 53 minutes