-
Notifications
You must be signed in to change notification settings - Fork 970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Insufficient memory for daemonset - no new node #7406
Comments
Just a daemonset? Karpenter will not create a new node for just a daemonset, that would be the same as an empty node. Also, I believe daemonsets should be first on the node before any workload, so having insufficient memory for that sounds very odd - would be great if you could capture some logs and events the next time this happens. |
I agree that a daemonset on it's own is pointless. But the daemonset here is used to collect some instance network metrics (it's ethtool-exporter). So, when not running on the node, we miss those metrics there. I believe Karpenter should calculate the requests required for all pods to be scheduled (daemonsets and others) in order to size the instance it will select. Not sure what kinda logs you would like to see. Those are my failed pods events:
Karpenter does not show anything. Those events might not be picked up by Karpenter, so I don't have much logs from karpenter side unfortunately. |
You could try to add |
Yes, this is indeed a good workaround that we implemented for other daemonsets as they're more critical. But shouldn't Karpenter be able to handle that? I'm surprised there are no logs in karpenter, so it looks like the event is not processed (or it is but nor verbose) |
@mariuskimmina Thanks for the fast feedback! I work with @JulesClaussen and we are unable to find a good solution for this issue that happens quite often recently. Regarding the solution you mentioned (set a priorityClassName), I checked the issue you linked before the edit and multiple people agree that this is not an accurate solution for this.
kubernetes-sigs/karpenter#731 (comment) I think that this comment also applies here: forcing the scheduler to reschedule our pods with a Is it possible that Karpenter incorrectly compute the daemonset resources? Is there a way to check that? But we have multiple tolerations on the daemonset that triggers the issue: tolerations:
- key: some-custom-key
operator: Equal
value: some-value
effect: NoSchedule
- key: node.kubernetes.io/not-ready
operator: Exists
effect: NoExecute
- key: node.kubernetes.io/unreachable
operator: Exists
effect: NoExecute
- key: node.kubernetes.io/disk-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/memory-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/pid-pressure
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/unschedulable
operator: Exists
effect: NoSchedule
- key: node.kubernetes.io/network-unavailable
operator: Exists
effect: NoSchedule Is that something taken into account by Karpenter (when it compute the resources required for an upcoming node) ? Another thing that could trigger the issue: We are setting a startupTaint when the node is created. Here's a sample of our NodePool spec: apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: custom-pool
spec:
template:
spec:
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: tasks
expireAfter: Never
taints:
- key: some-custom-key
value: some-value
effect: NoSchedule
startupTaints:
# this taint is removed by nodetaint when critical pods are ready (agent that report logs)
- key: node-starting
value: "true"
effect: NoSchedule WDYT ? |
I also encountered an issue where pods couldn't be scheduled on nodes after assigning resources to daemonsets. To avoid this problem, I didn't assign resources to the daemonsets and instead allocated slightly more generous resources to deployments and statefulsets. It would be great if Karpenter could handle this. |
Description
Observed Behavior:
Sometimes, (I can't understand when) we have some daemonsets pods that cannot be scheduled, due to "Insufficient Memory".
Karpenter doesn't start a new (bigger) node that would allow the daemon set to be scheduled.
This is not the same issue as kubernetes-sigs/karpenter#731. This occurs for older daemonsets as well, and for nodes created recently
Expected Behavior:
Karpenter should start a new node that would allow all daemonsets and workload pods to be scheduled.
Reproduction Steps (Please include YAML):
I don't know how to reproduce. This happens often, but I can't pinpoint why it does, and why it doesn't.
A node in question contains some kube-system pods (aws-node, kube-proxy, secrets-store) but also some workload pods (that have PDB)
Versions:
kubectl version
): 1.31.0The text was updated successfully, but these errors were encountered: