Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter cannot provision node for application when use Volcano #4030

Open
tieungao88 opened this issue Feb 21, 2025 · 9 comments
Open

Karpenter cannot provision node for application when use Volcano #4030

tieungao88 opened this issue Feb 21, 2025 · 9 comments

Comments

@tieungao88
Copy link

Hi everyone,
I have a problem: "Karpenter cannot provision node for application when use Volcano"
Detail:
Volcano version: volcano-1.11.0
EKS version: 1.30.0

I deploy a deployment. When bootup, Karpenter worked and provisioned for me one node. After 2 mins, I scaled deployment from one replica to two replica, while the node had cpu ~ 100%. Then, Pod is pendding infinity!.
Deployment:

apiVersion: scheduling.volcano.sh/v1beta1
kind: PodGroup
metadata:
  name: cpu-hog-group
  namespace: default
  annotations:
    scheduling.volcano.sh/pod-group-type: "deployment"
spec:
  minMember: 1
  queue: default
  # priorityClassName: high-priority  # Tùy chọn
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cpu-hog-active-01
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cpu-hog-active-01
  template:
    metadata:
      labels:
        app: cpu-hog-active-01
      annotations:
        scheduling.k8s.io/group-name: cpu-hog-group
        scheduling.volcano.sh/pod-group-type: "deployment"
    spec:
      schedulerName: volcano    # Chỉ định sử dụng Volcano scheduler
      # priorityClassName: high-priority    # Tùy chọn
      nodeSelector:
        workload-type: app-schplugin
      tolerations:
        - key: workload-type
          operator: Equal
          value: app-schplugin
          effect: NoSchedule
      containers:
      - name: cpu-hog-active-01
        image: busybox
        resources:
          requests:
            cpu: "100m"
          # limits:
          #   cpu: "1500m"
        command: ["/bin/sh", "-c"]
        args:
        - |
          N=$(nproc)
          for i in $(seq 1 $N); do
            yes > /dev/null &
          done
          wait

volcano-scheduler-configmap:

actions: "enqueue, allocate, backfill"  
tiers:
  - plugins:
      - name: priority
      - name: gang
      - name: conformance
      - name: usage  # usage based scheduling plugin
        enablePredicate: true  # If the value is false, new pod scheduling is not disabled when the node load reaches the threshold. If the value is true or left blank, new pod scheduling is disabled.
        arguments:
          usage.weight: 5
          cpu.weight: 1
          memory.weight: 1
          thresholds:
            cpu: 80    # The actual CPU load of a node reaches 80%, and the node cannot schedule new pods.
            prometheusMetrics:
              - name: "cpu_usage"
                query: "100 - (avg by (instance) (irate(node_cpu_seconds_total{mode='idle'}[5m])) * 100)"
                step: 5
  - plugins:
      - name: overcommit
      - name: drf
      - name: predicates
      - name: proportion
      - name: nodeorder
      - name: binpack
metrics:                               # metrics server related configuration
  type: prometheus                     # Optional, The metrics source type, prometheus by default, support "prometheus", "prometheus_adaptor" and "elasticsearch"
  address: http://dev-ext-prometheus-kube-pr-prometheus.prometheus:9090    # Mandatory, The metrics source address
  interval: 30s                        # Optional, The scheduler pull metrics from Prometheus with this interval, 30s by default

Please help me!

@Monokaix
Copy link
Member

Hi, have you reported it to karpenter community or aws customer service?

@Monokaix
Copy link
Member

Seems Karpenter has not adapted volcano scheduler yet.

@Vacant2333
Copy link
Contributor

@tieungao88 Karpenter provisions nodes based on the resource requests of Pods. In other words, it only adds nodes to your cluster when there are Pods pending due to insufficient resources. However, your issue is that you haven’t set a request limit, and the request values are too small. Karpenter does not scale nodes based on actual resource usage.

@tieungao88
Copy link
Author

tieungao88 commented Feb 22, 2025

Hi @Vacant2333 ,

I scaled the next pod when the CPU of the current node was at 100% (2vcpu).

First time:
I tried to adjust the CPU request to 500m. At this level, the system did not trigger any event for Karpenter to scale the node, but only generated this event:

I0222 04:01:41.172368       1 predicate_helper.go:81] Predicates failed: task default/cpu-hog-active-01-585b44bcb7-h677f on node ip-10-26-5-108.ap-southeast-1.compute.internal fit failed: the CPU load of the node exceeds the upper limit.

Second time:
I adjusted the CPU request to 900m. At this level, the system triggered the event Insufficient cpu => Karpenter provisioned a new node:

I0222 04:05:41.591894       1 predicate_helper.go:81] Predicates failed: task default/cpu-hog-active-01-7bf5f5c4d7-dmmm5 on node ip-10-26-18-179.ap-southeast-1.compute.internal fit failed: Insufficient cpu

I do not understand why there is this difference and why the Insufficient cpu event was not triggered in the first case?

Thanks.

@Vacant2333
Copy link
Contributor

@tieungao88
Karpenter does not expand capacity according to the usage rate of nodes, but according to the allocation rate. Your request and limit should be set to a reasonable value, so that Karpenter can handle new nodes.

@Monokaix
Copy link
Member

Monokaix commented Feb 24, 2025

You have enabled usage plugin in volcano, which will schedule pods based on actual node load, and your deloyment consumes a lot of cpu as it's a for loopn and there is no cpu limit, so it will faile to be scheduled, but Karpenter doesn't know the plugin usage in volcano, it will only scale nodes when there is no enough cpu instead of high load.

@tieungao88
Copy link
Author

tieungao88 commented Feb 24, 2025

Hi,
I want to set it up so that if a Node has 500m CPU left, it won't allow scheduling.

Thanks.

@Monokaix
Copy link
Member

Monokaix commented Mar 4, 2025

Hi, I want to set it up so that if a Node has 500m CPU left, it won't allow scheduling.

Thanks.

As said before, you can construct a case that the new pod is insufficient cpu, so that the karpenter can be aware of that can scale nodes.

@Monokaix
Copy link
Member

Monokaix commented Mar 5, 2025

Hi, the Karpenter community is willing to solve gang related issue and support custom scheduler, feel free to give some feedbacks to the Karpenter community to make some progress! kubernetes-sigs/karpenter#742 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants