NVIDIA Device Plugin Only Exposes One GPU Out of Two GPUs Installed on Single Node #1079

amir-bialek · 2024-10-29T17:06:45Z

Hey all,

"I have an on-premises Kubernetes cluster with multiple nodes. One of these nodes is equipped with two different GPU models:
NVIDIA GeForce RTX 3090 and NVIDIA GeForce RTX 4090

When I SSH into this node and run nvidia-smi, both GPUs are properly detected and displayed.
I have installed the NVIDIA Device Plugin using gpu-operator Helm chart (https://github.com/NVIDIA/gpu-operator/tree/main/deployments/gpu-operator).
However, only the RTX 4090 is being exposed as a resource to Kubernetes.
Here is my current configuration:

devicePlugin:
  config:
    name: time-slicing-config-all
    create: true
    default: "any"
    data:
      any: |-
        version: v1
        flags:
          migStrategy: none
        sharing:
          timeSlicing:
            resources:
            - name: nvidia.com/gpu
              replicas: 5

I have tried different type of the configuration, but it always show only one type.
Any help ?

The text was updated successfully, but these errors were encountered:

klueska · 2024-10-29T17:37:14Z

As mentioned here, the k8s-device-plugin doesn't support multiple GPU types per node:
NVIDIA/k8s-device-plugin#1021 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Device Plugin Only Exposes One GPU Out of Two GPUs Installed on Single Node #1079

NVIDIA Device Plugin Only Exposes One GPU Out of Two GPUs Installed on Single Node #1079

amir-bialek commented Oct 29, 2024

klueska commented Oct 29, 2024

NVIDIA Device Plugin Only Exposes One GPU Out of Two GPUs Installed on Single Node #1079

NVIDIA Device Plugin Only Exposes One GPU Out of Two GPUs Installed on Single Node #1079

Comments

amir-bialek commented Oct 29, 2024

klueska commented Oct 29, 2024