You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
GPUs not connecting to VM. The NVIDIA RTX A-Series in the Flemingburg zone is not connecting to the VMs smoothly. The other GPUs namely NVIDIA Quadro K420 and NVIDIA GRID K1 are recognised by the UNIX system as PCI devices, but nvidia-smi does not recognise them due to a signing-key mismatch.
To Reproduce
Steps to reproduce the behaviour:
Create a Ubuntu VM in the Flemingsburg zone in the Gold tier.
Attach any of the available GPUs (except NVIDIA Quadro RTX 5000 this was not available during my testing, so its behaviour is inconclusive)
After installing the latest production drivers, run the nvidia-smi command to see a failed output.
Check the lspci | grep -i vga command and the lspci | grep -i nvidia command to obtain connected GPUs in the K420 and K1 cases and empty outputs in the A-Series' cases.
Check the dmesg | grep -i nvidia command to see the errors occurring with the exact details.
Check dpkg -l | grep -i nvidia to see that the modules are installed but running the lsmod | grep -i nvidia gives a blank output signalling that they are not loaded. Manual loading of the modules also fails.
Expected behavior
Essentially on leasing a GPU through the web interface and installing the drivers and rebooting the system should flawlessly attach the GPUs. But I understand there is a scope of access for each GPU and if there is something of that sort, then it should be made clear in the web interface or in the documentation as to which tier has access to which GPUs.
System Configurations:
OS: Ubuntu 24.04 LTS (noble)
Kernel-Version: 6.8.0-36-generic
The text was updated successfully, but these errors were encountered:
I'm not sure if the K420/K1 cards should have the latest drivers. They are rather old and might require an older version.
You could start by trying out the 470 driver and see if that works for those cards.
All in all, this seems like an issue with drivers or compatibility, since when I try to replicate I get the correct output with lspci -nnv | grep -i vga (go-deploy attached successfully).
Describe the bug
GPUs not connecting to VM. The
NVIDIA RTX A-Series
in the Flemingburg zone is not connecting to the VMs smoothly. The other GPUs namelyNVIDIA Quadro K420
andNVIDIA GRID K1
are recognised by the UNIX system as PCI devices, butnvidia-smi
does not recognise them due to a signing-key mismatch.To Reproduce
Steps to reproduce the behaviour:
NVIDIA Quadro RTX 5000
this was not available during my testing, so its behaviour is inconclusive)nvidia-smi
command to see a failed output.lspci | grep -i vga
command and thelspci | grep -i nvidia
command to obtain connected GPUs in the K420 and K1 cases and empty outputs in the A-Series' cases.dmesg | grep -i nvidia
command to see the errors occurring with the exact details.dpkg -l | grep -i nvidia
to see that the modules are installed but running thelsmod | grep -i nvidia
gives a blank output signalling that they are not loaded. Manual loading of the modules also fails.Expected behavior
Essentially on leasing a GPU through the web interface and installing the drivers and rebooting the system should flawlessly attach the GPUs. But I understand there is a scope of access for each GPU and if there is something of that sort, then it should be made clear in the web interface or in the documentation as to which tier has access to which GPUs.
System Configurations:
The text was updated successfully, but these errors were encountered: