Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2 #43

Open
kinredon opened this issue Feb 19, 2022 · 1 comment
Open

undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2 #43

kinredon opened this issue Feb 19, 2022 · 1 comment

Comments

@kinredon
Copy link

When I run the following code to get GPU process information:

import psutil
import pynvml #导包


UNIT = 1024 * 1024


pynvml.nvmlInit() 
gpuDeriveInfo = pynvml.nvmlSystemGetDriverVersion()


gpuDeviceCount = pynvml.nvmlDeviceGetCount()


for i in range(gpuDeviceCount):
    handle = pynvml.nvmlDeviceGetHandleByIndex(i)#获取GPU i的handle,后续通过handle来处理

            print("进程pid:", pidInfo.pid, "用户名:", pidUser, 
            "显存占有:", pidInfo.usedGpuMemory/UNIT, "Mb") # 统计某pid使用的显存


pynvml.nvmlShutdown() #最后关闭管理工具

but I get the errors like this:

Traceback (most recent call last):
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/ctypes/__init__.py", line 361, in __getattr__
    func = self.__getitem__(name)
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/ctypes/__init__.py", line 366, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/nvidia-430/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "gpu_info.py", line 21, in <module>
    pidAllInfo = pynvml.nvmlDeviceGetComputeRunningProcesses(handle)#获取所有GPU上正在运行的进程信息
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 2223, in nvmlDeviceGetComputeRunningProcesses
    return nvmlDeviceGetComputeRunningProcesses_v2(handle);
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.NVMLError_FunctionNotFound: Function Not Found

Here is the nvidia-smi information:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.64       Driver Version: 430.64       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0 Off |                  N/A |
| 20%   26C    P8     8W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 20%   28C    P8     8W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:82:00.0 Off |                  N/A |
| 20%   24C    P8     9W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 108...  Off  | 00000000:83:00.0 Off |                  N/A |
| 20%   27C    P8     8W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

The version of nvidia-ml-py is 11.495.46. So why did this happen?

@fbcotter
Copy link

This is also happening on the latest version, which now tries to call nvmlDeviceGetComputeRunningProcesses_v3 for me with nvidia driver version 470. I think calling the older function nvmlDeviceGetComputeRunningProcesses is still available when I try. Maybe could we add a try-except here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants