You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had brought up a new pytorch docker and followed the steps mentioned in https://docs.mlcommons.org/inference/benchmarks/language/llama2-70b/#__tabbed_50_1 but the MLPerf script is failing to run due to libcuda.so.1 file missing. Surprisingly, that file is not present in "/usr/local/cuda/lib64". Look into the issue and resolve it.
To confirm the CUDA installation, executed torch.cuda.is_available() via a python script and it is able to identify the GPU card, looks there is an issue with MLPerf.
Entire run log
root@509aa35d5a3a:/usr/local/cuda/lib64# cm run script --tags=run-mlperf,inference,_find-performance,_full,_r5.0-dev --model=llama2-70b-99 --implementation=amd --framework=pytorch --category=datacenter --scenario=Offline --execution_mode=test --device=cuda --quiet --test_query_count=50
INFO:root:* cm run script "run-mlperf inference _find-performance _full _r5.0-dev"
INFO:root: * cm run script "detect os"
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/run.sh from tmp-run.sh
INFO:root: ! call "postprocess" from /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/customize.py
INFO:root: * cm run script "detect cpu"
INFO:root: * cm run script "detect os"
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/run.sh from tmp-run.sh
INFO:root: ! call "postprocess" from /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/customize.py
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/detect-cpu/run.sh from tmp-run.sh
INFO:root: ! call "postprocess" from /root/CM/repos/mlcommons@mlperf-automations/script/detect-cpu/customize.py
INFO:root: * cm run script "get python3"
INFO:root: ! load /root/CM/repos/local/cache/13cba8bb674a4c34/cm-cached-state.json
INFO:root:Path to Python: /root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/bin/python3
INFO:root:Python version: 3.10.15
INFO:root: * cm run script "get mlcommons inference src"
INFO:root: ! load /root/CM/repos/local/cache/f714e4d549a44d0c/cm-cached-state.json
INFO:root: * cm run script "get sut description"
INFO:root: * cm run script "detect os"
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/run.sh from tmp-run.sh
INFO:root: ! call "postprocess" from /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/customize.py
INFO:root: * cm run script "detect cpu"
INFO:root: * cm run script "detect os"
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/run.sh from tmp-run.sh
INFO:root: ! call "postprocess" from /root/CM/repos/mlcommons@mlperf-automations/script/detect-os/customize.py
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/detect-cpu/run.sh from tmp-run.sh
INFO:root: ! call "postprocess" from /root/CM/repos/mlcommons@mlperf-automations/script/detect-cpu/customize.py
INFO:root: * cm run script "get python3"
INFO:root: ! load /root/CM/repos/local/cache/13cba8bb674a4c34/cm-cached-state.json
INFO:root:Path to Python: /root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/bin/python3
INFO:root:Python version: 3.10.15
INFO:root: * cm run script "get compiler"
INFO:root: ! load /root/CM/repos/local/cache/de6c8fe7c8c54a65/cm-cached-state.json
INFO:root: * cm run script "get cuda-devices _with-pycuda"
INFO:root: * cm run script "get cuda _toolkit"
INFO:root: ! load /root/CM/repos/local/cache/623f91d372e247cd/cm-cached-state.json
INFO:root:ENV[CM_CUDA_PATH_LIB_CUDNN_EXISTS]: no
INFO:root:ENV[CM_CUDA_VERSION]: 11.8
INFO:root:ENV[CM_CUDA_VERSION_STRING]: cu118
INFO:root:ENV[CM_NVCC_BIN_WITH_PATH]: /root/CM/repos/local/cache/ba25462ff8ea4840/install/bin/nvcc
INFO:root:ENV[CUDA_HOME]: /root/CM/repos/local/cache/ba25462ff8ea4840/install
INFO:root: * cm run script "get python3"
INFO:root: ! load /root/CM/repos/local/cache/13cba8bb674a4c34/cm-cached-state.json
INFO:root:Path to Python: /root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/bin/python3
INFO:root:Python version: 3.10.15
INFO:root: * cm run script "get generic-python-lib _package.pycuda"
INFO:root: * cm run script "get python3"
INFO:root: ! load /root/CM/repos/local/cache/13cba8bb674a4c34/cm-cached-state.json
INFO:root:Path to Python: /root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/bin/python3
INFO:root:Python version: 3.10.15
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/get-generic-python-lib/validate_cache.sh from tmp-run.sh
INFO:root: ! call "detect_version" from /root/CM/repos/mlcommons@mlperf-automations/script/get-generic-python-lib/customize.py
Detected version: 2024.1.2
INFO:root: * cm run script "get python3"
INFO:root: ! load /root/CM/repos/local/cache/13cba8bb674a4c34/cm-cached-state.json
INFO:root:Path to Python: /root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/bin/python3
INFO:root:Python version: 3.10.15
INFO:root: ! load /root/CM/repos/local/cache/e3ccae16f73c4c4e/cm-cached-state.json
INFO:root: * cm run script "get generic-python-lib _package.numpy"
INFO:root: * cm run script "get python3"
INFO:root: ! load /root/CM/repos/local/cache/13cba8bb674a4c34/cm-cached-state.json
INFO:root:Path to Python: /root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/bin/python3
INFO:root:Python version: 3.10.15
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/get-generic-python-lib/validate_cache.sh from tmp-run.sh
INFO:root: ! call "detect_version" from /root/CM/repos/mlcommons@mlperf-automations/script/get-generic-python-lib/customize.py
Detected version: 2.2.2
INFO:root: * cm run script "get python3"
INFO:root: ! load /root/CM/repos/local/cache/13cba8bb674a4c34/cm-cached-state.json
INFO:root:Path to Python: /root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/bin/python3
INFO:root:Python version: 3.10.15
INFO:root: ! load /root/CM/repos/local/cache/54187be427df4fad/cm-cached-state.json
INFO:root: ! cd /root/CM/repos/local/cache/ba25462ff8ea4840/install/targets/x86_64-linux/lib
INFO:root: ! call /root/CM/repos/mlcommons@mlperf-automations/script/get-cuda-devices/detect.sh from tmp-run.sh
Traceback (most recent call last):
File "/root/CM/repos/mlcommons@mlperf-automations/script/get-cuda-devices/detect.py", line 1, in <module>
import pycuda.driver as cuda
File "/root/CM/repos/local/cache/eaf07973e00a4e24/mlperf/lib/python3.10/site-packages/pycuda/driver.py", line 66, in <module>
from pycuda._driver import * # noqa
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
CM error: Portable CM script failed (name = get-cuda-devices, return code = 256)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Please file an issue at https://github.com/mlcommons/cm4mlops/issues along with the full CM command being run and the relevant
or full console log.
!
root@509aa35d5a3a:/usr/local/cuda/lib64# python /cuda-device-check.py
Radeon RX 7900 XT
CUDA is available. Using GPU.
root@509aa35d5a3a:/usr/local/cuda/lib64# cat /cuda-device-check.py
import torch
if torch.cuda.is_available():
print(torch.cuda.get_device_name(0))
else:
print("No CUDA device found.")
import time
# Check if CUDA is available
if torch.cuda.is_available():
device = torch.device("cuda")
print("CUDA is available. Using GPU.")
else:
device = torch.device("cpu")
print("CUDA is not available. Using CPU.")
root@509aa35d5a3a:/usr/local/cuda/lib64# ls -lrt
total 4886268
drwxr-xr-x 5 root root 4096 Feb 6 15:37 cmake
-rw-r--r-- 1 root root 1021860 Feb 6 15:37 libcudadevrt.a
-rw-r--r-- 1 root root 1198880 Feb 6 15:37 libcudart_static.a
-rw-r--r-- 1 root root 30922 Feb 6 15:37 libculibos.a
lrwxrwxrwx 1 root root 17 Feb 6 15:37 libcudart.so -> libcudart.so.11.0
lrwxrwxrwx 1 root root 14 Feb 6 15:37 libOpenCL.so -> libOpenCL.so.1
-rw-r--r-- 1 root root 7743142 Feb 6 15:37 libnvrtc-builtins_static.a
-rw-r--r-- 1 root root 71382614 Feb 6 15:37 libnvrtc_static.a
lrwxrwxrwx 1 root root 25 Feb 6 15:37 libnvrtc-builtins.so -> libnvrtc-builtins.so.11.8
lrwxrwxrwx 1 root root 16 Feb 6 15:37 libnvrtc.so -> libnvrtc.so.11.2
lrwxrwxrwx 1 root root 17 Feb 6 15:37 libcublasLt.so -> libcublasLt.so.11
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libcublas.so -> libcublas.so.11
-rw-r--r-- 1 root root 961529764 Feb 6 15:37 libcublasLt_static.a
-rw-r--r-- 1 root root 125068502 Feb 6 15:37 libcublas_static.a
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libnvblas.so -> libnvblas.so.11
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libcufftw.so -> libcufftw.so.10
lrwxrwxrwx 1 root root 14 Feb 6 15:37 libcufft.so -> libcufft.so.10
-rw-r--r-- 1 root root 294190062 Feb 6 15:37 libcufft_static.a
-rw-r--r-- 1 root root 32066 Feb 6 15:37 libcufftw_static.a
-rw-r--r-- 1 root root 308031378 Feb 6 15:37 libcufft_static_nocallback.a
-rwxr-xr-x 1 root root 64350 Feb 6 15:37 libcufile_rdma_static.a
lrwxrwxrwx 1 root root 19 Feb 6 15:37 libcufile_rdma.so -> libcufile_rdma.so.1
lrwxrwxrwx 1 root root 14 Feb 6 15:37 libcufile.so -> libcufile.so.0
-rwxr-xr-x 1 root root 9533296 Feb 6 15:37 libcufile_static.a
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libcurand.so -> libcurand.so.10
-rw-r--r-- 1 root root 101422198 Feb 6 15:37 libcurand_static.a
-rw-r--r-- 1 root root 317318118 Feb 6 15:37 libcusolver_static.a
-rw-r--r-- 1 root root 17535094 Feb 6 15:37 libcusolver_lapack_static.a
lrwxrwxrwx 1 root root 19 Feb 6 15:37 libcusolverMg.so -> libcusolverMg.so.11
lrwxrwxrwx 1 root root 17 Feb 6 15:37 libcusolver.so -> libcusolver.so.11
-rw-r--r-- 1 root root 17535094 Feb 6 15:37 liblapack_static.a
-rw-r--r-- 1 root root 1005514 Feb 6 15:37 libmetis_static.a
-rwxr-xr-x 1 root root 279899536 Feb 6 15:37 libcusparse.so.11.7.5.86
lrwxrwxrwx 1 root root 17 Feb 6 15:37 libcusparse.so -> libcusparse.so.11
-rw-r--r-- 1 root root 322898242 Feb 6 15:37 libcusparse_static.a
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libnppisu.so -> libnppisu.so.11
lrwxrwxrwx 1 root root 14 Feb 6 15:37 libnppim.so -> libnppim.so.11
lrwxrwxrwx 1 root root 16 Feb 6 15:37 libnppidei.so -> libnppidei.so.11
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libnppicc.so -> libnppicc.so.11
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libnppial.so -> libnppial.so.11
lrwxrwxrwx 1 root root 13 Feb 6 15:37 libnppc.so -> libnppc.so.11
lrwxrwxrwx 1 root root 13 Feb 6 15:37 libnpps.so -> libnpps.so.11
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libnppitc.so -> libnppitc.so.11
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libnppist.so -> libnppist.so.11
lrwxrwxrwx 1 root root 14 Feb 6 15:37 libnppig.so -> libnppig.so.11
lrwxrwxrwx 1 root root 14 Feb 6 15:37 libnppif.so -> libnppif.so.11
-rw-r--r-- 1 root root 30678 Feb 6 15:37 libnppc_static.a
-rw-r--r-- 1 root root 17674370 Feb 6 15:37 libnppial_static.a
-rw-r--r-- 1 root root 6658636 Feb 6 15:37 libnppicc_static.a
-rw-r--r-- 1 root root 38671166 Feb 6 15:37 libnppig_static.a
-rw-r--r-- 1 root root 3888178 Feb 6 15:37 libnppitc_static.a
-rw-r--r-- 1 root root 8644640 Feb 6 15:37 libnppim_static.a
-rw-r--r-- 1 root root 19912102 Feb 6 15:37 libnpps_static.a
-rw-r--r-- 1 root root 11352054 Feb 6 15:37 libnppidei_static.a
-rw-r--r-- 1 root root 41275914 Feb 6 15:37 libnppist_static.a
-rw-r--r-- 1 root root 106248726 Feb 6 15:37 libnppif_static.a
-rw-r--r-- 1 root root 11266 Feb 6 15:37 libnppisu_static.a
lrwxrwxrwx 1 root root 15 Feb 6 15:37 libnvjpeg.so -> libnvjpeg.so.11
-rw-r--r-- 1 root root 5914110 Feb 6 15:37 libnvjpeg_static.a
drwxr-xr-x 2 root root 4096 Feb 6 15:37 stubs
lrwxrwxrwx 1 root root 20 Feb 6 15:37 libcudart.so.11.0 -> libcudart.so.11.8.89
lrwxrwxrwx 1 root root 16 Feb 6 15:37 libOpenCL.so.1 -> libOpenCL.so.1.0
-rwxr-xr-x 1 root root 30856 Feb 6 15:37 libOpenCL.so.1.0.0
-rwxr-xr-x 1 root root 679264 Feb 6 15:37 libcudart.so.11.8.89
lrwxrwxrwx 1 root root 18 Feb 6 15:37 libOpenCL.so.1.0 -> libOpenCL.so.1.0.0
lrwxrwxrwx 1 root root 19 Feb 6 15:37 libnvrtc.so.11.2 -> libnvrtc.so.11.8.89
lrwxrwxrwx 1 root root 28 Feb 6 15:37 libnvrtc-builtins.so.11.8 -> libnvrtc-builtins.so.11.8.89
-rwxr-xr-x 1 root root 54409816 Feb 6 15:37 libnvrtc.so.11.8.89
-rwxr-xr-x 1 root root 7718792 Feb 6 15:37 libnvrtc-builtins.so.11.8.89
lrwxrwxrwx 1 root root 26 Feb 6 15:37 libcublasLt.so.11 -> ./libcublasLt.so.11.11.3.6
-rwxr-xr-x 1 root root 574565016 Feb 6 15:37 libcublasLt.so.11.11.3.6
lrwxrwxrwx 1 root root 24 Feb 6 15:37 libcublas.so.11 -> ./libcublas.so.11.11.3.6
-rwxr-xr-x 1 root root 94729912 Feb 6 15:37 libcublas.so.11.11.3.6
-rwxr-xr-x 1 root root 745240 Feb 6 15:37 libnvblas.so.11.11.3.6
lrwxrwxrwx 1 root root 24 Feb 6 15:37 libnvblas.so.11 -> ./libnvblas.so.11.11.3.6
-rwxr-xr-x 1 root root 279161544 Feb 6 15:37 libcufft.so.10.9.0.58
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libcufftw.so.10 -> libcufftw.so.10.9.0.58
-rwxr-xr-x 1 root root 1618440 Feb 6 15:37 libcufftw.so.10.9.0.58
lrwxrwxrwx 1 root root 21 Feb 6 15:37 libcufft.so.10 -> libcufft.so.10.9.0.58
-rwxr-xr-x 1 root root 1480656 Feb 6 15:37 libcufile.so.1.4.0
-rwxr-xr-x 1 root root 39296 Feb 6 15:37 libcufile_rdma.so.1.4.0
lrwxrwxrwx 1 root root 23 Feb 6 15:37 libcufile_rdma.so.1 -> libcufile_rdma.so.1.4.0
lrwxrwxrwx 1 root root 18 Feb 6 15:37 libcufile.so.0 -> libcufile.so.1.4.0
-rwxr-xr-x 1 root root 101334448 Feb 6 15:37 libcurand.so.10.3.0.86
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libcurand.so.10 -> libcurand.so.10.3.0.86
lrwxrwxrwx 1 root root 26 Feb 6 15:37 libcusolverMg.so.11 -> libcusolverMg.so.11.4.1.48
-rwxr-xr-x 1 root root 184705792 Feb 6 15:37 libcusolverMg.so.11.4.1.48
-rwxr-xr-x 1 root root 302702224 Feb 6 15:37 libcusolver.so.11.4.1.48
lrwxrwxrwx 1 root root 24 Feb 6 15:37 libcusolver.so.11 -> libcusolver.so.11.4.1.48
lrwxrwxrwx 1 root root 24 Feb 6 15:37 libcusparse.so.11 -> libcusparse.so.11.7.5.86
lrwxrwxrwx 1 root root 20 Feb 6 15:37 libnppc.so.11 -> libnppc.so.11.8.0.86
-rwxr-xr-x 1 root root 4997744 Feb 6 15:37 libnppitc.so.11.8.0.86
-rwxr-xr-x 1 root root 10822808 Feb 6 15:37 libnppidei.so.11.8.0.86
lrwxrwxrwx 1 root root 20 Feb 6 15:37 libnpps.so.11 -> libnpps.so.11.8.0.86
lrwxrwxrwx 1 root root 21 Feb 6 15:37 libnppif.so.11 -> libnppif.so.11.8.0.86
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libnppicc.so.11 -> libnppicc.so.11.8.0.86
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libnppial.so.11 -> libnppial.so.11.8.0.86
-rwxr-xr-x 1 root root 20182264 Feb 6 15:37 libnpps.so.11.8.0.86
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libnppisu.so.11 -> libnppisu.so.11.8.0.86
-rwxr-xr-x 1 root root 1618416 Feb 6 15:37 libnppc.so.11.8.0.86
lrwxrwxrwx 1 root root 23 Feb 6 15:37 libnppidei.so.11 -> libnppidei.so.11.8.0.86
-rwxr-xr-x 1 root root 40272976 Feb 6 15:37 libnppist.so.11.8.0.86
lrwxrwxrwx 1 root root 21 Feb 6 15:37 libnppim.so.11 -> libnppim.so.11.8.0.86
-rwxr-xr-x 1 root root 103516208 Feb 6 15:37 libnppif.so.11.8.0.86
-rwxr-xr-x 1 root root 16425776 Feb 6 15:37 libnppial.so.11.8.0.86
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libnppist.so.11 -> libnppist.so.11.8.0.86
-rwxr-xr-x 1 root root 9810760 Feb 6 15:37 libnppim.so.11.8.0.86
-rwxr-xr-x 1 root root 695720 Feb 6 15:37 libnppisu.so.11.8.0.86
-rwxr-xr-x 1 root root 38097808 Feb 6 15:37 libnppig.so.11.8.0.86
-rwxr-xr-x 1 root root 7168816 Feb 6 15:37 libnppicc.so.11.8.0.86
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libnppitc.so.11 -> libnppitc.so.11.8.0.86
lrwxrwxrwx 1 root root 21 Feb 6 15:37 libnppig.so.11 -> libnppig.so.11.8.0.86
-rwxr-xr-x 1 root root 5690112 Feb 6 15:37 libnvjpeg.so.11.9.0.86
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libnvjpeg.so.11 -> libnvjpeg.so.11.9.0.86
-rwxr-xr-x 1 root root 2124152 Feb 6 15:37 libaccinj64.so.11.8.87
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libaccinj64.so.11.8 -> libaccinj64.so.11.8.87
lrwxrwxrwx 1 root root 19 Feb 6 15:37 libaccinj64.so -> libaccinj64.so.11.8
-rwxr-xr-x 1 root root 2544608 Feb 6 15:37 libcuinj64.so.11.8.87
lrwxrwxrwx 1 root root 21 Feb 6 15:37 libcuinj64.so.11.8 -> libcuinj64.so.11.8.87
lrwxrwxrwx 1 root root 18 Feb 6 15:37 libcuinj64.so -> libcuinj64.so.11.8
-rwxr-xr-x 1 root root 40136 Feb 6 15:37 libnvToolsExt.so.1.0.0
lrwxrwxrwx 1 root root 22 Feb 6 15:37 libnvToolsExt.so.1 -> libnvToolsExt.so.1.0.0
lrwxrwxrwx 1 root root 18 Feb 6 15:37 libnvToolsExt.so -> libnvToolsExt.so.1
-rw-r--r-- 1 root root 948930 Feb 6 15:38 libcufilt.a
-rw-r--r-- 1 root root 36743962 Feb 6 15:38 libnvptxcompiler_static.a
-rw-r--r-- 1 root root 509 Feb 6 18:11 tmp-state.json
-rwxr-xr-x 1 root root 8840 Feb 6 18:11 tmp-run.sh
root@509aa35d5a3a:/usr/local/cuda/lib64# pip list
Package Version
----------------------- ------------------
absl-py 2.1.0
aiohappyeyeballs 2.4.3
aiohttp 3.10.9
aiosignal 1.3.1
apex 1.3.0
asgiref 3.8.1
astunparse 1.6.3
async-timeout 4.0.3
attrs 24.2.0
audioread 3.0.1
autocommand 2.2.2
backports.tarfile 1.2.0
boto3 1.19.12
botocore 1.22.12
cachetools 5.5.0
certifi 2024.8.30
cffi 1.17.1
charset-normalizer 3.4.0
click 8.1.7
cm4mlops 0.6.25
cmind 4.0.2
colorama 0.4.6
coremltools 5.0b5
Cython 3.0.11
decorator 5.1.1
dill 0.3.7
Django 5.1.2
exceptiongroup 1.2.2
execnet 2.1.1
expecttest 0.1.6
filelock 3.16.1
flatbuffers 2.0
frozenlist 1.4.1
fsspec 2024.9.0
future 1.0.0
geojson 2.5.0
ghstack 0.8.0
giturlparse 0.12.0
google-auth 2.35.0
google-auth-oauthlib 1.0.0
grpcio 1.66.2
huggingface-hub 0.28.1
hypothesis 5.35.1
idna 3.10
image 1.5.33
imageio 2.35.1
importlib_metadata 8.0.0
importlib_resources 6.4.0
inflect 7.3.1
iniconfig 2.0.0
jaraco.collections 5.1.0
jaraco.context 5.3.0
jaraco.functools 4.0.1
jaraco.text 3.12.1
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.4.2
junitparser 2.1.1
lark 0.12.0
lazy_loader 0.4
librosa 0.10.2.post1
lintrunner 0.10.7
llvmlite 0.38.1
lxml 5.0.0
Markdown 3.7
MarkupSafe 3.0.1
more-itertools 10.3.0
mpmath 1.3.0
msgpack 1.1.0
multidict 6.1.0
mypy 1.8.0
mypy-extensions 1.0.0
networkx 2.8.8
numba 0.55.2
numpy 1.21.2
oauthlib 3.2.2
opt-einsum 3.3.0
optionloop 1.0.7
optree 0.9.1
packaging 24.1
pillow 10.2.0
pip 24.2
platformdirs 4.3.6
pluggy 1.5.0
pooch 1.8.2
propcache 0.2.0
protobuf 3.20.2
psutil 6.0.0
pyasn1 0.6.1
pyasn1_modules 0.4.1
pycparser 2.22
Pygments 2.15.0
pytest 7.3.2
pytest-cpp 2.3.0
pytest-flakefinder 1.1.0
pytest-rerunfailures 14.0
pytest-xdist 3.3.1
python-dateutil 2.9.0.post0
PyWavelets 1.4.1
PyYAML 6.0.1
regex 2024.11.6
requests 2.32.3
requests-oauthlib 2.0.0
rockset 1.0.3
rsa 4.9
s3transfer 0.5.2
safetensors 0.5.2
scikit-image 0.20.0
scikit-learn 1.5.2
scipy 1.8.1
setuptools 75.1.0
six 1.16.0
sortedcontainers 2.4.0
soundfile 0.12.1
soxr 0.5.0.post1
sqlparse 0.5.1
sympy 1.12.1
tabulate 0.9.0
tb-nightly 2.13.0a20230426
tensorboard 2.13.0
tensorboard-data-server 0.7.2
threadpoolctl 3.5.0
tifffile 2024.9.20
tlparse 0.3.5
tokenizers 0.21.0
tomli 2.0.2
torch 2.3.0a0+gitd2f9472
torchvision 0.18.0a0+68ba7ec
tqdm 4.66.5
transformers 4.48.2
triton 2.3.0
typeguard 4.3.0
typing_extensions 4.12.2
unittest-xml-reporting 3.2.0
urllib3 1.26.20
Werkzeug 3.0.4
wheel 0.44.0
xdoctest 1.1.0
yarl 1.14.0
z3-solver 4.12.2.0
zipp 3.19.2
root@509aa35d5a3a:/usr/local/cuda/lib64#
The text was updated successfully, but these errors were encountered:
I had brought up a new pytorch docker and followed the steps mentioned in https://docs.mlcommons.org/inference/benchmarks/language/llama2-70b/#__tabbed_50_1 but the MLPerf script is failing to run due to libcuda.so.1 file missing. Surprisingly, that file is not present in "/usr/local/cuda/lib64". Look into the issue and resolve it.
To confirm the CUDA installation, executed torch.cuda.is_available() via a python script and it is able to identify the GPU card, looks there is an issue with MLPerf.
Entire run log
The text was updated successfully, but these errors were encountered: