Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: cusolver error: CUSOLVER_STATUS_INTERNAL_ERROR, when calling cusolverDnCreate(handle) #103

Open
superstones opened this issue Dec 7, 2024 · 0 comments

Comments

@superstones
Copy link

2024-12-07 10:41:36,319 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
2024-12-07 10:41:36,319 - mmdet - INFO - Checkpoints will be saved to /mnt/g/V2X-Kitti-Spd/mmdetection3d/work_dirs/vic3d_latefusion_veh_imvoxelnet by HardDiskBackend.
Traceback (most recent call last):
File "tools/train.py", line 224, in
main()
File "tools/train.py", line 213, in main
train_model(
File "/mnt/g/V2X-Kitti-Spd/mmdetection3d/mmdet3d/apis/train.py", line 28, in train_model
train_detector(
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
return old_func(*args, **kwargs)
File "/root/anaconda3/envs/openmmlablinux/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 171, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/mnt/g/V2X-Kitti-Spd/mmdetection3d/mmdet3d/models/detectors/imvoxelnet.py", line 91, in forward_train
x = self.extract_feat(img, img_metas)
File "/mnt/g/V2X-Kitti-Spd/mmdetection3d/mmdet3d/models/detectors/imvoxelnet.py", line 60, in extract_feat
volume = point_sample(
File "/mnt/g/V2X-Kitti-Spd/mmdetection3d/mmdet3d/models/fusion_layers/point_fusion.py", line 56, in point_sample
points = apply_3d_transformation(
File "/mnt/g/V2X-Kitti-Spd/mmdetection3d/mmdet3d/models/fusion_layers/coord_transform.py", line 69, in apply_3d_transformation
rotate_func = partial(pcd.rotate, rotation=pcd_rotate_mat.inverse())
RuntimeError: cusolver error: CUSOLVER_STATUS_INTERNAL_ERROR, when calling cusolverDnCreate(handle)

envs info
(openmmlablinux) root@DESKTOP-46K4CEN:/mnt/g/V2X-Kitti-Spd/mmdetection3d# python tools/train.py configs/imvoxelnet/trainval_config_v.py
2024-12-07 09:56:54,558 - mmdet - INFO - Environment info:

sys.platform: linux
Python: 3.8.0 (default, Nov 6 2019, 21:49:08) [GCC 7.3.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 4080
CUDA_HOME: /usr/local/cuda-11.1
NVCC: Build cuda_11.1.TC455_06.29069683_0
GCC: gcc (Ubuntu 9.5.0-1ubuntu1~22.04) 9.5.0
PyTorch: 1.9.1+cu111
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-ge
    ncode;arch=compute_86,code=sm_86
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINE
    TO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -
    Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error
    =pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl,
    PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.1+cu111
OpenCV: 4.10.0
MMCV: 1.4.0
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.14.0
MMSegmentation: 0.14.1
MMDetection3D: 0.17.1+f110797

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant