assert torch.cuda.is_available() ! it will be work without GPU? #20

RedOne88 · 2021-04-30T09:09:04Z

Hi, and thanks for your code.
I have a question, can your code work without GPU, with CPU!
I can't seem to used the code, with this current version! I have always this error :

    assert torch.cuda.is_available()
AssertionError

Regarding the image basis, your code is able to run on fisheye images in grayscale and not in color.
thank you in advance for your reply !

The text was updated successfully, but these errors were encountered:

duanzhiihao · 2021-04-30T09:45:30Z

Hi, thank you for your interest.
Actually, CUDA is not required to run RAPiD. I updated the api.py file, and could you check if you can run on CPU when passing use_cuda=False argument to the Detector class.

For training, I do not intend to add support for CPU since training on CPU will be super slow and I guess no one will try to do it.

Regarding the image basis, your code is able to run on fisheye images in grayscale and not in color.

I'm afraid I can't understand it. I believe our code runs on colored images. Could you give the error message if you are facing an error?

RedOne88 · 2021-05-03T07:53:51Z

Thank you for your reply.
I was able to find your results on your test images. Thank you so much.
However, I tried to test it on gray images, it didn't work. Here is the error:

File "example.py", line 8, in <module>
  detector.detect_one (img_path = '. / images / image1-002.jpg',
File "/home/redmou/Téléchargements/rapid/api.py", line 69, in detect_one
  detections = self._predict_pil (img, ** kwargs)
File "/home/redmou/Téléchargements/rapid/api.py", line 136, in _predict_pil
  dts = self.model (input _). cpu ()
File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward (* input, ** kwargs)
File "/home/redmou/Téléchargements/rapid/models/rapid.py", line 71, in forward
  small, medium, large = self.backbone (x)
File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward (* input, ** kwargs)
File "/home/redmou/Téléchargements/rapid/models/backbones.py", line 80, in forward
  x = self.netlist [i] (x)
File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward (* input, ** kwargs)
File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
  input = module (input)
File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward (* input, ** kwargs)
File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 399, in forward
  return self._conv_forward (input, self.weight, self.bias)
File "/home/redmou/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 395, in _conv_forward
  return F.conv2d (input, weight, bias, self.stride,
RuntimeError: Given groups = 1, weight of size [32, 3, 3, 3], expected input [1, 1, 1024, 1024] to have 3 channels, but got 1 channels instead

do you have any idea about this type of error?
Thank you

duanzhiihao · 2021-05-03T10:27:46Z

Hello,
Our method is not designed for gray images, but here is a workaround: we expand (repeat) the gray image to an RGB image before feeding it into the CNN:

# im is a torch tensor, and im.shape is (1,1,h,w)
im = im.expand(-1,3,-1,-1)
pred = model(im)

RedOne88 · 2021-06-04T13:59:01Z

Hello,
and thank you for all your answers.
Could you help me please? I managed to install a graphics card with a size 1g gpu. I started training, but considering the size of my gpu, it didn't want to start training, because part of the gpu is used to install pytorch.

Could you help me to train either in CPU (although if it will spend a lot of time) or with my actual gpu?
thank you very munch.

duanzhiihao · 2021-06-04T14:33:51Z

Hi,

Given that your GPU has only 1G memory, it would be challenging to fit the model into the memory. I recommend you try to use Google Colab Notebook, which gives you a free 4GB GPU. Please check the tutorial here for Google Colab Notebooks.

Alternatively, you can use Kaggle notebooks, which sometimes provide you a free P100 GPU, which is powerful enough to train RAPiD.

Training on the CPU could cost more than 20 days for COCO and fisheye datasets. If you want to do it anyway, please let me know and I can provide a CPU training script in several days.

RedOne88 · 2021-06-04T14:59:23Z

thank you so much for your response.
I will test what you have offered for me.
I have a last question, considering the incredible work you have done, I have asked so many questions, I am so sorry.
Just to train gray images (not in colors) it will be enough just to convert them in gray or change the code squarely?
My idea is to train the gray images and try to optimize the final model as much as possible (after training), because the one provided is very large (246mb), with just training with gray images it can reduce the size of the model (I think). Could you tell me the piece of code to modify, if it is not complicated!
thank you so much

duanzhiihao · 2021-06-04T15:26:50Z

No problem at all.

If you want to train using gray images, you need to modify the code here

RAPiD/models/backbones.py

Line 47 in e56ac87

self.netlist.append(ConvBnLeaky(3, 32, k=3, s=1))

to ConvBnLeaky(1,32,k=3,s=1).

However, using gray instead of RGB almost do not reduce the model size because it only affects the very first layer, which is very lightweight compared to the whole model.

An effective way to reduce model size (but sacrificing some accuracy) is to use half precision (ie, float 16) instead of float32. To do this, please try

model = model.half()
x = x.half()
y = model(x)

RedOne88 · 2021-06-04T15:32:18Z

thank you !
:)

RedOne88 · 2021-06-09T08:43:14Z

I looked at the two possibilities (google colab and kaggle), they are very very interesting.To test your code, using the GPUs provided, the execution is very fast. On the other hand, for training, there is the problem of disk space provided by the two platforms in order to load the trainer2017 images which has 18gb. I don't see how I can load it in order to start training.

…

________________________________ De : Zhihao Duan ***@***.***> Envoyé : vendredi 4 juin 2021 15:27 À : duanzhiihao/RAPiD ***@***.***> Cc : RedOne88 ***@***.***>; Author ***@***.***> Objet : Re: [duanzhiihao/RAPiD] assert torch.cuda.is_available() ! it will be work without GPU? (#20) No problem at all. If you want to train using gray images, you need to modify the code here https://github.com/duanzhiihao/RAPiD/blob/e56ac87b0422d98f2942dbfbc64b745a3a3149ae/models/backbones.py#L47 to ConvBnLeaky(1,32,k=3,s=1). However, using gray instead of RGB almost do not reduce the model size because it only affects the very first layer, which is very lightweight compared to the whole model. An effective way to reduce model size (but sacrificing some accuracy) is to use half precision (ie, float 16) instead of float32. To do this, please try model = model.half() x = x.half() y = model(x) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#20 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AHLGUS5DXZJHTUW7SOHNDD3TRDWEZANCNFSM433ZD57A>.

RedOne88 · 2021-06-10T08:12:26Z

Hello,I was able to solve this problem. The Coco image database is already online.

…

________________________________ De : Red Moub ***@***.***> Envoyé : mercredi 9 juin 2021 08:43 À : duanzhiihao/RAPiD ***@***.***> Objet : RE: [duanzhiihao/RAPiD] assert torch.cuda.is_available() ! it will be work without GPU? (#20) I looked at the two possibilities (google colab and kaggle), they are very very interesting.To test your code, using the GPUs provided, the execution is very fast. On the other hand, for training, there is the problem of disk space provided by the two platforms in order to load the trainer2017 images which has 18gb. I don't see how I can load it in order to start training.

________________________________ De : Zhihao Duan ***@***.***> Envoyé : vendredi 4 juin 2021 15:27 À : duanzhiihao/RAPiD ***@***.***> Cc : RedOne88 ***@***.***>; Author ***@***.***> Objet : Re: [duanzhiihao/RAPiD] assert torch.cuda.is_available() ! it will be work without GPU? (#20) No problem at all. If you want to train using gray images, you need to modify the code here https://github.com/duanzhiihao/RAPiD/blob/e56ac87b0422d98f2942dbfbc64b745a3a3149ae/models/backbones.py#L47 to ConvBnLeaky(1,32,k=3,s=1). However, using gray instead of RGB almost do not reduce the model size because it only affects the very first layer, which is very lightweight compared to the whole model. An effective way to reduce model size (but sacrificing some accuracy) is to use half precision (ie, float 16) instead of float32. To do this, please try model = model.half() x = x.half() y = model(x) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#20 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AHLGUS5DXZJHTUW7SOHNDD3TRDWEZANCNFSM433ZD57A>.

RedOne88 · 2021-06-11T12:26:53Z

On the other hand, we run the program on kaggle and activate the GPU, it worked at the beginning, then it spits with this error:

effective batch size = 8 * 16
initialing dataloader...
Only train on person images and objects
Loading annotations /kaggle/input/cocods/annotations_trainval2017/annotations/instances_train2017.json into memory...
Training on perspective images; adding angle to BBs
Using backbone Darknet-53. Loading ImageNet weights....
Warning: no ImageNet-pretrained weights found. Please check https://github.com/duanzhiihao/RAPiD for it.
Number of parameters in backbone: 40584928
2021-06-11 12:13:48.188692: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
/opt/conda/conda-bld/pytorch_1603729047590/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
  File "/kaggle/input/rapid-training/train.py", line 257, in <module>
    loss = model(imgs, targets, labels_cats=cats)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/rapid.py", line 80, in forward
    boxes_M, loss_M = self.pred_M(detect_M, self.img_size, labels)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/rapid.py", line 282, in forward
    target[b,best_n,truth_j,truth_i,0] = tx_all[b,:n][valid_mask] - tx_all[b,:n][valid_mask].floor()
RuntimeError: CUDA error: device-side assert triggered

I believe there is an overflow of the size of the boxes! but I don't know where it comes from?

duanzhiihao · 2021-06-13T03:13:58Z

It seems to be related to the COCO dataset format. Please check #11 to see if that solves your problem.

RedOne88 · 2021-06-15T09:30:22Z

effectively it worked, but after 2 hours of training, it spat! here is the error message:

Total time: 1:52:05.342283, iter: 0:00:13.397096, epoch: 3:26:20.508855
[Iteration 500] [learning rate 0.001] [Total loss 209.47] [img size 512]
level_16 total 8 objects: xy/gt 1.385, wh/gt 0.143, angle/gt 0.627, conf 44.588
level_32 total 1 objects: xy/gt 1.340, wh/gt 0.016, angle/gt 0.638, conf 13.681
level_64 total 12 objects: xy/gt 1.384, wh/gt 0.215, angle/gt 0.748, conf 105.670
Max GPU memory usage: 6.040322303771973 GigaBytes

Traceback (most recent call last):
  File "/kaggle/input/rapid-training/train.py", line 303, in <module>
    dts = api.detect_once(model, eval_img, conf_thres=0.1, input_size=target_size)
  File "/kaggle/input/rapid-training/rapid/rapid/api.py", line 175, in detect_once
    dts = model(input_img[None]).cpu().squeeze()
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/rapid.py", line 71, in forward
    small, medium, large = self.backbone(x)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/kaggle/input/rapid-training/rapid/rapid/models/backbones.py", line 80, in forward
    x = self.netlist[i](x)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[1, 1, 608, 608] to have 3 channels, but got 1 channels instead

did you have any idea on the source of the error !
thank you very match.

duanzhiihao · 2021-06-15T10:14:07Z

The error tells that the input is a gray-scale image, but the network expects an RGB image. Did you made any change to the datasets.py script?

RAPiD/datasets.py

Line 136 in e56ac87

if img.mode == 'L':

RedOne88 · 2021-06-23T08:09:49Z

Regarding the training with COCO, I did not manage to finish it, on kaggle each time the site crashes after 25 hours of execution. My idea is to build my own model with your algorithm.
First, I will start with the existing one, I downloaded the HABOOF, CEPDOF, and MW-R, I want to build an images basis that brings together all its bases by taking for example 1000 images of each base. Do you think that we could have good results while training the model on an images basis that is not large enough?

duanzhiihao · 2021-06-23T08:12:57Z

Do you think that we could have good results while training the model on an images basis that is not large enough?

Yes, as long as you start from the pre-trained model and use a small learning rate.

RedOne88 · 2021-07-05T12:12:44Z

Hello,
I still can't finish training. On kaggle, I have the right to 9 hours of continuous use.
I chose the MW-R image basis, I reduced the size of the images and the annotations in order to reduce, perhaps, the training time.
Otherwise, I have a 32 processor machine, with old graphics card (Quadro K4200) which is not supported by PyTorch, so I plan to run the training in CPU, could you provide me the version of your code compatible with CPU plz?

RedOne88 · 2021-07-07T14:00:29Z

Please keep me informed if you have any idea

RedOne88 · 2021-10-28T07:37:14Z

Hello Sir,
can you provide me a CPU training script
thanks

duanzhiihao added a commit that referenced this issue Apr 30, 2021

Remove requirement for CUDA. #20

87c7f29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assert torch.cuda.is_available() ! it will be work without GPU? #20

assert torch.cuda.is_available() ! it will be work without GPU? #20

RedOne88 commented Apr 30, 2021 •

edited

Loading

duanzhiihao commented Apr 30, 2021

RedOne88 commented May 3, 2021

duanzhiihao commented May 3, 2021

RedOne88 commented Jun 4, 2021

duanzhiihao commented Jun 4, 2021

RedOne88 commented Jun 4, 2021 •

edited

Loading

duanzhiihao commented Jun 4, 2021

RedOne88 commented Jun 4, 2021

RedOne88 commented Jun 9, 2021 via email

RedOne88 commented Jun 10, 2021 via email

RedOne88 commented Jun 11, 2021 •

edited

Loading

duanzhiihao commented Jun 13, 2021

RedOne88 commented Jun 15, 2021 •

edited

Loading

duanzhiihao commented Jun 15, 2021

RedOne88 commented Jun 23, 2021

duanzhiihao commented Jun 23, 2021

RedOne88 commented Jul 5, 2021

RedOne88 commented Jul 7, 2021

RedOne88 commented Oct 28, 2021

assert torch.cuda.is_available() ! it will be work without GPU? #20

assert torch.cuda.is_available() ! it will be work without GPU? #20

Comments

RedOne88 commented Apr 30, 2021 • edited Loading

duanzhiihao commented Apr 30, 2021

RedOne88 commented May 3, 2021

duanzhiihao commented May 3, 2021

RedOne88 commented Jun 4, 2021

duanzhiihao commented Jun 4, 2021

RedOne88 commented Jun 4, 2021 • edited Loading

duanzhiihao commented Jun 4, 2021

RedOne88 commented Jun 4, 2021

RedOne88 commented Jun 9, 2021 via email

RedOne88 commented Jun 10, 2021 via email

RedOne88 commented Jun 11, 2021 • edited Loading

duanzhiihao commented Jun 13, 2021

RedOne88 commented Jun 15, 2021 • edited Loading

duanzhiihao commented Jun 15, 2021

RedOne88 commented Jun 23, 2021

duanzhiihao commented Jun 23, 2021

RedOne88 commented Jul 5, 2021

RedOne88 commented Jul 7, 2021

RedOne88 commented Oct 28, 2021

RedOne88 commented Apr 30, 2021 •

edited

Loading

RedOne88 commented Jun 4, 2021 •

edited

Loading

RedOne88 commented Jun 11, 2021 •

edited

Loading

RedOne88 commented Jun 15, 2021 •

edited

Loading