Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During training, IndexError occurred. #82

Open
Jucjiaswiss opened this issue Mar 27, 2020 · 8 comments
Open

During training, IndexError occurred. #82

Jucjiaswiss opened this issue Mar 27, 2020 · 8 comments

Comments

@Jucjiaswiss
Copy link

Jucjiaswiss commented Mar 27, 2020

Not long from training started, an error occurred as follows:
[Epoch 0/500][Iter 2350/5649][lr 0.000000][Loss: anchor 12.62, iou 12.75, l1 60.10, conf 1123.60, cls 258.15, imgsize 320, time: 6.73] [Epoch 0/500][Iter 2360/5649][lr 0.000000][Loss: anchor 15.11, iou 15.44, l1 77.95, conf 3106.67, cls 311.54, imgsize 576, time: 7.44] Traceback (most recent call last): File "main.py", line 486, in <module> main() File "main.py", line 416, in main loss_dict = model(imgs, targets, epoch) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__ result = self.forward(*input, **kwargs) File "/home/detection_networks/ASFF/models/yolov3_asff.py", line 149, in forward x, anchor_loss, iou_loss, l1_loss, conf_loss, cls_loss = header(fused, targets) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__ result = self.forward(*input, **kwargs) File "/home/detection_networks/ASFF/models/yolov3_head.py", line 214, in forward pred_anchors[b, self.n_anchors-1, j, i, :4].data.cpu().view(-1,4),xyxy=False) #iou of pred anchor IndexError: index 19 is out of bounds for dimension 3 with size 18

@Willert98
Copy link

same problem hava some solution?

@Willert98
Copy link

@Jucjiaswiss hava some solution?help....

@Willert98
Copy link

label wrong ?

@Jucjiaswiss
Copy link
Author

still don't know why. sorry..

@Jucjiaswiss
Copy link
Author

I changed a dataset with VOC style, trainning has no problem.
My former data was VOC-convert-to-COCO style. Hope it works for you.

@Willert98
Copy link

I changed a dataset with VOC style, trainning has no problem.
My former data was VOC-convert-to-COCO style. Hope it works for you.

yep ,I guess same reson ,but I also cant found which is wrong。thank you~

@Jucjiaswiss
Copy link
Author

Jucjiaswiss commented May 7, 2020

[Epoch 0/500][Iter 19030/83822][lr 0.000000][Loss: anchor 19.62, iou 20.14, l1 83.97, conf 150.48, cls 2597.47, imgsize 448, time: 9.23]
[Epoch 0/500][Iter 19040/83822][lr 0.000000][Loss: anchor 29.36, iou 30.13, l1 134.33, conf 136.58, cls 4027.54, imgsize 320, time: 7.31]
Traceback (most recent call last):
File "main.py", line 472, in
main()
File "main.py", line 398, in main
loss_dict = model(imgs, targets, epoch)
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/detection_networks/ASFF/models/yolov3_asff.py", line 149, in forward
x, anchor_loss, iou_loss, l1_loss, conf_loss, cls_loss = header(fused, targets)
File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/detection_networks/ASFF/models/yolov3_head.py", line 214, in forward
pred_anchors[b, self.n_anchors-1, j, i, :4].data.cpu().view(-1,4),xyxy=False) #iou of pred anchor
IndexError: index 19 is out of bounds for dimension 3 with size 19

using a different VOC dataset, this error still occurred after several iterations.

@Jucjiaswiss Jucjiaswiss reopened this May 7, 2020
@Jucjiaswiss
Copy link
Author

plus,custom datasets. 300 classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants