Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in rapid.py #49

Open
Alex18947 opened this issue Feb 2, 2025 · 1 comment
Open

Bug in rapid.py #49

Alex18947 opened this issue Feb 2, 2025 · 1 comment

Comments

@Alex18947
Copy link

Hi, while training with custom fisheye dataset on a pretrained COCO checkpoint, I encounter some "index out of bounds" problems in rapid.py: here

I am running with CUDA_LAUNCH_BLOCKING=1 to get the exact line. The error is:

./aten/src/ATen/native/cuda/IndexKernel.cu:93: operator(): block: [0,0,0], thread: [0,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.

Adding some prints before the error I can see that penalty_mask is of size: torch.Size([4, 3, 136, 136])

Other values:

b: 2
best_n: tensor([0, 0, 0, 0, 2])
truth_j: tensor([141, 99, 47, 53, 97])
truth_i: tensor([ 81, 109, 86, 96, 108])

I don't seem able to find the root cause as for now, seems like it happens for some images only. Any ideas?

@Alex18947
Copy link
Author

Looking some more into this, it seems that it happens when images from Wepdtof dataset are included in the training set. It does not happen for me, when I only include Habbof or Cepdof images. Maybe some unexpected or strange image resolution...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant