You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I train my own dataset(My dataset has six classes, and I have divided it into five base classes and a new class. I have replaced the number and names of classes in the code with my own dataset's class names and numbers) use the configs/detection/meta_rcnn/voc/split1/meta-rcnn_r101_c4_8xb4_voc-split1_base-training.py,
I only have a 3090 GPU, and my settings are as follows
configs/detection/base/datasets/nway_kshot/base_voc.py
data = dict(
samples_per_gpu=4,
workers_per_gpu=1,
model_init=dict(
samples_per_gpu=8,
workers_per_gpu=1,
configs/detection/meta_rcnn/voc/split1/meta-rcnn_r101_c4_8xb4_voc-split1_base-training.py
evaluation = dict(interval=6000)
lr_config = dict(warmup_iters=300, step=[1600])
optimizer = dict(lr=0.00001)
During the basic training phase, when iterating 950-1000 times, loss will suddenly become NAN
Please help me, I will be very grateful to you!!!
The text was updated successfully, but these errors were encountered:
When I train my own dataset(My dataset has six classes, and I have divided it into five base classes and a new class. I have replaced the number and names of classes in the code with my own dataset's class names and numbers) use the configs/detection/meta_rcnn/voc/split1/meta-rcnn_r101_c4_8xb4_voc-split1_base-training.py,
![image](https://private-user-images.githubusercontent.com/113024532/252203718-c08c6294-fa12-44bf-9ca8-0207d69b6369.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5OTcwNTcsIm5iZiI6MTczOTk5Njc1NywicGF0aCI6Ii8xMTMwMjQ1MzIvMjUyMjAzNzE4LWMwOGM2Mjk0LWZhMTItNDRiZi05Y2E4LTAyMDdkNjliNjM2OS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOVQyMDI1NTdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0wMzA5ZjQ1NzA5ZWEzZDE4NTA5NWUwYTBlZGMyMTY2N2QwNjA1NmU3ZGQxNjY4YTEzYmQ0NWIzZDk2YTk5NWFkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.-XnAXSnempVp2ShWAq6SFxKjfSG92hz2UIONu9jYTv8)
![image](https://private-user-images.githubusercontent.com/113024532/252203750-72d90401-117f-4e7b-90a4-53f23b45b8ae.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5OTcwNTcsIm5iZiI6MTczOTk5Njc1NywicGF0aCI6Ii8xMTMwMjQ1MzIvMjUyMjAzNzUwLTcyZDkwNDAxLTExN2YtNGU3Yi05MGE0LTUzZjIzYjQ1YjhhZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOVQyMDI1NTdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0wZTk4YTJiNGJkN2Y2ZTliMWY1YTIwMWFlZjA1ZWU4MWMyMGY5OGNhODE3OTg5ZWVlZGNjOTAxMzU5MmE1M2IyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.lRd5mzb6JZSpkgo_Oo3-2pUf8u-zCVrm8-dDbDpNMM8)
![image](https://private-user-images.githubusercontent.com/113024532/252203768-413e0b0a-f1e4-47fd-a0a3-859dcf866c86.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5OTcwNTcsIm5iZiI6MTczOTk5Njc1NywicGF0aCI6Ii8xMTMwMjQ1MzIvMjUyMjAzNzY4LTQxM2UwYjBhLWYxZTQtNDdmZC1hMGEzLTg1OWRjZjg2NmM4Ni5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOVQyMDI1NTdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iNTBhZTg4MzMyNzUzNDdiNjhhMTY4YjVmOTJjM2ZlOTFlMDU0YmNhNzk2NTQ0ZWIwNjViOTdjNGM2MmFhZjA3JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.nahjqsEOjkCjSPYes9l0e4Acdm0XPW6lpVrJALL4s0I)
![image](https://private-user-images.githubusercontent.com/113024532/252203784-1ffb73f3-af1b-4a11-b041-48f3c3a71ae7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk5OTcwNTcsIm5iZiI6MTczOTk5Njc1NywicGF0aCI6Ii8xMTMwMjQ1MzIvMjUyMjAzNzg0LTFmZmI3M2YzLWFmMWItNGExMS1iMDQxLTQ4ZjNjM2E3MWFlNy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxOVQyMDI1NTdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1hMTFlNmVkNzczMGYxN2NkZTRjOWIzZWVmZjIyODI1ZDk4MjY5OGQxYWY1ZTEyYTRmNzU0MDU4MjI0ODIwZjZiJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.y_p5HbVU0CvsXtGao6eTWuu8sEbzMgncMgehUGTazdI)
I only have a 3090 GPU, and my settings are as follows
configs/detection/base/datasets/nway_kshot/base_voc.py
data = dict(
samples_per_gpu=4,
workers_per_gpu=1,
model_init=dict(
samples_per_gpu=8,
workers_per_gpu=1,
configs/detection/meta_rcnn/voc/split1/meta-rcnn_r101_c4_8xb4_voc-split1_base-training.py
evaluation = dict(interval=6000)
lr_config = dict(warmup_iters=300, step=[1600])
optimizer = dict(lr=0.00001)
During the basic training phase, when iterating 950-1000 times, loss will suddenly become NAN
Please help me, I will be very grateful to you!!!
The text was updated successfully, but these errors were encountered: