YOLOv4_new.cfg didn't work on custom data set. #8264

v-nayjack · 2021-11-30T18:33:34Z

v-nayjack
Nov 30, 2021

@AlexeyAB I am trying to train YOLO v4 to detect OCR from images, and the object size is very small when compared to the actual image size (I used all the recommendations provided in https://github.com/AlexeyAB/darknet#how-to-improve-object-detection section ). I have to train the network to detect about 100 classes but I want to make sure that the detection works, so this time I used YOLO v4 with yolov4_new.cfg for 3 classes. I changed the anchors=, filters=24 and classes=3 for each YOLO layer and trained the network with the custom train and validation data-sets on Google Colab. According to the avg loss and map plot the training went well and the avg loss was 0.1395 at 4900 iterations.

I have all the weights files, I tried running map on all the weight files and I am not getting any map values.
None of the weight files help with the detection while testing.
Also, I used pre-trained convolutional layers to start the training (yolov4.conv.137)
I am at a complete loss as to why I am not able to detect any of the objects, even though the training went well (I think)

stephanecharette · 2021-11-30T19:49:34Z

stephanecharette
Nov 30, 2021
Collaborator

"object size is very small" doesn't give us enough info. What is your network dimensions, and how small are the objects when the images are resized to the network dimensions? To understand: https://www.ccoderun.ca/darkhelp/api/Tiling.html

If you have 100 classes, you should train for about 200,000 iterations. You shouldn't be looking at results after only 4900 iterations.

Also see https://www.ccoderun.ca/programming/darknet_faq/#how_many_images

1 reply

v-nayjack Nov 30, 2021
Author

My image size vary from 1024 x 768 to 6025 x 4015 and the smallest bbox are 17 x 35 and 26 x 44 respectively. The objects might be smaller than what is mentioned above.
I am training the network only for 3 classes and not all 100 classes, therefore I don't need to run entire 200000 iterations. I am training for only 3 classes because I want to make sure my cfg values are good and work well for detection at least for the 3 classes I am training it for.

" To understand: https://www.ccoderun.ca/darkhelp/api/Tiling.html"
For now using tiling method with DarkHelp and DarkMark is the last resort for me. I need to understand the issues that are happening in my network before adding more variability to the network.

Here are the some of the cfg values that might give you more info:
[net]
batch=64
subdivisions=64
width=608
height=608
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.0013
burn_in=1000
max_batches = 6000
policy=steps
steps=4800,5400
scales=.1,.1

mosaic=1

letter_box=1

ema_alpha=0.9998

I calculated the anchor values based on the labeled data-set and used it in my cfg file.

[yolo]
mask = 6,7,8
anchors = 3, 7, 7, 13, 10, 24, 19, 16, 13, 30, 31, 25, 18, 45, 23, 48, 34, 54
classes=3
num=9
jitter=.1
scale_x_y = 2.0
objectness_smooth=1
ignore_thresh = .9
truth_thresh = 1
#random=1
resize=1.5
#iou_thresh=0.2
iou_normalizer=0.05
cls_normalizer=0.5
obj_normalizer=0.4
iou_loss=giou
nms_kind=diounms
beta_nms=0.6
new_coords=1
max_delta=2

And changed layers=23 on line 899 and stride=4 in lines 896 and 999 of yolov4_new.cfg (made these changes since the objects will be smaller 16x16 after the input image is resized)

Also used the recommended values of ignore_thresh = .9, iou_normalizer=0.05, iou_loss=giou to each [yolo] layer to increase the mAP

stephanecharette · 2021-11-30T22:15:01Z

stephanecharette
Nov 30, 2021
Collaborator

You need to re-read those pages. I wasn't asking you to try tiling, I needed you to understand the implications of your choices when it comes to sizing.

Eg, take your 6025 x 4015 image, resize it to 608x608 and let us know the size of your objects once resized.

1 reply

v-nayjack Nov 30, 2021
Author

I know that the objects being so small after resizing, I need to consider a solution similar to tiling. However, I am trying to understand why the cfg values recommended by @AlexeyAB for small object detection doesn't work, even after changing the anchor values.

In 6025 x 4015 image, after resizing the smallest object to 608 x 608, the object size is 2.62 x 6.67

stephanecharette · 2021-11-30T22:17:28Z

stephanecharette
Nov 30, 2021
Collaborator

I am training the network only for 3 classes and not all 100 classes

Regardless, even ignoring the other problems in the previous comment, the minimum for max batches is 6000, so you still would have to train some more.

1 reply

v-nayjack Nov 30, 2021
Author

I did train the network for full 6000 iterations. I have weights files for every 1000th iteration and one xxxxx_best.weights, one xxxxx_ema.weights, one xxxxx_final.weights, & one xxxxx_last.weights.

I used the iteration number 4900 because, that is the last point I saved the training chart from Google Colab. Again, I did finish the training for 3 classes (6000 iterations).

When I run detector map for each of these weights file and I don't get mAP values. Sometimes I get it for one of them, but the value is always zero. I don't know why! This is random, sometimes the mAP value is shown for xxxx_1000.weights and sometime it is shown for xxxx_final.weights etc. The results are not consistent at all.

stephanecharette · 2021-11-30T23:06:22Z

stephanecharette
Nov 30, 2021
Collaborator

for small object detection doesn't work

It does work. It works very well! See my youtube channel for examples.

after resizing the smallest object to 608 x 608, the object size is 2.62 x 6.67

You need to re-read my page. After resizing, if the objects are smaller than 16x16, you'll have problems. If smaller than 10x10, it cannot be detected.

Since your objects are as small as 3x7, I can guarantee you it wont work without using something like DarkHelp or your own implementation of tiling.

3 replies

v-nayjack Nov 30, 2021
Author

It does work. It works very well! See my youtube channel for examples.

I'll check it out.

Since your objects are as small as 3x7, I can guarantee you it wont work without using something like DarkHelp or your own implementation of tiling.

I understand that I have to use tiling/similar implementation for larger images with objects smaller than 10x10, however this still wouldn't answer my question.

This time my training data-set contained maybe 4-5 extremely large images of size 6025 x 4015 and rest of them were about 1024 x 768 and the resized object size wasn't smaller than 10x10. After resizing, since some of them were less than 16x16, I followed the recommendations given by @AlexeyAB. At the least, I was expecting that the trained network would detect the objects from the images of size 1024x768.

RonWP Dec 1, 2021

@v-nayjack I don't have answers for your questions but want to comment and offer some encouragement. My 1st experience with yolo was 2-class custom dataset on 256x256 images with object sizes from 4x4 to 128x128 and larger. The experiment was directed at exploring how small these objects could be and still be detected with acceptable false positives... I started with pre-trained network and subset of my objects whose sizes generally fit the default anchor boxes... intent was to start with something that ought to work and achieve early success. That didn't work as expected and I had a few iterations of learning from mistakes. I suggest you consider doing the same, then gradually make the problem harder and evolve your solution.

For me, overall, much of the documented general advice was quite helpful, and some wasn't; that's to be expected if the problem-at-hand has some uniqueness. Eventually, I ended up using custom/atypical anchor boxes in yolov4-tiny-3l or yolov3-tiny-3l, and with low threshold (e.g., 0.2) got high recall on objects having 30+ pixels; this came with lower precision (more false positives) than desired, but for our application we suppressed those with other methods. This was just a 2-class problem and given all the things learned along-the-way, it was particularly important for me to start with data/network that was close enough to standard that it was supposed to be successful... Good luck!

v-nayjack Dec 1, 2021
Author

@RonWP Thank you! I really appreciate that. Your advice is definitely helpful :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YOLOv4_new.cfg didn't work on custom data set. #8264

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 6 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

YOLOv4_new.cfg didn't work on custom data set. #8264

v-nayjack Nov 30, 2021

Replies: 4 comments · 6 replies

stephanecharette Nov 30, 2021 Collaborator

v-nayjack Nov 30, 2021 Author

stephanecharette Nov 30, 2021 Collaborator

v-nayjack Nov 30, 2021 Author

stephanecharette Nov 30, 2021 Collaborator

v-nayjack Nov 30, 2021 Author

stephanecharette Nov 30, 2021 Collaborator

v-nayjack Nov 30, 2021 Author

RonWP Dec 1, 2021

v-nayjack Dec 1, 2021 Author

v-nayjack
Nov 30, 2021

Replies: 4 comments 6 replies

stephanecharette
Nov 30, 2021
Collaborator

v-nayjack Nov 30, 2021
Author

stephanecharette
Nov 30, 2021
Collaborator

v-nayjack Nov 30, 2021
Author

stephanecharette
Nov 30, 2021
Collaborator

v-nayjack Nov 30, 2021
Author

stephanecharette
Nov 30, 2021
Collaborator

v-nayjack Nov 30, 2021
Author

v-nayjack Dec 1, 2021
Author