Preprocessing annotations problem #16

MurielleMardenli200 · 2024-11-25T19:01:22Z

Description of problem

The current method preprocess_data_coco in the preprocessing file which creates the coco annotations for RetinaNet model (see here) does not create the right bounding box dimensions.

Example

This example shows the ground truth according to the generated by the validation annotation file. The problem is that some axons are grouped together in certain bounding boxes

The text was updated successfully, but these errors were encountered:

hermancollin · 2024-12-03T17:58:54Z

@edgark31 confirmed he has the same problem, so the issue is upstream. The reason why this happens is very simple. Take a look at the top left corner of the image:

The bboxes were generated by processing the GT segmentation of this image:

However, if we do this directly on the semantic segmentation mask of the myelin, this is bound to happen. Both of your preprocessing functions extract the "individual" axons using the utils.find_regions function. But this function was very badly written and I'm pretty annoyed that I didn't catch this earlier.

axon-detection/src/utils.py

Lines 89 to 102 in e889db7

    
           def find_regions(img): 
        
               """Finds connected regions directly from a binary segmentation mask.""" 
        
               # Ensure the image is of integer type for labeling 
        
               img = img.astype(int) 
        
               # Label connected regions 
        
               labeled_img = measure.label(img) 
        
               # Extract region properties 
        
               regions = measure.regionprops(labeled_img) 
        
               if len(regions) == 0: 
        
                   print("No regions found!") 
        
               return regions

But if we do this on the myelin semantic segmentation mask, obviously the touching myelin sheaths get grouped together. Look at what happens when I select these 4 myelin regions in the image (using diagonal neighbors):

We can see that these regions are bundled together, and this is actually 100% consistent with the bboxes.

The simplest fix would be to use the axon mask instead of the myelin mask for the preprocessing, because the axons are naturally disjoint objects and this grouping would never happen.

For @edgark31, the preprocessing function uses the myelin mask. It does read the axon mask but does nothing with it:

axon-detection/src/preprocessing.py

Lines 112 to 134 in e889db7

    
           for i, region in enumerate(axon_seg_regions): 
        
               minr, minc, maxr, maxc = region.bbox 
        
               bbox_data.append({"image_name": f"{subject}_{sample}.png", "xmin": minc, "ymin": minr, "xmax": maxc, "ymax": maxr, "class": "axon"}) 
        
           # Process Myelin 
        
           myelin_seg = cv2.imread(myelin_seg_path, cv2.IMREAD_GRAYSCALE) 
        
           myelin_seg_regions = utils.find_regions(myelin_seg) 
        
           with open(os.path.join(processed_masks_dir, label_name), "w") as file: 
        
               for i, region in enumerate(myelin_seg_regions): 
        
                   minr, minc, maxr, maxc = region.bbox 
        
                   width,height = region.axis_major_length, region.axis_minor_length 
        
                   bbox_data.append({"image_name": f"{subject}_{sample}.png", "xmin": minc, "ymin": minr, "xmax": maxr, "ymax": maxc, "class": "myelin"}) 
        
                   # Normalize coordinates 
        
                   img_height, img_width = img.shape[:2] 
        
                   x_center = (minc + maxc) / 2 / img_width 
        
                   y_center = (minr + maxr) / 2 / img_height 
        
                   width = (maxc - minc) / img_width 
        
                   height = (maxr - minr) / img_height 
        
                   # Write axonmyelin class (0) to the label file 
        
                   file.write('0 {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(x_center, y_center, width, height))

For COCO I'm not entirely sure why this is happening because in the preprocessing function, the axon mask is used. However this does not make sense because your bboxes are the same as Edgar's (and we can see it visually as well in your visualization). @MurielleMardenli200 are you using a different implementation than what is on master by any chance?

axon-detection/src/preprocessing.py

Lines 249 to 265 in e889db7

    
           axon_seg = cv2.imread(axon_seg_path, cv2.IMREAD_GRAYSCALE) 
        
           axon_seg_regions = utils.find_regions(axon_seg) 
        
           axon_annotations = [] 
        
           for region in axon_seg_regions: 
        
               minr, minc, maxr, maxc = region.bbox 
        
               bbox_width = maxc - minc 
        
               bbox_height = maxr - minr 
        
               bbox_area = bbox_width * bbox_height 
        
               axon_annotations.append({ 
        
                   "id": annotation_id, 
        
                   "image_id": image_id, 
        
                   "category_id": 0,  # Axon category 
        
                   "bbox": [minc, minr, bbox_width, bbox_height], 
        
                   "area": bbox_area, 
        
                   "iscrowd": 0, 
        
               }) 
        
               annotation_id += 1

hermancollin · 2024-12-03T18:08:22Z

Ahh I see @MurielleMardenli200 you are working on dev/retinaNet. I see you changed this part of the code to use the axonmyelin mask instead of the axon mask, so this explains why you have the same problem.

MurielleMardenli200 · 2024-12-06T15:28:12Z

Thank you for the help with this! Like you said Edgar and I were using the same preprocessing of combining axon and myelin classes as we were told to use at the beginning. I changed the preprocessing to only using axons and I am getting more precise results. Here is what the ground truth looks like:

From what I understand, we should be using instance segmentation and not semantic segmantation due to the overlapping. Even with the use of only axon classes, I see that some boxes still combine mulltiple axons, like at the bottom left corner:

I can take a look at the utils.find_regions method to see how we can use instance segmentation to fix this

hermancollin · 2024-12-06T15:53:24Z

From what I understand, we should be using instance segmentation and not semantic segmantation due to the overlapping. Even with the use of only axon classes, I see that some boxes still combine mulltiple axons, like at the bottom left corner:

Ok so I think the GT you show here looks basically perfect. The problem is not the "overlap" itself (this is to be expected with high axon density), but the fact that bboxes contained multiple instances. In this cropped GT, the reason why there are overlaps is just because of the fact we're putting circular shapes inside rectangular boxes.

I can take a look at the utils.find_regions method to see how we can use instance segmentation to fix this

I think we should do this together because I don't want you to re-invent the wheel. We already have features in ADS to produce these bboxes correctly.

MurielleMardenli200 self-assigned this Nov 25, 2024

hermancollin changed the title ~~COCO Preprocessing annotations problem~~ Preprocessing annotations problem Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preprocessing annotations problem #16

Preprocessing annotations problem #16

MurielleMardenli200 commented Nov 25, 2024

hermancollin commented Dec 3, 2024 •

edited

Loading

hermancollin commented Dec 3, 2024

MurielleMardenli200 commented Dec 6, 2024 •

edited

Loading

hermancollin commented Dec 6, 2024

Preprocessing annotations problem #16

Preprocessing annotations problem #16

Comments

MurielleMardenli200 commented Nov 25, 2024

Description of problem

Example

hermancollin commented Dec 3, 2024 • edited Loading

hermancollin commented Dec 3, 2024

MurielleMardenli200 commented Dec 6, 2024 • edited Loading

hermancollin commented Dec 6, 2024

hermancollin commented Dec 3, 2024 •

edited

Loading

MurielleMardenli200 commented Dec 6, 2024 •

edited

Loading