Focal loss value is very less #2

prakashjayy · 2018-02-28T13:24:02Z

Hi,

I have implemented your code in Pytorch and it worked properly but have the following concerns

My sudo code works like this
cls_targets = [batch_size, anchor_boxes, classes] # classes is 21 (voc_labels+background) [16, 67995, 21]
cls_preds = [batch_size, anchor_boxes] # anchor_boxes number ranges from -1 to 20 [67995, 21]

Now I remove all the anchor boxes with -1 (ignore_boxes)
cls_targets = [batch_size * valid_anchor_boxes, classes] # [54933, 21]
cls_preds = [batch_size * valid_anchor_boxes, classes] # [54933, 21] This is one hot encoding vector

Now, I followed your code and implemented focal loss as it is but My loss values are coming very less. Like random values is giving a score of 0.12 and quickly the loss is going 0.0012 and small

is der I am missing something:

class FocalLoss_tensorflow(nn.Module):
    def __init__(self, num_classes=20,
                focusing_param = 2.0, 
                balance_param=0.25):
        super(FocalLoss_2, self).__init__()
        self.num_classes = num_classes
        self.focusing_param = focusing_param
        self.balance_param = balance_param 
    
    def focal_loss(self, x, y):
        """ https://github.com/ailias/Focal-Loss-implement-on-Tensorflow/blob/master/focal_loss.py 
        everywhere people are just talking about num_classes. So lets remove the background class from focal loss calculation.
        """
        x  = x[:, 1:]
        sigmoid_p = F.sigmoid(x)
        anchors, classes = x.shape 
        
        t = torch.FloatTensor(anchors, classes+1)
        t.zero_()
        t.scatter_(1, y.data.cpu().view(-1, 1), 1)
        t = Variable(t[:, 1:]).cuda()
        
        zeros = Variable(torch.zeros(sigmoid_p.size())).cuda()
        pos_p_sub = ((t >= sigmoid_p).float() * (t-sigmoid_p)) + ((t < sigmoid_p).float() * zeros)
        neg_p_sub = ((t >= zeros).float() * zeros) + ((t <= zeros).float() * sigmoid_p)
        
        per_entry_cross_ent = (-1) * self.balance_param * (pos_p_sub ** self.focusing_param) * torch.log(torch.clamp(sigmoid_p, 1e-8, 1.0)) -(1-self.balance_param) * (neg_p_sub ** self.focusing_param) * torch.log(torch.clamp(1.0-sigmoid_p, 1e-8, 1.0))
        return per_entry_cross_ent.mean()
        
        
    
    def forward(self, loc_preds, loc_targets, cls_preds, cls_targets):
        batch_size, num_boxes = cls_targets.size()
        pos = cls_targets > 0
        num_pos = pos.data.long().sum()

        mask = pos.unsqueeze(2).expand_as(loc_preds)
        masked_loc_preds = loc_preds[mask].view(-1,4)
        masked_loc_targets = loc_targets[mask].view(-1,4)
        loc_loss = F.smooth_l1_loss(masked_loc_preds, masked_loc_targets, size_average=False)
        loc_loss = loc_loss/num_pos

        pos_neg = cls_targets > -1
        mask = pos_neg.unsqueeze(2).expand_as(cls_preds)
        masked_cls_preds = cls_preds[mask].view(-1, self.num_classes)
        cls_loss = self.focal_loss(masked_cls_preds, cls_targets[pos_neg])
        return loc_loss, cls_loss

Question1:
I am still not getting quite write, if I should use 0 as my background class and how normalization is done while focal loss is applied.

Question2:
I see u have taken mean() but the papers says we need to sum and normalize with positive anchors. Does positive anchors mean only positive anchor boxes are all valid anchor boxes ?

Question3:
The graphs presented by you shows that the loss starts from 6.45... and decreases but mine starts from 0.12 and quickly drops to small decimals..

The text was updated successfully, but these errors were encountered:

ChiefGodMan · 2018-03-03T15:07:35Z

Hi, for Q1: what label should we set for the background, positive or negative? And where should we choose the bbox? How many should we set for background? We can simply think the loss function meaning. Q2: Yes you are right. The result of mine was also use sum() instead. The positive examples only consider the positive anchors losses. Q3: Use reduce_sum() instead.
Thanks for your attention.

prakashjayy · 2018-03-05T07:40:29Z

Thanks for your response.

One more question:

When assigning anchor boxes, shall we compute max_iou over each pyramid level ? or do we need to compute all the anchor boxes and then check for max_iou

In the first case, I m getting around 9000 +ve anchor boxes and in the second case I am getting 200 +ve anchor boxes.

Which one do you think is correct ?

ChiefGodMan · 2018-03-05T14:29:53Z

It depends what Network you choose. For example, if you implement ssd, you may need searching anchor boxed on multi-layer features using max_iou threshold. If you implement Faster-RCNN series and YOLO, you just need computing max_iou on last level featur. Yes, you need to compute all anchor boxes and then check for max_iou.

prakashjayy · 2018-03-05T17:06:59Z

Thanks man for your time. I using RetinaNet. Regards, Prakash V

…

On Mon, Mar 5, 2018 at 8:00 PM, Cheng Yang ***@***.***> wrote: It depends what Network you choose. For example, if you implement ssd, you may need searching anchor boxed on multi-layer features using max_iou threshold. If you implement Faster-RCNN series and YOLO, you just need computing max_iou on last level featur. Yes, you need to compute all anchor boxes and then check for max_iou. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AK2yaNyKZ3CoEWw1nGworvI0gqS3efyqks5tbUv0gaJpZM4SWpQi> .

ChiefGodMan · 2018-03-07T01:32:46Z

You are welcome.
I'm curious about your final training result, does it significant incrementation?

prakashjayy · 2018-03-07T02:40:50Z

Yup everything is working now as desired . Will be releasing the blog post soon. Thanks for ur response man. Really helped me :)

…

On Mar 7, 2018 7:02 AM, "Cheng Yang" ***@***.***> wrote: You are welcome. I'm curious about your final training result, does it significant incrementation? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AK2yaFAj_WNL4O8ixP05fIe8DBXbkD-Xks5tbzi-gaJpZM4SWpQi> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Focal loss value is very less #2

Focal loss value is very less #2

prakashjayy commented Feb 28, 2018

ChiefGodMan commented Mar 3, 2018

prakashjayy commented Mar 5, 2018

ChiefGodMan commented Mar 5, 2018

prakashjayy commented Mar 5, 2018 via email

ChiefGodMan commented Mar 7, 2018

prakashjayy commented Mar 7, 2018 via email

Focal loss value is very less #2

Focal loss value is very less #2

Comments

prakashjayy commented Feb 28, 2018

ChiefGodMan commented Mar 3, 2018

prakashjayy commented Mar 5, 2018

ChiefGodMan commented Mar 5, 2018

prakashjayy commented Mar 5, 2018 via email

ChiefGodMan commented Mar 7, 2018

prakashjayy commented Mar 7, 2018 via email