How to add a sparse loss during training #632

rogercmq · 2021-02-21T06:14:57Z

rogercmq
Feb 21, 2021

I am working on adding a L1-norm loss term during training, aiming to obtain sparse network weights. Could you give me some advice?

Answered by congee524

Since we calculated loss in the HEAD, I think the best way is to modify the optimizer.....

For example, torch.optim.SGD directly computes the gradient of L2 regularization (1/2 w^2 ---> w):

if weight_decay != 0:
    d_p = d_p.add(p, alpha=weight_decay)

Maybe you can

d_p = d_p.add(sign(p), alpha=weight_decay_l1)

congee524 · 2021-02-22T04:13:09Z

Since we calculated loss in the HEAD, I think the best way is to modify the optimizer.....

For example, torch.optim.SGD directly computes the gradient of L2 regularization (1/2 w^2 ---> w):

if weight_decay != 0:
    d_p = d_p.add(p, alpha=weight_decay)

Maybe you can

d_p = d_p.add(sign(p), alpha=weight_decay_l1)

0 replies

dreamerlin · 2021-03-11T04:03:08Z

It has been answered @rogercmq

0 replies

rogercmq · 2021-03-11T04:04:14Z

Problem solved. Thx!

0 replies