Skip to content

How to add a sparse loss during training #632

Answered by congee524
rogercmq asked this question in General
Discussion options

You must be logged in to vote

Since we calculated loss in the HEAD, I think the best way is to modify the optimizer.....

For example, torch.optim.SGD directly computes the gradient of L2 regularization (1/2 w^2 ---> w):

if weight_decay != 0:
    d_p = d_p.add(p, alpha=weight_decay)

Maybe you can

d_p = d_p.add(sign(p), alpha=weight_decay_l1)

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by innerlee
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants
Converted from issue

This discussion was converted from issue #622 on February 23, 2021 12:05.