which knowledge-distillation loss is best in "kd_losses "? #3

yang0817manman · 2020-09-16T02:15:06Z

thanks your shariing. Can you tell me which lossfunction is best in " "kd_losses " for classification task?

AberHu · 2020-10-14T01:29:39Z

Sorry for later reply. From my perspective, different kd losses are suitable for different tasks. For classification, the orginal KD (soft target) is ok, because it can be treated a variation of label smoothing regularization. You may tune the temperature and the trade-off parameters in soft target. Another tow kd losses for classification I recommended are sp and cc. Hope these kd losses have helped.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

which knowledge-distillation loss is best in "kd_losses "? #3

which knowledge-distillation loss is best in "kd_losses "? #3

yang0817manman commented Sep 16, 2020

AberHu commented Oct 14, 2020

which knowledge-distillation loss is best in "kd_losses "? #3

which knowledge-distillation loss is best in "kd_losses "? #3

Comments

yang0817manman commented Sep 16, 2020

AberHu commented Oct 14, 2020