How to use CRD loss for training on large scale dataset #9

TMaysGGS · 2021-04-15T06:50:03Z

Thank you very much for your work.

Now I am working on using CRD loss for training my face recognition model, while your work and the original paper show that CRD loss with KL divergence for distillation is better than the others at this moment. But I found that it needs 2 memory buffers, which make it unfeasible when the dataset is really huge. So I wonder if there is a softer way to implement this.

Hope for your reply. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use CRD loss for training on large scale dataset #9

How to use CRD loss for training on large scale dataset #9

TMaysGGS commented Apr 15, 2021

How to use CRD loss for training on large scale dataset #9

How to use CRD loss for training on large scale dataset #9

Comments

TMaysGGS commented Apr 15, 2021