Skip to content

Latest commit

 

History

History
30 lines (20 loc) · 3.85 KB

README.md

File metadata and controls

30 lines (20 loc) · 3.85 KB

Data-Free Adversarial Distillation (DFAD)

Data-Free Adversarial Distillation

Abstract

Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer. However, almost all existing KD algorithms are data-driven, i.e., relying on a large amount of original training data or alternative data, which is usually unavailable in real-world scenarios. In this paper, we devote ourselves to this challenging problem and propose a novel adversarial distillation mechanism to craft a compact student model without any real-world data. We introduce a model discrepancy to quantificationally measure the difference between student and teacher models and construct an optimizable upper bound. In our work, the student and the teacher jointly act the role of the discriminator to reduce this discrepancy, when a generator adversarially produces some "hard samples" to enlarge it. Extensive experiments demonstrate that the proposed data-free method yields comparable performance to existing data-driven methods. More strikingly, our approach can be directly extended to semantic segmentation, which is more complicated than classification, and our approach achieves state-of-the-art results.

pipeline

Results and models

Classification

Location Dataset Teacher Student Acc Acc(T) Acc(S) Config Download
logits Cifar10 resnet34 resnet18 92.80 95.34 94.82 config teacher |model | log

Citation

@article{fang2019data,
  title={Data-free adversarial distillation},
  author={Fang, Gongfan and Song, Jie and Shen, Chengchao and Wang, Xinchao and Chen, Da and Song, Mingli},
  journal={arXiv preprint arXiv:1912.11006},
  year={2019}
}