Home

Awesome

Knowledge Diffusion for Distillation (DiffKD)

Official implementation for paper "Knowledge Diffusion for Distillation" (DiffKD), NeurIPS 2023


Reproducing our results

git clone https://github.com/hunto/DiffKD.git --recurse-submodules
cd DiffKD

The implementation of DiffKD is in classification/lib/models/losses/diffkd.

ImageNet

cd classification
sh tools/dist_train.sh 8 ${CONFIG} ${MODEL} --teacher-model ${T_MODEL} --experiment ${EXP_NAME}

Example script for reproducing DiffKD on ResNet-34 teacher and ResNet-18 student with B1 baseline setting:

sh tools/dist_train.sh 8 configs/strategies/distill/diffkd/diffkd_b1.yaml tv_resnet18 --teacher-model tv_resnet34 --experiment diffkd_res34_res18

License

This project is released under the Apache 2.0 license.

Citation

@article{huang2023knowledge,
  title={Knowledge Diffusion for Distillation},
  author={Huang, Tao and Zhang, Yuan and Zheng, Mingkai and You, Shan and Wang, Fei and Qian, Chen and Xu, Chang},
  journal={arXiv preprint arXiv:2305.15712},
  year={2023}
}