Awesome
knowledge distillation papers
Early Papers
- Model Compression, Rich Caruana, 2006
- Distilling the Knowledge in a Neural Network, Hinton, J.Dean, 2015
- Knowledge Acquisition from Examples Via Multiple Models, Perdo Domingos, 1997
- Combining labeled and unlabeled data with co-training, A. Blum, T. Mitchell, 1998
- Using A Neural Network to Approximate An Ensemble of Classifiers, Xinchuan Zeng and Tony R. Martinez, 2000
- Do Deep Nets Really Need to be Deep?, Lei Jimmy Ba, Rich Caruana, 2014
Recommended Papers
- FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016
- A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, Junho Yim, Donggyu Joo, Jihoon Bae, Junmo Kim, 2017
- Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang
- Born Again Neural Networks, Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar, 2018
- Net2Net: Accelerating Learning Via Knowledge Transfer, Tianqi Chen, Ian Goodfellow, Jonathon Shlens, 2016
- Unifying distillation and privileged information, David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik, 2015
- Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks, Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami, 2016
- Large scale distributed neural network training through online distillation, Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, Geoffrey E. Hinton, 2018
- Deep Mutual Learning, Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu, 2017
- Learning Loss for Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang, 2017
- Data-Free Knowledge Distillation for Deep Neural Networks, Raphael Gontijo Lopes, Stefano Fenu, Thad Starner, 2017
- Quantization Mimic: Towards Very Tiny CNN for Object Detection, Yi Wei, Xinyu Pan, Hongwei Qin, Wanli Ouyang, Junjie Yan, 2018
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Moonshine: Distilling with Cheap Convolutions, Elliot J. Crowley, Gavin Gray, Amos Storkey, 2017
- Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving, Jiaolong Xu, Peng Wang, Heng Yang and Antonio M. L ´opez, 2018
- Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2017
- Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher, Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Hassan Ghasemzadeh, 2019
- ResKD: Residual-Guided Knowledge Distillation, Xuewei Li, Songyuan Li, Bourahla Omar, and Xi Li, 2020
- Rethinking Data Augmentation: Self-Supervision and Self-Distillation, Hankook Lee, Sung Ju Hwang, Jinwoo Shin, 2019
- MSD: Multi-Self-Distillation Learning via Multi-classifiers within Deep Neural Networks, Yunteng Luan, Hanyu Zhao, Zhi Yang, Yafei Dai, 2019
- Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation, Linfeng Zhang, Jiebo Song, Anni Gao, Jingwei Chen, Chenglong Bao, Kaisheng Ma, 2019
2016
- Cross Modal Distillation for Supervision Transfer, Saurabh Gupta, Judy Hoffman, Jitendra Malik, CVPR 2016
- Deep Model Compression: Distilling Knowledge from Noisy Teachers, Bharat Bhusan Sau, Vineeth N. Balasubramanian, 2016
- Knowledge Distillation for Small-footprint Highway Networks, Liang Lu, Michelle Guo, Steve Renals, 2016
- Sequence-Level Knowledge Distillation, deeplearning-papernotes, Yoon Kim, Alexander M. Rush, 2016
- Recurrent Neural Network Training with Dark Knowledge Transfer, Zhiyuan Tang, Dong Wang, Zhiyong Zhang, 2016
- Face Model Compression by Distilling Knowledge from Neurons, Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang, and Xiaoou Tang, 2016
- Sequence-Level Knowledge Distillation, Yoon Kim, Alexander M. Rush, EMNLP 2016
- Distilling Word Embeddings: An Encoding Approach, Lili Mou, Ran Jia, Yan Xu, Ge Li, Lu Zhang, Zhi Jin, CIKM 2016
2017
- Data Distillation: Towards Omni-Supervised Learning, Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He, CVPR 2017
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer, Zehao Huang, Naiyan Wang, 2017
- Data-Free Knowledge Distillation For Deep Neural Networks, Raphael Gontijo Lopes, Stefano Fenu, 2017
- DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, 2017
- Adapting Models to Signal Degradation using Distillation, Jong-Chyi Su, Subhransu Maji, BMVC 2017
- Cross-lingual Distillation for Text Classification, Ruochen Xu, Yiming Yang, ACL 2017, code
2018
- Learning Global Additive Explanations for Neural Nets Using Model Distillation, Sarah Tan, Rich Caruana, Giles Hooker, Paul Koch, Albert Gordo, 2018
- YASENN: Explaining Neural Networks via Partitioning Activation Sequences, Yaroslav Zharov, Denis Korzhenkov, Pavel Shvechikov, Alexander Tuzhilin, 2018
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, Antti Tarvainen, Harri Valpola, 2018
- Local Affine Approximators for Improving Knowledge Transfer, Suraj Srinivas & François Fleuret, 2018
- Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?Shilin Zhu, Xin Dong, Hao Su, 2018
- Probabilistic Knowledge Transfer for deep representation learning, Nikolaos Passalis, Anastasios Tefas, 2018
- Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons, Byeongho Heo, Minsik Lee, Sangdoo Yun, Jin Young Choi, 2018
- Paraphrasing Complex Network: Network Compression via Factor Transfer, Jangho Kim, SeongUk Park, Nojun Kwak, NIPS, 2018
- KDGAN: Knowledge Distillation with Generative Adversarial Networks, Xiaojie Wang, Rui Zhang, Yu Sun, Jianzhong Qi, NeurIPS 2018
- Distilling Knowledge for Search-based Structured Prediction, Yijia Liu, Wanxiang Che, Huaipeng Zhao, Bing Qin, Ting Liu, ACL 2018
2019
- Learning Efficient Detector with Semi-supervised Adaptive Distillation, Shitao Tang, Litong Feng, Zhanghui Kuang, Wenqi Shao, Quanquan Li, Wei Zhang, Yimin Chen, 2019
- Dataset Distillation, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, Alexei A. Efros, 2019
- Relational Knowledge Distillation, Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho, 2019
- Knowledge Adaptation for Efficient Semantic Segmentation, Tong He, Chunhua Shen, Zhi Tian, Dong Gong, Changming Sun, Youliang Yan, 2019
- A Comprehensive Overhaul of Feature Distillation, Byeongho Heo, Jeesoo Kim, Sangdoo Yun, Hyojin Park, Nojun Kwak, Jin Young Choi, 2019, code
- Towards Understanding Knowledge Distillation, Mary Phuong, Christoph Lampert, ICML, 2019
- Knowledge Distillation from Internal Representations, Gustavo Aguilar, Yuan Ling, Yu Zhang, Benjamin Yao, Xing Fan, Edward Guo, 2019
- Knowledge Flow: Improve Upon Your Teachers, Iou-Jen Liu, Jian Peng, Alexander G. Schwing, 2019
- Similarity-Preserving Knowledge Distillation, Frederick Tung, Greg Mori, 2019
- [Correlation Congruence for Knowledge Distillation](Correlation Congruence for Knowledge Distillation), Baoyun Peng, Xiao Jin, Jiaheng Liu, Shunfeng Zhou, Yichao Wu, Yu Liu, Dongsheng Li, Zhaoning Zhang, 2019
- Variational Information Distillation for Knowledge Transfer, Sungsoo Ahn, Shell Xu Hu, Andreas Damianou, Neil D. Lawrence, Zhenwen Dai, 2019
- Knowledge Distillation via Instance Relationship Graph, Yufan Liu, Jiajiong Cao, Bing Lia, Chunfeng Yuan, Weiming Hua, Yangxi Lic, Yunqiang Duan, CVPR 2019
- Structured Knowledge Distillation for Semantic Segmentation, Yifan Liu, Changyong Shu, Jingdong Wang, Chunhua Shen, CVPR 2019
- Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention, Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, Weihua Luo, ACL 2019, code
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks, Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova, Jimmy Lin, arXiv, 2019
- Multilingual Neural Machine Translation with Knowledge Distillation, Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu, ICLR 2019
- BAM! Born-Again Multi-Task Networks for Natural Language Understanding, Kevin Clark, Minh-Thang Luong, Urvashi Khandelwal, Christopher D. Manning, Quoc V. Le, ACL 2019
- Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, arXiv 2019
- Exploiting the Ground-Truth: An Adversarial Imitation Based Knowledge Distillation Approach for Event Detection, AAAI 2019
2020
- Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion, Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose M. Alvarez, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz, 2020
- Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation, Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai, 2020
- Data-Free Adversarial Distillation, Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, Mingli Song, 2020
- Contrastive Representation Distillation, Yonglong Tian, Dilip Krishnan, Phillip Isola, ICLR 2020, code
- StyleGAN2 Distillation for Feed-forward Image Manipulation, Yuri Viazovetskyi, Vladimir Ivashkin, and Evgeny Kashin, ECCV 2020
- Distilling Knowledge from Graph Convolutional Networks, Yiding Yang, Jiayan Qiu, Mingli Song, Dacheng Tao, Xinchao Wang, CVPR 2020
- Self-supervised Knowledge Distillation for Few-shot Learning, Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah, 2020, code
- Online Knowledge Distillation with Diverse Peers, Defang Chen, Jian-Ping Mei, Can Wang, Yan Feng and Chun Chen, AAAI, 2020
- Intra-class Feature Variation Distillation for Semantic Segmentation, Yukang Wang, Wei Zhou, Tao Jiang, Xiang Bai, and Yongchao Xu, ECCV 2020
- Exclusivity-Consistency Regularized Knowledge Distillation for Face Recognition, Xiaobo Wang, Tianyu Fu, Shengcai Liao, Shuo Wang, Zhen Lei, and Tao Mei, ECCV 2020
- Improving Face Recognition from Hard Samples via Distribution Distillation Loss, Yuge Huang, Pengcheng Shen, Ying Tai, Shaoxin Li, Xiaoming Liu, Jilin Li, Feiyue Huang, Rongrong Ji, ECCV 2020
- Distilling Knowledge Learned in BERT for Text Generation, Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu, ACL 2020, code
2021
- Dataset Distillation with Infinitely Wide Convolutional Networks, Timothy Nguyen, Roman Novak, Lechao Xiao, Jaehoon Lee, 2021
- Dataset Meta-Learning from Kernel Ridge-Regression, Timothy Nguyen, Zhourong Chen, Jaehoon Lee, 2021
- Up to 100× Faster Data-free Knowledge Distillation, Gongfan Fang1, Kanya Mo, Xinchao Wang, Jie Song, Shitao Bei, Haofei Zhang, Mingli Song, 2021
- Robustness and Diversity Seeking Data-Free Knowledge Distillation, Pengchao Han, Jihong Park, Shiqiang Wang, Yejun Liu, 2021
- Data-Free Knowledge Transfer: A Survey, Yuang Liu, Wei Zhang, Jun Wang, Jianyong Wang, 2021
- Undistillable: Making A Nasty Teacher That CANNOT teach students, Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang, ICLR 2021
- QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning, Kaan Ozkara, Navjot Singh, Deepesh Data, Suhas Diggavi, NeurIPS 2021
- KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation, Yongfei Liu, Chenfei Wu, Shao-yen Tseng, Vasudev Lal, Xuming He, Nan Duan
- Online Knowledge Distillation for Efficient Pose Estimation, Zheng Li, Jingwen Ye, Mingli Song, Ying Huang, Zhigeng Pan, ICCV 2021
- Does Knowledge Distillation Really Work?, Samuel Stanton, Pavel Izmailov, Polina Kirichenko, Alexander A. Alemi, Andrew Gordon Wilson, NeurIPS 2021
- Hierarchical Self-supervised Augmented Knowledge Distillation, Chuanguang Yang, Zhulin An, Linhang Cai, Yongjun Xu, IJCAI 2021
- DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis With GANs, Javier Nistal, Stefan Lattner, Gaël Richard, ISMIR2021
- On Self-Distilling Graph Neural Network, Yuzhao Chen, Yatao Bian, Xi Xiao, Yu Rong, Tingyang Xu, Junzhou Huang, IJCAI 2021
- Graph-Free Knowledge Distillation for Graph Neural Networks, Xiang Deng, Zhongfei Zhang, IJCAI 2021
- Self Supervision to Distillation for Long-Tailed Visual Recognition, Tianhao Li, Limin Wang, Gangshan Wu, ICCV 2021
- Cross-Layer Distillation with Semantic Calibration, Defang Chen, Jian-Ping Mei, Yuan Zhang, Can Wang, Zhe Wang, Yan Feng, Chun Chen, AAAI 2021
- Channel-wise Knowledge Distillation for Dense Prediction, Changyong Shu, Yifan Liu, Jianfei Gao, Zheng Yan, Chunhua Shen, ICCV 2021
- Training data-efficient image transformers & distillation through attention, Hugo Touvron, Matthieu Cord, Douze Matthijs, Francisco Massa, Alexandre Sablayrolles, Herve Jegou, ICML 2021
- Exploring Inter-Channel Correlation for Diversity-preserved Knowledge Distillation, Li Liu, Qingle Huang, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Xiaodan Liang, ICCV 2021, code
- torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation, Yoshitomo Matsubara, International Workshop on Reproducible Research in Pattern Recognition 2021, code
2022
- LGD: Label-guided Self-distillation for Object Detection, Peizhen Zhang, Zijian Kang, Tong Yang, Xiangyu Zhang, Nanning Zheng, Jian Sun, AAAI 2022
- MonoDistill: Learning Spatial Features for Monocular 3D Object Detection, Anonymous, ICLR 2022
- Bag of Instances Aggregation Boosts Self-supervised Distillation, Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian, ICLR 2022
- Meta Learning for Knowledge Distillation, Wangchunshu Zhou, Canwen Xu, Julian McAuley, 2022
- Focal and Global Knowledge Distillation for Detectors, Zhendong Yang, Zhe Li, Xiaohu Jiang, Yuan Gong, Zehuan Yuan, Danpei Zhao, Chun Yuan, CVPR 2022
- Self-Distilled StyleGAN: Towards Generation from Internet Photos, Ron Mokady, Michal Yarom, Omer Tov, Oran Lang, Daniel Cohen-Or, Tali Dekel, Michal Irani, Inbar Mosseri, 2022
- Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation, Gang Li, Xiang Li, Yujie Wang, Shanshan Zhang, Yichao Wu, Ding Liang, AAAI 2022
- Decoupled Knowledge Distillation, Borui Zhao, Quan Cui, Renjie Song, Yiyu Qiu, Jiajun Liang, CVPR 2022, code
- Graph Flow: Cross-layer Graph Flow Distillation for Dual-Efficient Medical Image Segmentation, Wenxuan Zou, Muyi Sun, 2022
- Dataset Distillation by Matching Training Trajectories, George Cazenavette, Tongzhou Wang, Antonio Torralba, Alexei A. Efros, Jun-Yan Zhu, CVPR 2022
- Knowledge Distillation with the Reused Teacher Classifier, Defang Chen, Jian-Ping Mei, Hailin Zhang, Can Wang, Yan Feng, Chun Chen, CVPR 2022
- Self-Distillation from the Last Mini-Batch for Consistency Regularization, Shen Yiqing, Xu Liwu, Yang Yuzhe, Li Yaqian and Guo Yandong, CVPR 2022 code
- DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers, Xianing Chen, Qiong Cao, Yujie Zhong, Shenghua Gao, CVPR 2022
- Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning, Lin Zhang, Li Shen, Liang Ding, Dacheng Tao, Ling-Yu Duan, CVPR 2022
- LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection, Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jiwen Lu, Jie Zhou, 2022
- Localization Distillation for Dense Object Detection, Zhaohui Zheng, Rongguang Ye, Ping Wang, Dongwei Ren, Wangmeng Zuo, Qibin Hou, Ming-Ming Cheng, CVPR 2022, code
- Localization Distillation for Object Detection, Zhaohui Zheng, Rongguang Ye, Qibin Hou, Dongwei Ren, Ping Wang, Wangmeng Zuo, Ming-Ming Cheng, 2022, code
- Cross-Image Relational Knowledge Distillation for Semantic Segmentation, Chuanguang Yang, Helong Zhou, Zhulin An, Xue Jiang, Yongjun Xu, Qian Zhang, CVPR 2022, code
- Knowledge distillation: A good teacher is patient and consistent, Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, Alexander Kolesnikov, CVPR 2022
- Spot-adaptive Knowledge Distillation, Jie Song, Ying Chen, Jingwen Ye, Mingli Song, TIP 2022, code
- MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning, Shiming Chen, Ziming Hong, Guo-Sen Xie, Wenhan Yang, Qinmu Peng, Kai Wang, Jian Zhao, Xinge You, CVPR 2022
- Knowledge Distillation via the Target-aware Transformer, Sihao Lin, Hongwei Xie, Bing Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang, Gang Wang, CVPR 2022
- PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection, Linfeng Zhang, Runpei Dong, Hung-Shuo Tai, Kaisheng Ma, arXiv 2022, code
- Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation, Linfeng Zhang, Xin Chen, Xiaobing Tu, Pengfei Wan, Ning Xu, Kaisheng Ma, CVPR 2022
- Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation, Yixuan Wei, Han Hu, Zhenda Xie, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo, Tech Report 2022, code
- Knowledge Distillation via the Target-aware Transformer, Sihao Lin, Hongwei Xie, Bing Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang, Gang Wang, CVPR 2022
- BERT Learns to Teach: Knowledge Distillation with Meta Learning, Wangchunshu Zhou, Canwen Xu, Julian McAuley, ACL 2022, code
- Nearest Neighbor Knowledge Distillation for Neural Machine Translation, Zhixian Yang, Renliang Sun, Xiaojun Wan, NAACL 2022
- Knowledge Condensation Distillation, Chenxin Li, Mingbao Lin, Zhiyuan Ding, Nie Lin, Yihong Zhuang, Yue Huang, Xinghao Ding, Liujuan Cao, ECCV 2022, code
- Masked Generative Distillation, Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan, ECCV 2022, code
- DTG-SSOD: Dense Teacher Guidance for Semi-Supervised Object Detection, Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang
- Distilled Dual-Encoder Model for Vision-Language Understanding, Zekun Wang, Wenhui Wang, Haichao Zhu, Ming Liu, Bing Qin, Furu Wei, code
- Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection, Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun, ECCV 2022, code
- Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation, Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang
- TinyViT: Fast Pretraining Distillation for Small Vision Transformers, Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang, ECCV 2022
- Self-slimmed Vision Transformer, Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu, ICLR 2022
- KD-MVS: Knowledge Distillation Based Self-supervised Learning for MVS, Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, CHi Zhang, ECCV 2022, code
- Rethinking Data Augmentation for Robust Visual Question Answering, Long Chen, Yuhang Zheng, Jun Xiao, ECCV 2022, code
- ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval, Yuxiang Lu, Yiding Liu, Jiaxiang Liu, Yunsheng Shi, Zhengjie Huang, Shikun Feng Yu Sun, Hao Tian, Hua Wu, Shuaiqiang Wang, Dawei Yin, Haifeng Wang
- Prune Your Model Before Distill It, Jinhyuk Park, Albert No, ECCV 2022, code
- Efficient One Pass Self-distillation with Zipf's Label Smoothing, Jiajun Liang, Linze Li, Zhaodong Bing, Borui Zhao, Yao Tang, Bo Lin, Haoqiang Fan, ECCV 2022, code
- R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis, ECCV 2022, code
- D3Former: Debiased Dual Distilled Transformer for Incremental Learning, Abdelrahman Mohamed, Rushali Grandhe, KJ Joseph, Salman Khan, Fahad Khan, code
- SdAE: Self-distillated Masked Autoencoder, Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian, ECCV 2022, code
- Masked Generative Distillation, Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan, ECCV 2022, code
- MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition, Chuanguang Yang, Zhulin An, Helong Zhou, Linhang Cai, Xiang Zhi, Jiwen Wu, Yongjun Xu, Qian Zhang, ECCV 2022, code
- Mind the Gap in Distilling StyleGANs, Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy, ECCV 2022, code
- Prune Your Model Before Distill It, Jinhyuk Park and Albert No, ECCV 2022, code
- HIRE: Distilling high-order relational knowledge from heterogeneous graph neural networks, Jing Liu, Tongya Zheng, Qinfen Hao, Neurocomputing
- A Fast Knowledge Distillation Framework for Visual Recognition, Zhiqiang Shen, Eric Xing, ECCV 2022, code
- Knowledge Distillation from A Stronger Teacher, Tao Huang, Shan You, Fei Wang, Chen Qian, Chang Xu, NeurIPS 2022, code
- ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval, Nicola Messina, Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Fabrizio Falchi, Giuseppe Amato, Rita Cucchiara, CBMI 2022, code
- Towards Efficient 3D Object Detection with Knowledge Distillation, Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi, NeurlPS 2022, code
- Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher, Mehdi Rezagholizadeh, Aref Jafari, Puneeth Salad, Pranav Sharma, Ali Saheb Pasand, Ali Ghodsi, COLING 2022
- Noisy Self-Knowledge Distillation for Text Summarization, Yang Liu, Sheng Shen, Mirella Lapata, arXiv 2021
- On Distillation of Guided Diffusion Models, Chenlin Meng, Ruiqi Gao, Diederik P. Kingma, Stefano Ermon, Jonathan Ho, Tim Salimans, arXiv 2022
- ViTKD: Practical Guidelines for ViT feature knowledge distillation, Zhendong Yang, Zhe Li, Ailing Zeng, Zexian Li, Chun Yuan, Yu Li, arXiv 2022, code
- Self-Regulated Feature Learning via Teacher-free Feature Distillation, Lujun Li, ECCV 2022, code
- DETRDistill: A Universal Knowledge Distillation Framework for DETR-families, Jiahao Chang, Shuo Wang, Guangkai Xu, Zehui Chen, Chenhongyi Yang, Feng Zhao, arXiv 2022
- Learning to Explore Distillability and Sparsability: A Joint Framework for Model Compression, Yufan Liu, Jiajiong Cao, Bing Li, Weiming Hu, Stephen Maybank, TPAMI 2022
- Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?, Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Yunqing Zhao, Ngai-Man Cheung, ICML 2022
2023
- Curriculum Temperature for Knowledge Distillation, Zheng Li, Xiang Li, Lingfeng Yang, Borui Zhao, Renjie Song, Lei Luo, Jun Li, Jian Yang, AAAI 2023, code
- Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling, Xin Ma, Chang Liu, Chunyu Xie, Long Ye, Yafeng Deng, Xiangyang Ji, arXiv 2023, code