Awesome

More clear codes are provided at https://github.com/sseung0703/KD_methods_with_TF.

Self-supervised Knowledge Distillation using Singular Value Decomposition

Alt text

Feature

Define knowledge by Singular value decomposition
Fast and efficient learning by multi-task learning

Requirments

Tensorflow
Scipy

Unfortunatly SVD is very slow on GPU. so recommend below installation method.

Install Tensorflow from source which is removed SVD GPU op.(recommended)
Install ordinary Tensorflow and make SVD using CPU.
Install Tensorflow version former than 1.2.

How to Use

The code is based on Tensorflow-slim example codes. so if you used that it is easy to understand. ( If you want recoded dataset and trained teacher weights download them on https://drive.google.com/open?id=1sNPkr-hOy3qKNUE8bu5qJpvCGFI7VOJN )

Recording Cifar100 dataset to tfrecording file
Train teacher network by train_teacher.py, and you can find trained_param.mat in training path.
Train student network using teacher knowledge which trained step 2 or default knowledge which is ImageNet pretrained VGG16.

Results

Alt text

Paper

This research accepted to ECCV2018 poster session and arxiv version paper is available on https://arxiv.org/abs/1807.06819 and ECCV format paper is available on http://openaccess.thecvf.com/content_ECCV_2018/papers/SEUNG_HYUN_LEE_Self-supervised_Knowledge_Distillation_ECCV_2018_paper.pdf