Awesome

Knowledge Distillation via the Target-aware Transformer (CVPR2022)

Experiments on semantic segmentation of our work. See this link for experiments on ImageNet.

All the experiments are conducted on a single Nvidia A100 (40GB). Multi-gpu environment hasn't been tested.

Please modify the dataset path on the file mypath.py according to your system.

Our model is implemented on ./distiller_tat.

We also provide the implementation of ReveiwKD on ./distiller_reveiwkd and other methods (KD/FitNet/AT/ICKD) on ./distiller_comp.

ResNet101 is used as teacher backbone.

We use the official model. Please download the checkpoint from here and put it on ./pretrained/ .

We train the teacher on our own. You may download the checkpoint from here or just simply running:

sh ./train_cocostuff10k_baseline.sh

Please refer to the shell scripts. For instance, distilling the ResNet101 to ResNet18 on Pascal VOC:

sh ./train_voc_resnet18.sh