Home

Awesome

Knowledge Distillation via the Target-aware Transformer (CVPR2022)

Experiments on semantic segmentation of our work. See this link for experiments on ImageNet.

Requirement

Note

All the experiments are conducted on a single Nvidia A100 (40GB). Multi-gpu environment hasn't been tested.

Overview

Before getting started

Please modify the dataset path on the file mypath.py according to your system.

Implementation

Our model is implemented on ./distiller_tat.

We also provide the implementation of ReveiwKD on ./distiller_reveiwkd and other methods (KD/FitNet/AT/ICKD) on ./distiller_comp.

Execution

The executable file is ./train_with_distillation_tat.

Training a teacher model

ResNet101 is used as teacher backbone.

Pascal VOC

We use the official model. Please download the checkpoint from here and put it on ./pretrained/ .

COCOStuff-10k

We train the teacher on our own. You may download the checkpoint from here or just simply running:

sh ./train_cocostuff10k_baseline.sh

Training with distillation

Please refer to the shell scripts. For instance, distilling the ResNet101 to ResNet18 on Pascal VOC:

sh ./train_voc_resnet18.sh

TO-DO

Acknowledgment

pytorch-deeplab-xception

DeepLab with PyTorch

Overhaul distillation

RepDistiller

ReviewKD