Home

Awesome

Official Pytorch Implementation of DToP

Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation

Quan Tang, Bowen Zhang, Jiajun Liu, Fagui Liu, Yifan Liu

ICCV 2023. [arxiv]

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for DToP

As shown in the following figure, the network is naturally split into stages using inherent auxiliary blocks.

<img src="./resources/fig-1-1.png">

Highlights

Getting started

  1. requirements
torch==2.0.0 mmcls==1.0.0.rc5, mmcv==2.0.0 mmengine==0.7.0 mmsegmentation==1.0.0rc6 

or up-to-date mmxx series till 9 Aug 2023

Training

To aquire the base model

python tools dist_train.sh config/prune/BASE_segvit_ade20k_large.py  $work_dirs$

To prune on the base model

python tools dist_train_load.sh  config/prune/prune_segvit_ade20k_large.py  $work_dirs$  $path_to_ckpt$

Eval

python tools/dist_test.sh  config/prune/prune_segvit_ade20k_large.py  $path_to_ckpt$

Datasets

Please follow the instructions of mmsegmentation data preparation

Results

Ade20k

MethodBackbonemIoUGFlopsconfigckpt
SegvitVit-base49.6109.9config
Segvit-pruneVit-base49.886.8config
SegvitVit-large53.3617.0config
Segvit-pruneVit-large52.8412.8config

Pascal Context

MethodBackbonemIoUGFlopsconfigckpt
SegvitVit-large63.0315.4config
Segvit-pruneVit-large62.7224.3config

COCO-Stuff-10K

MethodBackbonemIoUGFlopsconfigckpt
SegvitVit-large47.4366.9config
Segvit-pruneVit-large47.1276.2config

License

For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact the authors.

Citation