Home

Awesome

PTSEFormer

This repo is an official implementation of PTSEFormer, which is accepted by ECCV 2022.

Citing PTSEFormer

Please consider citing our paper if you find it useful:

@article{ptseformer,
    Author = {Han, Wang and Jun, Tang and Xiaodong, Liu and Shanyan, Guan and Rong, Xie and Li, Song},
    Title = {PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection},
    Conference = {ECCV},
    Year = {2022}
}

Installation

Requirements

Data preparation

Please download the ILSVRC2015 DET and ILSVRC2015 VID dataset from here and organize them as following.

data_root/
	└──ILSVRC2015/
		├──ImageSets/
		├──Annotations/
		├──Data/

Evaluation

The inference command line for testing on the validation dataset:

python -m torch.distributed.launch --nproc_per_node=8 tools/test.py --config-file experiments/PTSEFormer_r101_8gpus.yaml

Pretrained model can be found here.

Training

The training command line for training on a combined dataset of VID and DET:

python -m torch.distributed.launch --nproc_per_node=8 tools/train.py --config-file experiments/PTSEFormer_r101_8gpus.yaml