Home

Awesome

PWC

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation

This is the offical implementation of paper PCAN for MOTS. PCAN is also served as the baseline method in BDD100K tracking challenges at CVPR 2022.

We also present a trailer that consists of method illustrations and tracking & segmentation visualizations. Our project website contains more information: vis.xyz/pub/pcan.

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
NeurIPS 2021, Spotlight
Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

<div align="center"> <img src="./figures/pcan_banner_new.gif" width="100%" /> </div>

Abstract

Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes. Most approaches only exploit the temporal dimension to address the association problem, while relying on single frame predictions for the segmentation mask itself. We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation. PCAN first distills a space-time memory into a set of prototypes and then employs cross-attention to retrieve rich information from the past frames. To segment each object, PCAN adopts a prototypical appearance module to learn a set of contrastive foreground and background prototypes, which are then propagated over time. Extensive experiments demonstrate that PCAN outperforms current video instance tracking and segmentation competition winners on both Youtube-VIS and BDD100K datasets, and shows efficacy to both one-stage and two-stage segmentation frameworks.

Prototypical Cross-Attention Networks (PCAN)

<img src="figures/pcan-banner-final.png" width="800">

Main results

Results on BDD100K Benchmark

DetectormMOTSA-valmIDF1-valID Sw.-valScores-valmMOTSA-testmIDF1-testID Sw.-testScores-testConfigWeightsPredsVisuals
ResNet-5028.145.4874scores31.950.4845scoresconfigmodel | MD5predsvisuals

Installation

Please refer to INSTALL.md for installation instructions.

Usages

Please refer to GET_STARTED.md for dataset preparation and detailed running (training, testing, visualization, etc.) instructions.

Related links

Youtube Video | Bilibili Video| Zhihu Reading

Citation

If you find PCAN useful in your research or refer to the provided baseline results, please star :star: this repository and consider citing :pencil::

@inproceedings{pcan,
  title={Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation},
  author={Ke, Lei and Li, Xia and Danelljan, Martin and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
  booktitle={Advances in Neural Information Processing Systems},
  year={2021}
}