Home

Awesome

Beyond MOT: Semantic Multi-Object Tracking

Beyond MOT: Semantic Multi-Object Tracking <br> Yunhao Li, Qin Li, Hao Wang, Xue Ma, Jiali Yao, Shaohua Dong, Heng Fan*, Libo Zhang* <br> European Conference on Computer Vision (ECCV), 2024. (*equal advising and co-last author)<br> arXiv Dataset

Semantic Multi-Object Tracking


overview <br> Figure: Illustration of the proposed Semantic SMOT. Existing multi-object tracking (MOT) focusing on predicting trajectories only (see (a)) and our semantic multi-object tracking (SMOT) aiming at estimating trajectories and understanding their semantics (see (b)). Best viewed in color for all figures.

Framework


framework <br> Figure: : Illustration of the proposed approach SMOTer, which contains three components of trajectory estimation for tracking, feature fusion, and trajectory-associated semantic understanding.

Implementation


Installation

Requirements
Installation Example
conda create -n somter python=3.8.0
conda activate somter
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

# under your working directory
git clone https://github.com/facebookresearch/detectron2.git
cd detectron2
pip install -e .

cd ..
# or git clone https://github.com/Nathan-Li123/SMOTer
git clone https://github.com/HengLan/SMOT
# or cd SMOTer
cd SMOT 
pip install -r requirements.txt

Dataset preparation

  1. Before starting the processing, please download BenSMOT from here (baidu: yb2d, one dirve) and place it anywhere you wish to. For more details about BenSMOT, please refer to BenSMOT.md.
  2. In the BenSMOT dataset folder, we provide semantic annotation files for each sequence in the dataset, including video captions, trajectory captions, and trajectory interactions. For convenience, we recommend downloading the combined annotation files from here (baidu: 1b2h, one drive).
  3. Sim-link the test set of BenSMOT to datasets/bensmot/BenSMOT-val/, and construct them as follows.
datasets
├── bensmot
|   └──annotations
|   └──seqmaps
|   └──BenSMOT-val
|   └──instance_captioin.json
|   └──video_summary.json
|   └──relation.json
  1. Modify the DATA_PATH in tools/convert_bensmot2coco.py to the BenSMOT root directory you are using.
  2. run tools/convert_bensmot2coco.py to create train.json and test.json files in the annotations folder, and create a test.txt file in the seqmaps folder.

Training and Evaluation

Please use the scripts provided in scripts/bensmot.sh for training and evaluation. The weights files used in the process can be downloaded here (one drive).

# train
CUDA_VISIBLE_DEVICES=0,1,2,3 python train_net.py --num-gpus 4 --config-file configs/BYTE_BENSMOT_FPN.yaml
# evaluation
CUDA_VISIBLE_DEVICES=0 python train_net.py --num-gpus 1 --config-file configs/BYTE_BENSMOT_FPN.yaml --eval-only path/to/weight
# count metrics
python eval_vu.py

Acknowledgement

Our code repository is built upon xingyizhou/GTR. Thanks for their wonderful work.

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{li2024beyond,
  title={Beyond MOT: Semantic Multi-Object Tracking},
  author={Li, Yunhao and Wang, Hao and Ma, Xue and Yao, Jiali and Dong, Shaohua and Fan, Heng and Zhang, Libo},
  booktitle={ECCV},
  year={2024}
}