Home

Awesome

Delving into the Trajectory Long-tail Distribution for Muti-object Tracking

【CVPR 2024】Delving into the Trajectory Long-tail Distribution for Muti-object Tracking
Sijia Chen, En Yu, Jinyang Li, Wenbing Tao
[ArXiv] Paper (http://arxiv.org/abs/2403.04700)
[CVPR] Paper (Delving_into_the_Trajectory_Long-tail_Distribution_for_Muti-object_Tracking_CVPR_2024_paper))
YouTube (https://www.youtube.com/watch?v=ohgIesSNgaQ)

If you have any problems with our work, please issue me. We will promptly reply it.

If you cite our method for experimental comparison, you can use the method name TLTDMOT.

Thanks for your attention! If you are interested in our work, please give us a star ⭐️.

Poster

Abstract

Multiple Object Tracking (MOT) is a critical area within computer vision, with a broad spectrum of practical implementations. Current research has primarily focused on the development of tracking algorithms and enhancement of post-processing techniques. Yet, there has been a lack of thorough examination concerning the nature of tracking data it self. In this study, we pioneer an exploration into the distribution patterns of tracking data and identify a pronounced long-tail distribution issue within existing MOT datasets. We note a significant imbalance in the distribution of trajectory lengths across different pedestrians, a phenomenon we refer to as “pedestrians trajectory long-tail distribution”. Addressing this challenge, we introduce a bespoke strategy designed to mitigate the effects of this skewed distribution. Specifically, we propose two data augmentation strategies, including Stationary Camera View Data Augmentation (SVA) and Dynamic Camera View Data Augmentation (DVA) , designed for viewpoint states and the Group Softmax (GS) module for Re-ID. SVA is to backtrack and predict the pedestrian trajectory of tail classes, and DVA is to use diffusion model to change the background of the scene. GS divides the pedestrians into unrelated groups and performs softmax operation on each group individually. Our proposed strategies can be integrated into numerous existing tracking systems, and extensive experimentation validates the efficacy of our method in reducing the influence of long-tail distribution on multi-object tracking performance.

Apology letter

I'm Sijia Chen. I'm very sorry. There is a small error in Figure 1 in the paper of official CVPR. Figure 1 in the paper of ArXiv is correct.

We made a mistake when submitting the camera-ready version of CVPR. Although we found this error in May 2024 and contacted the publisher immediately, we were unable to correct it because the deadline for the camera-ready version of CVPR had passed.

News

Installation

conda create -n TLDTMOT python=3.8
conda activate TLDTMOT
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
cd ${Trajectory-Long-tail-Distribution-for-MOT_ROOT}
pip install cython # Optional addition: -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install -r requirements.txt # Optional addition: -i https://pypi.tuna.tsinghua.edu.cn/simple/
git clone -b pytorch_1.7 https://github.com/ifzhang/DCNv2.git
cd DCNv2
./make.sh
conda install ffmpeg
pip install ffmpy

Data preparation

dataset
   |
   |
   |——————MOT15
   |        |——————images
   |        |        └——————train
   |        |        └——————test
   |        └——————labels_with_ids
   |                 └——————train(empty)
   |——————MOT16
   |        |——————images
   |        |        └——————train
   |        |        └——————test
   |        └——————labels_with_ids
   |                 └——————train(empty)
   |——————MOT17
   |        |——————images
   |        |        └——————train
   |        |        └——————test
   |        └——————labels_with_ids
   |                 └——————train(empty)
   |——————MOT20
            |——————images
            |        └——————train
            |        └——————test
            └——————labels_with_ids
                     └——————train(empty)

Then, you can change the seq_root and label_root in src/gen_labels_15.py , src/gen_labels_16.py, src/gen_labels_17.py and src/gen_labels_20.py and run:

cd src
python gen_labels_15.py
python gen_labels_16.py
python gen_labels_17.py
python gen_labels_20.py

to generate the labels of 2DMOT15 , MOT16, MOT17 and MOT20. The seqinfo.ini files of 2DMOT15 can be downloaded here [Google], [Baidu],code:8o0w.

Note: Each time you run, you need to delete the labels_with_ids folder.

dataset
   |
   |
   |——————crowdhuman
            |——————images
            |        └——————train
            |        └——————val
            └——————labels_with_ids
            |         └——————train(empty)
            |         └——————val(empty)
            └------annotation_train.odgt
            └------annotation_val.odgt

If you want to pretrain on CrowdHuman (we train Re-ID on CrowdHuman), you can change the paths in src/gen_labels_crowd_id.py and run:

cd src
python gen_labels_crowd_id.py

If you want to add CrowdHuman to the MIX dataset (we do not train Re-ID on CrowdHuman), you can change the paths in src/gen_labels_crowd_det.py and run:

cd src
python gen_labels_crowd_det.py

Pretrained models and baseline model

DLA-34 official COCO pretrained model: ctdet_coco_dla_2x.pth can be downloaded here [Baidu, code:hust], [Google]. HRNetV2 ImageNet pretrained model: HRNetV2-W18 official, HRNetV2-W32 official. After downloading, you should put the pretrained models in the following structure:

${Trajectory-Long-tail-Distribution-for-MOT_ROOT}
   └——————models
           └——————ctdet_coco_dla_2x.pth
           └——————hrnetv2_w32_imagenet_pretrained.pth
           └——————hrnetv2_w18_imagenet_pretrained.pth

Our baseline FairMOT model (DLA-34 backbone) is pretrained on the CrowdHuman for 60 epochs with the self-supervised learning approach and then trained on the MIX dataset for 30 epochs. The models can be downloaded here: crowdhuman_dla34.pth [Google] [Baidu, code:ggzx ] [Onedrive]. fairmot_dla34.pth [Google] [Baidu, code:uouv] [Onedrive]. After downloading, you should put the baseline model in the following structure:

${Trajectory-Long-tail-Distribution-for-MOT_ROOT}
   └——————models
           └——————fairmot_dla34.pth
           └——————...

The important notes:

Our processed MOT17 dataset by SVA and DVA can be downloaded here [Baidu, code:hust].

Our models can be downloaded here [Baidu, code:hust].

Training

Baseline(+Ours):

bash experiments/MOT15_add_our_method_dla34.sh

Baseline:

bash experiments/MOT15_baseline.sh

Baseline(+Ours):

bash experiments/MOT16_add_our_method_dla34.sh

Baseline:

bash experiments/MOT16_baseline.sh

Baseline(+Ours):

bash experiments/MOT17_add_our_method_dla34.sh

Baseline:

bash experiments/MOT17_baseline.sh

The data annotation of MOT20 is a little different from MOT17, the coordinates of the bounding boxes are all inside the image, so we need to uncomment line 313 to 316 in the dataset file src/lib/datasets/dataset/jde.py:

#np.clip(xy[:, 0], 0, width, out=xy[:, 0])
#np.clip(xy[:, 2], 0, width, out=xy[:, 2])
#np.clip(xy[:, 1], 0, height, out=xy[:, 1])
#np.clip(xy[:, 3], 0, height, out=xy[:, 3])

Then, we can train on MOT20:

Baseline(+Ours):

bash experiments/MOT20_add_our_method_dla34.sh

Baseline:

bash experiments/MOT20_baseline.sh
#np.clip(xy[:, 0], 0, width, out=xy[:, 0])
#np.clip(xy[:, 2], 0, width, out=xy[:, 2])
#np.clip(xy[:, 1], 0, height, out=xy[:, 1])
#np.clip(xy[:, 3], 0, height, out=xy[:, 3])

Then, we can train on MOT20:

Baseline(+Ours):

bash experiments/MOT20_ft_mix_add_our_method_dla34.sh
bash experiments/ablation_study.sh

Tracking

MOT15:

bash experiments/MOT15_track.sh

MOT16:

bash experiments/MOT16_track.sh

MOT17:

bash experiments/MOT17_track.sh

MOT20:

bash experiments/MOT20_track.sh

we evaluate on the other half of the training set of MOT17, you can run:

All classes(default):

bash experiments/ablation_study_track.sh

If you want to evaluate head classes and tail classes, you need to run tackle_module/head_tail_classes_division/val_id_num_count.py. Then you need to place the generated gt_headclasses.txt and gt_tailclasses.txt file in the corresponding gt location of the MOT17 training dataset, like below:

dataset
   |
   |
   |——————MOT17
            |
            |——————images
                     |
                     |——————train
                              |
                              |——————MOT17-02-SDP
                              |            |
                              |            |——————gt
                              |                   └——————gt.txt
                              |                   └——————gt_headclasses.txt
                              |                   └——————gt_tailclasses.txt
                              |——————MOT17-04-SDP
                              |            |
                              |            |——————gt
                              |                   └——————gt.txt
                              |                   └——————gt_headclasses.txt
                              |                   └——————gt_tailclasses.txt
                              |——————MOT17-05-SDP
                              |            |
                              |            |——————gt
                              |                   └——————gt.txt
                              |                   └——————gt_headclasses.txt
                              |                   └——————gt_tailclasses.txt
                              |——————MOT17-09-SDP
                              |            |
                              |            |——————gt
                              |                   └——————gt.txt
                              |                   └——————gt_headclasses.txt
                              |                   └——————gt_tailclasses.txt
                              |——————MOT17-10-SDP
                              |            |
                              |            |——————gt
                              |                   └——————gt.txt
                              |                   └——————gt_headclasses.txt
                              |                   └——————gt_tailclasses.txt
                              |——————MOT17-11-SDP
                              |            |
                              |            |——————gt
                              |                   └——————gt.txt
                              |                   └——————gt_headclasses.txt
                              |                   └——————gt_tailclasses.txt
                              |——————MOT17-13-SDP
                                           |
                                           |——————gt
                                                  └——————gt.txt
                                                  └——————gt_headclasses.txt
                                                  └——————gt_tailclasses.txt

Then you can run:

Head classes or tail classes:

bash experiments/ablation_study_classes_track.sh

Demo

You can input a raw video and get the demo video by running src/demo.py and get the mp4 format of the demo video:

cd src
python demo.py mot --load_model ../models/fairmot_dla34.pth --conf_thres 0.4

You can change --input-video and --output-root to get the demos of your own videos. --conf_thres can be set from 0.3 to 0.7 depending on your own videos.

Acknowledgement

The part of the code are borrowed from the follow work:

Thanks for their wonderful works.

Citation

@InProceedings{Chen_2024_CVPR,
    author    = {Chen, Sijia and Yu, En and Li, Jinyang and Tao, Wenbing},
    title     = {Delving into the Trajectory Long-tail Distribution for Muti-object Tracking},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {19341-19351}
}