Awesome
MMPedestron
[ECCV2024] This is the official implementation of the paper "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset".
Authors: Yi Zhang, Wang ZENG, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu
<img src="./images/mmpd.png"/>MMPedestron Examples
<img src="./images/vis.png"/>Configs and Models
Region proposal performance
- Prtrained Stage
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
- CrowdHuman
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
3.COCO-Person
Method&Config | Backbone | Download |
---|---|---|
MMPedestron finetune | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
4.FLIR
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
5.PEDRo
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
MMPedestron(10% train data) | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Co-Dino | Res50 | - |
YOLOX | CSPDarknet | - |
Meta Transformer | ViTAdapter | - |
Faster R-CNN | Res50 | - |
6.LLVIP Datasets
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Co-Dino RGB, Co-Dino IR | Res50 | - |
YOLOX RGB, YOLOX IR | CSPDarknet | - |
Meta Transformer RGB, Meta Transformer IR | ViTAdapter | - |
Faster R-CNN RGB, Faster R-CNN IR | Res50 | - |
7.InoutDoor Datasets
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Co-Dino RGB, Co-Dino Depth | Res50 | - |
YOLOX RGB, YOLOX Depth | CSPDarknet | - |
Meta Transformer RGB, Meta Transformer Depth | ViTAdapter | - |
Faster R-CNN RGB, Faster R-CNN Depth | Res50 | - |
8.STCrowd Datasets
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Co-Dino RGB, Co-Dino Lidar | Res50 | - |
YOLOX RGB, YOLOX Lidar | CSPDarknet | - |
Meta Transformer RGB, Meta Transformer Lidar | ViTAdapter | - |
Faster R-CNN RGB, Faster R-CNN Lidar | Res50 | - |
9.EventPed Datasets
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Co-Dino RGB, Co-Dino Lidar | Res50 | - |
YOLOX RGB, YOLOX Lidar | CSPDarknet | - |
Meta Transformer RGB, Meta Transformer Lidar | ViTAdapter | - |
Faster R-CNN RGB, Faster R-CNN Lidar | Res50 | - |
9.Fusion Exp
9-1 LLVIP
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Early-Fusion | Res50 | - |
FPN-Fusion | Res50 | - |
ProbEN RGB, ProbEN IR | Res50 | - |
CMX | SwinTransformer | - |
9-2 InOutDoor
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Early-Fusion | UNIXViT | - |
FPN-Fusion | Res50 | - |
ProbEN RGB, ProbEN Depth | Res50 | - |
CMX | SwinTransformer | - |
9-1 STCrowd
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Early-Fusion | Res50 | - |
FPN-Fusion | Res50 | - |
ProbEN RGB, ProbEN Lidar | Res50 | - |
CMX | SwinTransformer | - |
9-1 EventPed
Method&Config | Backbone | Download |
---|---|---|
MMPedestron | UNIXViT | Gooogle Drive, Baidu Yun (Code: mmpd) |
Early-Fusion | Res50 | - |
FPN-Fusion | Res50 | - |
ProbEN RGB, ProbEN Event | Res50 | - |
CMX | SwinTransformer | - |
Compared with SOTA
<img src="./images/compare.png" width=800>Installation
Prepare environment
- Create a conda virtual environment and activate it.
conda create -n mmpedestron python=3.6
conda activate mmpedestron
- Install requirements, we recommend you to install requirements by env_deploy.sh
conda install cudatoolkit=10.1
sh env_deploy.sh
Data Preparation
Please obtain the datasets repo from the following: MMPD-Dataset
Training
Manage training jobs with Slurm
sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} ${GPUS}
Testing
Manage testing jobs with Slurm
sh tools/slurm_test.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${CHECKPOINT} ${GPUS}
License
Codes and data are freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please contact Mr. Sheng Jin (jinsheng13[at]foxmail[dot]com). We will send the detail agreement to you.
Citation
if you find our paper and code useful in your research, please consider giving a star and citation :)
@inproceedings{zhang2024when,
title={When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset},
author={Zhang, Yi and Zeng, Wang and Jin, Sheng and Qian, Chen and Luo, Ping and Liu, Wentao},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024},
month={September}
}