Home

Awesome

MMPedestron

[ECCV2024] This is the official implementation of the paper "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset".

Authors: Yi Zhang, Wang ZENG, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu

<img src="./images/mmpd.png"/>

MMPedestron Examples

<img src="./images/vis.png"/>

Configs and Models

Region proposal performance

  1. Prtrained Stage
Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
  1. CrowdHuman
Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)

3.COCO-Person

Method&ConfigBackboneDownload
MMPedestron finetuneUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)

4.FLIR

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)

5.PEDRo

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
MMPedestron(10% train data)UNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Co-DinoRes50-
YOLOXCSPDarknet-
Meta TransformerViTAdapter-
Faster R-CNNRes50-

6.LLVIP Datasets

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Co-Dino RGB, Co-Dino IRRes50-
YOLOX RGB, YOLOX IRCSPDarknet-
Meta Transformer RGB, Meta Transformer IRViTAdapter-
Faster R-CNN RGB, Faster R-CNN IRRes50-

7.InoutDoor Datasets

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Co-Dino RGB, Co-Dino DepthRes50-
YOLOX RGB, YOLOX DepthCSPDarknet-
Meta Transformer RGB, Meta Transformer DepthViTAdapter-
Faster R-CNN RGB, Faster R-CNN DepthRes50-

8.STCrowd Datasets

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Co-Dino RGB, Co-Dino LidarRes50-
YOLOX RGB, YOLOX LidarCSPDarknet-
Meta Transformer RGB, Meta Transformer LidarViTAdapter-
Faster R-CNN RGB, Faster R-CNN LidarRes50-

9.EventPed Datasets

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Co-Dino RGB, Co-Dino LidarRes50-
YOLOX RGB, YOLOX LidarCSPDarknet-
Meta Transformer RGB, Meta Transformer LidarViTAdapter-
Faster R-CNN RGB, Faster R-CNN LidarRes50-

9.Fusion Exp

9-1 LLVIP

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Early-FusionRes50-
FPN-FusionRes50-
ProbEN RGB, ProbEN IRRes50-
CMXSwinTransformer-

9-2 InOutDoor

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Early-FusionUNIXViT-
FPN-FusionRes50-
ProbEN RGB, ProbEN DepthRes50-
CMXSwinTransformer-

9-1 STCrowd

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Early-FusionRes50-
FPN-FusionRes50-
ProbEN RGB, ProbEN LidarRes50-
CMXSwinTransformer-

9-1 EventPed

Method&ConfigBackboneDownload
MMPedestronUNIXViTGooogle Drive, Baidu Yun (Code: mmpd)
Early-FusionRes50-
FPN-FusionRes50-
ProbEN RGB, ProbEN EventRes50-
CMXSwinTransformer-

Compared with SOTA

<img src="./images/compare.png" width=800>

Installation

Prepare environment

  1. Create a conda virtual environment and activate it.
conda create -n mmpedestron python=3.6
conda activate mmpedestron
  1. Install requirements, we recommend you to install requirements by env_deploy.sh
conda install cudatoolkit=10.1

sh env_deploy.sh

Data Preparation

Please obtain the datasets repo from the following: MMPD-Dataset

Training

Manage training jobs with Slurm

sh tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} ${GPUS}

Testing

Manage testing jobs with Slurm

sh tools/slurm_test.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${CHECKPOINT} ${GPUS}

License

Codes and data are freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please contact Mr. Sheng Jin (jinsheng13[at]foxmail[dot]com). We will send the detail agreement to you.

Citation

if you find our paper and code useful in your research, please consider giving a star and citation :)

@inproceedings{zhang2024when,
  title={When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset},
  author={Zhang, Yi and Zeng, Wang and Jin, Sheng and Qian, Chen and Luo, Ping and Liu, Wentao},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2024},
  month={September}
}