Home

Awesome

P2PNet (ICCV2021 Oral Presentation)

This repository contains codes for the official implementation in PyTorch of P2PNet as described in Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework.

A brief introduction of P2PNet can be found at 机器之心 (almosthuman).

The codes is tested with PyTorch 1.5.0. It may not run with other versions.

Visualized demos for P2PNet

<img src="vis/congested1.png" width="1000"/> <img src="vis/congested2.png" width="1000"/> <img src="vis/congested3.png" width="1000"/>

The network

The overall architecture of the P2PNet. Built upon the VGG16, it firstly introduce an upsampling path to obtain fine-grained feature map. Then it exploits two branches to simultaneously predict a set of point proposals and their confidence scores.

<img src="vis/net.png" width="1000"/>

Comparison with state-of-the-art methods

The P2PNet achieved state-of-the-art performance on several challenging datasets with various densities.

MethodsVenueSHTechPartA <br> MAE/MSESHTechPartB <br> MAE/MSEUCF_CC_50 <br> MAE/MSEUCF_QNRF <br> MAE/MSE
CANCVPR'1962.3/100.07.8/12.2212.2/243.7107.0/183.0
Bayesian+ICCV'1962.8/101.87.7/12.7229.3/308.288.7/154.8
S-DCNetICCV'1958.3/95.06.7/10.7204.2/301.3104.4/176.1
SANet+SPANetICCV'1959.4/92.56.5/9.9232.6/311.7-/-
DUBNetAAAI'2064.6/106.87.7/12.5243.8/329.3105.6/180.5
SDANetAAAI'2063.6/101.87.8/10.2227.6/316.4-/-
ADSCNetCVPR'20<u>55.4</u>/97.7<u>6.4</u>/11.3198.4/267.371.3/132.5
ASNetCVPR'2057.78/<u>90.13</u>-/-<u>174.84</u>/<u>251.63</u>91.59/159.71
AMRNetECCV'2061.59/98.367.02/11.00184.0/265.886.6/152.2
AMSNetECCV'2056.7/93.46.7/10.2208.4/297.3101.8/163.2
DM-CountNeurIPS'2059.7/95.77.4/11.8211.0/291.585.6/<u>148.3</u>
Ours-52.74/85.066.25/9.9172.72/256.18<u>85.32</u>/154.5

Comparison on the NWPU-Crowd dataset.

MethodsMAE[O]MSE[O]MAE[L]MAE[S]
MCNN232.5714.6220.91171.9
SANet190.6491.4153.8716.3
CSRNet121.3387.8112.0<u>522.7</u>
PCC-Net112.3457.0111.0777.6
CANNet110.0495.3102.3718.3
Bayesian+105.4454.2115.8750.5
S-DCNet90.2370.582.9567.8
DM-Count<u>88.4</u>388.688.0498.0
Ours77.44362<u>83.28</u>553.92

The overall performance for both counting and localization.

nAP$_{\delta}$SHTechPartASHTechPartBUCF_CC_50UCF_QNRFNWPU_Crowd
$\delta=0.05$10.9%23.8%5.0%5.9%12.9%
$\delta=0.25$70.3%84.2%54.5%55.4%71.3%
$\delta=0.50$90.1%94.1%88.1%83.2%89.1%
$\delta={{0.05:0.05:0.50}}$64.4%76.3%54.3%53.1%65.0%

Comparison for the localization performance in terms of F1-Measure on NWPU.

MethodF1-MeasurePrecisionRecall
FasterRCNN0.0680.9580.035
TinyFaces0.5670.5290.611
RAZ0.5990.6660.543
Crowd-SDNet0.6370.6510.624
PDRNet0.6530.6750.633
TopoCount0.6920.6830.701
D2CNet<u>0.700</u>0.7410.662
Ours0.712<u>0.729</u><u>0.695</u>

Installation

pip install -r requirements.txt

Organize the counting dataset

We use a list file to collect all the images and their ground truth annotations in a counting dataset. When your dataset is organized as recommended in the following, the format of this list file is defined as:

train/scene01/img01.jpg train/scene01/img01.txt
train/scene01/img02.jpg train/scene01/img02.txt
...
train/scene02/img01.jpg train/scene02/img01.txt

Dataset structures:

DATA_ROOT/
        |->train/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->test/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->train.list
        |->test.list

DATA_ROOT is your path containing the counting datasets.

Annotations format

For the annotations of each image, we use a single txt file which contains one annotation per line. Note that indexing for pixel values starts at 0. The expected format of each line is:

x1 y1
x2 y2
...

Training

The network can be trained using the train.py script. For training on SHTechPartA, use

CUDA_VISIBLE_DEVICES=0 python train.py --data_root $DATA_ROOT \
    --dataset_file SHHA \
    --epochs 3500 \
    --lr_drop 3500 \
    --output_dir ./logs \
    --checkpoints_dir ./weights \
    --tensorboard_dir ./logs \
    --lr 0.0001 \
    --lr_backbone 0.00001 \
    --batch_size 8 \
    --eval_freq 1 \
    --gpu_id 0

By default, a periodic evaluation will be conducted on the validation set.

Testing

A trained model (with an MAE of 51.96) on SHTechPartA is available at "./weights", run the following commands to launch a visualization demo:

CUDA_VISIBLE_DEVICES=0 python run_test.py --weight_path ./weights/SHTechA.pth --output_dir ./logs/

Acknowledgements

Citing P2PNet

If you find P2PNet is useful in your project, please consider citing us:

@inproceedings{song2021rethinking,
  title={Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework},
  author={Song, Qingyu and Wang, Changan and Jiang, Zhengkai and Wang, Yabiao and Tai, Ying and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Wu, Yang},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Related works from Tencent Youtu Lab