Home

Awesome

ECCV 2024: Rethinking Features-Fused-Pyramid-Neck for Object Detection

English | ็ฎ€ไฝ“ไธญๆ–‡| Paper PDF

(I would like to call it Slim Neck V2. V1 is the Slim Neck by GSConv.๐Ÿ˜€)

Balltze's birthday is November 6th. We plan to release the code on November 16th, ten days after its birthday.

A little easter egg - Cheems(Balltze). At the beginning of 2023, when I was reflecting on the "feature fusion" paradigm and planning to conduct in-depth research, Cheems started appearing frequently on my social media. I really liked it, and every time I saw it, I felt a surge of joy. However, it had left this world before I could finish this paper. To commemorate it, I included its most memorable image in the main illustration of my paper. I am grateful for cute little animals like Cheems, who heal our hearts.

<p align="center"> <img src="https://github.com/AlanLi1997/rethinking-fpn/blob/main/figs/sni.png" alt="" width="500" /> </p>

Absract<br /> Multi-head detectors typically employ a features-fused-pyramid-neck for multi-scale detection and are widely adopted in the industry. However, this approach faces feature misalignment when representations from different hierarchical levels of the feature pyramid are forcibly fused point-to-point. To address this issue, we designed an independent hierarchy pyramid (IHP) architecture to evaluate the effectiveness of the features-unfused-pyramid-neck for multi-head detectors. Subsequently, we introduced soft nearest neighbor interpolation (SNI) with a weight-downscaling factor to mitigate the impact of feature fusion at different hierarchies while preserving key textures. Furthermore, we present a feature adaptive selection method for downsampling in extended spatial windows (ESD) to retain spatial features and enhance lightweight convolutional techniques (GSConvE). These advancements culminate in our secondary features alignment solution (SA) for real-time detection, achieving state-of-the-art results on Pascal VOC and MS COCO.

Prepare datasets&environment

1.datasets

โ”œโ”€โ”€ rethinking-fpn
โ”‚   โ”œโ”€โ”€ datasets
โ”‚   โ”‚   โ”œโ”€โ”€ coco
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€images
โ”‚   โ”‚   โ”‚   โ”‚  โ”œโ”€โ”€1.jpg
โ”‚   โ”‚   โ”‚   โ”‚  โ”œโ”€โ”€...
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€labels
โ”‚   โ”‚   โ”‚   โ”‚  โ”œโ”€โ”€1.txt
โ”‚   โ”‚   โ”‚   โ”‚  โ”œโ”€โ”€...
โ”‚   โ”‚   โ”œโ”€โ”€VOC...

2.environment

pip install requirements.txt

test environment for working:<br /> python==3.8.16<br /> pytorch==1.12.0(py3.8_cuda11.3_cudnn8.3.2_0)<br /> torchvision==0.13.0(py38_cu113)<br />

Train_sn2yolo models

python ./slimneck_v2/for_yolo/sn2-yolov5-v8/train_sn2yolo.py

Train_sn2fpn(R-CNN) models

python train_sn2fpn.py

Validation_sn2yolo models

python ./slimneck_v2/for_yolo/sn2-yolov5-v8/val_sn2yolo.py

References

Citation

@inproceedings{re-fpn,<br /> title={Rethinking Features-Fused-Pyramid-Neck for Object Detection},<br /> author={Li, Hulin},<br /> editors={Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G.}<br /> booktitle={Computer Vision โ€“ ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15125.},<br /> pages={74-90},<br /> year={2024},<br /> publisher={Springer, Cham.}, <br /> doi={10.1007/978-3-031-72855-6_5}, <br /> }