Home

Awesome

SPANet Official (ongoing)

<p align="left"> <a href="https://arxiv.org/abs/2308.11568" alt="arXiv"> <img src="https://img.shields.io/badge/arXiv-2308.11568-b31b1b.svg?style=flat" /></a> <a href="https://openaccess.thecvf.com/content/ICCV2023/html/Yun_SPANet_Frequency-balancing_Token_Mixer_using_Spectral_Pooling_Aggregation_Modulation_ICCV_2023_paper.html" alt="Colab"> <img src="https://img.shields.io/badge/ICCV_2023-open_access-blue" /></a> <a href="https://doranlyong.github.io/projects/spanet/"> <img src="https://img.shields.io/badge/project-page-blue"></a> </p>

💬 This repo is the official implementation of:

🤖 It currently includes code and models for the following tasks:

📖 Introduction

SPANet is a new backbone network which can handle the balance problem of high- and low-frequency components for optimal feature representations.

Main results on ImageNet-1K

Please see image_classification for more details.

ModelPretrainResolutionTop-1#Param.FLOPs
SPANet-SImageNet-1K224x22483.128.7M4.6G
SPANet-MImageNet-1K224x22483.541.8M6.8G
SPANet-MXImageNet-1K224x22483.854.9M9.0G
SPANet-BImageNet-1K224x22484.075.9M12.0G
SPANet-BXImageNet-1K224x22484.499.8 M15.8G

Main results on COCO object detection and instance segmentation

Please see object_detection for more details.

RetinaNet 1x

BackboneLr Schdbox mAP#params
SPANet-S1x43.338M
SPANet-M1x44.051M

Mask R-CNN 1x

BackboneLr Schdbox mAPmask mAP#params
SPANet-S1x44.740.648M
SPANet-M1x45.241.061M

Main results on ADE20K semantice segmentation

Please see semantic_segmentation for more details.

Semantic FPN

BackboneLr SchdmIoU#paramsFLOPs
SPANet-S80K45.432M46G
SPANet-M80K46.245M57G

⭐ Cite SPANet

If you find this repository useful, please give us stars and use the following BibTeX entry for citation.

@inproceedings{yun2023spanet,
  title={SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation},
  author={Yun, Guhnoo and Yoo, Juhan and Kim, Kijung and Lee, Jeongho and Kim, Dong Hwan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={6113--6124},
  year={2023}
}

License

This project is released under the MIT license. Please see the LICENSE file for more information.