Home

Awesome

<div align="center"> <img src="assets/caption.png" width="1000"> <h3>MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection</h3>

Haoyang He<sup>1*</sup>, Yuhu Bai<sup>1*</sup>, Jiangning Zhang<sup>2</sup>, Qingdong He<sup>2</sup>, Hongxu Chen<sup>1</sup>

Zhenye Gan<sup>2</sup>, Chengjie Wang<sup>2</sup>, Xiangtai Li<sup>3</sup>, Guanzhong Tian<sup>1</sup>, Lei Xie<sup>1</sup>

<sup>1</sup>College of Control Science and Engineering, Zhejiang University, <sup>2</sup>Youtu Lab, Tencent, <sup>3</sup>Nanyang Technological University, Singapore

[Paper] [Project Page]

Our MambaAD is based on ADer.

</div>

Abstract

Recent advancements in anomaly detection have seen the efficacy of CNN- and transformer-based approaches. However, CNNs struggle with long-range dependencies, while transformers are burdened by quadratic computational complexity. Mamba-based models, with their superior long-range modeling and linear efficiency, have garnered substantial attention. This study pioneers the application of Mamba to multi-class unsupervised anomaly detection, presenting MambaAD, which consists of a pre-trained encoder and a Mamba decoder featuring Locality-Enhanced State Space (LSS) modules at multi-scales. The proposed LSS module, integrating parallel cascaded (Hybrid State Space) HSS blocks and multi-kernel convolutions operations, effectively captures both long-range and local information. The HSS block, utilizing (Hybrid Scanning) HS encoders, encodes feature maps into five scanning methods and eight directions, thereby strengthening global connections through the (State Space Model) SSM. The use of Hilbert scanning and eight directions significantly improves feature sequence modeling. Comprehensive experiments on six diverse anomaly detection datasets and seven metrics demonstrate SoTA performance, substantiating the method's effectiveness.

Overview

<p align="center"> <img src="assets/mambaad.png" alt="accuracy" width="100%"> </p>

🛠️ Getting Started

Installation

📜 Multi-class Results on Popular AD Datasets

Subscripts I, R, and P represent image-level, region-level, and pixel-level, respectively.

MambaAD Results

MethodmAU-ROC<sub>I</sub>mAP<sub>I</sub>mF1-max<sub>I</sub>mAU-ROC<sub>P</sub>mAP<sub>P</sub>mF1-max<sub>P</sub>mAU-PRO<sub>R</sub><span style="color:blue">Download</span>
MVTec-AD98.699.697.897.756.359.293.1log & weight
VisA94.394.589.498.539.444.091.0log & weight
Real-IAD86.384.677.098.533.038.790.5log & weight
Uni-Medical83.780.182.096.945.447.387.5log & weight
COCO-AD63.956.263.269.316.922.240.5log & weight
MVTec-3D86.295.892.898.637.541.193.6log & weight

Citation

If you find this code useful, don't forget to star the repo and cite the paper:

@article{he2024mambaad,
      title={MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection}, 
      author={Haoyang He and Yuhu Bai and Jiangning Zhang and Qingdong He and Hongxu Chen and Zhenye Gan and Chengjie Wang and Xiangtai Li and Guanzhong Tian and Lei Xie},
      journal={arXiv preprint arXiv:2404.06564},
      year={2024},
}

Acknowledgements

We thank the great works ADer, VMamba for providing assistance for our research.