Home

Awesome

<div align="center"> <img src="./figure/logo.png" width = "100" align=center /> </div> <div align="center"> <h1>PointMamba</h1> <h3>A Simple State Space Model for Point Cloud Analysis</h3>

Dingkang Liang<sup>1</sup> *, Xin Zhou<sup>1</sup> *, Wei Xu<sup>1</sup>, Xingkui Zhu<sup>1</sup>, Zhikang Zou<sup>2</sup>, Xiaoqing Ye<sup>2</sup>, Xiao Tan<sup>2</sup> and Xiang Bai<sup>1†</sup>

<sup>1</sup> Huazhong University of Science & Technology, <sup>2</sup> Baidu Inc.

(*) Equal contribution. ($\dagger$) Corresponding author.

arXiv Project Zhihu Hits GitHub closed issues Code License

</div>

📣 News

Abstract

Transformers have become one of the foundational architectures in point cloud analysis tasks due to their excellent global modeling ability. However, the attention mechanism has quadratic complexity, making the design of a linear complexity method with global modeling appealing. In this paper, we propose PointMamba, transferring the success of Mamba, a recent representative state space model (SSM), from NLP to point cloud analysis tasks. Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs. Specifically, our method leverages space-filling curves for effective point tokenization and adopts an extremely simple, non-hierarchical Mamba encoder as the backbone. Comprehensive evaluations demonstrate that PointMamba achieves superior performance across multiple datasets while significantly reducing GPU memory usage and FLOPs. This work underscores the potential of SSMs in 3D vision-related tasks and presents a simple yet effective Mamba-based baseline for future research.

Overview

<div align="center"> <img src="./figure/pipeline.png" width = "888" align=center /> </div>

Main Results

<div align="center"> <img src="./figure/scanobj.png" width = "888" align=center /> </div>
TaskDatasetConfigAcc.(Scratch)Download (Scratch)Acc.(pretrain)Download (Finetune)
Pre-trainingShapeNetpretrain.yamlN.A.here
ClassificationModelNet40finetune_modelnet.yaml92.4%here93.6%here
ClassificationScanObjectNNfinetune_scan_objbg.yaml88.30%here90.71%here
Classification*ScanObjectNNfinetune_scan_objbg.yaml\\93.29%here
ClassificationScanObjectNNfinetune_scan_objonly.yaml87.78%here88.47%here
Classification*ScanObjectNNfinetune_scan_objonly.yaml\\91.91%here
ClassificationScanObjectNNfinetune_scan_hardest.yaml82.48%here84.87%here
Classification*ScanObjectNNfinetune_scan_hardest.yaml\\88.17%here
Part SegmentationShapeNetPartpart segmentation85.8% mIoUhere86.0% mIoUhere

* indicates further using simple rotational augmentation for training.

Getting Started

Datasets

See DATASET.md for details.

Usage

See USAGE.md for details.

To Do

Acknowledgement

This project is based on Point-BERT (paper, code), Point-MAE (paper, code), Mamba (paper, code), Causal-Conv1d (code). Thanks for their wonderful works.

Citation

If you find this repository useful in your research, please consider giving a star ⭐ and a citation

@inproceedings{liang2024pointmamba,
      title={PointMamba: A Simple State Space Model for Point Cloud Analysis}, 
      author={Liang, Dingkang and Zhou, Xin and Xu, Wei and Zhu, Xingkui and Zou, Zhikang and Ye, Xiaoqing and Tan, Xiao and Bai, Xiang},
      booktitle={Advances in Neural Information Processing Systems},
      year={2024}
}