Home

Awesome

Awesome-Vision-Mamba-Models

Awesome License: MIT GitHub last commit GitHub issues Arxiv Page

[NEWS.2024/04/29] Our paper is released!

[NEWS.2024/05/02] 🎉🎉🎉Congratulations to Vision Mamba on being accepted in ICML 2024.

[NEWS.2024/07/06] The updated version of our paper is now available!

[NEWS.2024/09/26] 🎉🎉🎉Congratulations to VMamba on being accepted in NeurIPS 2024.

📢NOTE: If you have any questions, please don't hesitate to contact us at any of the following emails: cseruixu@ust.hk, syangcw@connect.ust.hk, ywangrm@connect.ust.hk, yu.cai@connect.ust.hk.

Mamba, a novel state space model, has gained recognition across diverse domains for its exceptional performance and efficient computational complexity. By addressing the limitations inherent in traditional visual foundation architectures, Mamba emerges as a promising contender poised to catalyze advancements in the field of computer vision.

:star: This repository hosts a curated collection of literature associated with Mamba models in computer vision. Feel free to star and fork. For further details, refer to the following paper:

Visual Mamba: A Survey and New Outlooks<br/> Rui Xu, Shu Yang, Yihui Wang, Yu Cai, Bo Du, Hao Chen<br/> SMART Lab, The Hong Kong University of Science and Technology<br/> <br/>

If you find this repository is useful for you, please cite our paper:

@misc{2024visual_mamba,
      title={Visual Mamba: A Survey and New Outlooks}, 
      author={Rui Xu and Shu Yang and Yihui Wang and Yu Cai and Bo Du and Hao Chen},
      year={2024},
      eprint={2404.18861},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contents

Mamba

DatePaperFigureLinkCode
Arxiv 23.12.01 (COLM 2024)Mamba: Linear-Time Sequence Modeling with Selective State Spacesimage imageLinkCode
Arxiv 24.05.31 (ICML 2024)Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Dualityimage imageLinkCode

Related Survey

DatePaperLink
Arxiv 24.04.15State Space Model for New-Generation Network Alternative to Transformers: A SurveyLink
Arxiv 24.04.24A Survey on Visual MambaLink
Arxiv 24.04.24Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and ChallengesLink
Arxiv 24.05.07Vision Mamba: A Comprehensive Survey and TaxonomyLink
Arxiv 24.06.05Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image AnalysisLink
Arxiv 24.06.24Venturing into Uncharted Waters: The Navigation Compass from Transformer to MambaLink
Arxiv 24.08.02A Survey of MambaLink
Arxiv 24.10.03A Comprehensive Survey of Mamba Architectures for Medical Image Analysis: Classification, Segmentation, Restoration and BeyondLink
Arxiv 24.10.04Mamba in Vision: A Comprehensive Survey of Techniques and ApplicationsLink

Visual Mamba Backbone Networks

<img width="600" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/57466105/4843bead-14cd-4aa6-aecf-af9411defc49">

Detailed Performance Comparison

DatePaperFigureLinkCode
Arxiv 24.01.17 (ICML 2024)Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model<img width="684" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/6d32c807-3d2f-457e-8927-fa4bbe595064">LinkCode
Arxiv 24.01.18 (NeurIPS 2024)VMamba: Visual State Space Model<img width="806" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/039e24f6-5f89-4772-bb84-7409aeef4da0"> <img width="833" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/75158bbf-18e9-45fc-93e0-7d84c062ed0d">LinkCode
Arxiv 24.02.08 (ECCV 2024 Oral)Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data<img width="712" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/0ee52771-63ec-4e1d-bd24-aff9fe83c8e6">LinkCode
Arxiv 24.03.14LocalMamba: Visual State Space Model with Windowed Selective Scan<img width="710" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/1c2bcfb8-72d0-4f33-b561-f926952455ff">LinkCode
Arxiv 24.03.15EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba<img width="719" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/7e034c04-3359-456e-a2b3-720b4b37e975">LinkCode
Arxiv 24.03.22SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series<img width="622" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/bef43ea0-0d1e-4c2f-93e1-231b41394195">LinkCode
Arxiv 24.03.26 (BMVC 2024)PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition<img width="713" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/d1170f9a-b9b2-4c4d-ab44-cd6c52a07c8d">LinkCode
Arxiv 24.05.23 (NeurIPS 2024)Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Modelimage imageLinkCode
Arxiv 24.05.23Scalable Visual State Space Model with Fractal ScanningimageLink
Arxiv 24.05.23Mamba-R: Vision Mamba ALSO Needs RegistersimageLinkCode
Arxiv 24.05.29Vim-F: Visual State Space Model Benefiting from Learning in the Frequency DomainimageLinkCode
Arxiv 24.06.11Autoregressive Pretraining with Mamba in Visionimage imageLinkCode
Arxiv 24.07.10MambaVision: A Hybrid Mamba-Transformer Vision BackboneimageLinkCode
Arxiv 24.07.18GroupMamba: Parameter-Efficient and Accurate Group Visual State Space ModelimageLinkCode
Arxiv 24.07.26VSSD: Vision Mamba with Non-Causal State Space Dualityimage imageLinkCode
Arxiv 24.08.30Stochastic Layer-Wise Shuffle: A Good Practice to Improve Vision Mamba TrainingimageLinkCode
Arxiv 24.09.15SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networksimage imageLinkCode
Arxiv 24.09.18Distillation-free Scaling of Large SSMs for Images and VideosimageLink
Arxiv 24.09.27 (NeurIPS 2024)Exploring Token Pruning in Vision State Space ModelsimageLink
Arxiv 24.10.01MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive PretrainingimageLink
Arxiv 24.10.04HRVMamba: High-Resolution Visual State Space Model for Dense PredictionimageLinkCode
Arxiv 24.10.09 (NeurIPS 2024)QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Modelimage imageLinkCode
Arxiv 24.10.14GlobalMamba: Global Image Serialization for Vision Mambaimage imageLinkCode
Arxiv 24.10.14V2M: Visual 2-Dimensional Mamba for Image Representation LearningimageLinkCode
Arxiv 24.10.19Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusionimage imageLinkCode

Vision Application

Image

Natural Image

DatePaperFigureLinkCodeTask
Arxiv 24.02.06U-shaped Vision Mamba for Single Image Dehazing<img width="848" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/3ca0831b-711c-4073-841e-2eba4f2e718d">LinkCodeDehazing/Low Light Enhancement/Deraining
Arxiv 24.02.08Scalable Diffusion Models with State Space Backbone<img width="588" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/9d900e4b-4c3c-427a-a857-681a3f3470dd">LinkCodeImage Generation
Arxiv 24.02.23 (ECCV 2024)MambaIR: A Simple Baseline for Image Restoration with State-Space Model<img width="708" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/22041ebc-cae7-4e72-a537-a7af3429b6d8">LinkCodeSuper-resolution/Denoising
Arxiv 24.03.04MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection<img width="847" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/c38ffac7-65b7-452c-b0ec-c3a17f8de860">LinkCodeInfrared Image Segmentation
Arxiv 24.03.13Activating Wider Areas in Image Super-Resolution<img width="700" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/dfaf5b9a-19e0-4058-a4aa-a9af26df6334">LinkSuper-resolution
Arxiv 24.03.18VmambaIR: Visual State Space Model for Image Restoration<img width="485" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/1126cd48-1c85-4c09-883f-c4a50b922fd0">LinkCodeImage Restoration
Arxiv 24.03.20 (ECCV 2024)ZigMa: A DiT-style Zigzag Mamba Diffusion Model<img width="702" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/3a14b8da-188b-4c00-a054-c4cb47562f9e">LinkCodeGeneration
Arxiv 24.03.27Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction<img width="564" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/361b14b9-6291-47d1-b8e0-ae667db5aa22">Link3D Reconstruction
Arxiv 24.03.29Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring<img width="730" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/a84a1311-1ed4-4c51-828c-94b9e5b95578">LinkImage Deblurring
Arxiv 24.04.04InsectMamba: Insect Pest Classification with State Space Model<img width="554" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/2bb11b9a-c952-4f12-afd9-1ba2bce3ce9c">LinkImage Classification
Arxiv 24.04.09 (NeurIPS 2024)MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection<img width="793" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/0d601311-b0bf-48e1-b0d6-17ee9c3101d0">LinkcodeAnomaly Detection
Arxiv 24.04.11 (ACM MM 2024)DGMamba: Domain Generalization via Generalized State Space Model<img width="720" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/50d5a8bb-d701-40a1-9f17-52a7d9c96221">LinkCodeDomain Generalization
Arxiv 24.04.15 (ACM MM 2024)FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining<img width="798" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/430aabed-9c3f-40f5-b062-d748829d20fa">LinkCodeDeraining
Arxiv 24.04.17CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration<img width="1102" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/d4c0ac33-8541-4cf0-84aa-bd5b0958516f">LinkDenoising/Deblurring
Arxiv 24.04.22MambaUIE: Unraveling the Ocean's Secrets with Only 2.8 FLOPs<img width="687" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/7e84a297-ea0f-4e27-b9cc-cdb36b2b0e6f">LinkCodeImage Enhancement
Arxiv 24.05.03FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State SpaceimageLinkCodeEmotion recognition & Facial Expression Recognition & Detection
Arxiv 24.05.05 (CVPR 2024 Workshop)DVMSR: Distillated Vision Mamba for Efficient Super-ResolutionimageLinkCodeSuper-Resolution
Arxiv 24.05.05SMCD: High Realism Motion Style Transfer via Mamba-based DiffusionimageLinkMotion Style Transfer
Arxiv 24.05.06Retinexmamba: Retinex-based Mamba for Low-light Image EnhancementimageLinkCodeImage Enhancement
Arxiv 24.05.07VMambaCC: A Visual State Space Model for Crowd CountingimageLinkCrowd Counting
Arxiv 24.05.14WaterMamba: Visual State Space Model for Underwater Image Enhancement<img width="551" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/2307c102-8490-44d7-bf66-add22160739d">LinkImage Enhancement
Arxiv 24.05.16IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation ModelimageLinkCodeInfrared Image Super-resolution
Arxiv 24.05.23Efficient Visual State Space Model for Image Deblurring<img width="540" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/a1f85d94-136f-432b-ba51-de322b500539">LinkCodeImage Deblurring
Arxiv 24.05.23DiM: Diffusion Mamba for Efficient High-Resolution Image SynthesisimageLinkCodeGeneration
Arxiv 24.05.25Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation<img width="548" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/e904635f-26e9-4690-b215-f9ea741bb354">LinkGeneration
Arxiv 24.05.25 (NeurIPS 2024)MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State SpaceimageLinkCodeImage Enhancement
Arxiv 24.05.26Image Deraining with Frequency-Enhanced State Space Model<img width="439" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/8fd63341-7281-4911-a506-a814f5fb56df">LinkImage Deraining
Arxiv 24.05.28MambaVC: Learned Visual Compression with Selective State Spaces<img width="533" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/9591e9ef-7984-4851-bf09-dee72676c5e4">LinkCodeVisual Compression
Arxiv 24.05.29FourierMamba: Fourier Learning Integration with State Space Models for Image DerainingimageLinkImage Deraining
Arxiv 24.06.03LLEMamba: Low-Light Enhancement via Relighting-Guided Mamba with Deep Unfolding NetworkimageLinkLow-Light Enhancement
Arxiv 24.06.06MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth EstimationimageLinkDepth Estimation
Arxiv 24.06.09Mamba YOLO: SSMs-Based YOLO For Object DetectionimageLinkCodeObject Detection
Arxiv 24.06.12PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image EnhancementimageLinkCodeImage Enhancement
Arxiv 24.06.18LFMamba: Light Field Image Super-Resolution with State Space ModelimageLinkCodeSuper-resolution
Arxiv 24.06.13Q-Mamba: On First Exploration of Vision Mamba for Image Quality AssessmentimageLinkImage Quality Assessment
Arxiv 24.06.23Mamba-based Light Field Super-Resolution with Efficient Subspace ScanningimageLinkSuper-resolution
Arxiv 24.06.24Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfacesimageLinkCrack Segmentation
Arxiv 24.06.25 (WACV 2025)SUM: Saliency Unification through Mamba for Visual Attention ModelingimageLinkCodeVisual Saliency Prediction
Arxiv 24.07.02 (ECCV 2024)MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based DecodersimageLinkCodeMulti-Task Dense Scene Understanding
Arxiv 24.07.08Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental LearningimageLinkCodeFew-Shot Class-Incremental Learning
Arxiv 24.07.11 (ICML 2024 Workshop)Parallelizing Autoregressive Generation with Variational State Space ModelsimageLinkGeneration
Arxiv 24.07.12 (NeurIPS 2024)Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning MambaimageLinkCode3D Hand Reconstruction
Arxiv 24.07.16PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision TransformerimageLinkImage Classification/Object Detection/Point Cloud Object Detection
Arxiv 24.07.22Mamba meets crack segmentationimageLinkCodeSegmentation
Arxiv 24.07.23MxT: Mamba x Transformer for Image InpaintingimageLinkImage Inpainting
Arxiv 24.07.25ALMRR: Anomaly Localization Mamba on Industrial Textured Surface with Feature Reconstruction and RefinementimageLinkCodeAnomaly Localization
Arxiv 24.07.27Mamba-UIE: Enhancing Underwater Images with Physical Model ConstraintimageLinkCodeImage Enhancement
Arxiv 24.07.27 (WBIR 2024 Workshop)Mamba? Catch The Hype Or Rethink What Really Helps for Image RegistrationimageLinkCodeImage Registration
Arxiv 24.08.01MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object DetectionimageLinkMonocular 3D Object Detection
Arxiv 24.08.02 (ACM MM 2024)Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image EnhancementimageLinkCodeImage Enhancement
Arxiv 24.08.04DeMansia: Mamba Never Forgets Any TokensLinkCodeClassification
Arxiv 24.08.05LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and MambaimageLinkGeneration
Arxiv 24.08.06Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN NetworkimageLinkHuman Pose Estimation
Arxiv 24.08.07PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space ModelimageLinkHuman Pose Estimation
Arxiv 24.08.11Neural Architecture Search based Global-local Vision Mamba for Palm-Vein RecognitionimageLinkPalm-Vein Recognition
Arxiv 24.08.16QMambaBSR: Burst Image Super-Resolution with Query State Space ModelimageLinkSuper-Resolution
Arxiv 24.08.19Multi-Scale Representation Learning for Image Restoration with State-Space ModelimageLinkImage Restoration
Arxiv 24.08.21MambaOcc: Visual State Space Model for BEV-based Occupancy Prediction with Local Adaptive ReorderingimageLinkCodeOccupancy Prediction
Arxiv 24.08.21MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMsimageLinkCodeSuper-resolution
Arxiv 24.08.22Scalable Autoregressive Image Generation with MambaimageLinkCodeGeneration
Arxiv 24.08.23O-Mamba: O-shape State-Space Model for Underwater Image EnhancementimageLinkCodeImage Enhancement
Arxiv 24.08.27ZeroMamba: Exploring Visual State Space Model for Zero-Shot LearningimageLinkCodeZero-Shot Learning
Arxiv 24.08.27MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based DecodersimageLinkCodeMulti-Task Dense Scene Understanding
Arxiv 24.08.31A Hybrid Transformer-Mamba Network for Single Image DerainingimageLinkCodeDeraining
Arxiv 24.09.02 (ICPR)DS MYOLO: A Reliable Object Detector Based on SSMs for Driving ScenariosimageLinkObject Detection
Arxiv 24.09.09DSDFormer: An Innovative Transformer-Mamba Framework for Robust High-Precision Driver Distraction IdentificationimageLinkDriver Distraction Identification
Arxiv 24.09.11Retinex-RAWMamba: Bridging Demosaicing and Denoising for Low-Light RAW Image EnhancementimageLinkCodeImage Enhancement
Arxiv 24.09.15 (ECCV 2024 Workshop)Famba-V: Fast Vision Mamba with Cross-Layer Token FusionimageLinkCodeEfficiency
Arxiv 24.09.16Mamba-ST: State Space Model for Efficient Style TransferimageLinkCodeStyle Transfer
Arxiv 24.09.20 (ACCV 2024)OneBEV: Using One Panoramic Image for Bird's-Eye-View Semantic MappingimageLinkCodeBird's-Eye-View Semantic Mapping
Arxiv 24.09.25Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image EnhancementimageLinkCodeImage Enhancement
Neurocomputing 24.09.28MambaTSR: You only need 90k parameters for traffic sign recognitionimageLinkCodeTraffic Sign Recognition
Arxiv 24.09.29 (NeurIPS 2024)Hybrid Mamba for Few-Shot SegmentationimageLinkCodeFew-Shot Segmentation
Arxiv 24.10.05Mamba Capsule Routing Towards Part-Whole Relational Camouflaged Object DetectionimageLinkCodeCamouflaged Object Detection
Arxiv 24.10.14Hi-Mamba: Hierarchical Mamba for Efficient Image Super-ResolutionimageLinkSuper-resolution
Arxiv 24.10.16MambaBEV: An efficient 3D detection model with Mamba2imageLink3D Object Detection
Arxiv 24.10.21 (NeurIPS 2024)START: A Generalized State Space Model with Saliency-Driven Token-Aware TransformationimageLinkCodeDomain Generalization
Arxiv 24.10.25 (JAC 2024)Topology-aware Mamba for Crack Segmentation in StructuresimageLinkCodeCrack Segmentation
Arxiv 24.10.27 (ACCV 2024)Wavelet-based Mamba with Fourier Adjustment for Low-light Image EnhancementimageLinkCodeImage Enhancement
Arxiv 24.10.28 (NeurIPS 2024)ECMamba: Consolidating Selective State Space Model with Retinex Guidance for Efficient Multiple Exposure CorrectionimageLinkCodeMultiple Exposure Correction
ACM MM 24.10.28Realistic Full-Body Motion Generation from Sparse Tracking with State Space ModelimageLinkMotion Generation
Arxiv 24.10.30Adaptive Multi Scale Document Binarisation Using Vision MambaimageLinkDocument Binarisation

Remote Sensing Image

DatePaperFigureLinkCodeTask
Arxiv 24.03.28 (GRSL 2024)RSMamba: Remote Sensing Image Classification with State Space ModelimageLinkCodeRemote Sensing Images Classification
Arxiv 24.04.02Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model<img width="402" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/c64ee6cc-ced1-4d27-b1af-27582f089fb0">LinkCodeSemantic Segmentation
Arxiv 24.04.03 (GRSL 2024)RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation<img width="502" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/1767d964-15ea-4085-a1f0-937cba3cf915">LinkCodeSemantic Segmentation
Arxiv 24.04.03RS-Mamba for Large Remote Sensing Image Dense Prediction<img width="942" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/4c47c3b7-5df8-4d77-93ba-d35263916f03">LinkCodeSemantic Segmentation/Change Detection
Arxiv 24.04.04 (TGRS 2024)ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model<img width="1023" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/476bf8d5-625e-4e65-ad19-bc14817c9a58">LinkCodeChange Detection/Building Damage Assessment
Arxiv 24.04.12SpectralMamba: Efficient Mamba for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
Arxiv 24.04.15HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral Denoising<img width="947" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/91ebd913-ef36-400e-a83c-8d24fc5536b3">LinkHyperspectral Denoising
Arxiv 24.04.28S2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
Arxiv 24.04.29Spectral-Spatial Mamba for Hyperspectral Image ClassificationimageLinkHyperspectral Image Classification
Arxiv 24.05.02SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable GradientsimageLinkCodeDetection
Arxiv 24.05.02 (TGRS 2024)SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image DenoisingimageLinkCodeHyperspectral Image Denoising
Arxiv 24.05.08 (TMM 2024)Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution<img width="745" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/51b4b08f-8086-4e57-8006-3e9ba06ff205">LinkCodeSuper-resolution
Arxiv 24.05.13GMSR:Gradient-Guided Mamba for Spectral Reconstruction from RGB ImagesimageLinkCodeSpectral Reconstruction from RGB Images
Arxiv 24.05.14Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental StudyimageLinkSemantic Segmentation
Arxiv 24.05.16RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image DehazingimageLinkDehazing
Arxiv 24.05.17CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic SegmentationimageLinkCodeSemantic Segmentation
Arxiv 24.05.20Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
Arxiv 24.05.213DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image ClassificationimageLinkHyperspectral Image Classification
Arxiv 24.06.01Dual Hyperspectral Mamba for Efficient Spectral Compressive ImagingimageLinkCodeSpectral Compressive Imaging
Arxiv 24.06.06CDMamba: Remote Sensing Image Change Detection with MambaimageLinkCodeChange Detection
Arxiv 24.06.09HDMba: Hyperspectral Remote Sensing Imagery Dehazing with State Space ModelimageLinkCodeDehazing
Arxiv 24.06.11DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image ClassificationimageLinkHyperspectral Image Classification
Arxiv 24.06.16PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing ImageryimageLinkCodeSemantic Segmentation
Arxiv 24.07.08A Mamba-based Siamese Network for Remote Sensing Change DetectionimageLinkCodeChange Detection
Arxiv 24.07.09HTD-Mamba: Efficient Hyperspectral Target Detection with Pyramid State Space ModelimageLinkCodeHyperspectral Target Detection
Arxiv 24.07.11DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote SensingimageLinkCodeOriented Object Detection
Arxiv 24.07.11GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
TGRS 24.07.19MambaHSI: Spatial–Spectral Mamba for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
Arxiv 24.08.01Empowering Snapshot Compressive Imaging: Spatial-Spectral State Space Model with Across-Scanning and Local EnhancementimageLinkSnapshot Compressive Imaging
Arxiv 24.08.02Multi-head Spatial-Spectral Mamba for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
Arxiv 24.08.02WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
Arxiv 24.08.02Spatial-Spectral Morphological Mamba for Hyperspectral Image ClassificationimageLinkCodeHyperspectral Image Classification
Arxiv 24.08.21UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing ImagesimageLinkCodeSemantic Segmentation
Arxiv 24.08.26MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image ClassificationimageLinkCodeHyperspectral Image Classification
GRSL 24.09.02MambaFormerSR: A Lightweight model for Remote-Sensing Image Super-ResolutionimageLinkSuper-resolution
Arxiv 24.09.05UV-Mamba: A DCN-Enhanced State Space Model for Urban Village Boundary Identification in High-Resolution Remote Sensing ImagesimageLinkSegmentation
Arxiv 24.09.10PPMamba: A Pyramid Pooling Local Auxiliary SSM-Based Model for Remote Sensing Image Semantic Segmentationimage imageLinkSemantic Segmentation
Arxiv 24.09.15SITSMamba for Crop Classification based on Satellite Image Time SeriesimageLinkCodeSITS Classification
Arxiv 24.10.07IGroupSS-Mamba: Interval Group Spatial-Spectral Mamba for Hyperspectral Image ClassificationimageLinkHyperspectral Image Classification
Arxiv 24.10.07 (ECML/PKDD 2024 Workshop)A Deep Learning-Based Approach for Mangrove MonitoringLinkCodeSegmentation
Arxiv 24.10.08Remote Sensing Image Segmentation Using Vision Mamba and Multi-Scale Multi-Frequency Feature Fusionimage imageLinkSegmentation
Arxiv 24.10.17RemoteDet-Mamba: A Hybrid Mamba-CNN Network for Multi-modal Object Detection in Remote Sensing ImagesimageLinkObject Detection
ACM MM 24.10.28VmambaSCI: Dynamic Deep Unfolding Network with Mamba for Compressive Spectral ImagingimageLinkCompressive Spectral Imaging

Medical Image

DatePaperFigureLinkCodeTask
Arxiv 24.01.09U-Mamba: Enhancing Long-range Dependency for Biomedical Image SegmentationimageLinkCode2D Medical Segmentation/ </br> 3D Medical Segmentation
Arxiv 24.01.24 (MICCAI 2024)SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation<img width="635" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/690c8341-1f17-4f8f-929a-d7f31094ad64">LinkCode3D Medical Segmentation
Arxiv 24.02.04VM-UNet: Vision Mamba UNet for Medical Image Segmentation<img width="544" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/320dda01-12dc-4e37-992d-8551c99b475a">LinkCode2D Medical Segmentation
Arxiv 24.02.05nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model<img width="949" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/2b1669ec-d1f5-4c6c-a743-1620ab83fef3">LinkCode3D Medical Segmentation
Arxiv 24.02.05 (MICCAI 2024)Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining<img width="711" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/2b1c8b89-2b8c-4273-ae25-833f87fc97c2">LinkCode2D Medical Segmentation
Arxiv 24.02.07Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation<img width="698" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/0a09dda6-986b-480d-8445-1db4a02f16f1">LinkCode2D Medical Segmentation
Arxiv 24.02.09FD-Vision Mamba for Endoscopic Exposure Correction<img width="666" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/f87cc9c6-efa9-40ca-bf76-af7dd19b2277">LinkCodeEndoscopic Exposure Correction
Arxiv 24.02.11 (KBS 2024)Semi-Mamba-UNet: Pixel-Level Contrastive Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation<img width="623" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/e702b599-682b-42b0-8477-a72972843803">LinkCode2D Medical Segmentation
Arxiv 24.02.13P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation<img width="717" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/cbbc8a01-b1bb-44bf-8954-4485605a8326">Link2D Medical Segmentation
Arxiv 24.02.16Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation<img width="706" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/909947ae-9ba2-47f9-b257-620663d55820">LinkCode2D Medical Segmentation
Arxiv 24.02.28MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation<img width="733" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/ea55e5c2-27bb-4155-a1b3-769fbb46c1f3">LinkCodeMedical Image Reconstruction/Uncertainty Estimation
Arxiv 24.03.06MedMamba: Vision Mamba for Medical Image ClassificationimageLinkCode2D Medical Classification
Arxiv 24.03.08LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation<img width="587" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/20237455-6ec6-49a1-81ba-05553c69910d">LinkCode2D Medical Segmentation/ </br> 3D Medical Segmentation
Arxiv 24.03.08 (BIBM 2024)MamMIL: Multiple Instance Learning for Whole Slide Images with State Space ModelsimageLinkCancer Subtyping
Arxiv 24.03.11 (MICCAI 2024)MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology<img width="516" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/fc147a4e-8a81-4862-b222-b52929def042">LinkCodeCancer Subtyping/ </br> Survival Prediction
Arxiv 24.03.12Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention<img width="848" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/2e91f2b6-e13d-48b1-9012-91b8ce5f1f43">LinkCode2D Medical Segmentation/ </br> 3D Medical Segmentation
Arxiv 24.03.12 (MICCAI 2024)LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.03.13MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction<img width="683" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/4440c5f1-1197-4295-b585-52314a144539">LinkCodeRadiation Dose Prediction (Segmentation)
Arxiv 24.03.14 (ISBRA 2024)VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation<img width="702" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/3f285231-30db-4737-a790-e69f0646d155">LinkCode2D Medical Segmentation
Arxiv 24.03.20H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation<img width="748" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/88ae6463-46e9-4c84-a658-160bbbf4d9cf">LinkCode2D Medical Segmentation
Arxiv 24.03.20ProMamba: Prompt-Mamba for polyp segmentation<img width="741" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/56106afc-1d80-42db-bb9f-a6daaed7abc8">Link2D Medical Segmentation
Arxiv 24.03.25CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification<img width="707" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/c71f93de-e730-40b2-bb23-74a07c868ab7">LinkAlzheimer’s disease Classification (CT/MRI)
Arxiv 24.03.26Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion<img width="622" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/c0b134c5-f7e4-4794-86e2-b20ddca84469">Link2D Medical Segmentation (2D MRI)
Arxiv 24.03.26Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space Models<img width="633" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/5d4acab4-a1bb-4563-8551-151295f08bf2">LinkImage Resotration
Arxiv 24.03.26Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation<img width="830" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/100bc4e7-65e1-43d9-815d-99b394e12b4f">Link2D Medical Segmentation
Arxiv 24.03.29UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation<img width="725" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/14d04b8b-cbe6-429d-984d-3ac7dd894bf3">LinkCode2D Medical Segmentation
Arxiv 24.04.01T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation<img width="603" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/b01c38ef-6623-4e9b-867b-ba6f39575b5c">LinkCode3D Medical Segmentation (Tooth)
Arxiv 24.04.10 (MIDL 2024)ViM-UNet: Vision Mamba for Biomedical Segmentation<img width="581" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/1c32deb4-7695-4cb6-bbc7-5912a69bed98">LinkCode2D Medical Segmentation (Cell/Neurite)
Arxiv 24.04.15 (MICCAI 2024)nnU-Net Revisited: A Call for Rigorous Validation in 3D Medical Image SegmentationLinkCode3D Medical Segmentation
Arxiv 24.04.19 (CVPR 2024 Workshop)Vim4Path: Self-Supervised Vision Mamba for Histopathology Images<img width="939" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/9e5d4ef7-89f5-47da-bf5e-38367997f54f">LinkCodeCancer Subtyping
Arxiv 24.04.26Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance AdjustmentimageLinkUniversal Lesion Segmentation
Arxiv 24.04.26Sparse Reconstruction of Optical Doppler Tomography Based on State Space ModelimageLinkODT Sparse Reconstruction
Arxiv 24.05.05AC-MAMBASEG: An adaptive convolution and Mamba-based architecture for enhanced skin lesion segmentationimageLinkCodeSkin Lesion Segmentation
Arxiv 24.05.08HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation<img width="689" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/dd95c6d9-9d34-4460-9936-e6de2971dab8">Link2D Medical Segmentation
Arxiv 24.05.09VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis<img width="724" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/fe63dac2-8500-48cb-8237-2b0c02f5be38">LinkMedical Image Generation
Arxiv 24.05.24MUCM-Net: A Mamba Powered UCM-Net for Skin Lesion SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.05.25UU-Mamba: Uncertainty-aware U-Mamba for Cardiac Image SegmentationimageLink LinkMedical Image Segmentation
Arxiv 24.05.27TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token PredictionimageLinkCodePre-training/Medical Image Segmentation
Arxiv 24.05.27Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked MambaimageLinkMedical Image Reconstruction
Arxiv 24.05.28 (MICCAI 2024 Oral)Cardiovascular Disease Detection from Multi-View Chest X-rays with BI-MambaimageLinkCodeCVD Risk Prediction
Arxiv 24.06.01SAM-VMNet: Deep Neural Networks For Coronary Angiography Vessel Segmentation<img width="602" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/b5d17a68-7e29-4012-9d2e-449bba21745d">LinkMedical Image Segmentation
Arxiv 24.06.05Combining Graph Neural Network and Mamba to Capture Local and Global Tissue Spatial Relationships in Whole Slide ImagesimageLinkCodeCancer Subtyping/Survival Prediction
Arxiv 24.06.09Vision Mamba: Cutting-Edge Classification of Alzheimer's Disease with 3D MRI ScansimageLink3D Medical Classification
Arxiv 24.06.09 (WACV 2025)Convolution and Attention-Free Mamba-based Cardiac Image SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.06.10MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision MambaimageLinkCodeMedical Image Segmentation
Arxiv 24.06.12 (BMVC 2024)On Evaluating Adversarial Robustness of Volumetric Medical Segmentation ModelsLinkCodeMedical Image Segmentation
Arxiv 24.06.22Soft Masked Mamba Diffusion Model for CT to MRI ConversionimageLinkCodeCT to MRI Conversion
Arxiv 24.07.04 (MICCAI 2024 Workshop)Vision Mamba for Classification of Breast Ultrasound ImagesimageLinkClassification
Arxiv 24.07.08 (MICCAI 2024)Deform-Mamba Network for MRI Super-ResolutionimageLinkSuper-resolution
Arxiv 24.07.08Self-Prior Guided Mamba-UNet Networks for Medical Image Super-ResolutionimageLinkSuper-resolution
Arxiv 24.07.11SR-Mamba: Effective Surgical Phase Recognition with State Space ModelimageLinkCodeSurgical Phase Recognition
Arxiv 24.07.11SliceMamba for Medical Image SegmentationimageLinkMedical Image Segmentation
Arxiv 24.08.14Costal Cartilage Segmentation with Topology Guided Deformable Mamba: Method and BenchmarkimageLinkMedical Image Segmentation
Arxiv 24.08.15MambaMIM: Pre-training Mamba with State Space Token-interpolationimageLinkCodeMedical Image Segmentation
Arxiv 24.08.21HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.08.23Hierarchical Spatio-Temporal State-Space Modeling for fMRI AnalysisimageLinkMedical Image Classification and Regression
Arxiv 24.08.25MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image SegmentationimageLinkCodeMedical Image Segmentation
KDD Workshop 24.08.25State Space Model-based Classification of Major Depressive Disorder Across Multiple Imaging SitesimageLinkMedical Image Classification
Arxiv 24.08.26 (MICCAI 2024)ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image SegmentationimageLinkMedical Image Segmentation
Arxiv 24.08.26LoG-VMamba: Local-Global Vision Mamba for Medical Image SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.08.28SpineMamba: Enhancing 3D Spinal Segmentation in Clinical Imaging through Residual Visual Mamba Layers and Shape PriorsimageLinkMedical Image Segmentation
Scientific Reports 24.08.28A mixed Mamba U-net for prostate segmentation in MR imagesimageLinkMedical Image Segmentation
Arxiv 24.09.06MpoxMamba: A Grouped Mamba-based Lightweight Hybrid Network for Mpox DetectionimageLinkCodeMedical Image Classification
Arxiv 24.09.06Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation with Selective State-Space ModelimageLinkMedical Image Segmentation
Arxiv 24.09.09SX-Stitch: An Efficient VMS-UNet Based Framework for Intraoperative Scoliosis X-Ray Image StitchingimageLinkMedical Image Stitching
Arxiv 24.09.12Microscopic-Mamba: Revealing the Secrets of Microscopic Images with Just 4M ParametersimageLinkCodeMedical Image Classification
Arxiv 24.09.12OCTAMamba: A State-Space Model Approach for Precision OCTA Vasculature SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.09.12MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain SegmentationimageLinkMedical Image Segmentation
Arxiv 24.09.13 (MICCAI 2024)Tri-Plane Mamba: Efficiently Adapting Segment Anything Model for 3D Medical ImagesimageLinkCodeMedical Image Segmentation
Arxiv 24.09.17 (ACCV 2024 Workshop)SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary GuidanceimageLinkCodeMedical Image Segmentation
Arxiv 24.09.18SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with MambaimageLinkSurgical Phase Recognition
Arxiv 24.09.19MambaRecon: MRI Reconstruction with Structured State Space ModelsimageLinkCodeMedical Image Reconstruction
Arxiv 24.09.19MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.09.24Segmentation Strategies in Deep Learning for Prostate Cancer Diagnosis: A Comparative Study of Mamba, SAM, and YOLOimageLinkCodeMedical Image Segmentation
Arxiv 24.09.25Classification of Gleason Grading in Prostate Cancer Histopathology Images Using Deep Learning Techniques: YOLO, Vision Transformers, and Vision MambaimageLinkCodeMedical Image Classification
Arxiv 24.09.26 (MICCAI 2024)EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.09.28MambaEviScrib: Mamba and Evidence-Guided Consistency Make CNN Work Robustly for Scribble-Based Weakly Supervised Ultrasound Image SegmentationimageLinkCodeMedical Image Segmentation
MICCAI 24.10.06PathMamba: Weakly Supervised State Space Model for Multi-class Segmentation of Pathology ImagesimageLinkCodeMedical Image Segmentation
MICCAI 24.10.06Efficient and Gender-adaptive Graph Vision Mamba for Pediatric Bone Age AssessmentimageLinkCodeBone Age Assessment
MICCAI 24.10.06Polyp-Mamba: Polyp Segmentation with Visual MambaimageLinkMedical Image Segmentation
Arxiv 24.10.20Taming Mambas for Voxel Level 3D Medical Image SegmentationimageLinkCodeMedical Image Segmentation
Arxiv 24.10.29Advancing Efficient Brain Tumor Multi-Class Classification -- New Insights from the Vision Mamba Model in Transfer LearningimageLinkMulti-Class Classification
Arxiv 24.10.31MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image SegmentationimageLinkCodeMedical Image Segmentation

Video

DatePaperFigureLinkCodeTask
Arxiv 24.01.25Vivim: a Video Vision Mamba for Medical Video Object Segmentation<img width="596" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/e30c0ceb-5399-44b5-99b7-65ada043c87c">LinkCodeMedical Video Segmentation
Arxiv 24.03.11 (ECCV 2024)VideoMamba: State Space Model for Efficient Video Understanding<img width="728" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/08797465-f93f-49ce-b724-91b67fabbabd">LinkCodeAction Recognition/Video Understanding/Text-to-video Retrieval
Arxiv 24.03.12 (ICLR 2024 Workshop)SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces<img width="655" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/1b1ce7b5-392c-46dd-b4e2-d6e03f6af1ab">LinkCodeVideo Generation
Arxiv 24.03.14Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding<img width="704" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/70fb7829-d28e-4bbc-b326-fcb167dad979">LinkCodeAction Recognition/Action Localization/...
Arxiv 24.03.25 (CVPR 2024 Workshop)VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal ForecastingimageLinkCodeSpatiotemporal Forecasting
Arxiv 24.04.09RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos<img width="881" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/f1b0f8a1-f10f-43c6-8203-701ae0376af2">LinkCodeRemote photoplethysmography Prediction
Arxiv 24.04.11Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos<img width="697" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/ea35cf6a-e2a6-4eab-8da7-2cb7cd098507">LinkSkeleton Action Recognition
Arxiv 24.05.05Matten: Video Generation with Mamba-AttentionimageLinkVideo Generation
Arxiv 24.05.30DeMamba: AI-Generated Video Detection on Million-Scale GenVideo BenchmarkimageLinkCodeAI-Generated Video Detection
Arxiv 24.06.18Slot State Space ModelsimageLinkObject-centric Video Understanding/3D Visual Reasoning/Video Prediction
Arxiv 24.06.27VideoMambaPro: A Leap Forward for Mamba in Video UnderstandingimageLinkCodeVideo Understanding
Arxiv 24.07.02 (NeurIPS 2024)VFIMamba: Video Frame Interpolation with State Space ModelsimageLinkCodeVideo Frame Interpolation
Arxiv 24.07.03BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video EnhancementimageLinkCodeLow-Light Video Enhancement
Arxiv 24.07.04QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting @ Ego4D Long-Term Action Anticipation Challenge 2024imageLinkVideo Action Forecasting
Arxiv 24.07.11 (ECCV 2024)VideoMamba: Spatio-Temporal Selective State Space ModelimageLinkCodeAction Recognition
Arxiv 24.07.25Harnessing Temporal Causality for Advanced Temporal Action DetectionimageLinkCodeMoment Queries/Action Recognition/Action Detection/Audio-Based Interaction Detection
Arxiv 24.07.31 (ACM MM 2024 Oral)RainMamba: Enhanced Locality Learning with State Space Models for Video DerainingimageLinkCodeDeraining
Arxiv 24.08.15MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T TrackingimageLinkRGB-T Tracking
Arxiv 24.08.17 (ACM MM 2024 Oral)MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space ModelimageLinkMultiple Object Tracking
Arxiv 24.08.20DemMamba: Alignment-free Raw Video Demoireing with Frequency-assisted Spatio-Temporal MambaimageLinkVideo Demoireing
Arxiv 24.08.31TrackSSM: A General Motion Predictor by State-Space ModelimageLinkMotion Prediction
Arxiv 24.09.02FMRFT: Fusion Mamba and DETR for Query Time Sequence Intersection Fish TrackingimageLinkFish Tracking
Arxiv 24.09.04MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric VideosimageLinkCodeHand Trajectory Prediction
Arxiv 24.09.18 (CCBR 2024)PhysMamba: Efficient Remote Physiological Measurement with SlowFast Temporal Difference MambaimageLinkCodeRemote Photoplethysmography
Arxiv 24.10.18MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive ImagingimageLinkCodeVideo Snapshot Compressive Imaging
ACM MM 24.10.28Object-Level Pseudo-3D Lifting for Distance-Aware TrackingimageLinkTracking

Point Cloud

DatePaperFigureLinkCodeTask
Arxiv 24.02.16 (NeurIPS 2024)PointMamba: A Simple State Space Model for Point Cloud Analysis<img width="718" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/e252e787-0189-4f94-bea1-2944d50b18f4">LinkCodeClassification, Part Segmentation
Arxiv 24.02.23 (CVPR 2024 Spotlight, SSM)State Space Models for Event CamerasimageLinkCodeObject Detection
Arxiv 24.03.01Point Cloud Mamba: Point Cloud Learning via State Space Model<img width="692" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/6a315d04-afe6-41d1-b8d5-d931a891a681">LinkCodeClassification, Part Segmentation, Semantic Segmentation
Arxiv 24.03.11Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy<img width="882" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/c1c7c020-28cc-4ca6-b271-1d3cf665243f">LinkCodeClassification, Semantic Segmentation
Arxiv 24.04.083DMambaIPF: A State Space Model for Iterative Point Cloud Filtering via Differentiable Rendering<img width="1028" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/ab137b17-85c1-4b6c-96d4-9ae5bfd45a1b">LinkPoint Cloud Filtering
Arxiv 24.04.103DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion<img width="1020" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/da19fb01-52bd-4a55-b0ca-9681fdaef9ed">LinkPoint Cloud Completion
Arxiv 24.04.19 (ACM MM 2024)MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space ModelimageLinkCodeObject Segmentation
Arxiv 24.04.23 (ACM MM 2024)Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model<img width="959" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/6b565138-1c2d-4201-bd34-8b4343a62ec9">LinkCodeClassification, Part Segmentation
Arxiv 24.05.09Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba<img width="1528" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/57466105/2a7422a9-9483-4b5c-be57-5b2c04f4b614">LinkClassification, Regression
Arxiv 24.05.13OverlapMamba: Novel Shift State Space Model for LiDAR-based Place RecognitionimageLinkCodeLiDAR Place Recognition
Arxiv 24.05.23MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space ModelsimageLinkPoint Cloud Video Understanding
Arxiv 24.05.24PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud AnalysisimageLinkCodeClassification, Part Segmentation
Arxiv 24.05.27 (NeurIPS 2024)LCM: Locally Constrained Compact Point Cloud Model for Masked Point ModelingimageLinkClassification, Part Segmentation, Object Detection
Arxiv 24.06.07Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMsimageLinkCodeGeneration
Arxiv 24.06.10PointABM: Integrating Bidirectional State Space Model with Multi-Head Self-Attention for Point Cloud AnalysisimageLinkClassification
Arxiv 24.06.15 (NeurIPS 2024)Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object DetectionimageLinkCodeObject Detection
Arxiv 24.06.25Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space ModelimageLinkSemantic Segmentation
Arxiv 24.07.15Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation ModelimageLinkSemantic Segmentation, Instance Segmentation
Arxiv 24.07.25LION: Linear Group RNN for 3D Object Detection in Point Cloudsimage imageLinkCodeObject Detection
Arxiv 24.08.19Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and AlgorithmsimageLinkCodeAction Recognition
Arxiv 24.08.20MambaEVT: Event Stream based Visual Object Tracking using State Space ModelimageLinkCodeObject Tracking
Arxiv 24.08.20MV-MOS: Multi-View Feature Fusion for 3D Moving Object SegmentationimageLinkCodeObject Segmentation
Arxiv 24.08.20OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space ModelimageLinkCodeSemantic Prediction/Scene Completion
Arxiv 24.09.17Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge DistillationimageLinkObject Detection
Arxiv 24.09.24FSF-Net: Enhance 4D Occupancy Forecasting with Coarse BEV Scene Flow for Autonomous DrivingimageLink4D Occupancy Forecasting
Arxiv 24.10.21MBPU: A Plug-and-Play State Space Model for Point Cloud Upsamping with Fast Point RenderingimageLinkUpsamping
Arxiv 24.10.22SpikMamba: When SNN meets Mamba in Event-based Human Action RecognitionimageLinkCodeAction Recognition
Arxiv 24.10.24Bio2Token: All-atom tokenization of any biomolecular structure with MambaimageLinkTokenization
Arxiv 24.10.28Exploring contextual modeling with linear complexity for point cloud segmentationimageLinkSemantic Segmentation
Arxiv 24.10.31NIMBA: Towards Robust and Principled Processing of Point Clouds With SSMsimageLinkClassification, Part Segmentation

Multi-Modal

DatePaperFigureLinkCodeTaskModality
Arxiv 24.01.25MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration<img width="705" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/0584bfee-1ed2-4d5b-984e-c374491adab9">LinkCodeRegistrationMRI & CT
Arxiv 24.02.19Pan-Mamba: Effective pan-sharpening with State Space Model<img width="716" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/9cead6ad-ce09-4597-a985-8181b407523d">LinkCodePansharpeningHISR Images & LRMS Images
Arxiv 24.03.07 (ECCV 2024)InstructGIE: Towards Generalizable Image Editing<img width="912" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/45b0c86f-f473-4eb7-a821-7be8e3be417d">LinkCodeImage EditingImage & Text
Arxiv 24.03.12 (ECCV 2024)Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM<img width="910" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/9ef9c705-657d-4b6d-b229-6e2e4270682f">LinkCodeText-to-Motion GenerationMotion & Text
Arxiv 24.03.14 (NeurIPS 2024)MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space ModelsimageLinkGesture Synthesis
Arxiv 24.03.20VL-Mamba: Exploring State Space Models for Multimodal Learning<img width="718" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/aa912eb8-13a7-488f-9601-d298ed6796e2">LinkCodeMLLM tasksImage & Text
Arxiv 24.03.21Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference<img width="626" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/df845d03-3739-4b78-9328-6c2df2e98aad">LinkCodeMLLM tasksImage & Text
Arxiv 24.03.26 (ECCV 2024)ReMamber: Referring Image Segmentation with Mamba Twister<img width="715" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/08c3e6e4-49ca-4081-bea6-ed4c7b046c0b">LinkCodeReferring Image SegmentationImage & Text
Arxiv 24.04.01SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding<img width="727" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/04cba5ac-b2f0-4357-b447-1e14a1d2617b">LinkTemporal Video GroundingVideo & Text
Arxiv 24.04.05 (WACV 2025)Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation<img width="702" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/a972740b-774c-4e14-a914-791aa5f519b8">LinkCodeSemantic SegmentationRGB Images & Depth/Thermal Images
Arxiv 24.04.07VMambaMorph: a Multi-Modality Deformable Image Registration Framework based on Visual State Space Model with Cross-Scan Module<img width="711" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/6d47fc18-f044-49ed-8724-941a4fe46ebc">LinkCodeRegistrationMRI & CT
Arxiv 24.04.11SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction<img width="813" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/c9307069-4672-47ed-9706-1003a5ad5eff">LinkCancer Subtyping/Survival PredictionWSIs & Gene
Arxiv 24.04.11FusionMamba: Efficient Image Fusion with State Space Model<img width="816" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/2182cfb2-fa6f-4dea-ab2b-d21b906a683f">LinkCodePansharpeningHISR Images & LRMS Images
Arxiv 24.04.12MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion<img width="1035" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/5921c2d4-d50d-48a3-928a-1dc69a60deb6">LinkMulti-modality Image FusionRGB & Thermal Images, MRI & CT/PET/SPECT
Arxiv 24.04.14Fusion-Mamba for Cross-modality Object Detection<img width="902" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/97b716a5-f647-43d9-a1fe-dc8b2b02670d">LinkVisible-infrared Images FusionRGB Images & Infrared Images
Arxiv 24.04.14A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion<img width="1013" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/df942cab-7802-4314-b7a7-549439b74f06">LinkPansharpeningHISR Images & LRMS Images
Arxiv 24.04.15FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba<img width="906" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/87093676-286c-4e35-89b7-7e573679cc67">LinkCodeImage FusionRGB & Infrared Images, MRI & CT/PET/SPECT, PC & GFP
Arxiv 24.04.17Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion<img width="810" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/f27aa176-65e2-44db-a172-56712e789729">LinkTemporal GroundingMotion & Text
Arxiv 24.04.25CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather ConditionsimageLinkCodeVisible-infrared Images FusionRGB Images & Infrared Images
Arxiv 24.04.27Revisiting Multi-modal Emotion Learning with Broad State Space Models and Probability-guidance FusionimageLinkMulti-modal Emotion RecognitionText & Video & Audio
Arxiv 24.04.28 (PRCV 2024)Mamba-FETrack: Frame-Event Tracking via State Space ModelimageLinkCodeRGB-Event TrackingRGB Frames & Event
Arxiv 24.04.29 (GRSL 2024)RSCaMa: Remote Sensing Image Change Captioning with State Space ModelimageLinkCodeImage CaptioningRemote Sensing Image & Text
Arxiv 24.04.30CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian EvaluationimageLinkCodeOODImage & Text
Arxiv 24.05.22I2I-Mamba: Multi-modal medical image synthesis via selective state space modelingimageLinkCodeMedical Image GenerationMRI/CT
Arxiv 24.05.24 (NeurIPS 2024)Meteor: Mamba-based Traversal of Rationale for Large Language and Vision ModelsimageLinkCodeLarge Language and Vision ModelImage & Text (Qestion/Rationale)
Arxiv 24.05.29 (NeurIPS 2024)Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space ModelimageLinkmulti-modal sentiment analysisText & Video & Audio
Arxiv 24.05.31S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion<img width="539" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/74030174/15ab4235-93bd-4e6d-b836-79142bfa84ec">LinkImage FusionRGB Images & Infrared Images
Arxiv 24.06.02MGI: Multimodal Contrastive Pre-training of Genomic and Medical ImagingimageLinkMultimodal Contrastive Pre-trainingMedical Image & Genomic
Arxiv 24.06.03Dimba: Transformer-Mamba Diffusion ModelsimageLinkCodeText to Image GenerationImage & Text
Arxiv 24.06.06 (NeurIPS 2024)RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and ManipulationimageLinkCodeRobot Reasoning and ManipulationImage & Text
Arxiv 24.06.10MVGamba: Unify 3D Content Generation as State Space Sequence ModelingimageLink3D GenerationImage & Text
Arxiv 24.07.02MMR-Mamba: Multi-Contrast MRI Reconstruction with Mamba and Spatial-Frequency Information FusionimageLinkImage FusionMulti-Contrast MRI
Arxiv 24.07.14InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion GenerationimageLinkCodeText-to-Motion GenerationMotion & Text
Arxiv 24.07.15An Empirical Study of Mamba-based Pedestrian Attribute RecognitionimageLinkCodePedestrian Attribute RecognitionImage & Text
Arxiv 24.07.15OPa-Ma: Text Guided Mamba for 360-degree Image Out-paintingimageLink360-degree Image Out-paintingImage & Text
Arxiv 24.07.22GFE-Mamba: Mamba-based AD Multi-modal Progression Assessment via Generative Feature Extraction from MCIimageLinkCodeAD Progression AssessmentMRI & PET
Arxiv 24.07.29ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2imageLinkCodeMLLM TasksImage & Text
Arxiv 24.07.29 (ACM MM 2024)MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality FusionimageLinkCo-Speech Gesture GenerationMotion & Audio
Arxiv 24.08.01DiM-Gesture: Co-Speech Gesture Generation with Adaptive Layer Normalization Mamba-2 frameworkimageLinkCodeCo-Speech Gesture GenerationMotion & Audio
Arxiv 24.08.02 (ITSC 2024)MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian DetectionimageLinkCodePedestrian DetectionRGB & Thermal Images
Arxiv 24.08.02PhysMamba: Leveraging Dual-Stream Cross-Attention SSD for Remote Physiological MeasurementimageLinkRemote Physiological MeasurementVideo & rPPG
Arxiv 24.08.03JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Language ModelimageLinkMotion & Audio
Arxiv 24.08.07DRAMA: An Efficient End-to-end Motion Planner for Autonomous Driving with MambaimageLinkDriver Motion PlanImage & Text
Arxiv 24.08.15ColorMamba: Towards High-quality NIR-to-RGB Spectral Translation with MambaimageLinkCodeNIR-to-RGB TranslationNIR Images & RGB Images
Arxiv 24.08.16RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion MambaimageLinkRGBT TrackingRGB Videos & TIR Videos
Arxiv 24.08.19R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report GenerationimageLinkCodeMedical Report GenerationImage & Text
Arxiv 24.08.19OccMamba: Semantic Occupancy Prediction with State Space ModelsimageLinkSemantic Occupancy PredictionLiDAR Points & RGB Images
Arxiv 24.08.20Event Stream based Sign Language Translation: A High-Definition Benchmark Dataset and A New AlgorithmimageLinkCodeEvent Stream based Sign Language TranslationEvent & Text
Arxiv 24.08.20MUSE: Mamba is Efficient Multi-scale Learner for Text-video RetrievalimageLinkCodeText-video RetrievalVideo & Text
Arxiv 24.08.22Adapt CLIP as Aggregation Instructor for Image DehazingimageLinkDehazingImage & Text
Arxiv 24.08.27DualKanbaFormer: Kolmogorov-Arnold Networks and State Space Model DualKanbaFormer: Kolmogorov-Arnold Networks and State Space Model Transformer for Multimodal Aspect-based Sentiment AnalysisimageLinkMulti-modal Sentiment AnalysisImage & Text
Arxiv 24.08.28MambaPlace:Text-to-Point-Cloud Cross-Modal Place Recognition with Attention Mamba MechanismsimageLinkCodeCross-Modal Place RecognitionPoint Cloud & Text
TGRS 24.08.30Mask-Guided Mamba Fusion for Drone-based Visible-Infrared Vehicle DetectionimageLinkCross-Modal DetectionRGB Images & Infrared Image
Arxiv 24.09.03PixelBytes: Catching Unified Embedding for Multimodal GenerationimageLinkCodeMulti-Modal GenerationImage & Text
Arxiv 24.09.03Shuffle Mamba: State Space Models with Random Shuffle for Multi-Modal Image FusionimageLinkMulti-Modality Image FusionHISR Images & LRMS Images, MRI & CT/PET/SPECT
Arxiv 24.09.04LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid ArchitectureimageLinkCodeMLLM TasksImage & Text
Arxiv 24.09.05Why mamba is effective? Exploit Linear Transformer-Mamba Network for Multi-Modality Image FusionimageLinkMulti-Modality Image FusionRGB & Thermal Images, MRI & CT/PET/SPECT
Arxiv 24.09.08Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in ConversationsimageLinkCodeMulti-modal Emotion RecognitionText & Audio & Video
Arxiv 24.09.09Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language ModelingimageLinkCodeMLLM TasksImage & Text
Arxiv 24.09.11Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State ModelsimageLinkCodeAction PredictionPoint Cloud & Robot State
Arxiv 24.09.13Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary DetectionimageLinkCodeOpen-Vocabulary DetectionImage & Text
Arxiv 24.09.17Mamba Fusion: Learning Actions Through QuestioningimageLinkCodeAction Prediction/Action AnticipationVideo & Text
Arxiv 24.09.22GraspMamba: A Mamba-based Language-driven Grasp Detection Framework with Hierarchical Feature LearningimageLinkCodeGrasp DetectionImage & Text
Arxiv 24.09.24DepMamba: Progressive Fusion Mamba for Multimodal Depression DetectionimageLinkCodeMulti-modal Depression DetectionVideo & Audio
Arxiv 24.09.30MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image GenerationimageLinkImage GenerationImage & Text
Arxiv 24.10.01CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus DatasetimageLinkCodeMedical Report GenerationImage & Text
Arxiv 24.10.04HMT-Grasp: A Hybrid Mamba-Transformer Approach for Robot Grasping in Cluttered EnvironmentsimageLinkRobot GraspingRGB-D Image & Grasp
Arxiv 24.10.08EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical AlignmentimageLinkMLLM tasksImage & Text
Arxiv 24.10.10Moyun: A Diffusion-Based Model for Style-Specific Chinese Calligraphy GenerationimageLinkStyle-Specific Chinese Calligraphy GenerationImage & Text
Arxiv 24.10.17RemoteDet-Mamba: A Hybrid Mamba-CNN Network for Multi-modal Object Detection in Remote Sensing ImagesimageLinkObject DetectionRGB Images & Infrared Images
Arxiv 24.10.19MambaSOD: Dual Mamba-Driven Cross-Modal Fusion Network for RGB-D Salient Object DetectionimageLinkCodeSalient Object DetectionRGB Images & Depth Images
Arxiv 24.10.21LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetimageLinkDehazingImage & Text
Arxiv 24.10.21 (ISBI 2025)R2Gen-Mamba: A Selective State Space Model for Radiology Report GenerationimageLinkCodeRadiology Report GenerationImage & Text

Others

DatePaperFigureLinkCodeTask
Arxiv 24.02.24Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning<img width="683" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/e8ed3e23-e305-4b8a-a706-0601c1ef3b1b">LinkCodeFood Classification
Arxiv 24.03.08Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy<img width="943" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/4cce4533-8d35-4acc-8cb3-6ad44603dc04">LinkCodeEndoscope Tip Tracking
Arxiv 24.03.22Music to Dance as Language Translation using Sequence Models<img width="541" alt="image" src="https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/88369000/3e647680-22c7-4885-9ada-f32d9288f1ba">LinkCodeMusic-to-Dance
<!-- | Arxiv 24.05.08| Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models | ![image](https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/57466105/453639a0-a917-4ef3-8d15-5118bb019466) | [Link](https://arxiv.org/abs/2405.04909) | | Trajectory Prediction with LLM | | Arxiv 24.05.27| Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent | ![image](https://github.com/Ruixxxx/Awesome-Vision-Mamba-Models/assets/57466105/479e6d58-03a3-46f3-b78e-4d5d023b66f5) | [Link](https://arxiv.org/pdf/2405.17680) | [Code](https://github.com/colorfulfuture/UniTraj-pytorch) | Trajectory Generation | | Arxiv 24.07.11 (ICCC24)| ST-Mamba: Spatial-Temporal Mamba for Traffic Flow Estimation Recovery using Limited Data | ![image](https://github.com/user-attachments/assets/39f8739e-9a82-4025-94b5-603c179d3e3a) | [Link](https://arxiv.org/pdf/2407.08558) | | Traffic Flow Estimation Recovery | | Arxiv 24.08.08| Enhanced Prediction of Multi-Agent Trajectories via Control Inference and State-Space Dynamics | ![image](https://github.com/user-attachments/assets/e6008d53-54c1-4a56-97d3-f511a41c514d) | [Link](https://arxiv.org/pdf/2408.12609) | | Trajectory Prediction | | Arxiv 24.08.09| PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba | ![image](https://github.com/user-attachments/assets/2d2ad1b9-6f52-4b3c-9f18-d0fe418baad1) | [Link](https://arxiv.org/pdf/2408.04916) | | Destination Prediction/Time Estimation/Trajectory Search | -->

Valuable Insights

DatePaperLink
Arxiv 24.03.03The Hidden Attention of Mamba ModelsLink
Arxiv 24.03.15On the low-shot transferability of [V]-Mamba?Link
Arxiv 24.03.16Understanding Robustness of Visual State Space Models for Image ClassificationLink
Arxiv 24.05.13MambaOut: Do We Really Need Mamba for Vision?Link
Arxiv 24.05.26 (NeurIPS 2024)Demystify Mamba in Vision: A Linear Attention PerspectiveLink
Arxiv 24.05.26A Unified Implicit Attention Formulation for Gated-Linear Recurrent Sequence ModelsLink
Arxiv 24.06.11 (NeurIPS 2024)MambaLRP: Explaining Selective State Space Sequence ModelsLink
Arxiv 24.06.13Towards Evaluating the Robustness of Visual State Space ModelsLink

Other Domains

Reinforcement Learning

DatePaperFigureLinkCode
Arxiv 24.03.25 (IROS 2024)Proprioception Is All You Need: Terrain Classification for Boreal ForestsimageLinkCode
Arxiv 24.05.20 (NeurIPS 2024)Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?imageLink
Arxiv 24.05.31Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence ModelingimageLink
Arxiv 24.06.04Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement LearningimageLinkCode
Arxiv 24.06.08 (NeurIPS 2024)Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RLimageLink
Arxiv 24.06.12MaIL: Improving Imitation Learning with Mambaimage imageLink
Arxiv 24.06.21KalMamba: Towards Efficient Probabilistic State Space Models for RL under UncertaintyimageLink
Arxiv 24.08.05Context-aware Mamba-based Reinforcement Learning for social robot navigationimageLink
Arxiv 24.08.20Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMambaimageLinkCode
Arxiv 24.09.04Mamba as a motion encoder for robotic imitation learningimageLink
Arxiv 24.09.23DiSPo: Diffusion-SSM based Policy Learning for Coarse-to-Fine Action DiscretizationimageLink
Arxiv 24.10.11Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter EfficientimageLinkCode
Arxiv 24.10.25Multi-Agent Reinforcement Learning with Selective State-Space ModelsimageLinkCode
Arxiv 24.10.29A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics TasksimageLinkCode

Graph Learning

DatePaperFigureLinkCode
Arxiv 24.02.13 (KDD 2024)Graph Mamba: Towards Learning on Graphs with State Space ModelsimageLinkCode
Arxiv 24.05.22HeteGraph-Mamba: Heterogeneous Graph Learning via Selective State Space ModelimageLink
Arxiv 24.08.08DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space ModelsimageLink
Arxiv 24.08.13DyG-Mamba: Continuous State Space Modeling on Dynamic GraphsimageLink
Arxiv 24.09.18Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial ComplexesimageLink
KDD 2024 WorkshopIdentifying Subphenotypes for Sepsis with Acute Kidney Injury via Multimodal Graph State Space ModelsimageLink

Audio

DatePaperFigureLinkCode
Arxiv 24.03.12 (IEEE SPL 2024)Multichannel Long-Term Streaming Neural Speech Enhancement for Static and Moving SpeakersimageLinkCode
Arxiv 24.04.02SPMamba: State-space model is all you need in speech separationimageLink
Arxiv 24.05.02TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable PlatformsimageLink
Arxiv 24.05.10An Investigation of Incorporating Mamba for Speech EnhancementimageLink
Arxiv 24.05.20SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space ModelimageLinkCode
Arxiv 24.05.21Mamba in Speech: Towards an Alternative to Self-AttentionimageLink
Arxiv 24.05.22Audio Mamba: Pretrained Audio State Space Model For Audio TaggingLinkCode
Arxiv 24.06.04 (Interspeech 2024)Audio Mamba: Selective State Spaces for Self-Supervised Audio RepresentationsimageLinkCode
Arxiv 24.06.05Audio Mamba: Bidirectional State Space Model for Audio Representation LearningimageLinkCode
Arxiv 24.06.10 (Interspeech 2024)RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake DetectionimageLinkCode
Arxiv 24.06.24 (Interspeech 2024)Exploring the Capability of Mamba in Speech ApplicationsimageLink
Arxiv 24.07.13Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and SynthesisimageLinkMamba-TasNet Code ConMamba Code
Arxiv 24.08.09SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance EstimationimageLink
Arxiv 24.09.04MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal PrecisionimageLink
Arxiv 24.09.04 (SLT 2024 Workshop)An Analysis of Linear Complexity Attention Substitutes with BEST-RQimageLink
Arxiv 24.09.07Cross-attention Inspired Selective State Space Models for Target Sound Extractionimage imageLink
Arxiv 24.09.08TF-Mamba: A Time-Frequency Network for Sound Source LocalizationimageLink
Arxiv 24.09.09Vector Quantized Diffusion Model Based Speech Bandwidth ExtensionimageLink
Arxiv 24.09.10A Two-Stage Band-Split Mamba-2 Network for Music SeparationimageLinkCode
Arxiv 24.09.11Rethinking Mamba in Speech Processing by Self-Supervised ModelsimageLinkCode
Arxiv 24.09.13MambaFoley: Foley Sound Generation using Selective State-Space ModelsimageLinkCode
Arxiv 24.09.14Wave-U-Mamba: An End-To-End Framework For High-Quality And Efficient Speech Super ResolutionimageLink
Arxiv 24.09.15Self-supervised Learning for Acoustic Few-Shot ClassificationimageLink
Arxiv 24.09.16Ultra-Low Latency Speech Enhancement - A Comprehensive StudyLink
Arxiv 24.09.16Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech EnhancementimageLink
Arxiv 24.09.18Dense-TSNet: Dense Connected Two-Stage Structure for Ultra-Lightweight Speech EnhancementimageLinkCode
Arxiv 24.09.19DeFT-Mamba: Universal Multichannel Sound Separation and Polyphonic Audio ClassificationimageLink
Arxiv 24.09.26MC-SEMamba: A Simple Multi-channel Extension of SEMambaimageLink
Arxiv 24.09.27 (SLT 2024)Speech-Mamba: Long-Context Speech Recognition with Selective State Spaces ModelsimageLinkCode
Arxiv 24.09.30Mamba for Streaming ASR Combined with Unimodal AggregationimageLinkCode
Arxiv 24.10.01Zero-Shot Text-to-Speech from Continuous Text StreamsimageLinkCode
Arxiv 24.10.09 (ICASSP 2025)Mamba-based Segmentation Model for Speaker DiarizationimageLinkCode
Arxiv 24.10.09Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear ComplexityimageLink
Arxiv 24.10.14CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel PruningimageLinkCode
Arxiv 24.10.28SepMamba: State-space models for speaker separation using MambaimageLinkCode
Expert Systems with Applications 2024A barking emotion recognition method based on Mamba and Synchrosqueezing Short-Time Fourier TransformimageLinkCode

Time Series

DatePaperFigureLinkCode
Arxiv 24.03.14 (ECAI 2024)TimeMachine: A Time Series is Worth 4 Mambas for Long-term ForecastingimageLinkCode
Arxiv 24.04.23SST: Multi-Scale Hybrid Mamba-Transformer Experts for Long-Short Range Time Series ForecastingimageLinkCode
Arxiv 24.04.23Integrating Mamba and Transformer for Long-Short Range Time Series ForecastingimageLink
Arxiv 24.04.24Bi-Mamba+: Bidirectional Mamba for Time Series ForecastingimageLink
Arxiv 24.05.11DTMamba : Dual Twin Mamba for Time Series ForecastingimageLink
Arxiv 24.05.25Time-SSM: Simplifying and Unifying State Space Models for Time Series ForecastingimageLink
Arxiv 24.05.26MambaTS: Improved Selective State Space Models for Long-term Time Series ForecastingimageLink
Arxiv 24.06.06Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space ModelsimageLink
Arxiv 24.06.06TSCMamba: Mamba Meets Multi-View Learning for Time Series ClassificationimageLink
Arxiv 24.06.08C-Mamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series ForecastingimageLinkCode
Arxiv 24.06.17 (IJCAI 2024 Workshop)SpoT-Mamba: Learning Long-Range Dependency on Spatio-Temporal Graphs with Selective State SpacesimageLinkCode
Arxiv 24.07.15MSegRNN:Enhanced SegRNN Model with Mamba for Long-Term Time Series ForecastingimageLink
Arxiv 24.07.20FMamba: Mamba based on Fast-attention for Multivariate Time-series ForecastingimageLink
Arxiv 24.08.04Mamba-Spike: Enhancing the Mamba Architecture with a Spiking Front-End for Efficient Temporal Data ProcessingimageLinkCode
Arxiv 24.08.22 (CGI24)Simplified Mamba with Disentangled Dependency Encoding for Long-Term Time Series ForecastingimageLink
Arxiv 24.08.27Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Needimage imageLinkCode
Arxiv 24.09.13 (ICECCE 2024)Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather DynamicsimageLink
Arxiv 24.09.21Test Time Learning for Time Series ForecastingimageLink
Arxiv 24.09.30A SSM is Polymerized from Multivariate Time SeriesimageLinkCode
Arxiv 24.09.30 (SLT 2024)SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention DecodingimageLinkCode
Arxiv 24.10.08TIMBA: Time series Imputation with Bi-directional Mamba Blocks and Diffusion modelsimageLink
Arxiv 24.10.12Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space ModelsimageLinkCode
Arxiv 24.10.13SlimSeiz: Efficient Channel-Adaptive Seizure Prediction Using a Mamba-Enhanced NetworkimageLinkCode
Arxiv 24.10.15UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using MambaimageLink
Arxiv 24.10.17DiffImp: Efficient Diffusion Model for Probabilistic Time Series Imputation with Bidirectional Mamba BackboneimageLink
Arxiv 24.10.28FACTS: A Factored State-Space Framework For World ModellingimageLinkCode
Arxiv 24.10.28Neural Hamilton: Can A.I. Understand Hamiltonian Mechanics?imageLinkCode
Arxiv 24.10.30 (NeurIPS 2024 Workshop)Sequential Order-Robust Mamba for Time Series ForecastingimageLinkCode