Awesome
Awesome State-Space Resources for ML
Contributions are welcome! Please read the contribution guidelines before contributing.
Table of Contents
- Tutorials
- Surveys
- Books
- Foundation
- Architecture
- Vision
- Language
- Audio
- Time-Series
- Medical
- Tabular
- Reinforcement Learning
- Parameterization and Initialization
- Miscellaneous
Tutorials <a name="tutorials"></a>
Blogposts
- S4 Series
- The Annotated S4
- The Annotated S4D
- The Annotated Mamba [code]
- Mamba: The Easy Way
- Mamba: The Hard Way
- A Visual Guide to Mamba and State Space Models
- State Space Models: A Modern Approach
- Mamba No. 5 (A Little Bit Of...)
- Mamba: SSM, Theory, and Implementation in Keras and TensorFlow
Videos
- Efficiently Modeling Long Sequences with Structured State Spaces
- Do we need Attention? A Mamba Primer
- Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math
- MAMBA from Scratch
- Yannic Kilcher's Video
Surveys (Structured State Space Models) <a name="surveys"></a>
- Modeling Sequences with Structured State Spaces
- State Space Model for New-Generation Network Alternative to Transformers
- Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
- A Survey on Visual Mamba
Books (Classical State Space Models) <a name="books"></a>
Foundation
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces [code]
- Structured state-space models are deep Wiener models
- State-space Models with Layer-wise Nonlinearity are Universal Approximators with Exponential Decaying Memory
- Repeat After Me: Transformers are Better than State Space Models at Copying
- Theoretical Foundations of Deep Selective State-Space Models
- The Hidden Attention of Mamba Models
- The Expressive Capacity of State Space Models: A Formal Language Perspective
- Simplifying and Understanding State Space Models with Diagonal Linear RNNs
Architecture
- Jamba: A Hybrid Transformer-Mamba Language Model
- Jamba-1.5: Hybrid Transformer-Mamba Models at Scale
- Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models [code]
- S5: Simplified State Space Layers for Sequence Modeling (ICLR 2023) [code]
- Long range language modeling via gated state spaces (ICLR 2023)
- Pretraining Without Attention [code]
- MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts [code]
- LOCOST: State-Space Models for Long Document Abstractive Summarization [code]
- BlackMamba: Mixture of Experts for State-Space Models [code]
- DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models [code]
- ZigMa: Zigzag Mamba Diffusion Model (ECCV 2024) [code] [website]
- Block-State Transformers
- Efficient Long Sequence Modeling via State Space Augmented Transformer
- S7: Selective and Simplified State Space Layers for Sequence Modeling
Language
- Hungry Hungry Hippos: Towards Language Modeling with State Space Models (ICLR 2023) [code]
- Long range language modeling via gated state spaces (ICLR 2023) [code]
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces [code]
- MambaByte: Token-free Selective State Space Model [code]
- Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
- Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models [code]
Audio
- It's Raw! Audio Generation with State-Space Models (ICML 2022) [code]
- Augmenting conformers with structured state space models for online speech recognition
- Diagonal State Space Augmented Transformers for Speech Recognition
- Structured State Space Decoder for Speech Recognition and Synthesis
- Spiking Structured State Space Model for Monaural Speech Enhancement
- A Neural State-Space Model Approach to Efficient Speech Separation
- Multi-Head State Space Model for Speech Recognition
- Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation [code]
- SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model [code]
- Audio Mamba: Bidirectional State Space Model for Audio Representation Learning [code]
- Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis [code]
Vision
- S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces (NeurIPS 2022)
- Long movie clip classification with state-space video models (ECCV 2022) [code]
- Efficient Movie Scene Detection using State-Space Transformers (CVPR 2023)
- Selective Structured State-Spaces for Long-Form Video Understanding (CVPR 2023)
- 2-D SSM: A General Spatial Layer for Visual Transformers [code]
- Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model [code]
- VMamba: Visual State Space Model [code]
- U-shaped Vision Mamba for Single Image Dehazing [code]
- Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning [code]
- Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation [code]
- LocalMamba: Visual State Space Model with Windowed Selective Scan [code]
- Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM [code]
- A Survey on Visual Mamba
- SUM: Saliency Unification through Mamba for Visual Attention Modeling [code]
- [CVPR'24 Spotlight] State Space Models for Event Cameras [code]
Time-Series
- Deep State Space Models for Time Series Forecasting (NeurIPS 2018)
- FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting (NeurIPS 2022)
- Effectively modeling time series with simple discrete state spaces (ICLR 2023)
- Deep Latent State Space Models for Time-Series Generation (ICML 2023)
- Generative AI for End-to-End Limit Order Book Modelling (ICAIF 2023)
- On the Performance of Legendre State-Space Models in Short-Term Time Series Forecasting (CCECE 2023)
- Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
- Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models
Medical
- Structured State Space Models for Multiple Instance Learning in Digital Pathology
- Modeling Multivariate Biosignals with Graph Neural Networks and Structured State Space
- Diffusion-based conditional ECG generation with structured state space models
- Improving the Diagnosis of Psychiatric Disorders with Self-Supervised Graph State Space Models
- fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models
- Vivim: a Video Vision Mamba for Medical Video Object Segmentation [code]
- MambaMorph: a Mamba-based Backbone with Contrastive Feature Learning for Deformable MR-CT Registration [code]
- SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation [code]
- U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation [code]
- nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model
- VM-UNet: Vision Mamba UNet for Medical Image Segmentation
- MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation
- ViM-UNet: Vision Mamba for Biomedical Segmentation (MIDL 2024)
- I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling [code]
- BioMamba: A Pre-trained Biomedical Language Representation Model Leveraging Mamba
Tabular
Reinforcement Learning
- Decision S4: Efficient Sequence-Based RL via State Spaces Layers (ICLR 2023)
- Structured State Space Models for In-Context Reinforcement Learning (NeurIPS 2023)
- Mastering Memory Tasks with World Models (ICLR 2024 oral)
SSM Parameterization and Initialization
- Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers (NeurIPS 2021)
- Efficiently Modeling Long Sequences with Structured State Spaces (ICLR 2022)
- On the Parameterization and Initialization of Diagonal State Space Models (NeurIPS 2022)
- Diagonal State Spaces are as Effective as Structured State Spaces (NeurIPS 2022) [code]
- How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections (ICLR 2023)
- Robustifying State-space Models for Long Sequences via Approximate Diagonalization
- StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
- Spectral State Space Models
- From Generalization Analysis to Optimization Designs for State Space Models (ICML 2024)
Miscellaneous
- Variational learning for switching state-space models (Neural Computation 2000)
- Liquid structural state-space models (ICLR 2023)
- Resurrecting Recurrent Neural Networks for Long Sequences (ICML 2023)
- Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets (ICLR 2023)
- Never Train from Scratch: Fair Comparison Of Long- Sequence Models Requires Data-Driven Pirors
- Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks (NeurIPS 2019)
Contributions
🎉 Thank you for considering contributing to our Awesome State Space Models for Machine Learning repository! 🚀
Contribute in 3 Steps:
-
Fork the Repo: Fork this repo to your GitHub account.
-
Edit Content: Contribute by adding new resources or improving existing content in the
README.md
file. -
Create a Pull Request: Open a pull request (PR) from your branch to the main repository.
Guidelines
- Follow the existing structure and formatting.
- Ensure added resources are relevant to State Space Models in Machine Learning.
- Verify that links work correctly.
Reporting Issues
If you encounter issues or have suggestions, open an issue on the GitHub repository.
Your contributions make this repository awesome! Thank you! 🙌
License
This project is licensed under the MIT License.