Awesome

<div align="center"> <h1> 🐍 The Hidden Attention of Mamba Models 🐍 </h1>

Ameen Ali<sup>1</sup> *,Itamar Zimerman<sup>1</sup> * and Lior Wolf<sup>1</sup> <br> ameenali023@gmail.com, itamarzimm@gmail.com, liorwolf@gmail.com <br> <sup>1</sup> Tel Aviv University (*) equal contribution

</div>

Official PyTorch Implementation of "The Hidden Attention of Mamba Models"

The Mamba layer offers an efficient state space model (SSM) that is highly effective in modeling multiple domains including long-range sequences and images. SSMs are viewed as dual models, in which one trains in parallel on the entire sequence using convolutions, and deploys in an autoregressive manner. We add a third view and show that such models can be viewed as attention-driven models. This new perspective enables us to compare the underlying mechanisms to that of the self-attention layers in transformers and allows us to peer inside the inner workings of the Mamba model with explainability methods. <br> You can access the paper through : <a href="https://arxiv.org/pdf/2403.01590.pdf">The Hidden Attention of Mamba Models</a>

Set Up Environment

Python 3.10.13
- conda create -n your_env_name python=3.10.13
Activate Env
- conda activate your_env_name
CUDA TOOLKIT 11.8
- conda install nvidia/label/cuda-11.8.0::cuda-toolkit
torch 2.1.1 + cu118
- pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
Requirements: vim_requirements.txt
- pip install -r vim/vim_requirements.txt
Install jupyter
- pip install jupyter
Install causal_conv1d and mamba from <b>our source</b>
- cd causal-conv1d
- pip install --editable .
- cd ..
- pip install --editable mamba-1p1p1

Pre-Trained Weights

We have used the official weights provided by Vim, which can be downloaded from here:

Model	#param.	Top-1 Acc.	Top-5 Acc.	Hugginface Repo
Vim-tiny	7M	76.1	93.0	https://huggingface.co/hustvl/Vim-tiny-midclstok
Vim-tiny<sup>+</sup>	7M	78.3	94.2	https://huggingface.co/hustvl/Vim-tiny-midclstok
Vim-small	26M	80.5	95.1	https://huggingface.co/hustvl/Vim-small-midclstok
Vim-small<sup>+</sup>	26M	81.6	95.4	https://huggingface.co/hustvl/Vim-small-midclstok

Notes:

<b> In all of our experiments, we have worked with Vim-small.</b>

Vision-Mamba Explainability Notebook:

<div align="center"> <img src="assets/xai_gradmethod.jpg" alt="Left Image" align="center" width="1000" height="300"> </div> <br> Follow the instructions in <b>vim/vmamba_xai.ipynb</b> notebook, in order to apply a single-image inference for the 3 introduced methods in the paper. <br> <div align="center"> <img src="assets/notebook.png" alt="Left Image" align="center" width="600" height="600"> </div>

To-Do

For the segmentation experiment, please check out our follow-up work. <br>

<ul> <strike><li><input type="checkbox" id="task1" checked disabled><label for="task1">XAI - Single Image Inference Notebook</label></li></strike> <strike><li><input type="checkbox" id="task2" checked disabled><label for="task2">XAI - Segmentation Experimnts</label></li></strike> </ul>

Citation

if you find our work useful, please consider citing us:

@misc{ali2024hidden,
      title={The Hidden Attention of Mamba Models}, 
      author={Ameen Ali and Itamar Zimerman and Lior Wolf},
      year={2024},
      eprint={2403.01590},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Acknowledgement

This repository is heavily based on Vim, Mamba and Transformer-Explainability. Thanks for their wonderful works.