Awesome

A Mamba-Diffusion Framework for Multimodal Remote Sensing Image Semantic Segmentation

Wen-Liang Du, Yang Gu, Jiaqi Zhao, Hancheng Zhu, Rui Yao and Yong Zhou

Overview

Abstract

We propose a mamba-diffusion framework to preserve geometric consistency in segmentation masks. This framework preserves geometric consistency by introducing a generative diffusion-based semantic segmentation pipeline and developing a Mamba-based multimodal fusion model. The fusion model fuses the multimodal images in multiple scales and scanning mechanisms by a double cross-fusion (DCF) module. Then, the cross-modal information is further integrated by a dual-splitting structured state-space (DS-S4) model. Finally, the diffusion-based segmentation pipeline predicts semantic masks by progressively refining random Gaussian noise, guided by fused multimodal features. Our experimental results, verified on WHU-OPT-SAR and Hunan datasets, demonstrate that the proposed framework surpasses state-of-the-art (SOTA) methods by a considerable margin.

Framework

Getting Started

Experientially, we recommended to configure mamba environment before installing mmseg framework

# recommended to create a new environment with torch1.13.0 + cuda11.7
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 
--extra-index-url https://download.pytorch.org/whl/cu117

Step1: Vmamba

# Mamba-ssm
pip install causal-conv1d==1.1.1
pip install mamba-ssm==1.1.1

# Vmamba
git clone https://github.com/MzeroMiko/VMamba.git
cd VMamba
pip install -r requirements.txt
cd kernels/selective_scan && pip install .

Step2: MMSegmentation

The code is based on the MMSegmentation v0.30.0.

See MMSegmentation for more details on how to install the MMSegmentation framework

pip install mmcv-full==1.7.2 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13.0/index.html

cd Mamba-Diffusion
pip install -v -e .

Results

Checkpoints

https://pan.baidu.com/s/14IloqNUx746n8GjSZj0odA?pwd=pu62 
password：pu62

Training

Multi-gpu training

bash tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM}

For example, To train Our model on whu-opt-sar with 4 gpus run:

bash tools/dist_train.sh configs/whu/ddp_fuse-mamba_4x4_256x256_160k_whu-fianl.py 4

Evaluation

Single-gpu testing

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --eval mIoU

Multi-gpu testing

bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} --eval mIoU