Awesome
Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images
This is the official code repository for "Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images". {Arxiv Paper}
Abstract
Recent advancements in autoregressive networks with linear complexity have driven significant research progress, demonstrating exceptional performance in large language models. A representative model is the Extended Long Short-Term Memory (xLSTM), which incorporates gating mechanisms and memory structures, performing comparably to Transformer architectures in long-sequence language tasks. Autoregressive networks such as xLSTM can utilize image serialization to extend their application to visual tasks such as classification and segmentation. Although existing studies have demonstrated Vision-LSTM’s impressive results in image classification, its performance in image semantic segmentation remains unverified. Our study represents the first attempt to evaluate the effectiveness of Vision-LSTM in the semantic segmentation of remotely sensed images. This evaluation is based on a specifically designed encoder-decoder architecture named Seg-LSTM, and comparisons with state-of-the-art segmentation networks. Our study found that Vision-LSTM's performance in semantic segmentation was limited and generally inferior to Vision-Transformers-based and Vision-Mamba-based models in most comparative tests. Future research directions for enhancing Vision-LSTM are recommended. The source code is available from https://github.com/zhuqinfeng1999/Seg-LSTM.
Installation
Requirements
Requirements: Ubuntu 20.04, CUDA 12.4
- Set up the mmsegmentation environment; we conduct experiments using the mmsegmentation framework. Please refer to https://github.com/open-mmlab/mmsegmentation.
LoveDA datasets
-
The LoveDA datasets can be found here https://github.com/Junjue-Wang/LoveDA.
-
After downloading the dataset, you are supposed to put them into '/mmsegmentation/data/loveDA/'
-
'/mmsegmentation/data/loveDA/'
- ann_dir
- train
- .png
- val
- .png
- train
- img_dir
- train
- .png
- val
- .png
- train
Model file and config file
-
The model file zqf_seglstm.py can be found in /mmsegmentation/mmseg/models/backbones/
-
The config file zqf_seglstm_'decoder'.py for the combination of backbone and decoder head can be found in /mmsegmentation/configs/base/models
-
The config file for training can be found in /mmsegmentation/configs/zqf_seglstm/
Training Seg-LSTM
bash tools/dist_train.sh 'configfile' 2 --work-dir /mmsegmentation/output/seglstm
Testing Seg-LSTM
bash tools/dist_test.sh 'configfile' \ /mmsegmentation/output/seglstm/iter_15000.pth 2 --out /mmsegmentation/visout/seglstm
Citation
If you find this work useful in your research, please consider cite:
@article{
zhu2024seglstm,
title={Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images},
author={Qinfeng Zhu and Yuanzhi Cai and Lei Fan},
year={2024},
eprint={2406.14086},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
License
This project is released under the Apache 2.0 license.
Acknowledgement
We acknowledge all the authors of the employed public datasets, allowing the community to use these valuable resources for research purposes. We also thank the authors of xLSTM, Vision-LSTM, MMSegmentation for making their valuable code publicly available.