Home

Awesome

<h1 align="center">A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning</h1>

This is the official PyTorch implementation of A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning.

<!-- a project conducted at the [Institute of Advanced Research in Artificial Intelligence (IARAI)](https://www.iarai.ac.at/). -->

Preparation

├─/root/Data/LEVIR_CC/
        ├─LevirCCcaptions.json
        ├─images
             ├─train
             │  ├─A
             │  ├─B
             ├─val
             │  ├─A
             │  ├─B
             ├─test
             │  ├─A
             │  ├─B

where folder A contains images of pre-phase, folder B contains images of post-phase.

$ python preprocess_data.py

!NOTE: When preparing the text token files, we suggest setting the word count threshold of LEVIR-CC to 5 and Dubai_CC to 0 for fair comparisons.

Training

$ python train.py

!NOTE: If the program encounters the error: "'Meteor' object has no attribute 'lock'," we recommend installing it with sudo apt install openjdk-11-jdk to resolve this issue.

Testing

$ python test.py

Caption Generation

$ python caption.py

Visual Examples

Here are some visualized examples of the generated captions in LEVIR-CC:

Paper

A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning

Please cite the following paper if you find it useful for your research:

@ARTICLE{10700970,
  author={Sun, Dongwei and Bao, Yajie and Liu, Junmin and Cao, Xiangyong},
  journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing}, 
  title={A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning}, 
  year={2024},
  volume={17},
  number={},
  pages={18727-18738},
  keywords={Transformers;Feature extraction;Remote sensing;Kernel;Attention mechanisms;Accuracy;Sensors;Convolutional neural networks;Computational modeling;Visualization;Change captioning;remote sensing image change detection;sparse attention;transformer encoder},
  doi={10.1109/JSTARS.2024.3471625}}

Acknowledgement

License

This repo is distributed under MIT License. The code can be used for academic purposes only.