Home

Awesome

Single-stream Extractor Network with Contrastive Pre-training for Remote Sensing Change Captioning

Author: Qing Zhou, Junyu Gao, Yuan Yuan, Qi Wang☨

This repository is the official implementation of SEN and also support RSICCformer, MCCFormer.

overview

Requirements

To install requirements:

pip install -r requirements.txt

Download data form LEVIR-CC and put it in ./LEVIR_CC_dataset/.:

Then preprocess dataset for training as follows:

python create_input_files.py --min_word_freq 5

Pre-trained models

You can download the pre-trained models from Baidu Pan, it includes the following weights:

Training

To train the SEN model, run this command:

CUDA_VISIBLE_DEVICES=0 python train.py \
  --more_reproducibility \
  --savepath model_checkpoints/SEN --model SEN \
  --batch_size 128 --proj_channel 512 \
  --encoder_n_layers 2 --ft_layer 4 --model_stage 4 \
  --weight_path pretrain_ckpt/rn50.pth.tar

To train the RSICCformer model, run this command:

CUDA_VISIBLE_DEVICES=0 python train.py \
  --more_reproducibility \
  --savepath model_checkpoints/RSICCformer --model RSICCformer \
  --batch_size 128 --encoder_image resnet101 \
  --encoder_feat MCCFormers_diff_as_Q --decoder trans

To train the MCCFormer-S/D model, run this command:

CUDA_VISIBLE_DEVICES=0 python train.py \
  --more_reproducibility \
  --savepath model_checkpoints/MCCFormer-S --model RSICCformer \
  --batch_size 128 --encoder_image resnet101 \
  --encoder_feat MCCFormers-S --decoder trans \
  --n_layer 2 --n_heads 4 --decoder_n_layers 2

Evaluation

To evaluate the SEN model, run:

python eval.py --path ./models_checkpoint/SEN/ --model SEN

To evaluate the RSICCformer, MCCFormer-S/D model, run:

python eval.py --path ./models_checkpoint/RSICCformer/ --model RSICCformer

Result

MethodB@1B@2B@3B@4MRCS<sup></sup><sub>𝑚</sub>PFPS
Capt-Rep-Diff72.9061.9853.6247.4134.4765.64110.5764.52--
Capt-Att77.6467.4059.2453.1536.5869.73121.2270.17--
Capt-Dual-Att79.5170.5763.2357.4636.5670.69124.4272.28--
DUDA81.4472.2264.2457.7937.1571.04124.3272.58--
MCCFormer-S79.9070.2662.6856.6836.1769.46120.3970.6869.012.9
MCCFormer-D80.4270.8762.8656.3837.2970.32124.4472.1169.012.4
RSICCformer_c83.0974.3266.6660.4438.7672.63130.0075.4656.215.0
PSNet83.8675.1367.8962.1138.8073.60132.6276.78--
Δ+1.24+1.92+2.12+1.98+0.79+0.97+3.40+1.79-16.3+8.7
SEN (ours)85.1077.0570.0164.0939.5974.57136.0278.5739.923.7

Citation

@ARTICLE{10530145,
  author={Zhou, Qing and Gao, Junyu and Yuan, Yuan and Wang, Qi},
  journal={IEEE Transactions on Geoscience and Remote Sensing}, 
  title={Single-Stream Extractor Network With Contrastive Pre-Training for Remote-Sensing Change Captioning}, 
  year={2024},
  volume={62},
  number={},
  pages={1-14},
}

Reference

Thanks to the following repository: RSICCformer, SSL4EO-S12