Awesome

Multi-Image Super-Resolution Task

Ranked #1 on Multi-Frame Super-Resolution on PROBA-V

In remote sensing, multi-image super-resolution (MISR) is a challenging problem. The release of the PROBA-V Kelvin dataset has aroused our great interest.

We believe that multiple images contain more information than a single image, so it is necessary to improve image utilization significantly. Besides, the time span of multiple images taken by the PROBA-V satellite is long, so the impact of image position needs to be reduced.

In this repository, we demonstrate a novel Transformer-based MISR framework, namely TR-MISR, which gets the state-of-the-art on the PROBA-V Kelvin dataset. TR-MISR does not pursue the complexity of the encoder and decoder but the fusion capability of the fusion module. Specifically, we rearrange the feature maps encoded by low-resolution images to a set of feature vectors. By adding a learnable embedding vector, these feature vectors can be fused through multi-layers of the Transformer with self-attention. Then, we decode the output of the embedding vector to get a patch of a high-resolution image.

TR-MISR can receive more unclear input images without performance degradation, and it does not require pre-training. Overall, TR-MISR is an attempt to apply Transformers to a specific low-level vision task.

Recommended GPU platform: An NVIDIA ® Tesla ® V100 GPU.
If you are using another GPU and facing a memory shortage, please reduce the batch size or choose a smaller model hyperparameter as appropriate.

Fig1. Overview of TR-MISR.

0. Setup the environment

Setup a python environment and install dependencies. Our python version == 3.6.12

pip install -r requirements.txt

1. Prepare the data set

Download the training/validation set assigned by RAMS, RAMS/probav_data at master · EscVM/RAMS (github.com)
Run the split_data_fit script to crop images in each scene for the training set.

python ./split_data_fit.py

Run the save_clearance script to precompute clearance scores for low-resolution images.

python ./save_clearance.py

You can easily get the complete preprocessed dataset on Google Drive or Baidu Cloud (code:gflb).

2. Complete the config file

In the config file, the main settings are shown in the following Table.

Item	Description
prefix	indicate the path of the dataset.
use_all_bands	whether to use all bands, if False, then set use_band.
use_all_data_to_fight_leaderboard	if True, then use all training set and skip validation.
strategy	learning rate decay strategy, set Manual by default.
pth_epoch_num	load the model with the corresponding epoch number.
truncate values	whether to truncate values that exceed the valid value.
data_arguments	please set False.
all_loss_depend	if True, then set the ratio of the three losses.
model_path_band	indicate the path of the model.

3. Train the model

If the above processes are prepared, then it's time for training.

python ./src/train.py

If you need to record the training log, run

python ./src/train.py 2>&1 | tee re.log

The re.log file can be used to print out the training details in each epoch.

python ./train_all_read_log.py   # for training all data
python ./train_read_log.py       # for training and evaluation

You can also view the training logs with tensorboardX.

tensorboard --logdir='tb_logs/'

4. Validate the model

Fix the model paths trained in the NIR band and RED band, respectively.
The val script outputs a val_plot.png to visualize the results of each scene obtained by TR-MISR compared to the baseline.

python /src/val.py

5. Test the model

The test script is mainly used to submit the results to the leaderboard since the ground truth is not involved in the testing set. The test script will output a submission zip with 16-bit images (located in './submission/') and a visual result with 8-bit images (located in './submission_vis/'). The test platform is still open, allowing more methods to challenge their performance limits.

python /src/test.py

The leaderboard is shown as follows:

PROBA-V Benchmark (Multi-Frame Super-Resolution) | Papers With Code

If it helps for you, please cite

@article{an2022tr,
  title={TR-MISR: Multiimage Super-Resolution Based on Feature Fusion With Transformers},
  author={An, Tai and Zhang, Xin and Huo, Chunlei and Xue, Bin and Wang, Lingfeng and Pan, Chunhong},
  journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
  volume={15},
  pages={1373--1388},
  year={2022},
  publisher={IEEE}
}