Awesome
Multi-Image Super-Resolution Task
Ranked #1 on Multi-Frame Super-Resolution on PROBA-V
In remote sensing, multi-image super-resolution (MISR) is a challenging problem. The release of the PROBA-V Kelvin dataset has aroused our great interest.
We believe that multiple images contain more information than a single image, so it is necessary to improve image utilization significantly. Besides, the time span of multiple images taken by the PROBA-V satellite is long, so the impact of image position needs to be reduced.
In this repository, we demonstrate a novel Transformer-based MISR framework, namely TR-MISR, which gets the state-of-the-art on the PROBA-V Kelvin dataset. TR-MISR does not pursue the complexity of the encoder and decoder but the fusion capability of the fusion module. Specifically, we rearrange the feature maps encoded by low-resolution images to a set of feature vectors. By adding a learnable embedding vector, these feature vectors can be fused through multi-layers of the Transformer with self-attention. Then, we decode the output of the embedding vector to get a patch of a high-resolution image.
TR-MISR can receive more unclear input images without performance degradation, and it does not require pre-training. Overall, TR-MISR is an attempt to apply Transformers to a specific low-level vision task.
- Recommended GPU platform: An NVIDIA ® Tesla ® V100 GPU.
- If you are using another GPU and facing a memory shortage, please reduce the batch size or choose a smaller model hyperparameter as appropriate.
0. Setup the environment
- Setup a python environment and install dependencies. Our python version == 3.6.12
pip install -r requirements.txt
1. Prepare the data set
- Download the training/validation set assigned by RAMS, RAMS/probav_data at master · EscVM/RAMS (github.com)
- Run the split_data_fit script to crop images in each scene for the training set.
python ./split_data_fit.py
- Run the save_clearance script to precompute clearance scores for low-resolution images.
python ./save_clearance.py
- You can easily get the complete preprocessed dataset on Google Drive or Baidu Cloud (code:gflb).
2. Complete the config file
In the config file, the main settings are shown in the following Table.
Item | Description |
---|---|
prefix | indicate the path of the dataset. |
use_all_bands | whether to use all bands, if False, then set use_band. |
use_all_data_to_fight_leaderboard | if True, then use all training set and skip validation. |
strategy | learning rate decay strategy, set Manual by default. |
pth_epoch_num | load the model with the corresponding epoch number. |
truncate values | whether to truncate values that exceed the valid value. |
data_arguments | please set False. |
all_loss_depend | if True, then set the ratio of the three losses. |
model_path_band | indicate the path of the model. |
3. Train the model
If the above processes are prepared, then it's time for training.
python ./src/train.py
If you need to record the training log, run
python ./src/train.py 2>&1 | tee re.log
The re.log file can be used to print out the training details in each epoch.
python ./train_all_read_log.py # for training all data
python ./train_read_log.py # for training and evaluation
You can also view the training logs with tensorboardX.
tensorboard --logdir='tb_logs/'
4. Validate the model
- Fix the model paths trained in the NIR band and RED band, respectively.
- The val script outputs a val_plot.png to visualize the results of each scene obtained by TR-MISR compared to the baseline.
python /src/val.py
5. Test the model
The test script is mainly used to submit the results to the leaderboard since the ground truth is not involved in the testing set. The test script will output a submission zip with 16-bit images (located in './submission/') and a visual result with 8-bit images (located in './submission_vis/'). The test platform is still open, allowing more methods to challenge their performance limits.
python /src/test.py
The leaderboard is shown as follows:
PROBA-V Benchmark (Multi-Frame Super-Resolution) | Papers With Code
If it helps for you, please cite
@article{an2022tr,
title={TR-MISR: Multiimage Super-Resolution Based on Feature Fusion With Transformers},
author={An, Tai and Zhang, Xin and Huo, Chunlei and Xue, Bin and Wang, Lingfeng and Pan, Chunhong},
journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
volume={15},
pages={1373--1388},
year={2022},
publisher={IEEE}
}