Awesome
Effective Variance Attention-enhanced Diffusion Model (EVADM) for Crop Field Aerial Image Super Resolution
Overview 💥
This is the repository includes the models, methods and data developed in paper:
Effective variance attention-enhanced diffusion model for crop field aerial image super resolution that published in ISPRS Journal of Photogrammetry and Remote Sensing.
ResearchGate: ResearchGate Article
The Effective Variance Attention-enhanced Diffusion Model (EVADM) is designed to enhance the resolution and quality of aerial imagery, particularly focusing on high-resolution cropland images. By leveraging emerging diffusion models (DM) and introducing the Variance-Average-Spatial Attention (VASA) mechanism, EVADM significantly improves image super-resolution (SR) tasks.
<div style="text-align: center;"> <img src="img/Fig Abst.jpg" data-align="center" width="80%">Efficient VASA-enhanced Diffusion Model (EVADM) and the elevated image Variance after SR.
</div>Main Contributions
- Development of the CropSR Dataset: Created a high-resolution aerial image dataset, namely CropSR, with over 321,000 samples for self-supervised SR training.
- Introduction of Variance-Average-Spatial Attention (VASA): Designed a novel attention mechanism inspired by the trend of decreasing image variance with increasing flight altitude, enhancing SR model performance.
- Efficient VASA-enhanced Diffusion Model (EVADM): Developed a robust model that leverages VASA to improve the quality of aerial imagery super-resolution.
- Comprehensive Evaluation Metrics: Introduced the Super-Resolution Relative Fidelity Index (SRFI) for a nuanced assessment of structural and perceptual similarities in SR outputs.
Dataset
CropSR (for training)
- Description: A high-resolution aerial image dataset comprising over 321,000 samples for self-supervised SR training.
CropSR-FP/OR (for real-SR testing)
- Description: A combined dataset constructed from matched orthomosaic mapping (CropSR-OR) and fixed-point photographs (CropSR-FP).
- Total Pairs: More than 5,000 pairs.
- The test datasets can be accessed at CropSR (for Crop Field Aerial Image Super Resolution), Mendeley Data.
Model Performance
- Achieved a FID reduction of 14.6, and 27% boost of SRFI for ×2 real SR datasets.
- Achieved a FID reduction of 8.0, and 6% boost of SRFI for ×4 real SR datasets.
Generalization Ability
EVADM has demonstrated superior generalization capabilities on the open Agriculture-Vision dataset, highlighting its robustness across various aerial imagery tasks.
Ablation Studies
The model's effectiveness is validated through ablation studies and feature-attention map analyses, providing insights into the mechanism of VASA and the SR process.
Practical Applications
EVADM offers a promising approach for realistic aerial imagery super-resolution, showcasing high practicality for various downstream applications in agriculture and beyond.
💎 Go to /EVADM/ for demo & code
Installation
All models were implemented using Python and the PyTorch framework and trained on an NVIDIA RTX 4090 GPU. The EVADM model is based on the LDM (Rombach et al., 2022), please refer to both EVADM and LDM setup instructions. Download weights to EVADM/weights/ folder from weights.
Go under EVADM/ and run for EVADM SR usage demo:
python eva101_EVADM_infer.py
SRFI metrics
For calculating the SRFI model, see :
eva102_SRFI_metrics.py
License
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgments
We thank all reviewers for their constructive feedback, which greatly contributed to the improvement of this project.