Awesome

RSGPT: A Remote Sensing Vision Language Model and Benchmark

Yuan Hu, Jianlong Yuan, Congcong Wen, Xiaonan Lu, Xiang Li☨

☨corresponding author

This is an ongoing project. We are working on increasing the dataset size.

Related Projects

VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding

Xiang Li, Jian Ding, Mohamed Elhoseiny

Vision-language models in remote sensing: Current progress and future trends

Xiang Li☨, Congcong Wen, Yuan Hu, Zhenghang Yuan, Xiao Xiang Zhu

:fire: Updates

[2024.12.18] We release the manual scoring results for RSIEval.
[2024.06.19] We release the VRSBench, A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding. VRSBench contains 29,614 images, with 29,614 human-verified detailed captions, 52,472 object references, and 123,221 question-answer pairs. check VRSBench Project Page.
[2024.05.23] We release the RSICap dataset. Please fill out this form to get both RSICap and RSIEval dataset.
[2023.11.10] Our survey about vision-language models in remote sensing. RSVLM.
[2023.10.22] The RSICap dataset and code will be released upon paper acceptance.
[2023.10.22] We release the evaluation dataset RSIEval. Please fill out this form to get both the RSIEval dataset.

Dataset

RSICap: 2,585 image-text pairs with high-quality human-annotated captions.
RSIEval: 100 high-quality human-annotated captions with 936 open-ended visual question-answer pairs.

Code

The idea of finetuning our vision-language model is borrowed from MiniGPT-4. Our model is based on finetuning InstructBLIP using our RSICap dataset.

Acknowledgement

MiniGPT-4. A popular open-source vision-language model.
InstructBLIP. The model architecture of RSGPT follows InstructBLIP. Don't forget to check out this great open-source work if you don't know it before!
Lavis. This repository is built upon Lavis!
Vicuna. The fantastic language ability of Vicuna with only 13B parameters is just amazing. And it is open-source!

If you're using RSGPT in your research or applications, please cite using this BibTeX:

@article{hu2023rsgpt,
  title={RSGPT: A Remote Sensing Vision Language Model and Benchmark},
  author={Hu, Yuan and Yuan, Jianlong and Wen, Congcong and Lu, Xiaonan and Li, Xiang},
  journal={arXiv preprint arXiv:2307.15266},
  year={2023}
}