Awesome
SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model
<br> <p align="center"> <img src="images/SkyEyeGPT.png" width="250"/> <p> <br> <div align="center"> <strong>Author: Yang Zhan, Zhitong Xiong, Yuan Yuan</strong><strong>School of Artificial Intelligence, OPtics, and ElectroNics (iOPEN), Northwestern Polytechnical University</strong>
</div>This is the official repository for paper "SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model". [paper] [SkyEye-968k]
Please share a <font color='orange'>STAR ⭐</font> if this project does help
You can focus on remote sensing multimodal large language model (Vision-Language) here
📢 Latest Updates
This is an ongoing project. We will be working on improving it.
- 📦 Chatbot, codebase, datasets, and models coming soon! 🚀
- Jun-12-2024: RS instruction dataset SkyEye-968k is released. [huggingface] 🔥🔥
- Jan-18-2024: paper is released. 🔥🔥
- Jan-17-2024: A curated list about remote sensing multimodal large language model (Vision-Language) is created. 🔥🔥
💬 SkyEyeGPT: Remote Sensing Multi-modal Chatbot
The online demo will be released.
<div align="center"> <img src="images/chatbot.png"/> </div><img src="images/SkyEyeGPT.png" height="30"> SkyEyeGPT: Architecture
The model and checkpoint are coming soon! 🚀
<div align="center"> <img src="images/model.png"/> </div>🌋 SkyEye-968k: Unified RS Vision-Language Instruction
The download link of the unified remote sensing vision-language instruction dataset is here! 🚀
Download link: https://huggingface.co/datasets/ZhanYang-nwpu/SkyEye-968k
<div align="center"> <img src="images/dataset.png"/ height="400"> </div>📦 Performance
<div align="center"> <img src="images/performance.png"/ height="400"> </div>👁️ Visualization
1. Detailed description
<div align="center"> <img src="images/detailed_descr.png"/> </div>2. Some testing samples of captioning, grounding, and VQA
<div align="center"> <img src="images/some_sample.png"/> </div>👁️ Qualitative results
1. Remote Sensing Visual Grounding
<div align="center"> <img src="images/RSVG.png"/> </div>2. Remote Sensing Phrase Grounding
<div align="center"> <img src="images/RSPG.png"/> </div>3. Remote Sensing Image Captioning
<div align="center"> <img src="images/RSIC.png"/> </div>4. UAV Aerial Video Captioning
<div align="center"> <img src="images/UAVC.png"/> </div>5. Remote Sensing Visual Question Answering
<div align="center"> <img src="images/RSVQA.png"/> </div>6. Remote Sensing Referring Expression Generation
<div align="center"> <img src="images/RSREG.png"/> </div>7. Remote Sensing Scene Classification
<div align="center"> <img src="images/RSSC.png"/> </div>🔍 Quantitative results
1. Remote Sensing Image Captioning
<div align="center"> <img src="images/T_RSIC1.png"/> </div> <div align="center"> <img src="images/T_RSIC2.png"/> </div>2. UAV Aerial Video Captioning
<div align="center"> <img src="images/T_UAVC.png"/> </div>3. Remote Sensing Visual Grounding
<div align="center"> <img src="images/T_RSVG.png"/ height="250"> </div>4. Remote Sensing Visual Question Answering
<div align="center"> <img src="images/T_RSVQA1.png"/> </div> <div align="center"> <img src="images/T_RSVQA2.png"/ height="250"> </div>📜 Citation
@misc{zhan2024skyeyegpt,
title={SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model},
author={Yang Zhan and Zhitong Xiong and Yuan Yuan},
year={2024},
eprint={arXiv:2401.09712},
archivePrefix={arXiv}
}
🙏 Acknowledgement
Our code is based on MiniGPT-4, shikra, and MiniGPT-v2. We sincerely appreciate their contributions and authors for releasing source codes. We are thankful to EVA and LLaMA2 for releasing their models as open-source contributions. I would like to thank Xiong zhitong and Yuan yuan for helping the manuscript. I also thank the School of Artificial Intelligence, OPtics, and ElectroNics (iOPEN), Northwestern Polytechnical University for supporting this work.
🤖 Contact
If you have any questions about this project, please feel free to contact zhanyangnwpu@gmail.com.