Home

Awesome

🎮 Awesome Remote Sensing Image-Text Retrieval | Remote Sensing Cross-model Retrieval | Remote Sensing Vision-Lanuage Models

🧭 Guideline

A benchmark and awesome collection of papers on Remote Sensing Image-Text Retrieval (RSITR) | Remote Sensing Cross-model Retrieval (RSCMR) from the Internet, if there are any omissions, please contact me jiancheng.pan.plus@gmail.com. 🤝 If you want to join Remote Sensing Vision-Language Models (RSVLMs), you can click Slack Group.

💻 News

Record the major news of RSVLMs community.

📊 Remote Sensing Captions Dataset

Collect the more popular image-text pairs datasets on remote sensing, and welcome contact for additions if there are more.

Dataset NameImage sizeImage ResolutionVLMs
UCM-Captions613256 × 256-
Sydney-Captions2,100500 × 500-
RSICD10,921224 × 224-
RSITMD4,743256 × 256-
NWPU-Captions31,500256 × 256-
RS5M5 million+All ResolutionsGeoRSCLIP
SkyScript5.2 million+All ResolutionsSkyCLIP

🆚 RSITR | RSCMR Benchmark

Welcome to add more RSITR | RSCMR methods.

📌 Cross-Modal Retrieval on RSICD:

https://paperswithcode.com/sota/cross-modal-retrieval-on-rsicd

📌 Cross-Modal Retrieval on RSITMD:

https://paperswithcode.com/sota/cross-modal-retrieval-on-rsitmd

📖 RSITR | RSCMR Method

Closed-Domain Method: Training and testing on a single dataset.

Open-Domain Method: Using extra datasets for pre-training to gain more inter-domain knowledge.

Hashing Method: Efficient retrieval on large-scale datasets becomes feasible.

Open-Domain Method

Closed-Domain Method

Hashing Method